My clients at Cloudera have been around for a while, in effect positioned as “the Hadoop company.” Their business, in a nutshell, consists of:
- Packaging up a Cloudera distribution of Apache Hadoop. This distribution doesn’t have proprietary code; it’s just packaged by Cloudera from Apache projects (with a decent minority of the code happening to have been contributed by Cloudera engineers).
- Paid subscription support for Apache Hadoop and, in connection with that …
- … proprietary software that all support customers automatically get. There are two points to this proprietary software:
- It adds value for the customer.
- It makes Cloudera’s support job easier.
- Professional services around Hadoop.
- Training and conferences around Hadoop, which probably don’t generate all that much money, but are great marketing in terms of visibility, thought leadership, and lead generation.
Hortonworks spun out of Yahoo last week, with parts of the Cloudera business model, namely Hadoop support, training, and I guess conferences. Hortonworks emphatically rules out professional services, and says that it will contribute all code back to Apache Hadoop. Hortonworks does grudgingly admit that it might get into the proprietary software business at some point — but evidently hopes that day will never actually come.
Hortonworks’ two main initial marketing messages — and there’s some synergy between these — boil down to:
- Open source purism
- “We have most of the Hadoop developers, so we’re better”*
Frankly, the open source purism part sounds like doubletalk to me, in that Hortonworks has trouble articulating what supposedly-less-pure Cloudera does wrong that Hortonworks will do better. However, I’ve been hearing for a long time that Yahoo’s MapReduce developers feel very strongly about open source, so perhaps this is in part an emotional issue for them. More substantively, it fits well with the pro-Hortonworks story I’ve outlined below.
*”We have most of the Hadoop developers” seems fairly defensible, give or take dueling definitions of “committer,” “core developer,” “patch” or for that matter “Hadoop.”
The other branch of the Hortonworks marketing message can be lampooned as “We’re the right folks to identify your bugs, since we’re probably the ones who put them there in the first place.” More darkly, that pitch could be “If you want the bugs fixed that bother you, we’re the ones who have control over whether or not that happens.” Well, maybe. But I also see Cloudera having a couple years experience supporting Hadoop, as well as shipping some code that perhaps makes Hadoop more supportable.
That’s the skeptical view. A more favorable view of Hortonworks’ prospects would go something like this:
- One version of Apache Hadoop is plenty.
- Cloudera (and arguably other Hadoop platform software vendors) sell capabilities that will soon be eclipsed by core Apache Hadoop. Folks should just please wait.
- Now that Hortonworks is an independent company focused on the task, it will speedily solve the packaging problems that have made Cloudera’s Hadoop distribution (perceived to be) necessary.
- Yahoo and IBM both back Hortonworks’ approach. That’s got to count for something.
- Apache Hadoop will be quickly enhanced, and Hortonworks will be driving the enhancements. Hortonworks simply is the top Hadoop authority.
We’ll see. Cloudera’s been around for a couple years, has smart people, and by definition has no technical inferiority to Hortonworks (since it has access to all Hortonworks’ code). What’s more, it will be a long time before Hadoop technology is so mature that there’s nothing left to do; add-on software should long prove to be useful. As for “We’re purer about open source than the other guys” — well, I’m dubious that that will turn out to be a great marketing message.
And so I think Cloudera is the early favorite in the competition. But perhaps Hadoop users will be able to play Cloudera and Hortonworks off against each other in price negotiations. Perhaps, notwithstanding my skepticism about Hadoop appliances, some hardware vendors will play them against each other for appliance partnerships.
Meanwhile, whatever else happens, I’m pretty psyched about some enhancements the Hortonworks folks plan to lead for Hadoop.
- A Hortonworks/Apache Hadoop slide deck Hortonworks graciously allowed me to post
- Cloudera’s post about it’s recent 3.5 release of Cloudera Enterprise
- Pros and cons of professional services efforts at young software companies