July 10, 2011

Cloudera and Hortonworks

My clients at Cloudera have been around for a while, in effect positioned as “the Hadoop company.” Their business, in a nutshell, consists of:

Hortonworks spun out of Yahoo last week, with parts of the Cloudera business model, namely Hadoop support, training, and I guess conferences. Hortonworks emphatically rules out professional services, and says that it will contribute all code back to Apache Hadoop. Hortonworks does grudgingly admit that it might get into the proprietary software business at some point — but evidently hopes that day will never actually come.

Hortonworks’ two main initial marketing messages — and there’s some synergy between these — boil down to:

Frankly, the open source purism part sounds like doubletalk to me, in that Hortonworks has trouble articulating what supposedly-less-pure Cloudera does wrong that Hortonworks will do better. However, I’ve been hearing for a long time that Yahoo’s MapReduce developers feel very strongly about open source, so perhaps this is in part an emotional issue for them. More substantively, it fits well with the pro-Hortonworks story I’ve outlined below.

*”We have most of the Hadoop developers” seems fairly defensible, give or take dueling definitions of “committer,” “core developer,” “patch” or for that matter “Hadoop.”

The other branch of the Hortonworks marketing message can be lampooned as “We’re the right folks to identify your bugs, since we’re probably the ones who put them there in the first place.” More darkly, that pitch could be “If you want the bugs fixed that bother you, we’re the ones who have control over whether or not that happens.” Well, maybe. But I also see Cloudera having a couple years experience supporting Hadoop, as well as shipping some code that perhaps makes Hadoop more supportable.

That’s the skeptical view. A more favorable view of Hortonworks’ prospects would go something like this:

We’ll see. Cloudera’s been around for a couple years, has smart people, and by definition has no technical inferiority to Hortonworks (since it has access to all Hortonworks’ code). What’s more, it will be a long time before Hadoop technology is so mature that there’s nothing left to do; add-on software should long prove to be useful. As for “We’re purer about open source than the other guys” — well, I’m dubious that that will turn out to be a great marketing message.

And so I think Cloudera is the early favorite in the competition. But perhaps Hadoop users will be able to play Cloudera and Hortonworks off  against each other in price negotiations. Perhaps, notwithstanding my skepticism about Hadoop appliances, some hardware vendors will play them against each other for appliance partnerships.

Meanwhile, whatever else happens, I’m pretty psyched about some enhancements the Hortonworks folks plan to lead for Hadoop.

Related links

Comments

9 Responses to “Cloudera and Hortonworks”

  1. Hadoop futures and enhancements | DBMS 2 : DataBase Management System Services on July 10th, 2011 10:14 pm

    […] a new Hadoop company spun out of Yahoo, graciously permitted me to post a slide deck outlining an Apache Hadoop roadmap. Phase 1 refers to […]

  2. Daniel Weinreb on July 11th, 2011 7:42 am

    I very much hope that there isn’t a fork in the future. These parties need to work together, on the same branch of code, which means sharing each others’ contributions.

  3. Michael on July 11th, 2011 12:46 pm

    Those who care about performance and reliability will go with MapR. It’ll probably be years before Apache Hadoop catches up with them in terms of performance and features.

  4. Curt Monash on July 11th, 2011 2:19 pm

    Dan,

    Cloudera vs. Hortonworks isn’t a matter of a fork, so far as I can seen, even though Hortonworks talks a fair amount about that concern.

    MapR, Brisk, et al. are indeed forks, if not pitchforks.

  5. Eric Baldeschwieler on July 12th, 2011 1:29 pm

    Hi Curt,

    Thanks for covering Hortonworks. In particular, I appreciate you pointing to the slides we shared with you. I’d also like to point out our HadoopSummit slides to anyone interested in learning more about Hortonwork’s plans for improving Apache Hadoop: http://www.hortonworks.com/hadoop-summit-presentations/.

    I’d like to respond to a few of the points that you made in your post. First, I respectfully disagree with your assertion that our marketing message is limited to open source purism and “we have most of the Hadoop developers”. I’ve summarized points we covered in our conversation for your readers. Our objectives at Hortonworks are to:

    1. Make Apache Hadoop projects easier to install, manage and use. We believe that anyone should be able to easily deploy Hadoop projects downloaded directly from Apache.

    2. Make Apache Hadoop more robust. Much of this is spelled out in the slides referenced above. We plan to improve Hadoop performance, add high availability and improve administration and monitoring.

    3. Make Apache Hadoop easier to integrate and extend. We want to work with technology vendors and other community members to create or improve open APIs that will make it easier to extend and experiment with Apache Hadoop.

    This is not about focusing our energies on competing with any other vendor. We want to make Apache Hadoop better for everyone. There will still be value in having third parties package Apache Hadoop and add incremental functionality on top if it. These vendors will benefit from our work on core Apache Hadoop just like we will benefit from their contributions back to core Apache Hadoop. That’s one of the great things about open source. We firmly believe that we are in the early stages of a fundamental shift in how organizations store, manage and analyze the ever-increasing volume of data created inside and outside of their company’s walls. We believe that by focusing our efforts on making Apache Hadoop better, Apache Hadoop will become the de facto big data platform, which will create a huge business opportunity not only for Hortonworks but other vendors as well.

    I know that a Hortonworks vs. Cloudera battle is a compelling story, but it’s clearly not our focus and I highly doubt it’s Cloudera’s focus either. Both companies can benefit from jointly working to improve Apache Hadoop. Our futures are much brighter because we are both going to be out there helping enterprises and technology vendors adopt Apache Hadoop.

    Eric Baldeschwieler (a.k.a. Eric14), Hortonworks
    Twitter @jeric14, @hortonworks

  6. Curt Monash on July 12th, 2011 2:07 pm

    Eric,

    Fair enough that your marketing messaging talks about a bunch of things other than head-to-head vs. Cloudera. That’s why, for example, I had a whole other blog post about the general Apache Hadoop roadmap as laid out by you.

    But for now, the default prospect view has to be “Coolness! Hortonworks is going to help make Apache Hadoop better. All the more reason to install Hadoop and have it be supported by the company with a track record of supporting it, Cloudera.” And I do tend to focus on those aspects of a company’s marketing message that are relevant to its closest head-to-head competition.

  7. Jeremy Hanna on July 12th, 2011 4:14 pm

    ‘But for now, the default prospect view has to be “Coolness! Hortonworks is going to help make Apache Hadoop better. All the more reason to install Hadoop and have it be supported by the company with a track record of supporting it, Cloudera.”’

    As a spectator, that last comment sounded pretty biased. I think Cloudera has some great things for the community. However give Hortonworks a chance. They’ve outlined plans and have only been in operation for less than two weeks. Take Eric and his team at their word for now. Cloudera has a great team, but so does Hortonworks.

  8. Curt Monash on July 12th, 2011 6:47 pm

    Jeremy,

    As I said — “for now”. Cloudera has the lead, and Hortonworks could surely at some point overtake them. Both companies are still very, very young. But if I had a choice of getting support from an organization that has ~100 support clients (or whatever it is), with a >2 year history of serving some of them, vs. an organization with 1 client and no history, I know who I’d be more inclined to rely on in the short term.

  9. Hadoop Ecosystem Happenings « Andre's Tech Blog on November 10th, 2011 1:53 pm

    […] Cloudera and Hortonworks – dbms2 – July 2011 And then there is the animated discussion on who contributes more to the Hadoop source repo – e.g.  number of patches vs. lines of code! Very entertaining stuff . […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.