February 18, 2015

Hadoop: And then there were three

Hortonworks, IBM, EMC Pivotal and others have announced a project called “Open Data Platform” to do … well, I’m not exactly sure what. Mainly, it sounds like:

Edit: Now there’s a press report saying explicitly that Hortonworks is taking over Pivotal’s Hadoop distro customers (which basically would mean taking over the support contracts and then working to migrate them to Hortonworks’ distro).

The claim is being made that this announcement solves some kind of problem about developing to multiple versions of the Hadoop platform, but to my knowledge that’s a problem rarely encountered in real life. When you already have a multi-enterprise open source community agreeing on APIs (Application Programming interfaces), what API inconsistency remains for a vendor consortium to painstakingly resolve?

Anyhow, it now seems clear that if you want to use a Hadoop distribution, there are three main choices:

In saying that, I’m glossing over a few points, such as:

But the main point stands — big computer companies, such as IBM, EMC (Pivotal) and previously Intel, are figuring out that they can’t bigfoot something that started out as an elephant — stuffed or otherwise — in the first place.

If you think I’m not taking this whole ODP thing very seriously, you’re right.

Related links


11 Responses to “Hadoop: And then there were three”

  1. Greenplum is being open sourced | DBMS 2 : DataBase Management System Services on February 18th, 2015 4:51 pm

    […] I don’t find the Open Data Platform thing very significant, an associated piece of news seems cooler — Pivotal is open sourcing a […]

  2. The open data platform, like United Linux before it, will fail | Nagg on February 20th, 2015 5:38 pm

    […] ODP is a sign of weakness for the sponsoring members. Analyst Curt Monash described it[3] as “A face-saving way to admit that IBM’s and Pivotal’s insistence on having […]

  3. clive boulton on February 20th, 2015 6:13 pm

    Lord Ganesha says Google.

    Elephants grants blessings of prosperity and wisdom.

    Coda. Lord Ganesha made Google!

  4. The Open Data Platform, like United Linux before it, will fail | Big Data on February 21st, 2015 10:18 am

    […] ODP is a sign of weakness for the sponsoring members. Analyst Curt Monash described it as “A face-saving way to admit that IBM’s and Pivotal’s insistence on having […]

  5. Free Hadoop! on February 21st, 2015 2:53 pm

    http://blog.pivotal.io/pivotal/p-o-v/open-data-platform-initiative-putting-an-end-to-faux-pen-source-apache-hadoop-distributions sheds a new light on ODP.

    There are actually quite a bit of API inconsistencies to resolve.
    But these inconsistencies are not found by looking at each project individually. Distributions are at a layer above that where all these projects interact with each others.
    They have to answers questions such as “If I upgrade Apache Hive on a Kerberos enabled cluster, will it break Apache HBase integration?”
    And while Apache projects have a great test coverage and a lot of efforts are put into testing them individually, this sort of question is very difficult to answer without testing.
    Which is also why these sorts of tests are rarely contributed back to the Apache Community.

    I am curious how this effort will pan out, but if it enables the various actors to contribute scenarios or test cases they care about, it will be a huge win for everyone.

  6. Curt Monash on February 21st, 2015 6:49 pm

    So according to that blog Pivotal has been planning this move to open source since the first half of 2014 (or longer)?

  7. Allen on February 22nd, 2015 5:50 pm
  8. Harmony or Hail Mary? Experts debate need for the Open Data Platform | SiliconANGLE on February 26th, 2015 8:00 am

    […] opinion was shared by analyst Curt Monash, who dismissed the effort as “a face-saving way to admit that IBM’s and Pivotal’s insistence on having […]

  9. Databricks and Spark update | DBMS 2 : DataBase Management System Services on February 28th, 2015 6:06 am

    […] API works against a test harness. Speaking of certification, Ion basically agrees with my views on ODP, although like many — most? — people he expresses himself more politely than I […]

  10. Deenar Toraskar on March 2nd, 2015 12:11 am

    >> You could get Apache Hadoop directly, rather than using the free or paid versions of a vendor distro. But why would you make that choice, unless you’re an internet bad-ass on the level of Facebook, or at least think that you are?

    I disagree with the point that you need to be an internet bad-ass to use the Apache Hadoop distro. I have first hand experience of using Apache Hadoop in production for over 3 years now. Never had any issues. In fact the Apache Hadoop distro can run purely on an unprivileged account (no root access/sudo required). There has been the occasional need to patch Hive mainly because we use a rarely used database for the metastore.


  11. Strata Hadoop World 2015 summary - Simba Technologies on November 19th, 2015 3:15 pm

    […] in San Jose Tuesday afternoon, Cloudera’s Doug Cutting had responded as did Curt Monash the next day.  The level of activities in this market makes it difficult to judge the merit of such broad […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.