A few weeks ago I wrote:
The other big part of Hortonworks’ story is the claim that it holds the axe in Apache Hadoop development.
… just how dominant Hortonworks really is in core Hadoop development is a bit unclear. Meanwhile, Cloudera people seem to be leading a number of Hadoop companion or sub-projects, including the first two I can think of that relate to Hadoop integration or connectivity, namely Sqoop and Flume. So I’m not persuaded that the “we know this stuff better” part of the Hortonworks partnering story really holds up.
- It’s ridiculous to say any one company, e.g. Hortonworks, has a controlling position in Hadoop development.
- Such diversity is a Very Good Thing.
- Cloudera folks now contribute and always have contributed to Hadoop at a higher rate than Hortonworks folks.
- If you consider just core Hadoop projects — the most favorable way of counting from a Hadoop standpoint — Hortonworks has a lead, but not all that big of one.
I think Hortonworks likes to make the argument “But our contributions, on average, are more important than Cloudera’s contributions.” That claim perhaps aside, Cloudera’s argument looks persuasive.
Anyhow, the main bases for deciding whose enterprise support for Hadoop to buy — Cloudera’s or Hortonworks’ — are probably:
- Who is even offering it? Hortonworks, last I checked, wasn’t yet — Yahoo perhaps excepted — although it’s a near-term roadmap item for them to start doing so.
- Whose is better? Even when Hortonworks does offer enterprise support, it will lack experience at the support process. (To some extent, that could be worked around by providing money-losingly inefficient support at first.)
- Who bundles more useful proprietary software with their support? Unless you think the code in Cloudera Enterprise is 100% worthless, Cloudera wins that one.
- Price. I have no idea how that one will shake out.