I successfully resisted telephone consulting while on vacation, but I did do some by email. One was on the oft-recurring subject of Hadoop adoption. I think it’s OK to adapt some of that into a post.
Notes on past and current Hadoop adoption include:
- Enterprise Hadoop adoption is for experimental uses or departmental production (as opposed to serious enterprise-level production). Indeed, it’s rather tough to disambiguate those two. If an enterprise uses Hadoop to search for new insights and gets a few, is that an experiment that went well, or is it production?
- One of the core internet-business use cases for Hadoop is a many-step ETL, ELT, and data refinement pipeline, with Hadoop executing some or many of the steps. But I don’t think that’s in production at many enterprises yet, except in the usual forward-leaning sectors of financial services and (we’re all guessing) national intelligence.
- In terms of industry adoption:
- Financial services on the investment/trading side are all over Hadoop, just as they’re all over any technology. Ditto national intelligence, one thinks.
- Consumer financial services, especially credit card, are giving Hadoop a try too, for marketing and/or anti-fraud.
- I’m sure there’s some telecom usage, but I’m hearing of less than I thought I would. Perhaps this is because telcos have spent so long optimizing their data into short, structured records.
- Whatever consumer financial services firms do, retailers do too, albeit with smaller budgets.
Thoughts on how Hadoop adoption will look going forward include:
- Enterprise adoption of Hadoop for ETL/ELT/data refinement could explode after more software vendors offer support for it.
- The Hadoop community is trying hard to make it easy(ier) to manage multiple Hadoop clusters as one (preferably with elasticity among them). That could lead to more enterprise-level Hadoop deployments.
- There will be very few cases of Hadoop replacing existing relational data warehouses. But Hadoop could get a sizable share of new opportunities that might otherwise go to scale-out analytic RDBMS.
- I think Hadoop will do a good job of subsuming some of the newer efforts that might otherwise threaten to replace it. (I’m not sure whether Dremel/Drill is a major example of same, but it illustrates the point in any case.)
- If data starts out in the cloud, then the right place to do Hadoop on it may be in the same cloud.
- Hadoop appliances have dubious value for customers; everybody has or should have similar software, and nobody’s adding much value in their hardware designs. Even so, Hadoop gear is basically cheap, so overpaying for it isn’t a big deal. Thus, an enduring Hadoop appliance market may emerge.