I met with the Hadapt guys today. I think I can be a bit crisper than before in positioning Hadapt and its use cases, namely:
- Hadapt is additional software on a cluster that also runs fully functional Hadoop/HDFS. (Cloudera Hadoop more than straight-from-Apache Hadoop to date, but that’s not a requirement.)
- The cluster also runs a DBMS on every node, such as PostgreSQL or one of Infobright/Vectorwise.
- Hadapt’s software manages parallel SQL queries by distributing them to the DBMS living on each node. Hadapt says that the resulting query performance far outshines Hive’s.
- Hadapt further says that, by exploiting the partner DBMS, its SQL functionality outpaces Hive’s as well.
- Target Hadapt use cases are centered around keeping machine-generated or other poly-structured data in Hadoop, and extracting, enhancing, or otherwise deriving some of it to live in the relational store.
- In particular, Hadapt seems like an interesting choice when you want to use that relational data as you work on other data that’s still in HDFS, or if you want to keep using the relational data in other kinds of MapReduce jobs.
- That all fits well with my thoughts about the importance of derived data.
Other evolution from what I wrote about Hadapt a few months ago includes:
- Hadapt is in beta now.
- Hadapt has added adult supervision in the form of Philip Wickline, late of Endeca.
In other news, Hadapt is our newest client.