The Forrester Wave: Enterprise Data Warehouse Platforms, Q1 2011 is now out,* hot on the heels of the Gartner Magic Quadrant. Unfortunately, this particular Forrester Wave is riddled with inaccuracy.
*At the time of this writing, I don’t have a link to a free version of the full report. At the time of this writing, the 2011 Forrester Wave for Enterprise Data Warehouse Platforms graphic can be found here.
One example of the confusion pervading the 2011 Forrester Wave for Enterprise Data Warehouse Platforms lies in a list of three supposed trends.
- The Forrester Wave somehow conflates SaaS and MPP processing, tying them both to the term “cloud.” (In reality, the SaaS/cloud and MPP/cloud equations depend on two rather different word-senses for “cloud”.)
- The Forrester Wave then conflates EDWs, analytic computing systems, and application servers, the latter perhaps because of the “data-application server” product category name Aster Data floated. The Forrester Wave also conflates investigative analytics with low-latency operational processes that exploit investigative analytics’ results.
- The Forrester Wave then conflates social media, “unstructured data” (by which it seems at one point to mean text and at another point to also mean logs), solid-state drives, and a whole bunch of other technologies (especially but not only low-latency ones) into another supposed single trend.
Some of the sillier specific claims in the Forrester Wave for Enterprise Data Warehouse Platforms include:
- According to the 2011 Forrester Wave for Enterprise Data Warehouse Platforms, Netezza has hybrid row/columnar persistence, while most other vendors cited don’t. To recycle an old Larry Ellison joke, somebody obviously has a better pharmacist than I do. It’s tough to imagine how anybody who understands columnar storage could at all believe Netezza currently offers it.
- According to the 2011 Forrester Wave for Enterprise Data Warehouse Platforms, EMC/Greenplum is limited in the hardware it supports. Actually, Greenplum runs on pretty much any commodity Intel hardware, just like any other software-only DBMS does.
- According to the 2011 Forrester Wave for Enterprise Data Warehouse Platforms, Teradata, Sybase, and others are differentiated in their Hadoop support. Actually, Hadoop support of various forms is a checkmark item for analytic DBMS vendors.
- According to the 2011 Forrester Wave for Enterprise Data Warehouse Platforms, Oracle, Teradata, and others are differentiated in their cloud/SaaS support. Actually, having some kind of public cloud offering is a checkmark item; use of same is quite a different matter.
- The 2011 Forrester Wave for Enterprise Data Warehouse Platforms calls out EMC Greenplum for special praise in mixed workload management. Greenplum will probably be fine in concurrency and workload management, but implying it’s a leader is overstated.
- According to the 2011 Forrester Wave for Enterprise Data Warehouse Platforms, Vertica has not made a significant investment in real-time technologies (despite doing a lot of work with StreamBase and selling a lot into the algorithmic trading market). I disagree.
- Also according to the 2011 Forrester Wave for Enterprise Data Warehouse Platforms, Vertica has not made a significant investment in in-memory technology, despite the fact that all its updates pass through Vertica’s in-memory, query-responsive “Write-Optimized Store.” I disagree.
Even leaving aside the errors that obviously riddled the Forrester Wave for Enterprise Data Warehouse Platforms’ underlying 56-row matrix, I dispute the whole premise of the exercise. I’m not a big fan of overarching scorecard-based rankings, because the right choice of product varies so much by use case. For example:
- If you’re a smallish enterprise who can realistically do OLTP and data warehousing on the same instance of your DBMS, Oracle and Microsoft blow away everybody else mentioned.
- If columnar compression methods work really well for your use case, Vertica or maybe Oracle Exadata might shine.
- If you typically only retrieve a few columns from a wide table, so that columnar I/O is what you care most about, Vertica, Sybase, or even EMC Greenplum might shine. (The decidedly non-columnar Netezza and Oracle Exadata approaches to predicate pushdown might or might not excel as well.)
- If your database is above a certain size, some of the alternatives (such as Sybase IQ or non-Exadata Oracle) should be taken off the table.
- If you have a highly concurrent mixed workload, nobody else is as proven as Teradata.
- If you don’t want to invest much in database administration, Oracle is about the last vendor you should consider, and Netezza might be the first.
More excusable is some terminological confusion in the Forrester Wave for Enterprise Data Warehouse Platforms, the essence of which is this:
Notwithstanding its name, the Forrester Wave for Enterprise Data Warehouse Platforms isn’t just talking about what are called enterprise data warehouses (EDWs), but rather a broader range of analytic database management systems and use cases. These include:
- What are classically called operational data stores (the focus on “Next-Best Actions” suggests those are included).
- Analytic platforms/analytic computing systems (the high-level mentions of MapReduce, predictive modeling integration, and so on suggest they’re in too).
- Reporting data marts (some of the vendors cited might not make the minimum count threshold unless those are included too).
Indeed, the definition provided of “EDW” basically boils down to “runs SQL, is tuned in some way for analytics, has a cost-based or other query optimizer, and isn’t tied to a specific application.”
Frankly, I think classical EDWs have their problems, and are not necessarily the best way to address the numerous use cases for analytic DBMS technology. And product category names are commonly problematic anyhow. So I don’t much mind this overloading of the EDW term. But in one respect I think the Forrester Wave overdoes its inclusiveness — it includes things that aren’t actually DBMS, and then marks down just about every product cited for being a real DBMS rather than some sort of above-DBMS layer, at least when those things are sold by SAP. I’ve never agreed with the idea that SAP’s BW/BWA products should be included in a comparison with the other products cited in the Forrester Wave at all, and SAP HANA doesn’t change my mind.
One last thing — I’m suspicious of the Forrester Wave for Enterprise Data Warehouse Platforms’ comments on data warehouse appliance prices. However, they are hard to judge without knowing whether Forrester was using the term “raw data” in its usual sense, or actually means “user data”, and also without knowing whether Forrester is talking about list or “street” pricing.