There are at least 16 different vendors offering appliances and/or software that do database management primarily for analytic purposes.* That’s a lot to keep up with,. So I’ve thrown together a little overview of the analytic data management landscape, liberally salted with links to information about specific vendors, products, or technical issues. In some ways, this is a companion piece to my prior post about data warehouse appliance myths and realities.
*And that’s just the tabular/alphanumeric guys. Add in text search and you run the total a lot higher.
Numerous data warehouse specialists offer traditional row-based relational DBMS architectures, but optimize them for analytic workloads. These include Teradata, Netezza, DATAllegro, Greenplum, Dataupia, and SAS. All of those except SAS are wholly or primarily vendors of MPP/shared-nothing data warehouse appliances. EDIT: See the comment thread for a correction re Kognitio.
Numerous data warehouse specialists offer column-based relational DBMS architectures. These include Sybase (with the Sybase IQ product, originally from Expressway), Vertica, ParAccel, Infobright,
Kognitio (formerly White Cross), and Sand. Their products are generally available in software-only formats, although Vertica and ParAccel package their offerings as appliances too.
There are some array-based MOLAP (Multidimensional OnLine Analytical Processing) systems left. But the major ones are all now at Oracle, Microsoft, and IBM. Essbase wound up at Oracle, via the Hyperion acquisition. Express went to Oracle long ago, and got tightly integrated into the Oracle DBMS. Microsoft Analysis Services contains a MOLAP engine federated to Microsoft SQL Server. Applix‘s memory-centric TM1 went to Cognos, which had a couple of other MOLAP engines as well; Cognos is being bought by IBM.
There aren’t any star-schema specialists of note left. Most of them – actually just two, namely Red Brick and Stanford — merged into Informix a decade ago. Informix was later bought (in two stages) by IBM. Star schemas are now just a feature of general-purpose systems.
Of course, every general-purpose relational database management system can be used for a lot of analytic purposes. That’s the whole reason Codd introduced the relational model. What’s more, the leading SMP/shared-everything DBMS – Oracle, DB2 mainframe, and to a lesser extent Microsoft SQL Server – can be used even for very large databases, if you partition carefully and write your SQL code accordingly.
That’s 14 vendors already, without mentioning Calpont (hasn’t briefed me recently enough), HP (ditto, and partly working through Vertica), Sun (working through Greenplum and ParAccel), Attivio, the memory-centric engines of BI vendors such as QlikTech and SAP (not exactly database management), or the complex event/stream processing vendors such as Coral8, StreamBase, or Progress Apama (ditto). Methinks there’s some consolidation ahead.
Yet more links:
- Why Oracle and Microsoft are losing in VLDB data warehousing
- Three ways Oracle and Microsoft could catch up in MPP data warehousing
- IBM is oddly weak in the data warehouse market
- Some very big Teradata sites
- Extensive and overlapping coverage of Netezza, Vertica, database compression, and column-oriented database architectures.
- DATAllegro as an exemplar of non-proprietary index-light MPP data warehouse appliances
- An old article on Oracle’s integration of Express.