December 14, 2007

A quick survey of data warehouse management technology

There are at least 16 different vendors offering appliances and/or software that do database management primarily for analytic purposes.* That’s a lot to keep up with,. So I’ve thrown together a little overview of the analytic data management landscape, liberally salted with links to information about specific vendors, products, or technical issues. In some ways, this is a companion piece to my prior post about data warehouse appliance myths and realities.

*And that’s just the tabular/alphanumeric guys. Add in text search and you run the total a lot higher.

Numerous data warehouse specialists offer traditional row-based relational DBMS architectures, but optimize them for analytic workloads. These include Teradata, Netezza, DATAllegro, Greenplum, Dataupia, and SAS. All of those except SAS are wholly or primarily vendors of MPP/shared-nothing data warehouse appliances. EDIT: See the comment thread for a correction re Kognitio.

Numerous data warehouse specialists offer column-based relational DBMS architectures. These include Sybase (with the Sybase IQ product, originally from Expressway), Vertica, ParAccel, Infobright, Kognitio (formerly White Cross), and Sand. Their products are generally available in software-only formats, although Vertica and ParAccel package their offerings as appliances too.

There are some array-based MOLAP (Multidimensional OnLine Analytical Processing) systems left. But the major ones are all now at Oracle, Microsoft, and IBM. Essbase wound up at Oracle, via the Hyperion acquisition. Express went to Oracle long ago, and got tightly integrated into the Oracle DBMS. Microsoft Analysis Services contains a MOLAP engine federated to Microsoft SQL Server. Applix‘s memory-centric TM1 went to Cognos, which had a couple of other MOLAP engines as well; Cognos is being bought by IBM.

There aren’t any star-schema specialists of note left. Most of them – actually just two, namely Red Brick and Stanford — merged into Informix a decade ago. Informix was later bought (in two stages) by IBM. Star schemas are now just a feature of general-purpose systems.

Of course, every general-purpose relational database management system can be used for a lot of analytic purposes. That’s the whole reason Codd introduced the relational model. What’s more, the leading SMP/shared-everything DBMS – Oracle, DB2 mainframe, and to a lesser extent Microsoft SQL Server – can be used even for very large databases, if you partition carefully and write your SQL code accordingly.

That’s 14 vendors already, without mentioning Calpont (hasn’t briefed me recently enough), HP (ditto, and partly working through Vertica), Sun (working through Greenplum and ParAccel), Attivio, the memory-centric engines of BI vendors such as QlikTech and SAP (not exactly database management), or the complex event/stream processing vendors such as Coral8, StreamBase, or Progress Apama (ditto). Methinks there’s some consolidation ahead.

Yet more links:

Comments

11 Responses to “A quick survey of data warehouse management technology”

  1. Paul Groom on December 14th, 2007 12:18 pm

    Curt, Kognitio WX2 (formerly WhiteCross) is a standard row based relational database similar to Netezza, Datallegro, Teradata etc.. The differentiation from these other vendors is that WX2 is a software based product that will run on commodity X86 servers running Linux i.e. a virtual data warehouse appliance. WX2 has always been row based for simplicity (in an MPP architecture) and for high performance scalable loading.

    Paul Groom
    Director, Business Intelligence
    Kognitio

  2. Curt Monash on December 14th, 2007 12:34 pm

    Oh, crumb. I’m sorry about that.

    I should have checked my own post from last year. http://www.dbms2.com/2006/10/05/introduction-to-kognitio-wx-2/

    CAM

  3. Rich T on December 14th, 2007 2:33 pm

    No DB2 on AIX! are you serious. Gartner continues to put DB2/AIX at the top right hand corner of their quadrants.

  4. Curt Monash on December 14th, 2007 4:00 pm

    Fair enough. I didn’t say what I could or should about DB2. DB2 mainframe is another shared-everything system. DB2 on open systems — in practice, that means AIX — is in theory a solid MPP/shared-nothing system, with the BCUs playing a somewhat appliance-like role.

    As I said in http://www.dbms2.com/2007/10/05/the-four-horsemen-of-data-warehousing/ and http://www.dbms2.com/2007/10/09/another-firm-that-never-sees-db2-in-data-warehousing/ , it’s pretty surprising how little data warehouse traction DB2 has, given DB2’s architecture as per http://www.dbms2.com/2006/10/03/ibm-and-teradata-too/ .

    My comments about the Gartner MQ are summed up in http://www.dbms2.com/2007/10/19/gartner-2007-magic-quadrant-for-data-warehouse-database-management-systems/ (2007) and http://www.dbms2.com/2006/10/03/vendor-segmentation-for-data-warehouse-dbms/ (2006).

    CAM

  5. Curt Monash on December 23rd, 2007 6:32 pm

    Wait a moment — was I also wrong when I wrote that Kognitio “relies on compressed bitmaps” for data access?

    CAM

  6. Seth Grimes on January 8th, 2008 8:37 pm

    Curt, with regard to “compressed bitmaps,” and not knowing much about Kognitio, two things:

    – Why bother to compress a bitmap? It’s already compact, and I’d think that the overhead in compression/decompression wouldn’t be worth the space savings.

    – I believe that column stores typically don’t rely on indexes. That’s one reason they have fast load times.

  7. Curt Monash on January 9th, 2008 2:22 am

    1. My confusion was to remember the bitmaps and think that Kognitio was actually a column store. In part, it’s a distinction without a difference. Bitmaps have the same updating issues column stores do.

    2. “Compression” in bitmaps comes into play in at least two ways. One is sparsity. In a naive bitmap, if the cardinality of a column is N, then 1/N of the entries will be 1 and (N-1)/N of them will be 0. That’s food for sparsity compression.

    Second, that’s not how bitmaps really are implemented. If cardinality is 1024, there aren’t 1024 columns of bits implemented. Rather, numbers are assigned from 0 to 1023, and those are represented in 10 columns of bits. I.e., bitmaps and dictionary/tokenized compression are pretty much the same thing these days, with “bitmap” being a somewhat antiquated term.

    CAM

  8. MMIT on February 26th, 2008 1:43 pm

    Hi:
    I have worked pretty extensively with DB2 ESE/EEE, and somewhat with both Greenplum and WX2. The later two are marketting themselves as high end data warehouse MPP databases. Can anyone of you please explain me
    why they are superior to db2? I do not see any difference in their architecture! All three are MPPs. One of them is a very proven software, another runs on postgress, and the third is home made.

    Performance wise, no one would be able to beat DB2 as they run on P series hardware, and power 6 cpus running at 4 GHz

    I am wondering why IBM cannot position db2 as a competitor of Teradata and why they are going after Oracle.
    Thanks

  9. Curt Monash on February 26th, 2008 6:01 pm

    DB2 absolutely sounds like it has a good architecture. I don’t know what their problem is either. Perhaps, as in the special case of Viper, the practical implementation doesn’t live up to theory?

    I would take issue with you on one thing — DB2 rightfully competes BOTH with Teradata and Oracle.

    CAM

  10. DBMS2 — DataBase Management System Services » Blog Archive » Positioning the data warehouse appliances and specialty DBMS on April 25th, 2008 12:11 am

    [...] A quick survey of data warehouse management technology [...]

  11. Kognitio WX2 overview | DBMS2 -- DataBase Management System Services on September 5th, 2008 4:15 am

    [...] execs Paul Groom and John Thompson. Hopefully I can now clear up some confusion that was created in this comment thread. (Most of what I wrote about Kognitio in October, 2006 still applies.) Here are some [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.