August 4, 2009

PAX Analytica? Row- and column-stores begin to come together

Column-store proponents are prone to argue, in effect, that the only reason to implement an analytic DBMS with row-based storage is laziness. Their case generally runs along the lines:

Pushbacks to this argument from row-based vendors include:

plus generous dollops of:

(OK, I made that last one up, but I do hear the other claims frequently.)

However, there are at least two ways in which row- and column-stores are beginning to come together. First, there are lots of rumors about row-store vendors bringing out column-store options, even beyond the recent Ingres/VectorWise announcement. (But anything I may know about same beyond noticing the rumors fly by is surely under NDA.) Second, column-store vendors Vertica and VectorWise are bringing out a kind of row/column hybrid storage option.

Vertica 3.5 introduces what Vertica calls “FlexStore.” A key part of FlexStore is the ability to store data not just in pure columnar format, but also to group columns together in what amounts to sub-rows. This is advantageous when data is retrieved together and, I presume, when it is updated. There’s a tradeoff in giving up column stores’ compression advantages, however, and use of this feature is not recommended for columns that are frequently retrieved independently. Vertica also notes that since it typically uses 1 megabyte block sizes, any table smaller than that shouldn’t be broken into columns at all.

VectorWise, of course, doesn’t have a product right now, but has gotten a bunch of recent publicity around the column-store product it plans to ship via its partner Ingres in 2010. When I asked Peter Boncz about row/column hybridization inside VectorWise (not federating between Ingres and VectorWise, but rather truly within VectorWise), he said one of the storage options was PAX, and pointed me at a 2001 paper by a group of academics that includes the ubiquitous Dave Dewitt. PAX turns out to stand, in creative spelling, for Partition Attributes Across.

The PAX idea is to store as many rows of data as can fit into a block, but within the block store them in columns. This preserves some of the compression and cache-efficiency benefits of column stores, while also bringing back whole rows in a single step. (I think Vertica’s FlexStore does something similar to this, but I’m not sure.)

Further confusing things, Peter Boncz of VectorWise told me VectorWise can support “any hybrid” of columnar storage and PAX.

Bottom line: The distinction between row- and column-stores isn’t going to go away any time soon, but it is at least beginning to blur a bit.

Comments

4 Responses to “PAX Analytica? Row- and column-stores begin to come together”

  1. VectorWise, Ingres, and MonetDB | DBMS2 -- DataBase Management System Services on August 4th, 2009 6:43 am

    [...] VectorWise, the product, will be an open-source columnar analytic DBMS. (But that’s not quite true. Pending productization, it’s more accurate to call the VectorWise technology a row/column hybrid.) [...]

  2. The future of the database is… plaid? — Too much information on September 2nd, 2009 9:44 am

    [...] Curt Monash recently noted there are a couple of approaches emerging to hybrid row/column [...]

  3. Oracle Exadata Hybrid Columnar Compression | DBMS2 -- DataBase Management System Services on September 3rd, 2009 5:33 am

    [...] sounds a whole lot like PAX. Specifically, in Oracle’s case I would guess “hybrid columnar compression” [...]

  4. This and that | DBMS2 -- DataBase Management System Services on December 29th, 2009 5:15 am

    [...] Vertica offers a post on its 3.5 release, with a riff on the popular theme “We’ve fixed some weaknesses in our prior versions that we didn’t previously say we had.” More important, Vertica is pretty clear on the virtues of its hybrid columnar architecture. [...]

Leave a Reply




Feed including blog about database management, data warehousing, and business intelligence Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.