The following is by Mike Stonebraker, CTO of Vertica Systems, copyright 2007, as part of our ongoing discussion of data compression. My comments are in a separate post.
Row Store Compression versus Column Store Compression
There are three aspects of space requirements, which we discuss in this short note, namely:
structural space requirements
index space requirements
attribute space requirements.
II Structural Space Requirements
In this section we discuss slotted pages, headers, round-up, and empty space.
All row stores that we are familiar with use a slotted page structure. Hence, records are identified by a (page_number, slot_number). In a particular location on the page, there is a slot table which in turn points to the actual record. The advantage of a slotted page structure is that a record can be moved on the page (say when it is updated to get wider) without changing the record id or moving the record to a different page. However, a slotted page architecture costs a level of indirection and the resulting extra space.
Vertica does not use a slotted page structure, and hence does not pay this cost.
In addition, every row store we are familiar with has a “header” on the front of each record. Typical items in this header include:
An immutable tuple identifier (to identify a record if it moves to a different page)
A bit map indicating the non-null fields in the record (so nulls do not have to be stored)
A pointer to the first byte of each variable length field (variable length records are exactly that – variable length)
Although the length of this header depends on the number of variable length fields, typical values are 16-32 bytes. To discover what this overhead is for your favorite DBMS, try the following test.
Generate 1 million numbers from the Fibonacci sequence, and store them as 4 byte integers in a one-column table. Without compression, the data occupies 4 Mbytes. Any amount over this value is structural overhead. For example, a DBMS with a 16 byte header would occupy 20 Mbytes.
Vertica has no header and no structural overhead. Moreover, since it automatically applies data compression, the storage cost of this table is way under 4 Mbytes.
Most row stores we are familiar with pad entries to byte or word boundaries. Hence, there is empty space left inside a tuple. The reason for this decision is that the cost of shifting the data in main memory for processing was considered to dominate the wasted space in the tuple.
Obviously a column store does not have tuples and does not pay this cost.
Lastly, there is inevitably empty space left on a storage block (typically 4K in current systems). This results from many causes, including:
Failure to reuse the space for deleted records
Failure to reuse space from records which are updated to become shorter
The inability to pack the page completely full
The roughly 1/3 of each page that is empty in B-tree structures
In contrast, Vertica packs every 64K block full.
In summary, there is substantial structural overhead in a row store, which need not be paid by a column store. Although a row store can be changed to lower the amount of structural overhead, this requires extensive modifications, and will not be accomplished by merely coding the attribute data.
Most major row stores support B-tree indexing, which can be either clustered (tuples are approximately in index order) or unclustered (tuples are not approximately in index order). Indexes are dense, i.e. there is one index entry for each record. The typical implementation stores (key_value, record_id) pairs in an auxiliary structure. A sophisticated implementation (which is not the norm at the present time) would “delta compress” the key values. Because they are in sorted order, the next key_value has a prefix which is identical to the previous key_value, and does not need to be stored. However, the record_id is typically 8 bytes and is not easily compressed. Moreover, it is difficult to pack index pages tightly because of insertions and deletions.
Hence, the cost of indexing is significant. Moreover, data compression (to be discussed next) does nothing to help the space requirements of indexes.
In contrast Vertica stores materialized views, not indexes. Every materialized view is in some sorted order and is indexed. However, the index is sparse (one index entry per 64K block). Moreover, record_ids are never stored, because of the Vertica architecture. Hence, indexing overhead is dramatically reduced, compared to a row store.
Of course, there are other kinds of indexes (hashing, bit maps, etc.). We have used B-trees, just as an example. One would have to compare the specifics of each vendor’s implementation. The point to be clearly made is that attribute compression does not necessarily do anything for indexes.
IV Data Compression
A column store tends to be more effective at data compression than a row store. The standard wisdom in most row stores is to use block compression. Hence, a storage block is compressed using a single technique (say Lempel-Ziv or dictionary). The technique chosen then compresses all the attributes in all the columns which occur on the block. In contrast, Vertica compresses a storage block that only contains one attribute. Hence, it can use a different compression scheme for each attribute. Obviously a compression scheme that is type-specific will beat an implementation that is “one size fits all”.
It is possible for a row store to use a type-specific compression scheme. However, if there are 50 attributes in a record, then it must remember the state for 50 type-specific implementations, and complexity increases significantly.
In addition, all row stores we are familiar with decompress each storage block on access, so that the query executor processes uncompressed tuples. In contrast, the Vertica executor processes compressed tuples. This results in better L2 cache locality, less main memory copying and generally much better performance.
Of course, any row store implementation can rewrite their executor to run on compressed data. However, this is a rewrite – and a lot of work.
The aim of compression is superior overall performance. The best solution is a “from the ground up” architecture that simultaneously deals with:
1) Minimizing structural overhead
2) Minimizing index overhead
3) Sophisticated data compression
4) An executor that runs on compressed data
A row store that performs a “one size fits all” version of 3) is only solving part of the problem, and will be dramatically beaten by a system such as Vertica that does all four. For example, in a head-to-head, apples-to-apples comparison (both systems are storing exactly the same objects) against one of the very popular row stores with their “one size fits all” compression scheme turned on, Vertica required less than ½ of the space of the other system. It also ran more than an order of magnitude faster.