By: Will database compression change the hardware game? | DBMS2 -- DataBase Management System Services

Tue, 01 Jul 2008 08:58:05 +0000

[…] recently made a lot of posts about database compression. 3X or more compression is rapidly becoming standard; 5X+ is coming soon […]

By: Chuck

Chuck — Wed, 21 Mar 2007 14:02:11 +0000

One of the often overlooked reasons that Vertica compresses so well is that we don’t do updates in place. We can squeeze the data down to its entropy without worrying about what happens if an updated value will take more space, because the updated value gets written somewhere else.

The typical update-in-place row store could maybe compress a little better, but could never come close to our compression schemes. Because we compress sorted data by column, we can fit millions of values into a block sometimes. Since a row store needs block-level access this trick is impossible to repeat; the number of column values in the block is the same as the number of rows in the block.

This argument extends to processing as well. The row store is required to fetch the block and process the rows, an operation dominated by I/O time. Thus there isn’t anything to gain by operating on compressed data.

Note – I work for Vertica

Comments on: Compression in columnar data stores

By: Will database compression change the hardware game? | DBMS2 -- DataBase Management System Services

By: Chuck