<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Compression in columnar data stores</title>
	<atom:link href="http://www.dbms2.com/2007/03/21/compression-in-columnar-data-stores/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2007/03/21/compression-in-columnar-data-stores/</link>
	<description>Choices in data management and analysis</description>
	<pubDate>Sun, 20 Jul 2008 01:30:14 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: Will database compression change the hardware game? &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2007/03/21/compression-in-columnar-data-stores/#comment-89429</link>
		<dc:creator>Will database compression change the hardware game? &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Tue, 01 Jul 2008 08:58:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2007/03/21/compression-in-columnar-data-stores/#comment-89429</guid>
		<description>[...] recently made a lot of posts about database compression. 3X or more compression is rapidly becoming standard; 5X+ is coming soon [...]</description>
		<content:encoded><![CDATA[<p>[...] recently made a lot of posts about database compression. 3X or more compression is rapidly becoming standard; 5X+ is coming soon [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chuck</title>
		<link>http://www.dbms2.com/2007/03/21/compression-in-columnar-data-stores/#comment-22553</link>
		<dc:creator>Chuck</dc:creator>
		<pubDate>Wed, 21 Mar 2007 14:02:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2007/03/21/compression-in-columnar-data-stores/#comment-22553</guid>
		<description>One of the often overlooked reasons that Vertica compresses so well is that we don't do updates in place.  We can squeeze the data down to its entropy without worrying about what happens if an updated value will take more space, because the updated value gets written somewhere else.

The typical update-in-place row store could maybe compress a little better, but could never come close to our compression schemes.  Because we compress sorted data by column, we can fit millions of values into a block sometimes.  Since a row store needs block-level access this trick is impossible to repeat; the number of column values in the block is the same as the number of rows in the block.

This argument extends to processing as well.  The row store is required to fetch the block and process the rows, an operation dominated by I/O time.  Thus there isn't anything to gain by operating on compressed data.

Note - I work for Vertica</description>
		<content:encoded><![CDATA[<p>One of the often overlooked reasons that Vertica compresses so well is that we don&#8217;t do updates in place.  We can squeeze the data down to its entropy without worrying about what happens if an updated value will take more space, because the updated value gets written somewhere else.</p>
<p>The typical update-in-place row store could maybe compress a little better, but could never come close to our compression schemes.  Because we compress sorted data by column, we can fit millions of values into a block sometimes.  Since a row store needs block-level access this trick is impossible to repeat; the number of column values in the block is the same as the number of rows in the block.</p>
<p>This argument extends to processing as well.  The row store is required to fetch the block and process the rows, an operation dominated by I/O time.  Thus there isn&#8217;t anything to gain by operating on compressed data.</p>
<p>Note - I work for Vertica</p>
]]></content:encoded>
	</item>
</channel>
</rss>
