<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Database compression coming to the fore</title>
	<atom:link href="http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 16:57:09 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: Infology.Ru &#187; Blog Archive &#187; Сжатие данных в СУБД выходит на первый план</title>
		<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/#comment-98582</link>
		<dc:creator>Infology.Ru &#187; Blog Archive &#187; Сжатие данных в СУБД выходит на первый план</dc:creator>
		<pubDate>Sun, 05 Oct 2008 21:06:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=482#comment-98582</guid>
		<description>[...] Автор: Curt Monash Дата публикации оригинала: 2008-08-08 Перевод: Олег Кузьменко Источник: Блог Курта Монаша [...]</description>
		<content:encoded><![CDATA[<p>[...] Автор: Curt Monash Дата публикации оригинала: 2008-08-08 Перевод: Олег Кузьменко Источник: Блог Курта Монаша [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Potter</title>
		<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/#comment-96771</link>
		<dc:creator>Robert Potter</dc:creator>
		<pubDate>Thu, 04 Sep 2008 20:33:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=482#comment-96771</guid>
		<description>Per information theory and encoding practice, compression plain text in advance leads to more difficult to decipher encrypted text due to reduction of variances in value frequencies, thereby obscuring the identification of patterns based based on frequency. Furthermore, optimally compressing first can yield better space savings than vice versa.  To my knowledge, in most business cases for efficient reasons, the compression / decompression would be done in computer memory while encryption / description is performed on disk controllerduring disk I/O using faster encryption algorithm such as DES that trades off encryption strength for speed.</description>
		<content:encoded><![CDATA[<p>Per information theory and encoding practice, compression plain text in advance leads to more difficult to decipher encrypted text due to reduction of variances in value frequencies, thereby obscuring the identification of patterns based based on frequency. Furthermore, optimally compressing first can yield better space savings than vice versa.  To my knowledge, in most business cases for efficient reasons, the compression / decompression would be done in computer memory while encryption / description is performed on disk controllerduring disk I/O using faster encryption algorithm such as DES that trades off encryption strength for speed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stuart Frost</title>
		<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/#comment-94135</link>
		<dc:creator>Stuart Frost</dc:creator>
		<pubDate>Fri, 15 Aug 2008 02:08:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=482#comment-94135</guid>
		<description>In the DATAllegro architecture, compression helps us to balance CPU and I/O better. The basic problem is that CPU power is increasing much faster than I/O bandwidth. By using some of the &#039;excess&#039; CPU power to decompress data after it&#039;s read from disk, we effectively get more I/O bandwidth without having to add extra disks, switches and fiber channel cards, thereby keeping costs lower.

To maintain these benefits with an encrypted database, we encrypt after we compress. Otherwise, the compression ratio would fall dramatically, since encrypted data doesn&#039;t compress as well.

Stuart Frost
CEO, DATAllegro</description>
		<content:encoded><![CDATA[<p>In the DATAllegro architecture, compression helps us to balance CPU and I/O better. The basic problem is that CPU power is increasing much faster than I/O bandwidth. By using some of the &#8216;excess&#8217; CPU power to decompress data after it&#8217;s read from disk, we effectively get more I/O bandwidth without having to add extra disks, switches and fiber channel cards, thereby keeping costs lower.</p>
<p>To maintain these benefits with an encrypted database, we encrypt after we compress. Otherwise, the compression ratio would fall dramatically, since encrypted data doesn&#8217;t compress as well.</p>
<p>Stuart Frost<br />
CEO, DATAllegro</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/#comment-93934</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Wed, 13 Aug 2008 22:46:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=482#comment-93934</guid>
		<description>Jason, that&#039;s a good question, and offhand I don&#039;t know the answer.  But speculating:

1.  Dictionary/token compression, which is the main form, shouldn&#039;t be much affected by encryption.  An exception would be if you&#039;re SO paranoid that you don&#039;t want to reveal anything about your distribution of values even if the values are protected, but outside of a few national security applications I don&#039;t see why that would be the case.

2. Delta compression would be problematic.  Compression after encryption would seem not to work, and compressing before encryption might rule out what are otherwise some shortcuts in getting reasonable write speed.

CAM</description>
		<content:encoded><![CDATA[<p>Jason, that&#8217;s a good question, and offhand I don&#8217;t know the answer.  But speculating:</p>
<p>1.  Dictionary/token compression, which is the main form, shouldn&#8217;t be much affected by encryption.  An exception would be if you&#8217;re SO paranoid that you don&#8217;t want to reveal anything about your distribution of values even if the values are protected, but outside of a few national security applications I don&#8217;t see why that would be the case.</p>
<p>2. Delta compression would be problematic.  Compression after encryption would seem not to work, and compressing before encryption might rule out what are otherwise some shortcuts in getting reasonable write speed.</p>
<p>CAM</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jason</title>
		<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/#comment-93916</link>
		<dc:creator>Jason</dc:creator>
		<pubDate>Wed, 13 Aug 2008 20:05:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=482#comment-93916</guid>
		<description>From what I understand, if a database is encrypted, compression is very difficult.  If so, good compression would come at the expense of decreased database file security.  Or do data warehouse users mostly rely on physical security of the database server hardware (ie. in a locked down data center) to protect from theft of data?</description>
		<content:encoded><![CDATA[<p>From what I understand, if a database is encrypted, compression is very difficult.  If so, good compression would come at the expense of decreased database file security.  Or do data warehouse users mostly rely on physical security of the database server hardware (ie. in a locked down data center) to protect from theft of data?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/#comment-93656</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Mon, 11 Aug 2008 15:17:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=482#comment-93656</guid>
		<description>Excellent point, Dominik.

Vertica&#039;s the most famous for making the claim, but I&#039;ve heard a few times now &quot;We query on compressed data all the way through; we don&#039;t have to decompress it before query execution.&quot;

Best,

CAM</description>
		<content:encoded><![CDATA[<p>Excellent point, Dominik.</p>
<p>Vertica&#8217;s the most famous for making the claim, but I&#8217;ve heard a few times now &#8220;We query on compressed data all the way through; we don&#8217;t have to decompress it before query execution.&#8221;</p>
<p>Best,</p>
<p>CAM</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dominik Slezak</title>
		<link>http://www.dbms2.com/2008/08/08/database-compression-coming-to-the-fore/#comment-93655</link>
		<dc:creator>Dominik Slezak</dc:creator>
		<pubDate>Mon, 11 Aug 2008 14:59:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=482#comment-93655</guid>
		<description>Indeed, there are numerous approaches to data compression in databases, and they can be categorized in many different ways. I would like to draw attention to the compression ratio versus (de)compression speed trade-off. Many database vendors use relatively light compression algorithms, with relatively worse compression ratios but, on the other hand, have the ability to work on non-fully decompressed data. Other vendors achieve better compression ratios by applying more advanced compression algorithms, which, however, require adding a more time-consuming decompression phase to the whole solution. It is an interesting dilemma which strategy is better, for row and column stores, for general-purpose and data-warehouse-focused products, etc. I think the answer lays in the database engines’ (dis)ability to precisely identify (or heuristically predict) which data pieces (and in which ordering) should be accessed, decompressed, and processed. To summarize, some ability to manipulate compressed data is great. But, on top of that, the better the database can isolate and organize the data required for the query, minimizing the need for decompression, the more sophisticated data compression techniques may be applied.</description>
		<content:encoded><![CDATA[<p>Indeed, there are numerous approaches to data compression in databases, and they can be categorized in many different ways. I would like to draw attention to the compression ratio versus (de)compression speed trade-off. Many database vendors use relatively light compression algorithms, with relatively worse compression ratios but, on the other hand, have the ability to work on non-fully decompressed data. Other vendors achieve better compression ratios by applying more advanced compression algorithms, which, however, require adding a more time-consuming decompression phase to the whole solution. It is an interesting dilemma which strategy is better, for row and column stores, for general-purpose and data-warehouse-focused products, etc. I think the answer lays in the database engines’ (dis)ability to precisely identify (or heuristically predict) which data pieces (and in which ordering) should be accessed, decompressed, and processed. To summarize, some ability to manipulate compressed data is great. But, on top of that, the better the database can isolate and organize the data required for the query, minimizing the need for decompression, the more sophisticated data compression techniques may be applied.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

