<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Facebook&#8217;s experiences with compression</title>
	<atom:link href="http://www.dbms2.com/2009/05/14/facebooks-experiences-with-compression/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2009/05/14/facebooks-experiences-with-compression/</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 16:57:09 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: Ashish Thusoo</title>
		<link>http://www.dbms2.com/2009/05/14/facebooks-experiences-with-compression/#comment-121924</link>
		<dc:creator>Ashish Thusoo</dc:creator>
		<pubDate>Mon, 18 May 2009 04:53:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=783#comment-121924</guid>
		<description>Ivan,

Memory usage is definitely an issue and that typically means that we would not be able to run as many map/reduce slots in the cluster.

We are targeting this mostly for archival at this point and there the latency requirements on decompressing or compressing this data is not that high yet.</description>
		<content:encoded><![CDATA[<p>Ivan,</p>
<p>Memory usage is definitely an issue and that typically means that we would not be able to run as many map/reduce slots in the cluster.</p>
<p>We are targeting this mostly for archival at this point and there the latency requirements on decompressing or compressing this data is not that high yet.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ivan Novick</title>
		<link>http://www.dbms2.com/2009/05/14/facebooks-experiences-with-compression/#comment-121534</link>
		<dc:creator>Ivan Novick</dc:creator>
		<pubDate>Thu, 14 May 2009 17:12:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=783#comment-121534</guid>
		<description>Also be careful about the memory usage of bzip2, which may be prohibitively high on large data sets.

On gzip there are 9 different compression levels, level 6 seems to give the best balance between cost and data size.  6X seems about right for web log data</description>
		<content:encoded><![CDATA[<p>Also be careful about the memory usage of bzip2, which may be prohibitively high on large data sets.</p>
<p>On gzip there are 9 different compression levels, level 6 seems to give the best balance between cost and data size.  6X seems about right for web log data</p>
]]></content:encoded>
	</item>
</channel>
</rss>

