<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: VectorWise, Ingres, and MonetDB</title>
	<atom:link href="http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 09:22:14 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: Ingres VectorWise technical highlights &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-171512</link>
		<dc:creator>Ingres VectorWise technical highlights &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Fri, 11 Jun 2010 11:28:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-171512</guid>
		<description>[...] caught up with me for a regrettably brief call. Peter gave me the strong impression that what I&#8217;d written in the past about VectorWise had been and remained accurate, so I focused on filling in the gaps. Highlights [...]</description>
		<content:encoded><![CDATA[<p>[...] caught up with me for a regrettably brief call. Peter gave me the strong impression that what I&#8217;d written in the past about VectorWise had been and remained accurate, so I focused on filling in the gaps. Highlights [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martin Kersten on issues in scientific data management &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-142315</link>
		<dc:creator>Martin Kersten on issues in scientific data management &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Sat, 03 Oct 2009 10:33:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-142315</guid>
		<description>[...] Martin Kersten emailed a response to my post on issues in scientific data management. With his permission, I&#8217;ve lightly edited it, and am posting it below. [...]</description>
		<content:encoded><![CDATA[<p>[...] Martin Kersten emailed a response to my post on issues in scientific data management. With his permission, I&#8217;ve lightly edited it, and am posting it below. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: HadoopDB &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-140919</link>
		<dc:creator>HadoopDB &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Sun, 20 Sep 2009 00:05:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-140919</guid>
		<description>[...] where X=2. Column-store guru Abadi has repeatedly signaled his intention to try out HadoopDB with VectorWise at the nodes instead. (Recall that VectorWise is shared-everything.) It will be interesting to see [...]</description>
		<content:encoded><![CDATA[<p>[...] where X=2. Column-store guru Abadi has repeatedly signaled his intention to try out HadoopDB with VectorWise at the nodes instead. (Recall that VectorWise is shared-everything.) It will be interesting to see [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Do hash tables work in constant time?</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-135474</link>
		<dc:creator>Do hash tables work in constant time?</dc:creator>
		<pubDate>Tue, 18 Aug 2009 14:15:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-135474</guid>
		<description>[...] Am I being pedantic? Does the time required to multiply integers on modern machine depend on the size of the integers? It certainly does if you are using vectorization. And vectorization is used in commercial databases! [...]</description>
		<content:encoded><![CDATA[<p>[...] Am I being pedantic? Does the time required to multiply integers on modern machine depend on the size of the integers? It certainly does if you are using vectorization. And vectorization is used in commercial databases! [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Edward</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-133890</link>
		<dc:creator>Edward</dc:creator>
		<pubDate>Wed, 05 Aug 2009 02:03:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-133890</guid>
		<description>There&#039;s a 2008 talk by Peter Boncz about MonetDB/X100 project that illustrates principles that seem to be used by VectorWise&#039;s DBMS:

http://www.youtube.com/watch?v=yrLd-3lnZ58

Cool stuff,
E.</description>
		<content:encoded><![CDATA[<p>There&#8217;s a 2008 talk by Peter Boncz about MonetDB/X100 project that illustrates principles that seem to be used by VectorWise&#8217;s DBMS:</p>
<p><a href="http://www.youtube.com/watch?v=yrLd-3lnZ58" rel="nofollow">http://www.youtube.com/watch?v=yrLd-3lnZ58</a></p>
<p>Cool stuff,<br />
E.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Marcin Zukowski</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-133829</link>
		<dc:creator>Marcin Zukowski</dc:creator>
		<pubDate>Tue, 04 Aug 2009 19:36:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-133829</guid>
		<description>@Daniel

One thing to note is that the opinion of working on compressed data sets is mostly useful for the major ordering columns only refers to the RLE compression. Like you write, in cases with large domain cardinality RLE won&#039;t do much for non-sorted data.

Still, other forms of compression can be used and data compressed with those can be analyzed without decompressing, see e.g. http://scholar.google.com/scholar?q=%22The+Implementation+and+Performance+of+Compressed+Databases.%22

m.</description>
		<content:encoded><![CDATA[<p>@Daniel</p>
<p>One thing to note is that the opinion of working on compressed data sets is mostly useful for the major ordering columns only refers to the RLE compression. Like you write, in cases with large domain cardinality RLE won&#8217;t do much for non-sorted data.</p>
<p>Still, other forms of compression can be used and data compressed with those can be analyzed without decompressing, see e.g. <a href="http://scholar.google.com/scholar?q=%22The+Implementation+and+Performance+of+Compressed+Databases.%22" rel="nofollow">http://scholar.google.com/scholar?q=%22The+Implementation+and+Performance+of+Compressed+Databases.%22</a></p>
<p>m.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-133810</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Tue, 04 Aug 2009 15:58:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-133810</guid>
		<description>Thanks, Marcin!

I edited in two corrections (Ph.D, CPU cycles).

Best,

CAM</description>
		<content:encoded><![CDATA[<p>Thanks, Marcin!</p>
<p>I edited in two corrections (Ph.D, CPU cycles).</p>
<p>Best,</p>
<p>CAM</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Marcin Zukowski</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-133803</link>
		<dc:creator>Marcin Zukowski</dc:creator>
		<pubDate>Tue, 04 Aug 2009 15:31:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-133803</guid>
		<description>Hi Curt,

Thank you for a nice writeup on VectorWise. While generally correct, here are some clarifications:

- the VectorWise technology belongs fully to our company (no academic institution, including CWI, can control it)

- the MonetDB open-source system originated from the PhD research of Peter Boncz under supervision of Martin Kersten, while the VectorWise database engine is a technology generation later and came out of my own PhD (not MSc) research, supervised in turn by Peter Boncz. Other CWI group members also have significant contributions to both projects. 

- we do hope to make VectorWise technology available as early as possible, and 2010 is very possible, but please do not treat it as an official plan

- as for the string compression,  we use something called PDICT, which is a new - outlier resistant - form of dictionary encoding. 

- like you wrote, the main thing about the compression methods in VectorWise is that they are much faster than existing methods. As for the performance, we take a few &quot;CPU cycles&quot; (not &quot;steps&quot;) for one element. Links to publications with more technical info can be found on: http://www.vectorwise.com/index_js.php?page=company_origins

- the place to visit for more info on the Ingres VectorWise project is http://www.ingres.com/vectorwise

Best regards,
Marcin Zukowski</description>
		<content:encoded><![CDATA[<p>Hi Curt,</p>
<p>Thank you for a nice writeup on VectorWise. While generally correct, here are some clarifications:</p>
<p>- the VectorWise technology belongs fully to our company (no academic institution, including CWI, can control it)</p>
<p>- the MonetDB open-source system originated from the PhD research of Peter Boncz under supervision of Martin Kersten, while the VectorWise database engine is a technology generation later and came out of my own PhD (not MSc) research, supervised in turn by Peter Boncz. Other CWI group members also have significant contributions to both projects. </p>
<p>- we do hope to make VectorWise technology available as early as possible, and 2010 is very possible, but please do not treat it as an official plan</p>
<p>- as for the string compression,  we use something called PDICT, which is a new &#8211; outlier resistant &#8211; form of dictionary encoding. </p>
<p>- like you wrote, the main thing about the compression methods in VectorWise is that they are much faster than existing methods. As for the performance, we take a few &#8220;CPU cycles&#8221; (not &#8220;steps&#8221;) for one element. Links to publications with more technical info can be found on: <a href="http://www.vectorwise.com/index_js.php?page=company_origins" rel="nofollow">http://www.vectorwise.com/index_js.php?page=company_origins</a></p>
<p>- the place to visit for more info on the Ingres VectorWise project is <a href="http://www.ingres.com/vectorwise" rel="nofollow">http://www.ingres.com/vectorwise</a></p>
<p>Best regards,<br />
Marcin Zukowski</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Lemire</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-133783</link>
		<dc:creator>Daniel Lemire</dc:creator>
		<pubDate>Tue, 04 Aug 2009 12:54:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-133783</guid>
		<description>&lt;i&gt;the advantages of operating on compressed data are only significant if the database stores columns in multiple sort orders each.&lt;/i&gt;

If your table has few dimensions, this makes no sense. But for high dimensional tables, it rings true. Indeed, columnar compression often comes through run-length encoding (RLE), after sorting (lexicographically). Yet, only the first few columns (in sorting order) will end up compressible by RLE after sorting them.

See for example:

Daniel Lemire, Owen Kaser, Kamel Aouiche, Sorting improves word-aligned bitmap indexes. Data &amp; Knowledge Engineering (to appear).
http://arxiv.org/abs/0901.3751
http://www.slideshare.net/lemire/all-about-bitmap-indexes-and-sorting-them

This suggests that they are not relying much on RLE. It might be that vector processing does not work well in conjunction with RLE?</description>
		<content:encoded><![CDATA[<p><i>the advantages of operating on compressed data are only significant if the database stores columns in multiple sort orders each.</i></p>
<p>If your table has few dimensions, this makes no sense. But for high dimensional tables, it rings true. Indeed, columnar compression often comes through run-length encoding (RLE), after sorting (lexicographically). Yet, only the first few columns (in sorting order) will end up compressible by RLE after sorting them.</p>
<p>See for example:</p>
<p>Daniel Lemire, Owen Kaser, Kamel Aouiche, Sorting improves word-aligned bitmap indexes. Data &amp; Knowledge Engineering (to appear).<br />
<a href="http://arxiv.org/abs/0901.3751" rel="nofollow">http://arxiv.org/abs/0901.3751</a><br />
<a href="http://www.slideshare.net/lemire/all-about-bitmap-indexes-and-sorting-them" rel="nofollow">http://www.slideshare.net/lemire/all-about-bitmap-indexes-and-sorting-them</a></p>
<p>This suggests that they are not relying much on RLE. It might be that vector processing does not work well in conjunction with RLE?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vertica&#8217;s version of MapReduce integration &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/#comment-133768</link>
		<dc:creator>Vertica&#8217;s version of MapReduce integration &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Tue, 04 Aug 2009 10:29:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=857#comment-133768</guid>
		<description>[...] VectorWise guys also told me they are looking forward to seeing how the two projects work together.   [...]</description>
		<content:encoded><![CDATA[<p>[...] VectorWise guys also told me they are looking forward to seeing how the two projects work together.   [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

