<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Mike Stonebraker calls for the complete destruction of the old DBMS order</title>
	<atom:link href="http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Mon, 01 Mar 2010 19:09:32 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-148982</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Thu, 12 Nov 2009 16:58:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-148982</guid>
		<description>Mike,

I don&#039;t know what you&#039;re talking about, because I don&#039;t take benchmark descriptions seriously enough to read and remember them. :)  See the &quot;Benchmarking&quot; section here or search on &quot;TPC&quot; to see what I mean. :)

CAM</description>
		<content:encoded><![CDATA[<p>Mike,</p>
<p>I don&#8217;t know what you&#8217;re talking about, because I don&#8217;t take benchmark descriptions seriously enough to read and remember them. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />   See the &#8220;Benchmarking&#8221; section here or search on &#8220;TPC&#8221; to see what I mean. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>CAM</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Chen</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-148929</link>
		<dc:creator>Mike Chen</dc:creator>
		<pubDate>Thu, 12 Nov 2009 08:16:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-148929</guid>
		<description>I&#039;m pondering over the validity of tpc-c comparison in the h-store paper. To achieve 70k transactions per second, one is required to run on a database of 200TB. There is no hope of any machine holding such a huge amount of data in memory any time soon. The paper did point out that real world workloads doesn&#039;t require such a huge data set. Is TPC-C being ridiculous or the assumption in the paper?</description>
		<content:encoded><![CDATA[<p>I&#8217;m pondering over the validity of tpc-c comparison in the h-store paper. To achieve 70k transactions per second, one is required to run on a database of 200TB. There is no hope of any machine holding such a huge amount of data in memory any time soon. The paper did point out that real world workloads doesn&#8217;t require such a huge data set. Is TPC-C being ridiculous or the assumption in the paper?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hasso Plattner calls for in-memory OLTP column stores &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-129483</link>
		<dc:creator>Hasso Plattner calls for in-memory OLTP column stores &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Wed, 08 Jul 2009 03:33:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-129483</guid>
		<description>[...] technology. There also are strong similarities to the MPP in-memory row store project H-Store/VoltDB, although I don&#8217;t know whether Plattner would go so far as to adopt the H-Store view [...]</description>
		<content:encoded><![CDATA[<p>[...] technology. There also are strong similarities to the MPP in-memory row store project H-Store/VoltDB, although I don&#8217;t know whether Plattner would go so far as to adopt the H-Store view [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: H-Store is now VoltDB &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-126508</link>
		<dc:creator>H-Store is now VoltDB &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Mon, 22 Jun 2009 20:14:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-126508</guid>
		<description>[...] always honored more of an NDA about the H-Store project and its commercialization than I really felt obligated to, given how freely information was [...]</description>
		<content:encoded><![CDATA[<p>[...] always honored more of an NDA about the H-Store project and its commercialization than I really felt obligated to, given how freely information was [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: pawel lubczonok</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-90515</link>
		<dc:creator>pawel lubczonok</dc:creator>
		<pubDate>Sat, 12 Jul 2008 18:08:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-90515</guid>
		<description>Great, this needs to be more widely distributed. Traditional IT managers are afraid to go for anything else but Oracle, DB2 etc. Good job on the part of these vendors. It is about time to shatter their ideas and destroy taboos.

We have worked on a new database/platforms for last 6 years we have been running enterprises on in memory model with many points being listed in this article. From our research we concur, the relational DB model is no good. It is not only about performance but also inflexibility and the resultant introduction of unnecessary complexity.

We have gone much further than all this. Combination of these DB thoughts with semantics leads to all kinds of other weird and amazing properties for db. Radical reductions in size of DB are possible. Atomisation of databases increases speed etc. etc. To discover what is right all the angles have to be investigated. Ultimately DB is just a component to keep/act on knowledge/information. 

Pawel Lubczonok</description>
		<content:encoded><![CDATA[<p>Great, this needs to be more widely distributed. Traditional IT managers are afraid to go for anything else but Oracle, DB2 etc. Good job on the part of these vendors. It is about time to shatter their ideas and destroy taboos.</p>
<p>We have worked on a new database/platforms for last 6 years we have been running enterprises on in memory model with many points being listed in this article. From our research we concur, the relational DB model is no good. It is not only about performance but also inflexibility and the resultant introduction of unnecessary complexity.</p>
<p>We have gone much further than all this. Combination of these DB thoughts with semantics leads to all kinds of other weird and amazing properties for db. Radical reductions in size of DB are possible. Atomisation of databases increases speed etc. etc. To discover what is right all the angles have to be investigated. Ultimately DB is just a component to keep/act on knowledge/information. </p>
<p>Pawel Lubczonok</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Database management system choices &#8212; overview &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-88892</link>
		<dc:creator>Database management system choices &#8212; overview &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Thu, 26 Jun 2008 07:45:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-88892</guid>
		<description>[...] to traditional shared-everything. Oracle RAC, high availability wide-area replication, and the H-Store research project all suggest that shared-everything&#8217;s dominance of high-end OLTP is at [...]</description>
		<content:encoded><![CDATA[<p>[...] to traditional shared-everything. Oracle RAC, high availability wide-area replication, and the H-Store research project all suggest that shared-everything&#8217;s dominance of high-end OLTP is at [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-77037</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Sat, 08 Mar 2008 10:04:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-77037</guid>
		<description>Dave,

The H-Store claim is that they get an order of magnitude or whatever single-processor performance advantage over disk-based systems, even for database sizes that disk-based systems can put entirely in cache. (Indeed, the starting number is more like 15X, but that&#039;s the really rose-colored view.) That&#039;s supposed to make up for any parallelization awkwardness.

The same team did C-Store/Vertica, and they followed a similar approach. I&#039;m not aware of Vertica having the same level of interprocessor data movement (or, better yet, data movement prevention) sophistication as some of its competitors. But based on their sales figures, it seems they&#039;re winning a lot of POCs even so.

Best,

CAM</description>
		<content:encoded><![CDATA[<p>Dave,</p>
<p>The H-Store claim is that they get an order of magnitude or whatever single-processor performance advantage over disk-based systems, even for database sizes that disk-based systems can put entirely in cache. (Indeed, the starting number is more like 15X, but that&#8217;s the really rose-colored view.) That&#8217;s supposed to make up for any parallelization awkwardness.</p>
<p>The same team did C-Store/Vertica, and they followed a similar approach. I&#8217;m not aware of Vertica having the same level of interprocessor data movement (or, better yet, data movement prevention) sophistication as some of its competitors. But based on their sales figures, it seems they&#8217;re winning a lot of POCs even so.</p>
<p>Best,</p>
<p>CAM</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Gudeman</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-76995</link>
		<dc:creator>Dave Gudeman</dc:creator>
		<pubDate>Sat, 08 Mar 2008 02:02:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-76995</guid>
		<description>Good answers, Daniel. I guess we both agree that we&#039;ll have to see.

I&#039;d like to expand my point about idle processors. Basically, you can always make a database application faster by throwing hardware at it. For many applications, the big modern database products don&#039;t scale very well with multiple machines, so typically you have to buy faster computers, which is expensive because price goes up much faster than computer speed. That is, to get a box that&#039;s twice as fast, you pay lots more than twice as much. What H-Store is trying to take advantage of is that buying twice as many computers of the same speed costs exactly twice as much (less if you get bulk discounts :-) ). So it&#039;s less expensive to have a database product that scales and buy several cheap computers to run it on than to buy one that doesn&#039;t scale and buy big iron.

However, if you have idle processors, that changes the equation. If your processors are idle half the time then you have to buy four computers, not just two. Suddenly the equation doesn&#039;t look quite so good for multiple computers. At that rate, you are better off going with a computer that is twice as fast, so long as it is less than four times as expensive.</description>
		<content:encoded><![CDATA[<p>Good answers, Daniel. I guess we both agree that we&#8217;ll have to see.</p>
<p>I&#8217;d like to expand my point about idle processors. Basically, you can always make a database application faster by throwing hardware at it. For many applications, the big modern database products don&#8217;t scale very well with multiple machines, so typically you have to buy faster computers, which is expensive because price goes up much faster than computer speed. That is, to get a box that&#8217;s twice as fast, you pay lots more than twice as much. What H-Store is trying to take advantage of is that buying twice as many computers of the same speed costs exactly twice as much (less if you get bulk discounts <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  ). So it&#8217;s less expensive to have a database product that scales and buy several cheap computers to run it on than to buy one that doesn&#8217;t scale and buy big iron.</p>
<p>However, if you have idle processors, that changes the equation. If your processors are idle half the time then you have to buy four computers, not just two. Suddenly the equation doesn&#8217;t look quite so good for multiple computers. At that rate, you are better off going with a computer that is twice as fast, so long as it is less than four times as expensive.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-76984</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Fri, 07 Mar 2008 22:24:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-76984</guid>
		<description>Dan,

Thanks for your comments in the first two paragraphs of 3)!

I&#039;m glad they&#039;re working on better synchronization; despite quite a bit of time talking with them I never got to exactly that point.

Where you confused me is where you seemed to be saying that a stored procedure has to have all of the following properties:

A.  No application logic.
B.  Single ACID transaction.
C.  Coarse-grained and doing a lot of work.

While that&#039;s what I read, it surely can&#039;t be exactly what you meant. ;)

Best,

CAM</description>
		<content:encoded><![CDATA[<p>Dan,</p>
<p>Thanks for your comments in the first two paragraphs of 3)!</p>
<p>I&#8217;m glad they&#8217;re working on better synchronization; despite quite a bit of time talking with them I never got to exactly that point.</p>
<p>Where you confused me is where you seemed to be saying that a stored procedure has to have all of the following properties:</p>
<p>A.  No application logic.<br />
B.  Single ACID transaction.<br />
C.  Coarse-grained and doing a lot of work.</p>
<p>While that&#8217;s what I read, it surely can&#8217;t be exactly what you meant. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>Best,</p>
<p>CAM</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Weinreb</title>
		<link>http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/comment-page-1/#comment-76878</link>
		<dc:creator>Daniel Weinreb</dc:creator>
		<pubDate>Fri, 07 Mar 2008 11:22:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-the-complete-destruction-of-the-old-dbms-order/#comment-76878</guid>
		<description>In reply to Dave Gudeman:  Your comments are very cogent.  Here&#039;s my best take on what the answers would be if you put this to the H-store guys.  (I am not one of them, so these are just my own best understanding.)

1. Idle processors: Yes, I think that&#039;s true. I&#039;m not sure it&#039;s a problem. The real measure of a system is whether it gets the latency and throughput that you want. Saying that idle processor are a drawback is like saying that not using the whole disk is a drawback. So what if it leaves processors idle? Processors are cheap.  Scaling is what matters.  (You may or may not buy this point...)

2. Moving data: Yes, that&#039;s right.  The paper calls this &quot;general transactions&quot;: the ones where the different processors need to interact during the execution of commands. It admits that these won&#039;t work well. The hope is that the great majority of the commands whose speed is critical can be done as single-node or one-shot transactions (see paper).  The degree to which this will work depends a lot on your schema and your queries.  In some cases, it&#039;s very easy to make this work.  For example, in the database of one prominent company I know, there are only two kinds of data: data that is specific to one user account, and a small amount of slow-changing configuration data.  The former can be partitioned trivially, and the latter just gets replicated everywhere. Now all transactions can run on a single node. In other applications, it&#039;s much harder to make this work.  We won&#039;t know until and unless H-store gets productized and applied to a wide variety of applications.  This is certainly a potential drawback of the H-store approach.

3. Synchronization: Yes, you&#039;re right.  H-store has a synchronization mechanism.  The one in the paper depends on guaranteed hard maximum realtime limits on network transmission, which is unfortunately not a condition obtainable in real-world circumstances.  However, the authors of the paper are working on improving this (private communication).  Assuming they do, the claim is that the overhead will be amortized over a relatively large amount of actual work per command, so that its percentage of overhead will be acceptably low.  Also, a single-node read transaction can all happen on one processor so it doesn&#039;t run into this problem, and probably the hope is that those will be relatively common.

What appears to be the big issue is that you need, as you say, to do program re-writing. You have to arrange things so that your commands to H-store are of high enough granularity (lots of actual work per command).  Furthermore, a single command always performs one ACID transaction. So you can&#039;t interleave application logic with transaction execution.  Some applications will be easy to rewrite this way and some will be hard.  Again, they&#039;ll need more experience to see how this works out.

About &quot;not having any persistent store&quot;, please read my earlier post, which I feel takes care of this issue.

About doing 5-way joins: well, it&#039;ll also depend a lot on the application.  If the five-way joins are on five tables that are generally used together, they can be co-located on single hosts, and ther&#039;s nothing to stop H-store from acquiring a more sophisticated query optimizer down the line if it turns out that it&#039;s needed more often than the paper says to expect.  So I don&#039;t think this is an inherent problem with the H-store concept.

On the whole, I agree that only time will tell, and it all depends on the specifics of the particular application and database structure.</description>
		<content:encoded><![CDATA[<p>In reply to Dave Gudeman:  Your comments are very cogent.  Here&#8217;s my best take on what the answers would be if you put this to the H-store guys.  (I am not one of them, so these are just my own best understanding.)</p>
<p>1. Idle processors: Yes, I think that&#8217;s true. I&#8217;m not sure it&#8217;s a problem. The real measure of a system is whether it gets the latency and throughput that you want. Saying that idle processor are a drawback is like saying that not using the whole disk is a drawback. So what if it leaves processors idle? Processors are cheap.  Scaling is what matters.  (You may or may not buy this point&#8230;)</p>
<p>2. Moving data: Yes, that&#8217;s right.  The paper calls this &#8220;general transactions&#8221;: the ones where the different processors need to interact during the execution of commands. It admits that these won&#8217;t work well. The hope is that the great majority of the commands whose speed is critical can be done as single-node or one-shot transactions (see paper).  The degree to which this will work depends a lot on your schema and your queries.  In some cases, it&#8217;s very easy to make this work.  For example, in the database of one prominent company I know, there are only two kinds of data: data that is specific to one user account, and a small amount of slow-changing configuration data.  The former can be partitioned trivially, and the latter just gets replicated everywhere. Now all transactions can run on a single node. In other applications, it&#8217;s much harder to make this work.  We won&#8217;t know until and unless H-store gets productized and applied to a wide variety of applications.  This is certainly a potential drawback of the H-store approach.</p>
<p>3. Synchronization: Yes, you&#8217;re right.  H-store has a synchronization mechanism.  The one in the paper depends on guaranteed hard maximum realtime limits on network transmission, which is unfortunately not a condition obtainable in real-world circumstances.  However, the authors of the paper are working on improving this (private communication).  Assuming they do, the claim is that the overhead will be amortized over a relatively large amount of actual work per command, so that its percentage of overhead will be acceptably low.  Also, a single-node read transaction can all happen on one processor so it doesn&#8217;t run into this problem, and probably the hope is that those will be relatively common.</p>
<p>What appears to be the big issue is that you need, as you say, to do program re-writing. You have to arrange things so that your commands to H-store are of high enough granularity (lots of actual work per command).  Furthermore, a single command always performs one ACID transaction. So you can&#8217;t interleave application logic with transaction execution.  Some applications will be easy to rewrite this way and some will be hard.  Again, they&#8217;ll need more experience to see how this works out.</p>
<p>About &#8220;not having any persistent store&#8221;, please read my earlier post, which I feel takes care of this issue.</p>
<p>About doing 5-way joins: well, it&#8217;ll also depend a lot on the application.  If the five-way joins are on five tables that are generally used together, they can be co-located on single hosts, and ther&#8217;s nothing to stop H-store from acquiring a more sophisticated query optimizer down the line if it turns out that it&#8217;s needed more often than the paper says to expect.  So I don&#8217;t think this is an inherent problem with the H-store concept.</p>
<p>On the whole, I agree that only time will tell, and it all depends on the specifics of the particular application and database structure.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic page generated in 0.227 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2010-03-02 16:39:38 -->
