<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; VoltDB and H-Store</title>
	<atom:link href="http://www.dbms2.com/category/products-and-vendors/h-store/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 09:21:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Soundbites: the Facebook/MySQL/NoSQL/VoltDB/Stonebraker flap, continued</title>
		<link>http://www.dbms2.com/2011/07/15/facebook-mysql-nosql-voltdb-stonebraker/</link>
		<comments>http://www.dbms2.com/2011/07/15/facebook-mysql-nosql-voltdb-stonebraker/#comments</comments>
		<pubDate>Fri, 15 Jul 2011 08:27:18 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Cache]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Clustrix]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[Data models and architecture]]></category>
		<category><![CDATA[Database diversity]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Michael Stonebraker]]></category>
		<category><![CDATA[MongoDB and 10gen]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[ScaleBase]]></category>
		<category><![CDATA[ScaleDB]]></category>
		<category><![CDATA[Schooner Information Technology]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Tokutek]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[dbShards and CodeFutures]]></category>
		<category><![CDATA[memcached]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4977</guid>
		<description><![CDATA[As a follow-up to the latest Stonebraker kerfuffle, Derrick Harris asked me a bunch of smart followup questions. My responses and afterthoughts include: Facebook et al. are in effect Software as a Service (SaaS) vendors, not enterprise technology users. In particular: They have the technical chops to rewrite their code as  needed. Unlike packaged software [...]]]></description>
			<content:encoded><![CDATA[<p>As a follow-up to the latest <a href="http://www.dbms2.com/2011/07/14/an-odd-claim-attributed-to-mike-stonebraker/">Stonebraker kerfuffle</a>, Derrick Harris asked me a bunch of smart followup questions. My responses and afterthoughts include:</p>
<ul>
<li>Facebook et al. are in effect Software as a Service (SaaS) vendors, not enterprise technology users. In particular:
<ul>
<li>They have the technical chops to rewrite their code as  needed.</li>
<li>Unlike packaged software vendors, they&#8217;re not answerable to anybody for keeping legacy code alive after a rewrite. That makes migration a lot easier.</li>
<li>If they want to write different parts of their system on different technical underpinnings, nobody can stop them. For example &#8230;</li>
<li>&#8230;  <a href="http://www.dbms2.com/2008/07/21/project-cassandra-facebook-open-sourced-quasi-dbms/">Facebook innovated Cassandra</a>, and is now heavily committed to HBase.</li>
</ul>
</li>
<li>It makes little sense to talk of Facebook&#8217;s use of &#8220;MySQL.&#8221; Better to talk of Facebook&#8217;s use of &#8220;MySQL +  memcached  + non-transparent sharding.&#8221; That said:
<ul>
<li>It&#8217;s hard to see why somebody today would use MySQL +  memcached  + non-transparent sharding for a new project. At least one of <a href="http://www.dbms2.com/2011/02/08/couchbase-membase-couchone-couchdb/">Couchbase</a> or <a href="http://www.dbms2.com/2011/02/24/transparent-sharding/">transparently-sharded</a> MySQL is very likely a superior alternative. Other alternatives might be better yet.</li>
<li>As noted above in the example of Facebook, the many major web businesses that are using MySQL +  memcached  + non-transparent sharding for existing projects can be presumed able to migrate away from that stack as the need arises.</li>
</ul>
</li>
</ul>
<p>Continuing with that discussion of DBMS alternatives:</p>
<ul>
<li>If you just want to write to the memcached API anyway, why not go with Couchbase?</li>
<li>If you want to go relational, why not go with MySQL? There are many alternatives for scaling or accelerating MySQL &#8212; dbShards, Schooner, Akiban, Tokutek, ScaleBase, ScaleDB, Clustrix, and Xeround come to mind quickly, so there&#8217;s a great chance that one or more will fit your use case. (And if you don&#8217;t get the choice of MySQL flavor right the first time, porting to another one shouldn&#8217;t be all THAT awful.)</li>
<li>If you really, really want to go in-memory, and don&#8217;t mind writing Java stored procedures, and don&#8217;t need to do the kinds of joins it isn&#8217;t good at, but do need to do the kinds of joins it is, VoltDB could indeed be a good alternative.</li>
</ul>
<p>And while we&#8217;re at it &#8212; going <strong>schema-free</strong> often makes a whole lot of sense. I need to write much more about the point, but for now let&#8217;s just say that I look favorably on the Big Four schema-free/NoSQL options of MongoDB, Couchbase, HBase, and Cassandra.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/07/15/facebook-mysql-nosql-voltdb-stonebraker/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>An odd claim attributed to Mike Stonebraker</title>
		<link>http://www.dbms2.com/2011/07/14/an-odd-claim-attributed-to-mike-stonebraker/</link>
		<comments>http://www.dbms2.com/2011/07/14/an-odd-claim-attributed-to-mike-stonebraker/#comments</comments>
		<pubDate>Thu, 14 Jul 2011 11:10:34 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Cache]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[Games and virtual worlds]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Michael Stonebraker]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[memcached]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4964</guid>
		<description><![CDATA[This post has a sequel. Last week, Mike Stonebraker insulted MySQL and Facebook&#8217;s use of it, by implication advocating VoltDB instead. Kerfuffle ensued. To the extent Mike was saying that non-transparently sharded MySQL isn&#8217;t an ideal way to do things, he&#8217;s surely right. That still leaves a lot of options for massive short-request databases, however, [...]]]></description>
			<content:encoded><![CDATA[<p><em>This post has a <a href="http://www.dbms2.com/2011/07/15/facebook-mysql-nosql-voltdb-stonebraker/">sequel</a>.</em></p>
<p>Last week, Mike Stonebraker <a href="http://gigaom.com/cloud/facebook-trapped-in-mysql-fate-worse-than-death/">insulted MySQL and Facebook&#8217;s use of it</a>, by implication advocating <a href="http://www.dbms2.com/2010/06/30/details-and-analysis-of-the-voltdb-argument/">VoltDB</a> instead. Kerfuffle ensued. To the extent Mike was saying that non-transparently sharded MySQL isn&#8217;t an ideal way to do things, he&#8217;s surely right. That still leaves a lot of options for massive <a href="http://www.dbms2.com/2011/03/02/short-request-processing/">short-request</a> databases, however, including <a href="http://www.dbms2.com/2011/02/24/transparent-sharding/">transparently sharded</a> RDBMS, scale-out <a href="http://www.dbms2.com/2011/05/23/databases-ram/">in-memory DBMS</a> (whether or not VoltDB*), and various NoSQL options. If nothing else, <a href="http://www.dbms2.com/2011/02/08/couchbase-membase-couchone-couchdb/">Couchbase</a> would seem superior to memcached/non-transparent MySQL if you were starting a project today.</p>
<p><em>*The big problem with VoltDB, last I checked, was its reliance on Java stored procedures to get work done.</em></p>
<p>Pleasantries continued in <em><a href="http://www.theregister.co.uk/2011/07/13/mike_stonebraker_versus_facebook/">The Register</a>,</em> which got an amazing-sounding quote from Mike. If <em>The Reg</em> is to be believed &#8212; something <a href="http://www.monashreport.com/2006/03/22/goodmail-esther-dyson-andrew-orlowski-etc/">I wouldn&#8217;t necessarily take for granted</a> &#8212; Mike claimed that he (i.e. VoltDB) knows how to solve the <strong>distributed join</strong> performance problem.  <span id="more-4964"></span></p>
<blockquote><p>So, it&#8217;s Stonebraker against the web. And the difference of option is  severe. In May, at a MongoDB developer conference in San Francisco,  Mongo creator Dwight Merriman told his audience there was &#8220;no way&#8221; to do distributed joins in a way that really scales.  &#8220;I&#8217;m not smart enough to do distributed joins that scale horizontally,  widely, and are super fast. You have to choose something else. We have  no choice but to not be relational,&#8221; he said</p>
<p>&#8220;You can do distributed transactions, but if you do them with no loss  of generality and you do them across a thousand machines, it&#8217;s not  going to be that fast.&#8221;</p>
<p>Stonebraker says precisely the opposite, and in typical fashion, he  goes right for the jugular. &#8220;I reject what Merriman says out of hand,&#8221;  he tells <em>The Register</em>. Merriman and his company, 10gen, declined  to comment for this story. But Stonebaker says words don&#8217;t matter. As  much as he likes to wield his opinions, he insists the debate will be  decided elsewhere. &#8220;Let the bake-off begin,&#8221; he crows.</p></blockquote>
<p>But when last I checked, VoltDB made nowhere near that claim. And well it shouldn&#8217;t have. In the fully general case, there&#8217;s no way to ensure super distributed join performance other than by throwing lots and lots of gear at the problem. But if you do that, many alternatives are fast. More specialized cases may be a different matter &#8212; but there are many fast alternatives for those too.</p>
<p>I imagine there will be use cases for which VoltDB sustains a lead as the truly fastest alternative, similarly-architected competitors perhaps excepted.* But what Mike supposedly said seems quite forward-leaning when compared to technical reality.</p>
<p><em>*The canonical VoltDB use case is <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">e-commerce in virtual goods</a>, the point of &#8220;virtual&#8221; being that physical inventory might necessitate costlier kinds of joins.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/07/14/an-odd-claim-attributed-to-mike-stonebraker/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Traditional databases will eventually wind up in RAM</title>
		<link>http://www.dbms2.com/2011/05/23/databases-ram/</link>
		<comments>http://www.dbms2.com/2011/05/23/databases-ram/#comments</comments>
		<pubDate>Mon, 23 May 2011 16:05:24 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Cache]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Oracle TimesTen]]></category>
		<category><![CDATA[SAP AG]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[solidDB]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4520</guid>
		<description><![CDATA[In January, 2010, I posited that it might be helpful to view data as being divided into three categories: Human/Tabular data –i.e., human-generated data that fits well into relational tables or arrays. Human/Nontabular data — i.e., all other data generated by humans. Machine-Generated data. I won&#8217;t now stand by every nuance in that post, which [...]]]></description>
			<content:encoded><![CDATA[<p>In January, 2010, I posited that <a href="http://www.dbms2.com/2010/01/17/three-broad-categories-of-data/">it might be helpful to view data as being divided into three categories</a>:</p>
<ul>
<li><strong>Human/Tabular</strong> data –i.e., human-generated data that  fits well 	into relational tables or arrays.</li>
<li><strong>Human/Nontabular</strong> data — i.e., all other data  generated by humans.</li>
<li><strong>Machine-Generated</strong> data.</li>
</ul>
<p>I won&#8217;t now stand by every nuance in that post, which may differ slightly from those in my more recent posts about <a href="http://www.dbms2.com/2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a> and <a href="http://www.dbms2.com/2011/05/17/poly-structured-database/">poly-structured databases</a>. But one general idea is hard to dispute:</p>
<p><strong>Traditional database data</strong> &#8212; records of human transactional activity, referred to as &#8220;Human/Tabular data above&#8221; &#8212; <strong>will not grow as fast as Moore&#8217;s Law makes computer chips cheaper.</strong></p>
<p>And that point has a straightforward corollary, namely:</p>
<p><strong>It will become ever more affordable to</strong><strong> put traditional database data entirely into RAM. </strong> <span id="more-4520"></span> </p>
<p>Actually, there are numerous ways for OLTP, other <a href="http://www.dbms2.com/2011/03/30/short-request-and-analytic-processing/">short-request</a>, and some analytic databases to wind up in RAM.</p>
<ul>
<li><a href="http://www.dbms2.com/2009/07/07/hasso-plattner-calls-for-in-memory-oltp-column-stores/">SAP has some good ideas</a> for how it could happen, banging transactions into what is essentially an in-memory analytic database. (I dispute SAP&#8217;s claims of transformational database technology leadership, but that doesn&#8217;t mean the underlying ideas aren&#8217;t good.)</li>
<li>For those who can afford the associated technology disruption, <a href="http://www.dbms2.com/2011/05/21/object-oriented-database-management-systems-oodbms/">memory-centric object-oriented DBMS</a> could be appealing.</li>
<li>Web scalability best practices commonly include keeping data in RAM (e.g., that&#8217;s pretty much the point of caching layer memcached).</li>
<li>SaaS (Software as a Service) companies &#8212; such as <a href="http://www.dbms2.com/2010/08/22/workday-technology-stack/">Workday</a> &#8212; often bring a particular tenant&#8217;s database entirely into RAM.</li>
<li><a href="http://www.dbms2.com/2010/06/12/the-underlying-technology-of-qlikview/">QlikView</a> highlights the benefits of doing business intelligence in RAM.</li>
<li><a href="http://www.dbms2.com/2011/04/21/sas-hpa-does-make-sense-after-all/">SAS HPA</a> makes the argument that even &#8220;big data analytics&#8221; should sometimes be done in RAM.</li>
<li>I don&#8217;t have particularly favorable opinions at this time about marketing strategies or momentum at <a href="http://www.dbms2.com/2008/12/29/ordinary-oltp-dbms-vs-memory-centric-processing/">Oracle TimesTen, IBM solidDB</a>, or <a href="http://www.dbms2.com/2010/06/30/details-and-analysis-of-the-voltdb-argument/">VoltDB</a>, but those examples at least serve to illustrate that memory-centric OLTP DBMS have existed for years.</li>
<li>Actually, SAP has at least two good ideas, if you count <a href="http://www.dbms2.com/2010/02/05/sybase-aleri-rap/">Sybase</a> as part of SAP.</li>
</ul>
<p>And here&#8217;s the kicker: Intel told me last year that <strong>CPUs are headed to 46-bit address spaces around mid-decade.</strong> Indeed, they hired me to help figure out if that was enough.* That multiplies out to <strong>64 terabytes of RAM on a single server,</strong> chip costs permitting. So most of what we now think of as operational databases &#8212; and many of the analytic ones too &#8212; will fit in-memory, even if they run very large businesses.</p>
<p><em>*And did so without putting the discussion under any kind of NDA.</em></p>
<p>Likely consequences of all this include:</p>
<ul>
<li><strong>Legacy apps will</strong> (eventually)<strong> be consolidated and virtualized in-memory.</strong> Their underlying databases will grow so slowly that eventually the cost of putting them in RAM will be too low to worry about.</li>
<li><strong>Expensive storage systems will </strong>(continue to)<strong> be irrelevant to database processing. </strong>Databases that don&#8217;t fit in RAM will typically be big enough to require the attention of a lot of CPUs &#8212; and in those cases the DBMS software itself will handle all the storage tasks.</li>
<li><strong>Major OLTP DBMS vendors, </strong>such as Oracle,<strong> will need alternate in-memory code lines, </strong>because disk-centric architectures are sub-optimal in-memory. Well, that&#8217;s what they have those big R&amp;D budgets for.</li>
<li><strong>SaaS vendors and web businesses may not rely on today&#8217;s major OLTP DBMS vendors.</strong> (I was going to say &#8220;won&#8217;t&#8221; rather than &#8220;may not&#8221; until I recalled the likely M&amp;A endgame.) Traditional enterprises may blanch at migrating away from their legacy DBMS environments, but the trade-offs are different for technology companies using DBMS as subsystems.</li>
</ul>
<p>Of course, the same trends that make data-storing chips cheaper will make data-generating chips cheaper too. So, just as there are huge amounts of machine-generated data that you&#8217;d never pay to store in RAM, the same will still be true 10 years from now; the data volumes involved will just be a lot bigger. And thus there will still be plenty of very large analytic databases using relatively cheap forms of storage, perhaps even disk.</p>
<p>But <strong>OLTP and other short-request processing are likely to wind up in-memory.</strong> And the same may be true for a considerable amount of <strong>analytics,</strong> especially but not only if the analytics have a low-latency requirement.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/05/23/databases-ram/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Some quick notes on HP-Vertica</title>
		<link>http://www.dbms2.com/2011/02/14/some-quick-notes-on-hp-vertica/</link>
		<comments>http://www.dbms2.com/2011/02/14/some-quick-notes-on-hp-vertica/#comments</comments>
		<pubDate>Mon, 14 Feb 2011 17:19:57 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[StreamBase]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3862</guid>
		<description><![CDATA[HP is acquiring Vertica.  Now we know (at least in part) why Vertica went oddly silent for a while. As per that same link, Vertica has &#62;250 ordinary customers, and &#62;70 more OEM sell-through ones. This is a setback for speculation about any kind of upcoming Aster/HP tie-up. Edit:  Forgot this one briefly &#8212; HP [...]]]></description>
			<content:encoded><![CDATA[<p>HP is acquiring Vertica.  <span id="more-3862"></span></p>
<ul>
<li>Now we know (at least in part) why <a href="http://www.dbms2.com/2011/02/14/now-we-know-why-vertica-has-been-so-weirdly-evasive/">Vertica went oddly silent</a> for a while.</li>
<li>As per that same link, Vertica has &gt;250 ordinary customers, and &gt;70 more OEM sell-through ones.</li>
<li>This is a setback for speculation about any kind of upcoming <a href="http://www.dbms2.com/2011/01/19/sound-bites-on-hpmicrosoft-and-neoview/">Aster/HP</a> tie-up.</li>
<li><em>Edit:  Forgot this one briefly &#8212; HP chairman Ray Lane was previously involved with Vertica.</em></li>
<li>Vertica arguably is the most mature of the modern <a href="http://www.dbms2.com/2011/02/06/columnar-compression-database-storage/">column-store DBMS</a> &#8212; i.e., the ones that don&#8217;t have their roots in bitmaps the way Sybase and SAND do.</li>
<li><a href="http://www.dbms2.com/2010/09/07/soundbites-about-mark-hurd-joining-oracle/">HP executed really badly in data warehouse DBMS and appliances</a> under former CEO Mark Hurd.</li>
<li>Unfortunately, if you&#8217;re quickly researching Vertica, neither <a href="http://www.dbms2.com/2011/02/05/gartner-magic-quadrant-data-warehouse-database-management-2010/">Gartner</a> nor <a href="http://www.dbms2.com/2011/02/11/comments-on-the-2011-forrester-wave-for-enterprise-data-warehouse-platforms/">Forrester</a> is a reliable source of detailed information.</li>
<li>It would make sense for HP to acquire <a href="http://www.dbms2.com/category/products-and-vendors/streambase/">StreamBase</a> too, and fold StreamBase into Vertica. Reasons include:
<ul>
<li>StreamBase and Vertica are aligned with each other. Both were founded by Mike Stonebraker, with overlapping groups of academic contributors. Both are in the Boston area. StreamBase and Vertica have worked together on various joint customer accounts, especially in the financial services sector.</li>
<li>Like other <a href="http://www.dbms2.com/2009/03/09/independent-cep-vendors-continue-to-flounder/">independent CEP vendors</a>, StreamBase can&#8217;t or won&#8217;t accomplish much outside certain niches (mainly financial services).</li>
<li>StreamBase reports, rather credibly, that it&#8217;s doing well in its niches. While StreamBase&#8217;s success seems to include a heavy dose of professional services, that hardly would be a deal-breaker for HP.</li>
</ul>
</li>
<li>It would make partial sense for HP to acquire <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">VoltDB</a>, and fold VoltDB into Vertica.
<ul>
<li>VoltDB was actually spun out of Vertica, and incubated in Vertica offices. A lot of thinking has already been done about how to integrate VoltDB and Vertica.</li>
<li>VoltDB needs the help, as its strategy is not attuned to the needs of succeeding in a highly competitive, rapidly innovative marketplace.</li>
<li>VoltDB doesn&#8217;t have the kind of traction on which a big company like HP could hang an acquisition case or acquisition strategy.</li>
</ul>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/02/14/some-quick-notes-on-hp-vertica/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Notes and links October 22, 2010</title>
		<link>http://www.dbms2.com/2010/10/22/notes-and-links-october-22-2010/</link>
		<comments>http://www.dbms2.com/2010/10/22/notes-and-links-october-22-2010/#comments</comments>
		<pubDate>Fri, 22 Oct 2010 06:47:05 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Liberty and privacy]]></category>
		<category><![CDATA[Market share and customer counts]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[SAS Institute]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[eBay]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3346</guid>
		<description><![CDATA[A number of recent posts have had good comments. This time, I won&#8217;t call them out individually. Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on &#8220;sensor data&#8221; &#8230; &#8230; and, even better, went on to [...]]]></description>
			<content:encoded><![CDATA[<p>A number of recent posts have had good comments. This time, I won&#8217;t call them out individually.</p>
<p>Evidently <a href="http://www.cscyphers.com/blog/2010/10/12/hadoop-world-2010/">Mike Olson of Cloudera is still telling the machine-generated data story</a>, exactly as he should be. The <a href="http://informationarbitrage.com/post/1359525958/big-ideas-around-big-problems-in-big-data">Information Arbitrage/IA Ventures</a> folks said something similar, focusing specifically on &#8220;sensor data&#8221; &#8230;</p>
<p>&#8230; and, even better, went on to say:  <span id="more-3346"></span></p>
<blockquote><p><strong>Privacy is dead</strong>.<br />
What do we consider to be the  boundaries of privacy, especially with respect to items like medical  data? In a data privacy-free world, should we be regulating data usage  instead? How do we deal with asymmetric access to our personal data,  e.g., how is it that insurance companies claim the right to our personal  information?</p></blockquote>
<p>Obviously, <a href="http://www.dbms2.com/2010/04/04/privacy-liberty-continued/">my answer to the second question is Yes!!!!</a></p>
<p>Also from Hadoop World &#8212; Dave Menninger, now an analyst, reports on <a href="http://www.ventanaresearch.com/blog/commentblog.aspx?id=4003">some Hadoop metrics</a>:</p>
<blockquote><p><span id="Contentblock1"><span>How big is “big data”?  In his opening remarks, Mike shared some statistics from a survey of  attendees. The average Hadoop cluster among respondents was 66 nodes and  114 terabytes of data. However there is quite a range. The largest in  the survey responses was a cluster of 1,300 nodes and more than 2  petabytes of data. (Presenters from eBay blew this away, describing  their production cluster of  8,500 nodes and 16 petabytes of storage.)  Over 60 percent of respondents had 10 terabytes or less, and half were  running 10 nodes or less.</span></span></p></blockquote>
<p><a href="http://www.dbms2.com/2010/10/06/ebay-followup-greenplum-out-teradata-10-petabytes-hadoop-has-some-value-and-more/">That eBay comment was particularly interestin</a>g. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>A while back, Doug Henschen noted that Netezza flagship reference Catalina Marketing is now at <a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/07/big_data_the_ea.html#more">2.5 petabytes</a>. Most of that is in one 600 billion row table. Oddly, the article talks of the Netezza/SAS partnership accelerating model-building via in-database scoring (not modeling) technology. Doug also wrote of a lot of <a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/08/whats_at_stake.html#more">analytic DBMS replacements</a>, including:</p>
<ul>
<li>Microsoft by ParAccel</li>
<li>Oracle by Aster Data, IBM, Oracle Exadata, probably Netezza, and probably Hadoop</li>
<li>Netezza by Greenplum</li>
<li>IBM by Teradata</li>
</ul>
<p>Carl Olofson pointed out on Twitter that <a href="http://www.oracle.com/us/corporate/Acquisitions/datascaler/index.html">DataScaler was an in-memory database technology just bought by Oracle</a>. This inspired me to google on them, and I found a sparse <a href="http://www.svadventure.com/">DataScaler CEO blog</a>. I link it because of an amusing juxtaposition &#8212; the second-to-last post says, in effect, &#8220;We make appliances and we recommend all these awesome technology design partners who helped us design the hardware,&#8221; while the very last post says &#8220;Designing our own hardware was a mistake.&#8221; <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><a href="http://www.dbms2.com/2010/07/23/some-interesting-links/">Fred Holahan</a> is now VP of Marketing at <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">VoltDB</a>, which is a lesson to me about giving free consulting &#8230; Anyhow, Fred tells me that VoltDB has about a dozen users on their way to production, some of whom are headed to being VoltDB paying customers, some of whom are not.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/10/22/notes-and-links-october-22-2010/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>I&#8217;m collecting data points on NoSQL and HVSP adoption</title>
		<link>http://www.dbms2.com/2010/08/18/nosql-hvsp-adoption/</link>
		<comments>http://www.dbms2.com/2010/08/18/nosql-hvsp-adoption/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 13:09:08 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Clustrix]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Groovy Corporation]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[ScaleDB]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[Zynga]]></category>
		<category><![CDATA[dbShards and CodeFutures]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2840</guid>
		<description><![CDATA[I was asked to do a magazine article on NoSQL, where by &#8220;NoSQL&#8221; is meant &#8220;whatever they talk about at NoSQL conferences.&#8221; By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about HVSP in [...]]]></description>
			<content:encoded><![CDATA[<p>I was asked to do a magazine article on NoSQL, where by &#8220;NoSQL&#8221; is meant &#8220;whatever they talk about at NoSQL conferences.&#8221; By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about <a href="http://www.dbms2.com/2010/03/13/the-naming-of-the-foo/">HVSP</a> in general, NoSQL and SQL alike.</p>
<p>It also is understood that, realistically, I can&#8217;t be expected to know and mention the very latest news for all the many products in the categories. Even so, I think this would be fine time to check just where NoSQL and HVSP adoption stand. Here is most of what I know, or links to same; it would be great if you guys would contribute additional data in the comment thread.</p>
<p>In the NoSQL area:  <span id="more-2840"></span></p>
<ul>
<li>Back in April, the VoltDB guys told me they thought Cassandra and HBase were the two NoSQL systems with the most momentum.</li>
<li>I know distressingly little about HBase adoption, but a source who may or may not wish to remain anonymous was kind enough to alert me that Twitter and StumbleUpon each have ~30 node deployments, for analytics and analytics/HVSP respectively.</li>
<li>I wrote in detail on <a href="http://www.dbms2.com/2010/07/06/riptano-and-cassandra-adoption/">Cassandra adoption</a> last month. News since then includes:
<ul>
<li>Facebook is rumored to have dropped Cassandra completely.</li>
<li><a href="http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html">Twitter clarified that it may not be quite as lovestruck by Cassandra as before</a>, but they&#8217;re still very close friends.</li>
<li>It&#8217;s not obvious that the <a href="http://www.riptano.com/blog/cassandra-summit-recap">Cassandra Summit</a> unveiled a lot of new adoption stories.</li>
</ul>
</li>
<li>Northscale&#8217;s <a href="http://www.dbms2.com/2010/08/18/northscale-membase-roadmap/">Membase</a> is still in its early days.  Zynga is bought in, however, as is something called NHN Korea. <em>(Edit: I subsequently saw NHN Korea on a prominent SEO expert&#8217;s list of the top half dozen or so search engines in the world. Who knew?)</em></li>
<li>Basho has listed a few <a href="http://www.basho.com/customers.html">Riak customers</a>. If memory serves (I haven&#8217;t spoken with Basho for a while, and some of my notes are misplaced due to some computer sloppiness), Basho has a few dozen customers in total.</li>
<li>Mozilla has <a href="http://blog.mozilla.com/data/2010/08/16/benchmarking-riak-for-the-mozilla-test-pilot-project/">a 4 machine, 64 core Riak cluster</a> in production.</li>
<li><a href="http://highscalability.com/hypertable-new-bigtable-clone-runs-hdfs-or-kfs">Hypertable</a> has a few users/project sponsors, Baidu being the biggest name among them.</li>
<li>I don&#8217;t really know how the MongoDB/10gen guys are doing. I think this is at least as much my fault as theirs. Anyhow, they seem to have <a href="http://www.10gen.com/news">links</a> to a couple of folks who have written about MongoDB usage.</li>
<li>NimbusDB is still in stealth mode. I&#8217;d be surprised if they had users  for a while yet, since in January they didn&#8217;t yet sound as if  development was very far underway. (Actually, I forget whether NimbusDB  is supposed to be SQL-based or not.)</li>
</ul>
<p>Among the SQL or SQL-friendly guys:</p>
<ul>
<li><a href="http://www.dbms2.com/2010/05/12/the-clustrix-story/">Clustrix</a> says it has a few production users, some big-name, but is not disclosing them yet.</li>
<li><a href="http://www.dbms2.com/2010/07/28/dbshards/">dbShards has around 6 customers</a>, including Facebook. (Facebook may outpace even Twitter and Zynga in using the most products mentioned in this post.)</li>
<li>As of May, <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">VoltDB</a> had one paying customer, plus 150 beta customers who weren&#8217;t in production yet.</li>
<li><a href="http://www.dbms2.com/2010/04/03/akiban-highlights/">Akiban</a> says they&#8217;ll get me up to speed on Thursday. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </li>
<li><a href="http://www.dbms2.com/2008/04/13/scaledb-presents-the-revenge-of-the-pointer/">ScaleDB</a> seems to be pedaling along in perennial beta. Whether ScaleDB has any actual beta users is less clear. On the plus side, checking that out uncovered a pretty funny <a href="http://scaledb.blogspot.com/2010/04/scaledb-introduces-clustered-database.html">April Fool blog post</a>.</li>
<li><a href="http://www.dbms2.com/2009/07/30/groovy-corp-puts-out-a-ridiculous-press-release/">Groovy Corporation</a> seems to have disappeared, or morphed into something called <a href="http://www.groovycorp.com/home.html">uCirrus</a>, or something like that.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/18/nosql-hvsp-adoption/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Details and analysis of the VoltDB argument</title>
		<link>http://www.dbms2.com/2010/06/30/details-and-analysis-of-the-voltdb-argument/</link>
		<comments>http://www.dbms2.com/2010/06/30/details-and-analysis-of-the-voltdb-argument/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 14:37:37 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Michael Stonebraker]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2432</guid>
		<description><![CDATA[Todd Hoff (High Scalability blog) posted a lengthy examination of the case and use cases for VoltDB. That excellent post, in turn, is based on a Mike Stonebraker* webinar for VoltDB, for which the slide deck is happily available. It&#8217;s all nicely consistent with what I wrote about VoltDB last month, in connection with its [...]]]></description>
			<content:encoded><![CDATA[<p>Todd Hoff <em>(High Scalability</em> blog) posted <a href="http://highscalability.com/blog/2010/6/28/voltdb-decapitates-six-sql-urban-myths-and-delivers-internet.html">a lengthy examination of the case and use cases for VoltDB</a>. That excellent post, in turn, is based on <a href="http://voltdb.com/voltdb-webinar-sql-urban-myths">a Mike Stonebraker* webinar for VoltDB</a>, for which the <a href="http://voltdb.com/_pdf/VoltDB-MikeStonebraker-SQLMythsWebinar-060310.pdf">slide deck</a> is happily available. It&#8217;s all nicely consistent with <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">what I wrote about VoltDB</a> last month, in connection with its launch.  <span id="more-2432"></span></p>
<p><em>*Who, in Todd&#8217;s apt description, is &#8220;the sword wielding Johnny Appleseed of the database world&#8221;.</em></p>
<p>Todd wrote:</p>
<blockquote><p>What matters to VoltDB is: <em>speed at scale, speed at scale, speed at scale, SQL, and ACID</em>. If that matches your priorities then you&#8217;ll probably be happy. Otherwise, as you&#8217;ll see, everything is sacrificed for speed at scale and what is sacrificed is often ease of use, generality, and <a href="http://community.voltdb.com/node/77">error checking</a>. It&#8217;s likely we&#8217;ll see ease of use improve over time, but for now it looks like rough going, unless of course, you are a going for speed at scale.</p></blockquote>
<p>Indeed.</p>
<p>Todd&#8217;s list of interesting VoltDB features is also pretty good, namely</p>
<ul>
<blockquote>
<li>Main-memory storage.</li>
<li>Run transactions to completion –single threaded –in timestamp order. <em> </em></li>
<li>Replicas.</li>
<li>Tables are partitioned across multiple servers.</li>
<li>Stored procedures, written in Java, are the unit of transaction.</li>
<li>A limited subset of SQL &#8217;99 is supported.</li>
<li>Design a schema and workflow to use single-sited procedures.</li>
<li>Challenging operations model.</li>
<li>No WAN support.</li>
<li>OLAP is purposefully kept separate.</li>
</blockquote>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/30/details-and-analysis-of-the-voltdb-argument/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>VoltDB finally launches</title>
		<link>http://www.dbms2.com/2010/05/25/voltdb-finally-launches/</link>
		<comments>http://www.dbms2.com/2010/05/25/voltdb-finally-launches/#comments</comments>
		<pubDate>Tue, 25 May 2010 07:15:04 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[Games and virtual worlds]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Michael Stonebraker]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2201</guid>
		<description><![CDATA[VoltDB is finally launching today. As is common for companies in sectors I write about, VoltDB &#8212; or just &#8220;Volt&#8221; &#8212; has discovered the virtues of embargoes that end 12:01 am. Let&#8217;s go straight to the technical highlights: VoltDB is based on the H-Store technology, which I wrote about in February, 2009. Most of what [...]]]></description>
			<content:encoded><![CDATA[<p>VoltDB is finally launching today. As is common for companies in sectors I write about, VoltDB &#8212; or just &#8220;Volt&#8221; &#8212; has discovered the virtues of embargoes that end 12:01 am. Let&#8217;s go straight to the technical highlights:</p>
<ul>
<li>VoltDB is based on the <a href="http://www.dbms2.com/2008/02/19/h-store-architecture/">H-Store</a> technology, which I wrote about in February, 2009. Most of what I said about H-Store then applies to VoltDB today.</li>
<li>VoltDB is a no-apologies ACID relational DBMS, which runs entirely in RAM.</li>
<li>VoltDB has rather limited SQL. (One example: VoltDB can&#8217;t do SUMs in SQL.) However, VoltDB guy Tim Callaghan (Mark Callaghan&#8217;s lesser-known but nonetheless smart brother) asserts that if you code up the missing functionality, it&#8217;s almost as fast as if it were present in the DBMS to begin with, because there&#8217;s no added I/O from the handoff between the DBMS and the procedural code. (The data&#8217;s in RAM one way or the other.)</li>
<li>VoltDB&#8217;s Big Conceptual Performance Story is that it does away with most locks, latches, logs, etc., and also most context switching.</li>
<li>In particular, you&#8217;re supposed to partition your data and architect your application so that most transactions execute on a single core. When you can do that, you get VoltDB&#8217;s performance benefits. To the extent you can&#8217;t, you&#8217;re in two-phase-commit performance land. (More precisely, you&#8217;re doing 2PC for multi-core writes, which is surely a major reason that multi-core reads are a lot faster in VoltDB than multi-core writes.)</li>
<li>VoltDB has a little less than one DBMS thread per core. When the data partitioning works as it should, you execute a complete transaction in that single thread. Poof. No context switching.</li>
<li>A transaction in VoltDB is a Java stored procedure. (The early idea of Ruby on Rails in lieu of the Java/SQL combo didn&#8217;t hold up performance-wise.)</li>
<li>Solid-state memory is not a viable alternative to RAM for VoltDB. Too slow.</li>
<li>Instead, VoltDB lets you snapshot data to disk at tunable intervals. &#8220;Continuous&#8221; is one of the options, wherein a new snapshot starts being made as soon as the last one completes.</li>
<li>In addition, VoltDB will also spool a kind of transaction log to the target of your choice. (Obvious choice: An analytic DBMS such as Vertica, but there&#8217;s no such connectivity partnership actually in place at this time.)</li>
</ul>
<p><span id="more-2201"></span>I should also note that when Tim Callaghan described architectural options to get around 2PC performance issues, they sounded a lot like eventual consistency. Maybe tunable <a href="http://www.dbms2.com/2010/05/01/ryw-read-your-writes-consistency/">RYW consistency</a> isn&#8217;t in the cards, but at least there&#8217;s a NoSQL-like possibility with VoltDB.</p>
<p>VoltDB&#8217;s open source strategy is:</p>
<ul>
<li>VoltDB will be open sourced.</li>
<li>Community VoltDB will be GPLed. Professional Edition VoltDB has a non-GPL license.</li>
<li>The VoltDB Professional Edition won&#8217;t start out with features beyond the Community Edition ones, but will gain such later on. I didn&#8217;t get the sense the plans for those features were completely baked yet, but ideas mentioned included:
<ul>
<li>Management/monitoring tools.</li>
<li>Integration with expense closed-source enterprise software products, such as ones in the management/monitoring area.</li>
<li>Yet more &#8220;extreme&#8221;/edge-case performance.</li>
</ul>
</li>
<li>Before VoltDB decided for sure that it wasn&#8217;t selling licenses, it sold a license to Getco, which also seems to be an investor in the company.</li>
</ul>
<p>VoltDB had a beta test with about 150 participants. None is in production yet, although at least a few are clearly headed there. Most VoltDB beta testers are in some kind of online business, with a particular concentration in everybody&#8217;s new favorite market, online gaming. Most of the rest are in investment/trading &#8212; a major target market for at least three different Mike Stonebraker companies &#8212; and a few are in telecom. VoltDB assures me that some of the beta users are companies one actually has heard of before, but VoltDB is not in a position to name any of those.</p>
<p>VoltDB is not ideally suited for a classic order management system, since you&#8217;d want to partition both on CustomerID and SKU, the latter because you&#8217;d constantly updating inventory stock levels. However, this argument doesn&#8217;t apply in the case of virtual goods. Virtual goods that are sold for real money &#8212; and hence need ACID levels of transaction integrity &#8212; are thus a clear target market for VoltDB. (The example that came up was in, you guessed it, online gaming.) The other interesting use case that Tim highlighted was low-latency analytics/ELT. For reasons I didn&#8217;t totally grasp, Tim likes to call this &#8220;Stateful ELT.&#8221; (Given that the data goes into the VoltDB database before much else happens to it, I&#8217;m pretty sure I heard &#8220;ELT&#8221; correctly. But I guess I might have been mishearing &#8220;ETL&#8221;.)</p>
<p>VoltDB company highlights include:</p>
<ul>
<li>VoltDB has about a dozen employees, all but two of whom are technical. (However, I&#8217;m not sure they&#8217;re counting Andy Ellicott against the two. But then, last I heard he wasn&#8217;t full time at VoltDB.)</li>
<li>VoltDB&#8217;s venture funding status is, if I may paraphrase, &#8220;Mumble mumble.&#8221;</li>
<li>Although long separate from Vertica, VoltDB is still located in Vertica&#8217;s offices.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/05/25/voltdb-finally-launches/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Notes on the evolution of OLTP database management systems</title>
		<link>http://www.dbms2.com/2010/04/05/oltp-database-management-systems-2/</link>
		<comments>http://www.dbms2.com/2010/04/05/oltp-database-management-systems-2/#comments</comments>
		<pubDate>Mon, 05 Apr 2010 08:22:03 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EnterpriseDB and Postgres Plus]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Market share and customer counts]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[Mid-range]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[RDF and graphs]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1841</guid>
		<description><![CDATA[The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part).  OLTP (OnLine Transaction Processing) and general purpose DBMS startups, however, have not yet done as well, with [...]]]></description>
			<content:encoded><![CDATA[<p>The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part).  OLTP <span style="font-weight: normal;">(OnLine Transaction Processing) </span>and general purpose DBMS startups, however, have not yet done as well, with such success as there has been (MySQL, Intersystems Cache&#8217;, solidDB&#8217;s exit, etc.) generally accruing to products that originated in the 20th Century.</p>
<p>Nonetheless, OLTP/general-purpose data management startup activity has recently picked up, targeting what I see as some very real opportunities and needs. So as a jumping-off point for further writing, I thought it might be interesting to collect a few observations about the market in one place.  These include:</p>
<ul>
<li><span style="font-weight: normal;">Big-brand 	OLTP/general-purpose DBMS have more “stickiness” 	than analytic DBMS.</span></li>
<li><span style="font-weight: normal;">By 	number, most of an enterprise&#8217;s OLTP/general-purpose databases are low-volume and 	low-value. </span></li>
<li>Most 	interesting new OLTP/general-purpose data management products are <span style="font-style: normal;">either 	MySQL-based or NoSQL.</span></li>
<li>It&#8217;s not yet 	clear whether MySQL will prevail over MySQL forks, or vice-versa, or 	whether they will co-exist.</li>
<li>The era of 	silicon-centric relational DBMS is coming.</li>
<li>The emphasis 	on scale-out and reducing the cost of joins spans the NoSQL and 	SQL-based worlds.<em> </em></li>
<li><span style="font-weight: normal;">Users&#8217; 	instance on “free” could be a major problem for OLTP DBMS 	innovation. </span></li>
</ul>
<p style="margin-bottom: 0in;">I shall explain.<span id="more-1841"></span></p>
<p style="margin-bottom: 0in;"><strong>Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.</strong></p>
<ul>
<li>OLTP 	applications are more complex than analytic ones, and hence more 	tightly wired into particular brands of DBMS. For example, 	third-party packaged OLTP applications are typically portable among 	only a few brands of DBMS. But third-party business intelligence 	tools, and the BI “applications” built in them, are more easily 	and widely portable.</li>
<li>Specific technical observations 	such as “OLTP apps tend to use stored procedures, which are 	DBMS-specific” or “OLTP apps tend to have lots and lots of 	tables” serve to underscore the first point.</li>
<li>An enterprise&#8217;s highest-value data 	is commonly the financial stuff handled by its core OLTP systems, so 	those are the last things they want to mess around with just to get 	some cost savings. Security, high availability, and so on are major 	considerations that can outweigh cost.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>By number, most of an enterprise&#8217;s OLTP/general-purpose databases are low-volume and low-value. </strong>Indeed, “OLTP” is often a misnomer, which is why I tend to go with “general-purpose” or some similarly wishy-washy phrase instead.</p>
<ul>
<li>In theory, this is a ripe area for 	what I&#8217;ve called <a href="http://www.dbms2.com/category/database-management-system/mid-range/">mid-range DBMS</a>.</li>
<li>The big brand vendors try hard to 	keep as many of those databases for themselves as they can. 	Enterprise-wide license pricing helps. Going forward, so will 	virtualization/consolidation strategies, such as <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/">Oracle&#8217;s 	Exadata-centric approach</a>.</li>
<li>A variety of mid-range DBMS 	alternatives beyond the big brands have technical merit, at least in 	some cases and configurations – MySQL, PostgreSQL, Intersystems 	Cache&#8217;, and so on.</li>
<li>The only such mid-range DBMS 	alternative with much large enterprise business momentum, however, 	appears to be MySQL.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>&#8220;General-purpose&#8221; might be a better term than &#8220;OLTP&#8221; anyway.</strong></p>
<ul>
<li>I don&#8217;t have a link, but it&#8217;s widely agreed that over half of the processing on an &#8220;OLTP&#8221; enterprise app is commonly reporting and so on.</li>
<li>&#8220;Operational BI&#8221; is progressing by fits and starts, but it is progressing.</li>
<li>Anything customer-facing &#8212; web-based, call center, or otherwise &#8212; is likely to include a heavy dose of &#8220;real-time&#8221; analytic optimization.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Most interesting new OLTP/general-purpose data management products are <span style="font-style: normal;">either MySQL-based or NoSQL.</span></strong></p>
<ul>
<li><a href="http://www.dbms2.com/2009/06/22/h-store-horizontica-voltdb/">VoltDB</a> is the main 	exception that jumps to mind.</li>
<li>This isn&#8217;t true in the analytic 	DBMS area, where Netezza, Greenplum, Aster, Vertica and others 	started from PostgreSQL&#8217;s code, APIs, or both.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>It&#8217;s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.</strong></p>
<ul>
<li>MySQL is a limited product without 	all the third-party storage engines that are being developed.</li>
<li><a href="http://www.dbms2.com/2009/12/14/oracle-mysql-storage-engine/">Oracle&#8217;s promise of MySQL good 	behavior</a> has an expiration date.</li>
<li>None of the MySQL front-end 	alternatives are remotely mature yet.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>The era of silicon-centric relational DBMS is coming.</strong></p>
<ul>
<li>I think “silicon” means 	“solid-state memory” as much as or more than it means “RAM,” 	but that&#8217;s not yet certain.</li>
<li>What is pretty certain is that, 	thanks to Moore&#8217;s Law, some kind of silicon will increasingly 	replace disk.</li>
<li><a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/">Oracle&#8217;s increasingly 	Flash-centric story</a> is a challenge to everybody.</li>
<li>RAM-centric VoltDB will launch 	fairly soon. (By the way, while VoltDB still has <a href="http://www.dbms2.com/2009/06/22/h-store-horizontica-voltdb/">a lot in common 	with H-Store</a>, they&#8217;re not exactly the same thing. And <a href="http://bit.ly/9QxjV2.">H-Store 	research</a> is progressing too.)</li>
<li><span style="font-style: normal;"><a href="http://rethinkdb.com/">RethinkDB</a> is being de</span>veloped, focused directly on solid-state memory. 	Based on the sparse information available online, RethinkDB sounds 	somewhat like a dumbed-down H-Store.</li>
<li>New disk-based vendors may never 	optimize their use of disk, instead targeting a solid-state future. 	(E.g., I think Akiban should and quite well might follow this path.)</li>
</ul>
<p style="margin-bottom: 0in; font-weight: normal;"><strong>The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds.</strong> We hear that from the <a href="http://www.dbms2.com/2010/03/14/nosql-taxonomy/">NoSQL</a> guys all the time. But I also just heard it from <a href="http://www.dbms2.com/2010/04/03/akiban-highlights/">Akiban</a>.</p>
<p style="margin-bottom: 0in;"><strong>Users&#8217; instance on “free” could be a major problem for OLTP DBMS innovation.</strong> Vendors of new OLTP data management technologies often feel obligated to open source their products, notwithstanding the historical lack of revenue in the open source OLTP DBMS market. As just one of many examples,  <a href="http://www.novaspivack.com/uncategorized/evri-ties-the-knot-with-twine">Nova Spivack</a> wrote:</p>
<blockquote>
<p style="margin-bottom: 0in;">I have recently seen some new graph data storage products that may provide the levels of scale and performance needed, but pricing has not been determined yet. In short, storage and retrieval of semantic graph datasets is a big unsolved challenge that is holding back the entire industry. We need federated database systems that can handle hundreds of billions to trillions of triples under high load conditions, in the cloud, on commodity hardware and open source software. Only then will it be affordable to make semantic applications and services at Web-scale.</p>
</blockquote>
<p style="margin-bottom: 0in;">I hear similar things from other startups, who evidently believe they need and/or are entitled to enjoy sophisticated, high-performance, zero-cost, specialized database management technology.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/05/oltp-database-management-systems-2/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>The Boston Globe had an article on VoltDB</title>
		<link>http://www.dbms2.com/2009/08/04/the-boston-globe-had-an-article-on-voltdb/</link>
		<comments>http://www.dbms2.com/2009/08/04/the-boston-globe-had-an-article-on-voltdb/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 09:17:10 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=856</guid>
		<description><![CDATA[The Boston Globe article has more detail than Vertica and VoltDB have ever OKed me to put out, and some business details they&#8217;ve never given me.]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.boston.com/business/technology/innoeco/2009/08/on_the_radar_voltdb_just_the_l.html">Boston Globe article</a> has more detail than Vertica and VoltDB have ever OKed me to put out, and some business details they&#8217;ve never given me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/08/04/the-boston-globe-had-an-article-on-voltdb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

