<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; ScaleDB</title>
	<atom:link href="http://www.dbms2.com/category/products-and-vendors/scaledb/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 09:21:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Soundbites: the Facebook/MySQL/NoSQL/VoltDB/Stonebraker flap, continued</title>
		<link>http://www.dbms2.com/2011/07/15/facebook-mysql-nosql-voltdb-stonebraker/</link>
		<comments>http://www.dbms2.com/2011/07/15/facebook-mysql-nosql-voltdb-stonebraker/#comments</comments>
		<pubDate>Fri, 15 Jul 2011 08:27:18 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Cache]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Clustrix]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[Data models and architecture]]></category>
		<category><![CDATA[Database diversity]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Michael Stonebraker]]></category>
		<category><![CDATA[MongoDB and 10gen]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[ScaleBase]]></category>
		<category><![CDATA[ScaleDB]]></category>
		<category><![CDATA[Schooner Information Technology]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Tokutek]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[dbShards and CodeFutures]]></category>
		<category><![CDATA[memcached]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4977</guid>
		<description><![CDATA[As a follow-up to the latest Stonebraker kerfuffle, Derrick Harris asked me a bunch of smart followup questions. My responses and afterthoughts include: Facebook et al. are in effect Software as a Service (SaaS) vendors, not enterprise technology users. In particular: They have the technical chops to rewrite their code as  needed. Unlike packaged software [...]]]></description>
			<content:encoded><![CDATA[<p>As a follow-up to the latest <a href="http://www.dbms2.com/2011/07/14/an-odd-claim-attributed-to-mike-stonebraker/">Stonebraker kerfuffle</a>, Derrick Harris asked me a bunch of smart followup questions. My responses and afterthoughts include:</p>
<ul>
<li>Facebook et al. are in effect Software as a Service (SaaS) vendors, not enterprise technology users. In particular:
<ul>
<li>They have the technical chops to rewrite their code as  needed.</li>
<li>Unlike packaged software vendors, they&#8217;re not answerable to anybody for keeping legacy code alive after a rewrite. That makes migration a lot easier.</li>
<li>If they want to write different parts of their system on different technical underpinnings, nobody can stop them. For example &#8230;</li>
<li>&#8230;  <a href="http://www.dbms2.com/2008/07/21/project-cassandra-facebook-open-sourced-quasi-dbms/">Facebook innovated Cassandra</a>, and is now heavily committed to HBase.</li>
</ul>
</li>
<li>It makes little sense to talk of Facebook&#8217;s use of &#8220;MySQL.&#8221; Better to talk of Facebook&#8217;s use of &#8220;MySQL +  memcached  + non-transparent sharding.&#8221; That said:
<ul>
<li>It&#8217;s hard to see why somebody today would use MySQL +  memcached  + non-transparent sharding for a new project. At least one of <a href="http://www.dbms2.com/2011/02/08/couchbase-membase-couchone-couchdb/">Couchbase</a> or <a href="http://www.dbms2.com/2011/02/24/transparent-sharding/">transparently-sharded</a> MySQL is very likely a superior alternative. Other alternatives might be better yet.</li>
<li>As noted above in the example of Facebook, the many major web businesses that are using MySQL +  memcached  + non-transparent sharding for existing projects can be presumed able to migrate away from that stack as the need arises.</li>
</ul>
</li>
</ul>
<p>Continuing with that discussion of DBMS alternatives:</p>
<ul>
<li>If you just want to write to the memcached API anyway, why not go with Couchbase?</li>
<li>If you want to go relational, why not go with MySQL? There are many alternatives for scaling or accelerating MySQL &#8212; dbShards, Schooner, Akiban, Tokutek, ScaleBase, ScaleDB, Clustrix, and Xeround come to mind quickly, so there&#8217;s a great chance that one or more will fit your use case. (And if you don&#8217;t get the choice of MySQL flavor right the first time, porting to another one shouldn&#8217;t be all THAT awful.)</li>
<li>If you really, really want to go in-memory, and don&#8217;t mind writing Java stored procedures, and don&#8217;t need to do the kinds of joins it isn&#8217;t good at, but do need to do the kinds of joins it is, VoltDB could indeed be a good alternative.</li>
</ul>
<p>And while we&#8217;re at it &#8212; going <strong>schema-free</strong> often makes a whole lot of sense. I need to write much more about the point, but for now let&#8217;s just say that I look favorably on the Big Four schema-free/NoSQL options of MongoDB, Couchbase, HBase, and Cassandra.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/07/15/facebook-mysql-nosql-voltdb-stonebraker/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Notes on short-request scale-out MySQL</title>
		<link>http://www.dbms2.com/2011/04/19/notes-on-short-request-scale-out-mysql/</link>
		<comments>http://www.dbms2.com/2011/04/19/notes-on-short-request-scale-out-mysql/#comments</comments>
		<pubDate>Tue, 19 Apr 2011 09:52:28 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Kaminario]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[ScaleBase]]></category>
		<category><![CDATA[ScaleDB]]></category>
		<category><![CDATA[Schooner Information Technology]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Tokutek]]></category>
		<category><![CDATA[Web analytics]]></category>
		<category><![CDATA[dbShards and CodeFutures]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4329</guid>
		<description><![CDATA[A press person recently asked about: &#8230; start-ups that are building technologies to enable MySQL and other SQL databases to get over some of the problems they have in scaling past a certain size. &#8230; I’d like to get a sense as to whether or not the problems are as severe and wide spread as [...]]]></description>
			<content:encoded><![CDATA[<p>A press person recently asked about:</p>
<blockquote><p>&#8230; start-ups that are building technologies to enable MySQL and other SQL databases to get over some of the problems they have in scaling past a certain size. &#8230; I’d like to get a sense as to whether or not the problems are as severe and wide spread as these companies are telling me? If so, why wouldn’t a customer just move to a new database?</p></blockquote>
<p>While that sounds as if he was asking about scale-out relational DBMS in general, MySQL or otherwise, <a href="http://www.dbms2.com/2011/03/30/short-request-and-analytic-processing/">short-request or analytic</a>, it turned out that he was asking just about <strong>short-request scale-out MySQL.</strong> My thoughts and comments on that narrower subject include(d) but are not limited to:  <span id="more-4329"></span></p>
<ul>
<li>The biggest web companies had to go to non-<a href="http://www.dbms2.com/2011/02/24/transparent-sharding/">transparently sharded</a> MySQL years ago. The NoSQL movement is, in no small part, <a href="http://www.dbms2.com/2010/03/02/cassandra-nosql-scalable-oltp/">an attempt to improve upon that</a>. Ditto for scale-out short-request MySQL.</li>
<li>Some overlapping categories of companies or projects who need scale-out short-request database processing are:
<ul>
<li>The aforementioned big companies who have other applications they haven&#8217;t hand-sharded yet.</li>
<li>Other web companies whose applications are getting that big.</li>
<li>Conventional enterprises whose web efforts happen to be very big.</li>
<li>Sensor networks and other massive sources of <a href="http://www.dbms2.com/2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a>.</li>
<li>Certain specialized areas (e.g., financial trading).</li>
</ul>
</li>
</ul>
<ul>
<li>Relatively few of these applications are totally impossible to do in Oracle. But the Oracle approach might be very expensive.</li>
<li>In particular, there&#8217;s a break point when companies &#8212; often SaaS vendors &#8212; <a href="http://www.dbms2.com/2011/04/01/the-client-that-was-confused-about-security/">outgrow Oracle Standard Edition</a>.</li>
<li>Yes, the alternatives usually are one of MySQL or Oracle.</li>
<li>InnoDB isn&#8217;t an alternative to these newer technologies; it&#8217;s just a piece of the puzzle and indeed of default MySQL now. Several of them &#8212; e.g. dbShards &#8212; are meant to be used in conjunction with InnoDB.</li>
<li>Merging his list and mine, the high-performance/scale-out MySQL alternatives look like <a href="http://www.dbms2.com/2011/01/25/dbshards-update/">dbShards</a>, <a href="http://www.dbms2.com/2011/01/28/schooner-software-onl/">Schooner</a>, <a href="http://www.dbms2.com/2011/01/25/scalebase-another-mpp-oltp-quasi-dbms/">ScaleBase</a>, <a href="http://www.dbms2.com/2008/04/13/scaledb-presents-the-revenge-of-the-pointer/">ScaleDB</a>, <a href="http://www.dbms2.com/2009/04/16/introduction-to-tokutek/">Tokutek</a>, <a href="http://www.dbms2.com/2010/04/03/akiban-highlights/">Akiban</a>, Xeround, and <a href="http://www.dbms2.com/2010/05/12/the-clustrix-story/">Clustrix</a>. The first two are to my knowledge more proven than the rest.</li>
<li>Proprietary hardware and the associated hardware/appliance pricing aren&#8217;t very appealing for these applications. That speaks against Oracle Exadata and Clustrix, and is the reason Schooner switched to a software-only strategy despite some initial appliance sales.</li>
<li>However, hardware band-aids such as solid-state drives or even <a href="http://www.dbms2.com/2010/10/19/introduction-to-kaminario/">RAM-based solid-state storage</a> could make more sense:
<ul>
<li>If, for performance, you&#8217;ve scaling out your database so that it fits in RAM on each box, you don&#8217;t really have a disk-based architecture anyway, now do you?</li>
<li>Even if you&#8217;re not doing that yet &#8212; if your problem is throughput rather than storage capacity, silicon-based storage could be a big help.</li>
<li>In principle, devices of that kind can be moved from one application to another, after the first one is rearchitected not to need them. (In practice, however, I don&#8217;t know of anybody who is doing that. I also don&#8217;t believe that Kaminario et al. are marketing that kind of idea, more&#8217;s the pity.)</li>
</ul>
</li>
<li>My notes on all this from <a href="http://www.dbms2.com/2010/04/05/oltp-database-management-systems-2/">April, 2010</a> are already badly outdated, but may be interesting anyway.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/04/19/notes-on-short-request-scale-out-mysql/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>I&#8217;m collecting data points on NoSQL and HVSP adoption</title>
		<link>http://www.dbms2.com/2010/08/18/nosql-hvsp-adoption/</link>
		<comments>http://www.dbms2.com/2010/08/18/nosql-hvsp-adoption/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 13:09:08 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Clustrix]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Groovy Corporation]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[ScaleDB]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[Zynga]]></category>
		<category><![CDATA[dbShards and CodeFutures]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2840</guid>
		<description><![CDATA[I was asked to do a magazine article on NoSQL, where by &#8220;NoSQL&#8221; is meant &#8220;whatever they talk about at NoSQL conferences.&#8221; By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about HVSP in [...]]]></description>
			<content:encoded><![CDATA[<p>I was asked to do a magazine article on NoSQL, where by &#8220;NoSQL&#8221; is meant &#8220;whatever they talk about at NoSQL conferences.&#8221; By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about <a href="http://www.dbms2.com/2010/03/13/the-naming-of-the-foo/">HVSP</a> in general, NoSQL and SQL alike.</p>
<p>It also is understood that, realistically, I can&#8217;t be expected to know and mention the very latest news for all the many products in the categories. Even so, I think this would be fine time to check just where NoSQL and HVSP adoption stand. Here is most of what I know, or links to same; it would be great if you guys would contribute additional data in the comment thread.</p>
<p>In the NoSQL area:  <span id="more-2840"></span></p>
<ul>
<li>Back in April, the VoltDB guys told me they thought Cassandra and HBase were the two NoSQL systems with the most momentum.</li>
<li>I know distressingly little about HBase adoption, but a source who may or may not wish to remain anonymous was kind enough to alert me that Twitter and StumbleUpon each have ~30 node deployments, for analytics and analytics/HVSP respectively.</li>
<li>I wrote in detail on <a href="http://www.dbms2.com/2010/07/06/riptano-and-cassandra-adoption/">Cassandra adoption</a> last month. News since then includes:
<ul>
<li>Facebook is rumored to have dropped Cassandra completely.</li>
<li><a href="http://engineering.twitter.com/2010/07/cassandra-at-twitter-today.html">Twitter clarified that it may not be quite as lovestruck by Cassandra as before</a>, but they&#8217;re still very close friends.</li>
<li>It&#8217;s not obvious that the <a href="http://www.riptano.com/blog/cassandra-summit-recap">Cassandra Summit</a> unveiled a lot of new adoption stories.</li>
</ul>
</li>
<li>Northscale&#8217;s <a href="http://www.dbms2.com/2010/08/18/northscale-membase-roadmap/">Membase</a> is still in its early days.  Zynga is bought in, however, as is something called NHN Korea. <em>(Edit: I subsequently saw NHN Korea on a prominent SEO expert&#8217;s list of the top half dozen or so search engines in the world. Who knew?)</em></li>
<li>Basho has listed a few <a href="http://www.basho.com/customers.html">Riak customers</a>. If memory serves (I haven&#8217;t spoken with Basho for a while, and some of my notes are misplaced due to some computer sloppiness), Basho has a few dozen customers in total.</li>
<li>Mozilla has <a href="http://blog.mozilla.com/data/2010/08/16/benchmarking-riak-for-the-mozilla-test-pilot-project/">a 4 machine, 64 core Riak cluster</a> in production.</li>
<li><a href="http://highscalability.com/hypertable-new-bigtable-clone-runs-hdfs-or-kfs">Hypertable</a> has a few users/project sponsors, Baidu being the biggest name among them.</li>
<li>I don&#8217;t really know how the MongoDB/10gen guys are doing. I think this is at least as much my fault as theirs. Anyhow, they seem to have <a href="http://www.10gen.com/news">links</a> to a couple of folks who have written about MongoDB usage.</li>
<li>NimbusDB is still in stealth mode. I&#8217;d be surprised if they had users  for a while yet, since in January they didn&#8217;t yet sound as if  development was very far underway. (Actually, I forget whether NimbusDB  is supposed to be SQL-based or not.)</li>
</ul>
<p>Among the SQL or SQL-friendly guys:</p>
<ul>
<li><a href="http://www.dbms2.com/2010/05/12/the-clustrix-story/">Clustrix</a> says it has a few production users, some big-name, but is not disclosing them yet.</li>
<li><a href="http://www.dbms2.com/2010/07/28/dbshards/">dbShards has around 6 customers</a>, including Facebook. (Facebook may outpace even Twitter and Zynga in using the most products mentioned in this post.)</li>
<li>As of May, <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">VoltDB</a> had one paying customer, plus 150 beta customers who weren&#8217;t in production yet.</li>
<li><a href="http://www.dbms2.com/2010/04/03/akiban-highlights/">Akiban</a> says they&#8217;ll get me up to speed on Thursday. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </li>
<li><a href="http://www.dbms2.com/2008/04/13/scaledb-presents-the-revenge-of-the-pointer/">ScaleDB</a> seems to be pedaling along in perennial beta. Whether ScaleDB has any actual beta users is less clear. On the plus side, checking that out uncovered a pretty funny <a href="http://scaledb.blogspot.com/2010/04/scaledb-introduces-clustered-database.html">April Fool blog post</a>.</li>
<li><a href="http://www.dbms2.com/2009/07/30/groovy-corp-puts-out-a-ridiculous-press-release/">Groovy Corporation</a> seems to have disappeared, or morphed into something called <a href="http://www.groovycorp.com/home.html">uCirrus</a>, or something like that.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/18/nosql-hvsp-adoption/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>ScaleDB presents The Revenge of the Pointer</title>
		<link>http://www.dbms2.com/2008/04/13/scaledb-presents-the-revenge-of-the-pointer/</link>
		<comments>http://www.dbms2.com/2008/04/13/scaledb-presents-the-revenge-of-the-pointer/#comments</comments>
		<pubDate>Sun, 13 Apr 2008 14:03:42 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data models and architecture]]></category>
		<category><![CDATA[Mid-range]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[ScaleDB]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[Relational database management systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/2008/04/13/scaledb-presents-the-revenge-of-the-pointer/</guid>
		<description><![CDATA[The MySQL user conference is upon us, and hence so are MySQL-related product announcements, including storage engines. One such is Kickfire. ScaleDB &#8212; smaller and earlier-stage &#8212; is another. In a nutshell, ScaleDB&#8217;s proposition is: Innovative approach to indexing relational DBMS, providing performance advantages. Shared-everything scale-up that ScaleDB believes will leapfrog the MySQL engine competition [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in">The MySQL user conference is upon us, and hence so are MySQL-related product announcements, including storage engines.  One such is <a href="http://www.dbms2.com/2008/04/08/kickfire-is-de-cloaking/">Kickfire.</a> <a href="http://www.scaledb.com/">ScaleDB</a> &#8212; smaller and earlier-stage &#8212;  is another.</p>
<p style="margin-bottom: 0in">In a nutshell, ScaleDB&#8217;s proposition is:</p>
<ul>
<li>
<p style="margin-bottom: 0in">Innovative approach to indexing relational DBMS, providing performance advantages.</p>
</li>
<li>
<p style="margin-bottom: 0in">Shared-everything scale-up that 	ScaleDB believes will leapfrog the MySQL engine competition already 	in Release 1. (In my opinion, this is the least plausible part of 	the ScaleDB story.)</p>
</li>
<li>
<p style="margin-bottom: 0in">State-of-the-art me-too facilities 	for locking, logging, replication/fail-over, etc., also already in 	Release 1.</p>
</li>
</ul>
<p style="margin-bottom: 0in">Like many software companies with non-US roots, ScaleDB seems to have started with a single custom project, using a <em>Patricia trie</em> indexing system.  Then they decided Patricia tries might be really useful for relational OLTP as well.  The ScaleDB team now features four developers, plus half-time or so “Chief Architect” involvement from <a href="http://vcwatts.org/ibm_story.html">Vern Watts</a>.  Watts seems to pretty much have been Mr. IMS for the past four decades, and thus surely knows a whole lot about pointer-based database management systems; presumably, he&#8217;s responsible for the generic DBMS design features that are being added to the innovative indexing scheme.  On ScaleDB&#8217;s advisory board is PeopleSoft veteran Rick Berquist, about whom I&#8217;ve had fond thoughts ever since he talked me into focusing on consulting as the core of my business.*</p>
<p style="margin-bottom: 0in"><em>*More precisely, Rick pretty much tricked me into doing a day of consulting for $15K, then revealed that&#8217;s what he&#8217;d done, expressing the thought that he&#8217;d very much gotten his money&#8217;s worth.  But I digress &#8230;</em></p>
<p style="margin-bottom: 0in">ScaleDB has no customers to date, but hopes to be in beta by the end of this year.  Angels and a small VC firm have provided bridge loans; otherwise, ScaleDB has no outside investment.  ScaleDB&#8217;s business model thoughts include: <span id="more-403"></span></p>
<ul>
<li>
<p style="margin-bottom: 0in">$1,000/server/year license fee, or 	something in that range.</p>
</li>
<li>
<p style="margin-bottom: 0in">Early focus on Web 2.0 kinds of 	customers (e.g., social networking companies may enjoy the join 	performance ScaleDB plans to offer).</p>
</li>
<li>
<p style="margin-bottom: 0in">Early focus on MySQL OLTP (but, 	like proud parents everywhere, they think the technology is so 	wonderful that it could eventually be pretty much all things to all 	people).</p>
</li>
</ul>
<p style="margin-bottom: 0in">The company is based in Menlo Park, CA.</p>
<p style="margin-bottom: 0in; font-style: normal">Probably I should explain what Patricia tries actually are, and how they can help relational DBMS. An ordinary <em>trie*</em> is a way of indexing data that looks a lot like – unsurprisingly – a tree.  For example, suppose you need to index a lot of character strings, each consisting of lower-case Latin letters.  From the root node you point to the 26 possibilities for starting letter.  From those you point to the next possible letter, and so on.  Combinatorial explosion is averted because you only have edges if there&#8217;s actually a string with that letter combination.  Thus, when indexing a corpus of classic novels, there might be a path i-t-i-s-a-t-r-u-t-h-u-n-&#8230; and so on, but none that starts i-a-u-z-z-z.</p>
<p style="margin-bottom: 0in"><em><span style="text-decoration: none">*”Trie” is sometimes </span>pronounced like “tree”, sometimes like “try.”</em></p>
<p style="margin-bottom: 0in; font-style: normal">Patricia tries add a now-obvious compression technique.  Namely, if there&#8217;s only one branch from a node, just collapse it.  Thus, the example I gave above would become something more like i-t-i-s-a-truth-universally-acknowledged-&#8230;, or perhaps something even more compact.</p>
<p style="margin-bottom: 0in"><span style="font-style: normal">While these ideas were evidently invented with text documents in mind, there&#8217;s no reason they can&#8217;t be applied to other kinds of strings – specifically, to those stored in relational databases.  (And numbers can just be treated as strings of bits.) As I wrote last year in <a href="http://www.dbms2.com/2007/06/22/in-memory-database-solid/">discussing solidDB</a>, which uses a similar approach:</span></p>
<blockquote><p>The canonical index structure in a disk-centric OLTP RDBMS is a tree of blocks. The record sought is in a block somewhere. There are index blocks whose entries are pointers to the correct block based on values in the index column. There are index blocks of pointers to other index blocks. And so on. One can traverse these trees in very few steps, but each step is costly, because each step involves examining the whole block.</p>
<p>SolidDB, by way of contrast, uses a core index structure called the <em>trie.</em> The key value on which the record search is based is divided into chunks of bits. Each chunk leads to a tree node with a small number of choices for the next chunk. There are more steps, but each step is much cheaper.</p>
<p>Benefits of this strategy include compression and in-memory performance.  But a naive implementation would, as in other pointer-based systems, lead to unacceptable disk thrashing.  ScaleDB&#8217;s answer is to layer the index, essentially creating a “trie of tries.”  The company confidently claims that, in almost all cases, data can be found via a single disk read.  Part of that story is the assertion that their indexing scheme achieves tremendous compression vs. conventional b-trees.</p></blockquote>
<p style="margin-bottom: 0in">So far, that all sounds like a performance win, of unclear magnitude.  (ScaleDB says it&#8217;s hoping for a 3X or better performance advantage versus traditional b-tree-based approaches.)  <span>But there&#8217;s another cool part as well.</span> The ScaleDB trie doesn&#8217;t necessarily end with the first row it finds; it also reaches through to capture foreign-key relationships.  E.g., if customer FOO123 places an order with OrderID BAR456, the BAR456 isn&#8217;t just found via the path B-A-R-4-5-6. It also can be found via FOO-1-2-3-BAR-456.  Thus, <strong>referential integrity and updatable views are baked into the core database management architecture.</strong></p>
<p style="margin-bottom: 0in">I look forward to seeing how this all works out, in Release 1 and beyond.</p>
<p><em>Edit: One way to think of this as the integration of the network and relational data models, ala IDMS/R, but with more compact linked lists.  And I believe Predrag Dizdarevic when he tells me IDMS/R did wind up working pretty well, in a rare instance of a DBMS technology success post acquisition by CA.</em></p>
<p style="margin-bottom: 0in"><em><strong></strong></em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2008/04/13/scaledb-presents-the-revenge-of-the-pointer/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

