<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Introduction to Tokutek</title>
	<atom:link href="http://www.dbms2.com/2009/04/16/introduction-to-tokutek/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 09:22:14 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: More on NoSQL and HVSP (or OLRP) &#124; DBMS 2 : DataBase Management System Services</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-182214</link>
		<dc:creator>More on NoSQL and HVSP (or OLRP) &#124; DBMS 2 : DataBase Management System Services</dc:creator>
		<pubDate>Thu, 26 Aug 2010 09:10:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-182214</guid>
		<description>[...] same site says Tokutek finally was able to raise some VC. [...]</description>
		<content:encoded><![CDATA[<p>[...] same site says Tokutek finally was able to raise some VC. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Some NoSQL links &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-162021</link>
		<dc:creator>Some NoSQL links &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Sat, 13 Mar 2010 16:58:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-162021</guid>
		<description>[...] hand, he praised or at least expressed hope for a variety of MySQL-related technologies, including Tokutek&#8217;s TokuDB and Continuent&#8217;s [...]</description>
		<content:encoded><![CDATA[<p>[...] hand, he praised or at least expressed hope for a variety of MySQL-related technologies, including Tokutek&#8217;s TokuDB and Continuent&#8217;s [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Calpont update &#8212; you read it here first! &#124; DBMS2 -- DataBase Management System Services</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117395</link>
		<dc:creator>Calpont update &#8212; you read it here first! &#124; DBMS2 -- DataBase Management System Services</dc:creator>
		<pubDate>Mon, 20 Apr 2009 07:15:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117395</guid>
		<description>[...] Calpont plans to offer a MySQL storage engine for analytic database processing. Thus, Calpont will compete with Infobright, Kickfire, and perhaps Tokutek. [...]</description>
		<content:encoded><![CDATA[<p>[...] Calpont plans to offer a MySQL storage engine for analytic database processing. Thus, Calpont will compete with Infobright, Kickfire, and perhaps Tokutek. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Weinreb</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117177</link>
		<dc:creator>Daniel Weinreb</dc:creator>
		<pubDate>Fri, 17 Apr 2009 14:41:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117177</guid>
		<description>@Seth: Sure, a lot of what I wrote is just background, in order to provide the context to explain about Tokutek&#039;s index.  I intended it for readers with less experience than you.

I don&#039;t know whether it&#039;s aimed at analytics users.  I have a tendency to think in terms of OLTP, since I am working on an airline reservation system.

As you say, the fact that it has a MySQL front end may mean that they have in mind LAMP-stack web sites, Drupal sites, etc.  I don&#039;t know.  And also as you say, streaming technology is good for caching recenbt data in memory.  I don&#039;t know whether they are aiming at applications with that property.

As Pror. Stonebraker says these days, one size does not fit all.  There are many useful architectures now, and for high performance you have to be careful to match the architecture to the use cases you care about.

ObjectStore, which I co-architected, was aimed at C++ programmer who wanted language transparency, sharing, ability to take advantage of powerful client-side computers, operating on data with good spatial and temporal locality, and not doing complex queries.  (There were plenty of customers who were happy with that and Object Design was very successful.)  That&#039;s another example of a different (in some ways) DBMS architecture.</description>
		<content:encoded><![CDATA[<p>@Seth: Sure, a lot of what I wrote is just background, in order to provide the context to explain about Tokutek&#8217;s index.  I intended it for readers with less experience than you.</p>
<p>I don&#8217;t know whether it&#8217;s aimed at analytics users.  I have a tendency to think in terms of OLTP, since I am working on an airline reservation system.</p>
<p>As you say, the fact that it has a MySQL front end may mean that they have in mind LAMP-stack web sites, Drupal sites, etc.  I don&#8217;t know.  And also as you say, streaming technology is good for caching recenbt data in memory.  I don&#8217;t know whether they are aiming at applications with that property.</p>
<p>As Pror. Stonebraker says these days, one size does not fit all.  There are many useful architectures now, and for high performance you have to be careful to match the architecture to the use cases you care about.</p>
<p>ObjectStore, which I co-architected, was aimed at C++ programmer who wanted language transparency, sharing, ability to take advantage of powerful client-side computers, operating on data with good spatial and temporal locality, and not doing complex queries.  (There were plenty of customers who were happy with that and Object Design was very successful.)  That&#8217;s another example of a different (in some ways) DBMS architecture.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jerome</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117169</link>
		<dc:creator>Jerome</dc:creator>
		<pubDate>Fri, 17 Apr 2009 13:39:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117169</guid>
		<description>Ouch that hurts :)</description>
		<content:encoded><![CDATA[<p>Ouch that hurts <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117166</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Fri, 17 Apr 2009 13:02:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117166</guid>
		<description>Seth,

Nothing that comes to mind is faster than a streaming engine integrated with a DBMS.  But for use cases where Tokutek does the job, it could be simpler/cheaper than that approach.</description>
		<content:encoded><![CDATA[<p>Seth,</p>
<p>Nothing that comes to mind is faster than a streaming engine integrated with a DBMS.  But for use cases where Tokutek does the job, it could be simpler/cheaper than that approach.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117165</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Fri, 17 Apr 2009 12:59:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117165</guid>
		<description>Jerome,

Not at this time. But I took pity on them and did a favor anyway.</description>
		<content:encoded><![CDATA[<p>Jerome,</p>
<p>Not at this time. But I took pity on them and did a favor anyway.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jerome</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117163</link>
		<dc:creator>Jerome</dc:creator>
		<pubDate>Fri, 17 Apr 2009 12:33:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117163</guid>
		<description>Curt, are they a client of yours?
Thanks.</description>
		<content:encoded><![CDATA[<p>Curt, are they a client of yours?<br />
Thanks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Seth Grimes</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117162</link>
		<dc:creator>Seth Grimes</dc:creator>
		<pubDate>Fri, 17 Apr 2009 12:21:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117162</guid>
		<description>Dan Weinreb, you&#039;re restating several years worth of justification for non-b-tree approaches to management of data for for analytical use put forward by various academics and software companies, in particular column-store backers.  No problem there.  I just wanted to point it out.

Is reliance on a &quot;cache-oblivious dynamic search tree as an alternative to the ubiquitious [sic] B-tree&quot; (quote from the paper you cite) meant to appeal to analytics users?  B-trees are good for random-access retrieval of relatively small numbers of records and the implication of the term &quot;search tree&quot; is that it would be similarly suited.

If the goal is fast data availability, why not look at a streaming-data engine that caches good volumes of recent data in memory?  Or is the advantage here that this software is a MySQL engine?</description>
		<content:encoded><![CDATA[<p>Dan Weinreb, you&#8217;re restating several years worth of justification for non-b-tree approaches to management of data for for analytical use put forward by various academics and software companies, in particular column-store backers.  No problem there.  I just wanted to point it out.</p>
<p>Is reliance on a &#8220;cache-oblivious dynamic search tree as an alternative to the ubiquitious [sic] B-tree&#8221; (quote from the paper you cite) meant to appeal to analytics users?  B-trees are good for random-access retrieval of relatively small numbers of records and the implication of the term &#8220;search tree&#8221; is that it would be similarly suited.</p>
<p>If the goal is fast data availability, why not look at a streaming-data engine that caches good volumes of recent data in memory?  Or is the advantage here that this software is a MySQL engine?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Weinreb</title>
		<link>http://www.dbms2.com/2009/04/16/introduction-to-tokutek/#comment-117152</link>
		<dc:creator>Daniel Weinreb</dc:creator>
		<pubDate>Fri, 17 Apr 2009 10:53:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=752#comment-117152</guid>
		<description>Curt, may I rephrase your description slightly?  The way I&#039;d put it is: Tokutek&#039;s technology is based on a fundamentally new index structure/algorithm.  Many database experts thought that the B-tree was the ultimate best answer, but this new one has a huge advantage over ordinary B-trees: inserts are much, much faster.  This is an impressive and fundamental new concept, widely applicable.

In a conventional relational database system, you have a tradeoff.  Adding more indexes makes queries (that use whatever columns are indexed) faster, but it makes updates slower because you have to update all those indexes (and write back the pages they&#039;re stored on).  Tokutek&#039;s basic idea is that now you&#039;re free to put in as many indexes as you want (hence the reason it&#039;s good for query-orinted uses), because the indexes do not slow down writes the way ordinary B-tree indexes would.

The &quot;fractal tree&quot; algorithm has been published.  It&#039;s based on the &quot;Cache Oblivious&quot; concept; see Wikipedia, which in turn points you to a nice explanation by Prof. Eric Demaine of MIT.  The &quot;fractal tree&quot; index papers are pointed to from http://supertech.csail.mit.edu/cacheObliviousBTree.html.  Note that the name &quot;fractal tree&quot; was introduced by Tokutek and is not used in these papers.  They use names like &quot;cache-oblivious search trees&quot;, just so you know that that&#039;t the thing we&#039;re talking about.  &quot;Fractal Tree&quot;, in my opinion, is a reasonable name from a technical point of view and better from a marketing point of view.  Naturally, Tokutek and the inventors have patents pending.  Bradley Kuszmaul of MIT, Michael Bender of Stony Brook U., and Martin Farach-Colton of Rutgers are the inventors.

It seems to me that doing ACID transactions will still incur costs for committing the index changes to disk, if you implement it the most obvious way.  Presumably they will implement it more cleverly than the obvious one.  This is not discussed in the academic papers.  I speculate that this is why they have not yet implemented ACID -- it&#039;s not as easy as just following the published methods.  But that&#039;s just a speculation.</description>
		<content:encoded><![CDATA[<p>Curt, may I rephrase your description slightly?  The way I&#8217;d put it is: Tokutek&#8217;s technology is based on a fundamentally new index structure/algorithm.  Many database experts thought that the B-tree was the ultimate best answer, but this new one has a huge advantage over ordinary B-trees: inserts are much, much faster.  This is an impressive and fundamental new concept, widely applicable.</p>
<p>In a conventional relational database system, you have a tradeoff.  Adding more indexes makes queries (that use whatever columns are indexed) faster, but it makes updates slower because you have to update all those indexes (and write back the pages they&#8217;re stored on).  Tokutek&#8217;s basic idea is that now you&#8217;re free to put in as many indexes as you want (hence the reason it&#8217;s good for query-orinted uses), because the indexes do not slow down writes the way ordinary B-tree indexes would.</p>
<p>The &#8220;fractal tree&#8221; algorithm has been published.  It&#8217;s based on the &#8220;Cache Oblivious&#8221; concept; see Wikipedia, which in turn points you to a nice explanation by Prof. Eric Demaine of MIT.  The &#8220;fractal tree&#8221; index papers are pointed to from <a href="http://supertech.csail.mit.edu/cacheObliviousBTree.html" rel="nofollow">http://supertech.csail.mit.edu/cacheObliviousBTree.html</a>.  Note that the name &#8220;fractal tree&#8221; was introduced by Tokutek and is not used in these papers.  They use names like &#8220;cache-oblivious search trees&#8221;, just so you know that that&#8217;t the thing we&#8217;re talking about.  &#8220;Fractal Tree&#8221;, in my opinion, is a reasonable name from a technical point of view and better from a marketing point of view.  Naturally, Tokutek and the inventors have patents pending.  Bradley Kuszmaul of MIT, Michael Bender of Stony Brook U., and Martin Farach-Colton of Rutgers are the inventors.</p>
<p>It seems to me that doing ACID transactions will still incur costs for committing the index changes to disk, if you implement it the most obvious way.  Presumably they will implement it more cleverly than the obvious one.  This is not discussed in the academic papers.  I speculate that this is why they have not yet implemented ACID &#8212; it&#8217;s not as easy as just following the published methods.  But that&#8217;s just a speculation.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

