August 31, 2013

Tokutek’s interesting indexing strategy

The general Tokutek strategy has always been:

But the details of “writes indexes efficiently” have been hard to nail down. For example, my post about Tokutek indexing last January, while not really mistaken, is drastically incomplete.

Adding further confusion is that Tokutek now has two product lines:

TokuMX further adds language support for transactions and a rewrite of MongoDB’s replication code.

So let’s try again. I had a couple of conversations with Martin Farach-Colton, who:

The core ideas of Tokutek’s architecture start:

A central concept is the interplay between the buffers and the write load.

Early on Tokutek made the natural choice to flush buffers when they were touched by a query, but now buffers are just flushed when the total buffer pool runs out of space, fullest buffer first.

This all raises the question — what is a “message”? It turns out that there are a lot of possibilities.  Four of the main ones are:

Since messages are a big part of what’s stored at a node, and they can have a variety of formats, columnar compression would be hard to implement. Instead, Tokutek offers a variety of standard block-level approaches.

A natural question to ask about OLTP (OnLine Transaction Processing) and other short-request DBMS is “When are there locks and latches?” Four cases came up:

I forgot to ask whether the locks at buffer flushing time cause performance hiccups.

Other notes include:

And finally — Tokutek has been slow to offer MySQL scale-out, but with the MongoDB version, scale-out is indeed happening. One would think that data could just be distributed among nodes in one of the usual ways, with all the indexes pertaining to that data stored at the same node as the data itself. So far as I can tell, that’s pretty close to being exactly what happens.


4 Responses to “Tokutek’s interesting indexing strategy”

  1. BohuTANG on September 1st, 2013 12:33 am

    ‘I forgot to ask whether the locks at buffer flushing time cause performance hiccups.’
    –no, or it’s less.
    because the flushing locking-granularity is very small, and they are all in background threads.
    But the sharp checkpoint may cause performance hiccups, that’s a tradeoff, we don’t need an idempotent operation when recovering.

    ‘It seems that the branching factor is in line with a Bε-tree’
    — branching fanout is between 4 and 16, much lower than the b-tree, this is a result of large block.


  2. Layering of database technology & DBMS with multiple DMLs | DBMS 2 : DataBase Management System Services on September 8th, 2013 4:52 am

    […] TokuMX, the Tokutek/MongoDB hybrid I just blogged about. […]

  3. Thoughts and notes, Thanksgiving weekend 2014 | DBMS 2 : DataBase Management System Services on November 30th, 2014 8:49 pm

    […] ship with two storage engines – the traditional one and a new one from WiredTiger (but not TokuMX). Both will be equally supported by MongoDB, the company, although there surely are some tiers of […]

  4. MongoDB 3.0 | DBMS 2 : DataBase Management System Services on September 8th, 2015 8:05 pm

    […] of any other storage engines using this architecture at this time. In particular, last I heard TokuMX was not an example. (Edit: Actually, see Tim Callaghan’s comment […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.