Alternate title: TokuDB updates
Tokutek turns a performance argument into a functionality one. In particular, Tokutek claims that TokuDB does a much better job than alternatives of making it practical for you to update indexes at OLTP speeds. Hence, it claims to do a much better job than alternatives of making it practical for you to write and execute queries that only make sense when indexes (or other analytic performance boosts) are in place.
That’s all been true since I first wrote about Tokutek and TokuDB in 2009. However, TokuDB’s technical details have changed. In particular, Tokutek has deemphasized the ideas that:
- Vaguely justified the “fractal” metaphor, namely …
- … the stuff in that post about having one block each sized for each power of 2, …
- … which seem to be a form of what is more ordinarily called “cache-oblivious” technology.
Rather, Tokutek’s new focus for getting the same benefits is to provide a separate buffer for each node of a b-tree. In essence, Tokutek is taking the usual “big blocks are better” story and extending it to indexes. TokuDB also uses block-level compression. Notes on that include:
- It’s LZMA.
- It’s expensive to write, cheap to read.
- 5X compression is common, 9X happens, and higher figures yet happen in a few edge cases.
- LZMA detects and compresses repeated values, so it has some of the benefits of tokenization.
- However, TokuDB has to decompress data before operating on it.
Somewhat like NuoDB, Tokutek talks in terms of sending messages to blocks. The TokuDB durability story involves streaming messages to disk and also checkpointing all dirty blocks to disk every minute or so. Further, TokuDB has an online schema change approach based on broadcasting messages about various column operations (delete, add w/ default value, etc.)
- Like most other RDBMS vendors I talk with, Tokutek goes for MVCC (Multi-Version Concurrency Control), if for no other reason than to obviate a need for read locks.
- TokuDB doesn’t have much in the way of a scale-out story. But as for any other NewSQL vendor of whom that’s true — e.g. Akiban — expect that to change. And even if it doesn’t, one could use TokuDB in conjunction with a transparent sharding tool such as dbShards.
- For more technical detail, Tokutek offers a web page with several detailed slide decks and so on.
And finally, Tokutek company basics include:
- 15-16 employees.
- A few more paying customers than those logoed on its website.
- Free customers beyond that. (TokuDB is free under 50 GB.)
- Notwithstanding the meaningless of the phrase, “Fractal Tree indexing” is Tokutek’s story and it’s sticking to it.