July 18, 2012

Clustrix 4.0 and other Clustrix stuff

It feels like time to write about Clustrix, which I last covered in detail in May, 2010, and which is releasing Clustrix 4.0 today. Clustrix and Clustrix 4.0 basics include:

Clustrix makes a short-request processing appliance.
As you might guess from the name, Clustrix is clustered — peer-to-peer, with no head node.
The Clustrix appliance uses flash/solid-state storage.
Traditionally, Clustrix has run a MySQL-compatible DBMS.
Clustrix 4.0 introduces JSON support. More on that below.
Clustrix 4.0 introduces a bunch of administrative features, and parallel backup.
Also in today’s announcement is a Rackspace partnership to offer Clustrix remotely, at monthly pricing.
Clustrix has been shipping product for about 4 years.
Clustrix has 20 customers in production, running >125 Clustrix nodes total.
Clustrix has 60 people.
List price for a (smallest size) Clustrix system is $150K for 3 nodes. Highest-end maintenance costs 15%.
There’s also a $100K version meant for high availability/disaster recovery. Over half of Clustrix’s customers use off-site disaster recovery.
Clustrix is raising a C round. Part of it has already been raised from insiders, as a kind of bridge.

The biggest Clustrix installation seems to be 20 nodes or so. Others seem to have 10+. I presume those disaster recovery customers have 6 or more nodes each. I’m not quite sure how the arithmetic on that all works; perhaps the 125ish count of nodes is a bit low.

Clustrix technical notes include:

Clustrix is MVCC (Multi-Version Concurrency Control).
Clustrix exploits MVCC to allow online, lockless schema changes. Clustrix says these changes are typically single-column, for example an add or a widening/datatype change.
Clustrix indexes are a mix of b-trees and log-structured merge files.
Clustrix sounds like it’s paid attention to being multi-core. For example, DR replication is via parallel, multi-core log streaming, going single-core only when transactions have the potential to influence each other.
MySQL features Clustrix lacks include triggers and XML support.
Clustrix uses MLC flash.

Clustrix doesn’t have compression, with the usual excuse of excessive CPU cost. When I pointed out that dictionary/token compression is cheap, Clustrix cofounder/CTO Sergei Tsarev suggested that it doesn’t make sense now due to high cardinalities in OLTP workloads, but could become more important as more analytic use cases emerge.

Clustrix’ JSON story seems to be:

The JSON goes into a relational column.
Fields inside a JSON document can be indexed.
One can then reference those fields in SQL just as if they were relational columns, including in joins.
If you’re reckless when joining on multi-valued fields, trouble could in theory ensue.

That sounds a lot like other schemes for sticking documents into relational BLOBs/CLOBs (Binary/Character Large OBjects), although it happens to be the first time I’ve heard it in connection with JSON.

Clustrix has one cool idea I haven’t heard from anybody else, which I’m calling index distribution. The idea is that each index can be distributed differently across the cluster (this includes the JSON secondary indexes), i.e. on different distribution keys. Clustrix thinks that paying special attention to index distribution and movement is helpful to the performance of distributed joins.

I still wish Clustrix were available on a software-only/bring your own hardware/bring your own cloud basis. Absent that, pricing and lock-in are concerns. True, I didn’t immediately see any flaws in Clustrix’ claims that its Rackspace offering was at once cheaper and more performant than MySQL on Amazon; but then, Amazon isn’t always that cost-effective an option. Price aside, Clustrix does sound as if it’s one of a number of appealing NewSQL options, and probably even one of the (relatively speaking) more proven ones.

Categories: Cloud computing, Clustering, Clustrix, Database compression, Market share and customer counts, MySQL, OLTP, Pricing, Structured documents

Subscribe to our complete feed!

Comments

4 Responses to “Clustrix 4.0 and other Clustrix stuff”

Clustrix 4.0 Review – Leigh Anne Varney's Blog on August 10th, 2012 12:23 pm

[…] Read the article in DBMS2 […]
NewSQL thoughts | DBMS 2 : DataBase Management System Services on January 6th, 2013 6:54 am

[…] vendors I’ve written about in the past include Akiban, Tokutek, CodeFutures (dbShards), Clustrix, Schooner (Membrain), VoltDB, ScaleBase, and ScaleDB, with GenieDB and NuoDB coming […]
A data distribution idea at Vertica and Clustrix | DBMS 2 : DataBase Management System Services on May 3rd, 2013 3:37 pm

[…] Yesterday I wrote: Clustrix has one cool idea I haven’t heard from anybody else, which I’m calling index distribution. The idea is that each index can be distributed differently across the cluster … i.e. on different distribution keys. Clustrix thinks that paying special attention to index distribution and movement is helpful to the performance of distributed joins. […]
Christine Lieu on November 22nd, 2013 1:17 pm

“I still wish Clustrix were available on a software-only/bring your own hardware/bring your own cloud basis”

Your wish has come true! http://www.clustrix.com/get-software

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Clustrix 4.0 and other Clustrix stuff

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin