Clustrix – DBMS 2 : DataBase Management System Services

MemSQL update

Curt Monash — Fri, 02 May 2014 03:40:39 +0000

I stopped by MemSQL last week, and got a range of new or clarified information. For starters:

Even though MemSQL (the product) was originally designed for OLTP (OnLine Transaction Processing), MemSQL (the company) is now focused on analytic use cases …
… which was the point of introducing MemSQL’s flash-based columnar option.
One MemSQL customer has a 100 TB “data warehouse” installation on Amazon.
Another has “dozens” of terabytes of data spread across 500 machines, which aggregate 36 TB of RAM.
At customer Shutterstock, 1000s of non-MemSQL nodes are monitored by 4 MemSQL machines.
A couple of MemSQL’s top references are also Vertica flagship customers; one of course is Zynga.
MemSQL reports encountering Clustrix and VoltDB in a few competitive situations, but not NuoDB. MemSQL believes that VoltDB is still hampered by its traditional issues — Java, reliance on stored procedures, etc.

On the more technical side:

Some MemSQL users are running 7- or 8-way joins and other long-ish SQL statements.
But MemSQL doesn’t yet have fully peer-to-peer data redistribution.
- MemSQL “leaves” only talk to MemSQL “aggregator nodes,” not each other …
- … but note the plural on “aggregator nodes”, which should immunize MemSQL from the worst of “fat head” bottlenecks.
- Of course, you can sometimes get join locality by sharding multiple tables on the same key …
- … or by broadcast-replicating tables that are sufficiently small.
Better SQL coverage — e.g. SQL Windowing — is coming soon.
MemSQL believes it has an aggressive data skipping story.
MemSQL doesn’t yet have a true workload management story; they’re still at the stage “Our queries run so fast not many of them have to be active at once, and if things nevertheless get too busy we have some throttling capabilities.” But MemSQL at least sounds aware of the difference between that and true workload management, which puts them ahead of some other vendors I talk with.
MemSQL doesn’t have stored procedures. In particular, since MemSQL (the product) generates code on the fly, MemSQL (the company) doesn’t think the performance benefits of stored procedure pre-compilation are needed.

And finally, MemSQL’s column-store compression story — which I mangled in a previous post — goes like this:

There are numerous compression algorithm choices, both columnar (e.g. dictionary/tokenization, run-length encoding) and block (Lempel-Ziv, I presume in multiple variations).
Compression is block-by-block, something I hear more commonly these days than Vertica’s alternative of global compression choices.
The choice of compression scheme is automagic for each block, unless you give explicit hints.
Default block size for the columnar store is 10 million rows.

Comments on the 2013 Gartner Magic Quadrant for Operational Database Management Systems

Curt Monash — Fri, 08 Nov 2013 16:46:46 +0000

The 2013 Gartner Magic Quadrant for Operational Database Management Systems is out. “Operational” seems to be Gartner’s term for what I call short-request, in each case the point being that OLTP (OnLine Transaction Processing) is a dubious term when systems omit strict consistency, and when even strictly consistent systems may lack full transactional semantics. As is usually the case with Gartner Magic Quadrants:

I admire the raw research.
The opinions contained are generally reasonable (especially since Merv Adrian joined the Gartner team).
Some of the details are questionable.
There’s generally an excessive focus on Gartner’s perception of vendors’ business skills, and on vendors’ willingness to parrot all the buzzphrases Gartner wants to hear.
The trends Gartner highlights are similar to those I see, although our emphasis may be different, and they may leave some important ones out. (Big omission — support for lightweight analytics integrated into operational applications, one of the more genuine forms of real-time analytics.)

Anyhow:

The 2013 Gartner Magic Quadrant for Operational Database Management Systems puts Oracle in the lead, closely followed in some order Microsoft, SAP, and IBM, with everybody else way behind. That’s reasonable, harkening back to the time when Oracle, IBM, Microsoft and to some extent Sybase were seemingly secure oligopolists, and most of the other vendors mentioned didn’t yet exist.
Gartner seems to view a proprietary appliance strategy as good for customers, without mentioning that it’s also a way to sell hardware at ridiculous prices.
Gartner evidently likes memory-centric positioning. SAP, Aerospike, VoltDB and McObject all get surprisingly high marks.
Gartner gives Intersystems pretty high marks, while Progress Software isn’t even mentioned. Despite Progress’ recent restructuring, I’d think the core Progress OpenEdge business — arguably Intersystems’ closest rival — deserves more respect than that. (But given how rarely I write about it myself, perhaps I shouldn’t criticize.)
Gartner has long been oddly positive on Actian, which is a floundering hodgepodge of half a dozen database also-rans. I like Mike Hoskins a lot too, but just how much has Actian’s supposedly “energized” “strong leadership” accomplished in the recent past, at Actian or elsewhere?
Gartner has brutally low “vision” rankings for NuoDB and Clustrix. I think scaling out SQL effectively is more impressive than that. Gartner also omits to mention Clustrix’s past as an appliance vendor.
Gartner refers to Oracle’s multi-tenancy support as if … well, as if it supported multi-tenancy.
I don’t understand Gartner’s rankings of or comments about NoSQL vendors. For example:
- Three “strengths” are mentioned for MongoDB, yet none reference MongoDB’s developer outreach, which may be second only to prime Microsoft’s.
- HBase is discussed as if the Hadoop vendors were still pushing it hard, or if it were showing up in a lot of enterprise evaluations.
- Geo-distribution is mentioned as a strength for Riak, yet not for Cassandra.
Every Gartner Magic Quadrant (or Forrester Wave) features one or more outright brain cramps. In this one:
- Gartner writes “the Clustrix database offers no support for data types beyond traditional relational types,” when in fact Clustrix was one of the early indicators of a trend toward relational DBMS JSON support.
- Gartner suggests that EnterpriseDB’s Oracle compatibility is something new, when it was actually the company’s whole strategy 6-7 years ago.

Finally, since I’ve struggled with the definition of “DBMS”, I’ll finish by quoting with approval the start of Gartner’s:

We define a DBMS as a complete software system used to define, create, manage, update and query a database.

Related links

Comments on the most recent Gartner Magic Quadrant for Data Warehouse Database Management Systems
My definition of operational analytics

Introduction to Deep Information Sciences and DeepDB

Curt Monash — Sun, 14 Apr 2013 04:33:17 +0000

I talked Friday with Deep Information Sciences, makers of DeepDB. Much like TokuDB — albeit with different technical strategies — DeepDB is a single-server DBMS in the form of a MySQL engine, whose technology is concentrated around writing indexes quickly. That said:

DeepDB’s indexes can help you with analytic queries; hence, DeepDB is marketed as supporting OLTP (OnLine Transaction Processing) and analytics in the same system.
DeepDB is marketed as “designed for big data and the cloud”, with reference to “Volume, Velocity, and Variety”. What I could discern in support of that is mainly:
- DeepDB has been tested at up to 3 terabytes at customer sites and up to 1 billion rows internally.
- Like most other NewSQL and NoSQL DBMS, DeepDB is append-only, and hence could be said to “stream” data to disk.
- DeepDB’s indexes could at some point in the future be made to work well with non-tabular data.*
- The Deep guys have plans and designs for scale-out — transparent sharding and so on.

*For reasons that do not seem closely related to product reality, DeepDB is marketed as if it supports “unstructured” data today.

Other NewSQL DBMS seem “designed for big data and the cloud” to at least the same extent DeepDB is. However, if we’re interpreting “big data” to include multi-structured data support — well, only half or so of the NewSQL products and companies I know of share Deep’s interest in branching out. In particular:

Akiban definitely does. (Note: Stay tuned for some next-steps company news about Akiban.)
Tokutek has planted a small stake there too.
Key-value-store-backed NuoDB and GenieDB probably leans that way. (And SanDisk evidently shut down Schooner’s RDBMS while keeping its key-value store.)
VoltDB, Clustrix, ScaleDB and MemSQL seem more strictly tabular, except insofar as text search is a requirement for everybody. (Edit: Oops; I forgot about Clustrix’s approach to JSON support.)

Edit: MySQL has some sort of an optional NoSQL interface, and hence so presumably do MySQL-compatible TokuDB, GenieDB, Clustrix, and MemSQL.

Also, some of those products do not today have the transparent scale-out that Deep plans to offer in the future.

Among the 10 people listed as part of Deep Information Sciences’ team, I noticed 2 who arguably had DBMS industry experience, in that they worked at virtualization vendor Virtual Iron, and stayed on for a while after Virtual Iron was bought by Oracle. One of them, Chief Scientist & Architect Tom Hazel, also was at Akiban for a few months, where he did actually work on a DBMS. Other Deep Information Sciences notes include:

Deep has 25 or so people in all.
Deep had a recent $10 million funding round.
Deep Information Sciences is the former Cloudtree, which as of February, 2011 was pursuing quite a different strategy. (Evidently there was a pivot.) Deep was founded in 2010.
There are 2 paying customers for DeepDB, even though it’s still in beta, and 8 trials. A similar number of trials and strategic partners are queued up.
DeepDB general availability is expected later this quarter.

Although our call was blessedly technical, we didn’t have a chance to go through the DeepDB architecture in great detail. That said, DeepDB seems to store data in all of 3 ways:

An in-memory row store.
An on-disk row store with a very different architecture.
Indexes, which can also serve as a column store.

Notes on that include:

DeepDB’s in-memory row store is designed to manage single rows as much as possible, rather than pages. Indeed, there are “aspects of tries”, although we didn’t drill down into what exactly that meant.
Indexes are streamed to disk no less than once every 15 seconds, by default, and perhaps with latency as low as 10 milliseconds.
Perhaps the most important point I didn’t grasp is “segments”. The data and indexes on disk are stored in segments, which can be of different sizes, and which may each carry some summary data/metadata/whatever. Somehow, this is central to DeepDB’s design.
In what is evidently a design focus, DeepDB tries to get the benefit of “in-memory data” that isn’t actually taking up RAM. B-trees can point at rows that aren’t actually in memory. Segments evicted from cache can leave some metadata or summary data behind.
DeepDB’s compression story seems to be a work in progress.
- There’s prefix compression already, at least in the indexes, which Deep just calls “compaction”.
- Other compression is working in the lab, but not scheduled for Version 1.0.
  - Block compression seems to be in play.
  - Delta compression was mentioned once
  - Dictionary compression wasn’t mentioned at all.
- DeepDB apparently will keep compressed data in cache, then decompress it to operate on it.
- Different segments can be compressed/uncompressed differently.
DeepDB’s on-disk row store is append-only. Time-travel is being worked on. While I forgot to ask, it seems likely that DeepDB has MVCC (Multi-Version Concurrency Control).

And finally: DeepDB in its current form is a “drop-in” InnoDB replacement, but not necessarily bug-compatible.

NewSQL thoughts

Curt Monash — Sat, 05 Jan 2013 18:04:08 +0000

I plan to write about several NewSQL vendors soon, but first here’s an overview post. Like “NoSQL”, the term “NewSQL” has an identifiable, recent coiner — Matt Aslett in 2011 — yet a somewhat fluid meaning. Wikipedia suggests that NewSQL comprises three things:

OLTP- (OnLine Transaction Processing)/short-request-oriented SQL DBMS that are newer than MySQL.
Innovative MySQL engines.
Transparent sharding systems that can be used with, for example, MySQL.

I think that’s a pretty good working definition, and will likely remain one unless or until:

SQL-oriented and NoSQL-oriented systems blur indistinguishably.
MySQL (or PostgreSQL) laps the field with innovative features.

To date, NewSQL adoption has been limited.

NewSQL vendors I’ve written about in the past include Akiban, Tokutek, CodeFutures (dbShards), Clustrix, Schooner (Membrain), VoltDB, ScaleBase, and ScaleDB, with GenieDB and NuoDB coming soon.
But I’m dubious whether, even taken together, all those vendors have as many customers or production references as any of 10gen, Couchbase, DataStax, or Cloudant.*

That said, the problem may lie more on the supply side than in demand. Developing a competitive SQL DBMS turns out to be harder than developing something in the NoSQL state of the art.

*Revenue might be a different matter.

The main reasons for NewSQL adoption tend to fall in the areas of performance, scaling, manageability and cost. But while they all support SQL, some NewSQL DBMS have differentiated programming models even so.

Akiban wants you to consider mixing access — to the same data in the same data structures — among SQL, JSON and, say, Hibernate.
Tokutek turns a performance argument into a functionality one. In particular, Tokutek claims that TokuDB does a much better job than alternatives of making it practical for you to update indexes at OLTP speeds. Hence, it claims to do a much better job than alternatives of making it practical for you to write and execute queries that only make sense when indexes (or other analytic performance boosts) are in place.
As a trade-off for blazing in-memory performance, VoltDB is hampered by an innovative and restrictive programming model.

Also, the MySQL add-ons and lookalikes vary in the (in)completeness of their MySQL emulation or support.

The most common performance/scaling NewSQL claims are simply “We scale, giving you the power of multiple servers, with sufficiently little downside in the way of tradeoffs.” That story is central to Clustrix, VoltDB, ScaleDB, NuoDB, and to anybody active in transparent sharding. Other performance/scaling claims include but are not limited to:

Optimized for RAM (VoltDB).
Optimized for flash (Schooner/Membrain).
Writes indexes quickly (TokuDB).
Fast joins (Akiban).

Management claims include (from multiple NewSQL vendors in each case):

Little added management pain, but you get scale-out!
Little added management pain, but you get active-active/multi-master wide-area replication!
Online schema change and other uninterrupted operation features.
Not as cumbersome as Oracle.

And that’s about as much as I’m ready to generalize about the NewSQL sector. Posts about particular product and companies are on the way.

A data distribution idea at Vertica and Clustrix

Curt Monash — Thu, 19 Jul 2012 21:44:23 +0000

Yesterday I wrote:

Clustrix has one cool idea I haven’t heard from anybody else, which I’m calling index distribution. The idea is that each index can be distributed differently across the cluster … i.e. on different distribution keys. Clustrix thinks that paying special attention to index distribution and movement is helpful to the performance of distributed joins.

While that’s true, I thought I’d heard something similar from Vertica; so I checked, and indeed I had. Vertica famously lets you store columns in different sort orders, in both reasonable senses:

Different columns in a table can be sorted in different ways.
A single column, which is stored multiple times for usual reasons of replication safety, can be sorted differently in its different copies.

It turns out those columns can also be distributed on different keys as well.

Related link

Vertica projections explained at length (September, 2011)

Clustrix 4.0 and other Clustrix stuff

Curt Monash — Wed, 18 Jul 2012 04:01:08 +0000

It feels like time to write about Clustrix, which I last covered in detail in May, 2010, and which is releasing Clustrix 4.0 today. Clustrix and Clustrix 4.0 basics include:

Clustrix makes a short-request processing appliance.
As you might guess from the name, Clustrix is clustered — peer-to-peer, with no head node.
The Clustrix appliance uses flash/solid-state storage.
Traditionally, Clustrix has run a MySQL-compatible DBMS.
Clustrix 4.0 introduces JSON support. More on that below.
Clustrix 4.0 introduces a bunch of administrative features, and parallel backup.
Also in today’s announcement is a Rackspace partnership to offer Clustrix remotely, at monthly pricing.
Clustrix has been shipping product for about 4 years.
Clustrix has 20 customers in production, running >125 Clustrix nodes total.
Clustrix has 60 people.
List price for a (smallest size) Clustrix system is $150K for 3 nodes. Highest-end maintenance costs 15%.
There’s also a $100K version meant for high availability/disaster recovery. Over half of Clustrix’s customers use off-site disaster recovery.
Clustrix is raising a C round. Part of it has already been raised from insiders, as a kind of bridge.

The biggest Clustrix installation seems to be 20 nodes or so. Others seem to have 10+. I presume those disaster recovery customers have 6 or more nodes each. I’m not quite sure how the arithmetic on that all works; perhaps the 125ish count of nodes is a bit low.

Clustrix technical notes include:

Clustrix is MVCC (Multi-Version Concurrency Control).
Clustrix exploits MVCC to allow online, lockless schema changes. Clustrix says these changes are typically single-column, for example an add or a widening/datatype change.
Clustrix indexes are a mix of b-trees and log-structured merge files.
Clustrix sounds like it’s paid attention to being multi-core. For example, DR replication is via parallel, multi-core log streaming, going single-core only when transactions have the potential to influence each other.
MySQL features Clustrix lacks include triggers and XML support.
Clustrix uses MLC flash.

Clustrix doesn’t have compression, with the usual excuse of excessive CPU cost. When I pointed out that dictionary/token compression is cheap, Clustrix cofounder/CTO Sergei Tsarev suggested that it doesn’t make sense now due to high cardinalities in OLTP workloads, but could become more important as more analytic use cases emerge.

Clustrix’ JSON story seems to be:

The JSON goes into a relational column.
Fields inside a JSON document can be indexed.
One can then reference those fields in SQL just as if they were relational columns, including in joins.
If you’re reckless when joining on multi-valued fields, trouble could in theory ensue.

That sounds a lot like other schemes for sticking documents into relational BLOBs/CLOBs (Binary/Character Large OBjects), although it happens to be the first time I’ve heard it in connection with JSON.

Clustrix has one cool idea I haven’t heard from anybody else, which I’m calling index distribution. The idea is that each index can be distributed differently across the cluster (this includes the JSON secondary indexes), i.e. on different distribution keys. Clustrix thinks that paying special attention to index distribution and movement is helpful to the performance of distributed joins.

I still wish Clustrix were available on a software-only/bring your own hardware/bring your own cloud basis. Absent that, pricing and lock-in are concerns. True, I didn’t immediately see any flaws in Clustrix’ claims that its Rackspace offering was at once cheaper and more performant than MySQL on Amazon; but then, Amazon isn’t always that cost-effective an option. Price aside, Clustrix does sound as if it’s one of a number of appealing NewSQL options, and probably even one of the (relatively speaking) more proven ones.

Soundbites: the Facebook/MySQL/NoSQL/VoltDB/Stonebraker flap, continued

Curt Monash — Fri, 15 Jul 2011 08:27:18 +0000

As a follow-up to the latest Stonebraker kerfuffle, Derrick Harris asked me a bunch of smart followup questions. My responses and afterthoughts include:

Facebook et al. are in effect Software as a Service (SaaS) vendors, not enterprise technology users. In particular:
- They have the technical chops to rewrite their code as needed.
- Unlike packaged software vendors, they’re not answerable to anybody for keeping legacy code alive after a rewrite. That makes migration a lot easier.
- If they want to write different parts of their system on different technical underpinnings, nobody can stop them. For example …
- … Facebook innovated Cassandra, and is now heavily committed to HBase.
It makes little sense to talk of Facebook’s use of “MySQL.” Better to talk of Facebook’s use of “MySQL + memcached + non-transparent sharding.” That said:
- It’s hard to see why somebody today would use MySQL + memcached + non-transparent sharding for a new project. At least one of Couchbase or transparently-sharded MySQL is very likely a superior alternative. Other alternatives might be better yet.
- As noted above in the example of Facebook, the many major web businesses that are using MySQL + memcached + non-transparent sharding for existing projects can be presumed able to migrate away from that stack as the need arises.

Continuing with that discussion of DBMS alternatives:

If you just want to write to the memcached API anyway, why not go with Couchbase?
If you want to go relational, why not go with MySQL? There are many alternatives for scaling or accelerating MySQL — dbShards, Schooner, Akiban, Tokutek, ScaleBase, ScaleDB, Clustrix, and Xeround come to mind quickly, so there’s a great chance that one or more will fit your use case. (And if you don’t get the choice of MySQL flavor right the first time, porting to another one shouldn’t be all THAT awful.)
If you really, really want to go in-memory, and don’t mind writing Java stored procedures, and don’t need to do the kinds of joins it isn’t good at, but do need to do the kinds of joins it is, VoltDB could indeed be a good alternative.

And while we’re at it — going schema-free often makes a whole lot of sense. I need to write much more about the point, but for now let’s just say that I look favorably on the Big Four schema-free/NoSQL options of MongoDB, Couchbase, HBase, and Cassandra.

More on NoSQL and HVSP (or OLRP)

Curt Monash — Thu, 26 Aug 2010 09:10:31 +0000

Since posting last Wednesday morning that I’m looking into NoSQL and HVSP, I’ve had a lot of conversations, including with (among others):

Dwight Merriman of 10gen (MongoDB)
Damien Katz of Couchio (CouchDB)
Matt Pfeil of Riptano (Cassandra)
Todd Lipcon of Cloudera (HBase committer)
Tony Falco of Basho (Riak)
John Busch of Schooner
Ori Herrnstadt of Akiban

By no means do I have time to do these conversations justice, in terms of giving them the write-ups and/or immediate follow-up that they deserve. Indeed, I’ll leave for vacation Saturday morning with my 2000-word NoSQL article still unwritten. So I’ll dump as many observations as I can into one or a few posts now, and play catch-up later as circumstances allow.

In no particular order:

A number of NoSQL offerings have had more uptake to date than most of the scale-out SQL offerings have.
“Document-oriented” NoSQL projects CouchDB and MongoDB have probably had the most users get into production, but perhaps for pretty small systems.
Cassandra and Hbase — the column-group-architecture guys — have probably had the most bang-in-lots-of-writes HVSP production uptake.*
I didn’t talk customer count with Schooner, but the decently-stocked Schooner customer page suggests Schooner may be something of an exception to these generalities.
A lot of these companies are in the low-to-mid-teens of employees.
The SQL-oriented companies, despite having fewer or no customers, often seem to have more money. (One reason I get the impression SQL guys have more money is, frankly, that more of them are talking about engaging my services.)
- Schooner cites $20 million in VC.
- Clustrix cites a figure close to that.
- Basho cites $10 million, plus a new round of $1.5 or $2 or $2.5 million. The new round is at a lowered valuation.
- That same site says Tokutek finally was able to raise some VC. Congrats!
It’s only a two-company trend, but I was pleased to hear that both 10gen/MongoDB and Akiban were seeing Drupal as a major use case or potential use case. No word on rescuing WordPress from its MySQL implementation, alas, but it seems that a Drupal site typically has 40-200+ tables, while a WordPress one has 10ish.
Another trend I think I’m seeing is serious object-oriented apps banging things straight into a simple back end. Workday is a huge example of that. Akiban hopes to do something similar with Hibernate.
Stability and maturity are still issues for many of these products. E.g., HBase isn’t even in Release 1.0 yet. Ditto Cassandra, and surely many of the others. Unsurprisingly, making Cassandra stable is still a challenge.

*As is common for terms I suggest, the “HVSP” name is not getting any traction. What do you think of Marton Trencseni’s suggestion of OLRP, for OnLine Request Processing?

One thing that makes following this area interesting is that so many projects are open source, leading there to be a lot of information in the wild. I hardly have time to read the mailing list for each project; but the people I talk with do, and often they may sorta kinda remember something somebody else posted one or several months back. As just one example, the mailing lists are said to confirm:

Contrary to rumor, Facebook hasn’t moved in-box search off of Cassandra.
Apparently, however, it’s true that Cassandra inventor Facebook has stopped working on Cassandra, and Facebook’s core Cassandra developers have shifted over to HBase.

Also, figuring out usage of open source software can be … interesting.

People who use open source software don’t have to reveal themselves, as there’s no purchase transaction to kick things off.
On the other hand, if they’re serious enough in their use, they often do.
- There are two main ways to get tech support for open source software — the community or a company that sells support — and both ways let the main support-selling company know that one is a user.
- Some folks even add themselves to open lists of users, for example these rather long lists for HBase and CouchDB.
- Or they show up at conferences. For example, two tweets from Riptano founder Jonathan Ellis suggest at least 30 production Cassandra users were represented at a recent event. That’s more detail than his colleague Matt Pfeil wanted to give me when talked.

OK. This post has gotten pretty long, even without me saying anything resembling an overview of any of the seven companies I listed up top, or of their products’ adoption. So I’ll just publish this now.

I’m collecting data points on NoSQL and HVSP adoption

Curt Monash — Wed, 18 Aug 2010 13:09:08 +0000

I was asked to do a magazine article on NoSQL, where by “NoSQL” is meant “whatever they talk about at NoSQL conferences.” By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about HVSP in general, NoSQL and SQL alike.

It also is understood that, realistically, I can’t be expected to know and mention the very latest news for all the many products in the categories. Even so, I think this would be fine time to check just where NoSQL and HVSP adoption stand. Here is most of what I know, or links to same; it would be great if you guys would contribute additional data in the comment thread.

In the NoSQL area:

Back in April, the VoltDB guys told me they thought Cassandra and HBase were the two NoSQL systems with the most momentum.
I know distressingly little about HBase adoption, but a source who may or may not wish to remain anonymous was kind enough to alert me that Twitter and StumbleUpon each have ~30 node deployments, for analytics and analytics/HVSP respectively.
I wrote in detail on Cassandra adoption last month. News since then includes:
- Facebook is rumored to have dropped Cassandra completely.
- Twitter clarified that it may not be quite as lovestruck by Cassandra as before, but they’re still very close friends.
- It’s not obvious that the Cassandra Summit unveiled a lot of new adoption stories.
Northscale’s Membase is still in its early days. Zynga is bought in, however, as is something called NHN Korea. (Edit: I subsequently saw NHN Korea on a prominent SEO expert’s list of the top half dozen or so search engines in the world. Who knew?)
Basho has listed a few Riak customers. If memory serves (I haven’t spoken with Basho for a while, and some of my notes are misplaced due to some computer sloppiness), Basho has a few dozen customers in total.
Mozilla has a 4 machine, 64 core Riak cluster in production.
Hypertable has a few users/project sponsors, Baidu being the biggest name among them.
I don’t really know how the MongoDB/10gen guys are doing. I think this is at least as much my fault as theirs. Anyhow, they seem to have links to a couple of folks who have written about MongoDB usage.
NimbusDB is still in stealth mode. I’d be surprised if they had users for a while yet, since in January they didn’t yet sound as if development was very far underway. (Actually, I forget whether NimbusDB is supposed to be SQL-based or not.)

Among the SQL or SQL-friendly guys:

Clustrix says it has a few production users, some big-name, but is not disclosing them yet.
dbShards has around 6 customers, including Facebook. (Facebook may outpace even Twitter and Zynga in using the most products mentioned in this post.)
As of May, VoltDB had one paying customer, plus 150 beta customers who weren’t in production yet.
Akiban says they’ll get me up to speed on Thursday.
ScaleDB seems to be pedaling along in perennial beta. Whether ScaleDB has any actual beta users is less clear. On the plus side, checking that out uncovered a pretty funny April Fool blog post.
Groovy Corporation seems to have disappeared, or morphed into something called uCirrus, or something like that.

The Clustrix story

Curt Monash — Wed, 12 May 2010 08:53:48 +0000

After my recent post, the Clustrix guys raised their hands and briefed me. Takeaways included:

Nothing in my original short post about Clustrix was actually incorrect.
Clustrix plans to reveal actual production “name-brand” customers soon.
The name of Clustrix’s software, or at least the guts thereof, is Sierra.
Clustrix’s products have actually been in general availability since last quarter, with some versions at customer sites for 2 years. Development started 3 ½ years ago.
Clustrix says its technology is for OLTP systems, which it calls “non-batch/non-analytic,” with mixed read/write workloads. All Clustrix’s example target markets are “internet verticals,” such as photo sharing, gaming, social media, e-commerce, etc.
Clustrix’s heart is in SQL, as is most of its customer base. Clustrix Sierra’s key-value-store option has little or no performance advantage over Clustrix Sierra’s SQL option, nor any other advantage over SQL that came up in discussion.
Clustrix Sierra is “wire-compatible” with MySQL, but doesn’t use MySQL code; Clustrix wrote all the code itself.
Clustrix asserts that Clustrix Sierra supports the “vast majority” of MySQL features. Examples of MySQL features Clustrix doesn’t support at this time are full-text search and geospatial indexing.
Indeed, Clustrix claims Clustrix Sierra can be used to replace MySQL with few or zero changes to existing applications.
I specifically asked about referential integrity, which has a poor performance reputation in MySQL. Besides saying they supported it, Clustrix said that some customers actually use referential integrity in some of their less active tables.
Clustrix Sierra is fully ACID-compliant, with no eventual consistency or RYW consistency story. The default number of copies of each datum is two, and they’re kept consistent via two-phase commit.
Clustrix Sierra is fully parallel, with no “head” node. I forgot to ask how it was determined which queries would be addressed to and/or controlled by which nodes, but I presume there’s some sort of a load-balancing scheme.
Clustrix says that because Clustrix Sierra uses MVCC (Multi-Version Concurrency Control), and thus reads and writes don’t block each other, global locks aren’t a major issue. (They’re rare or short or something – I have trouble seeing why they would be non-existent.)
Clustrix says there’s a second class of locks and latches that are purely local and short-lived, for B-tree indexes and the like. (I didn’t drill down into those either.) I guess this means Clustrix Sierra is B-tree-centric, which makes sense for an OLTP-oriented system.
Clustrix Sierra distributes data among nodes via consistent hashing (default), range partitioning, or “full distribution”(i.e., copying a – presumably small – table to each node). The choice of distribution plans is manual now; more automation is a future feature.
Clustrix Sierra’s CBO (Cost-Based Optimizer) is, as one would hope, distribution-aware.
Clustrix Sierra compiles query fragments and ships them off to the relevant nodes. A fragment might contain both instructions for SQL to be executed locally and for where data is to be sent next.
Clustrix says that Clustrix Sierra does data migration and redistribution (e.g., when you add a node) transparently online, and further says that in practice this doesn’t cause a performance hit.
As for Clustrix hardware:
- Clustrix makes Type I appliances.
- A Clustrix node contains 2 quad-core chips, 32 gigs of RAM, and 7 160 GB solid-state drives.
- Specifically, Clustrix is using Intel SSDs, with a SAS interface.
- Clustrix says solid-state memory isn’t really essential to the product design; it’s just cheap in terms of $/IOPS (I/O Per Second).
A minimum Clustrix configuration is 3 nodes, for redundancy. After that you can add nodes one at a time. Clustrix says it built a 20-node system in-house, leading me to suspect that customers don’t have anything bigger than 20 nodes either.
That 20-node Clustrix system was tested to show near-linear scalability. (In discussing this, Clustrix tends to forget to use the word “near”.)
Clustrix has partnered with somebody to provide global 4-hour-response support. As of now Clustrix seems to be active mainly in North America and Europe.
Clustrix is formed from the combination of two startups, which I’ve heard elsewhere were called Clustrix and Sprout. Exactly when the combination happened sounds a little different depending on who’s telling the story (one version has the predecessors still being separate well into 2008, but Clustrix implies the combination happened pretty much on Day 1).