Clustering

Analysis of products and issues in database clustering. Relates subjects include:

July 14, 2011

An odd claim attributed to Mike Stonebraker

This post has a sequel.

Last week, Mike Stonebraker insulted MySQL and Facebook’s use of it, by implication advocating VoltDB instead. Kerfuffle ensued. To the extent Mike was saying that non-transparently sharded MySQL isn’t an ideal way to do things, he’s surely right. That still leaves a lot of options for massive short-request databases, however, including transparently sharded RDBMS, scale-out in-memory DBMS (whether or not VoltDB*), and various NoSQL options. If nothing else, Couchbase would seem superior to memcached/non-transparent MySQL if you were starting a project today.

*The big problem with VoltDB, last I checked, was its reliance on Java stored procedures to get work done.

Pleasantries continued in The Register, which got an amazing-sounding quote from Mike. If The Reg is to be believed — something I wouldn’t necessarily take for granted — Mike claimed that he (i.e. VoltDB) knows how to solve the distributed join performance problem.  Read more

May 6, 2011

DB2 OLTP scale-out: pureScale

Tim Vincent of IBM talked me through DB2 pureScale Monday. IBM DB2 pureScale is a kind of shared-disk scale-out parallel OTLP DBMS, with some interesting twists. IBM’s scalability claims for pureScale, on a 90% read/10% write workload, include:

More precisely, those are counts of cluster “members,” but the recommended configuration is one member per operating system instance — i.e. one member per machine — for reasons of availability. In an 80% read/20% write workload, scalability is less — perhaps 90% scalability over 16 members.

Several elements are of IBM’s DB2 pureScale architecture are pretty straightforward:

Something called GPFS (Global Parallel File System), which comes bundled with DB2, sits underneath all this. It’s all based on the mainframe technology IBM Parallel Sysplex.

The weirdest part (to me) of DB2 pureScale is something called the Global Cluster Facility, which runs on its own set of boxes. (Edit: Actually, see Tim Vincent’s comment below.) Read more

May 3, 2011

Oracle and Exadata: Business and technical notes

Last Friday I stopped by Oracle for my first conversation since January, 2010, in this case for a chat with Andy Mendelsohn, Mark Townsend, Tim Shetler, and George Lumpkin, covering Exadata and the Oracle DBMS. Key points included:  Read more

April 4, 2011

The MongoDB story

Along with CouchDB/Couchbase, MongoDB was one of the top examples I had in mind when I wrote about document-oriented NoSQL. Invented by 10gen, MongoDB is an open source, no-schema DBMS, so it is suitable for very quick development cycles. Accordingly, there are a lot of MongoDB users who build small things quickly. But MongoDB has heftier uses as well, and naturally I’m focused more on those.

MongoDB’s data model is based on BSON, which seems to be JSON-on-steroids. In particular:

Read more

January 28, 2011

Schooner — flash-based, now software-only, and very fast

Last October I wrote about Schooner Information Technology, which made flash-based appliances, for MySQL, memcached, or persistent memcached. Schooner sold those appliances to close to 20 customers, but even so decided software-only was a better way to go.

Schooner’s core value proposition is that one Schooner box with flash does the job of a lot of MySQL or NoSQL boxes with hard drives. Highlights of the Schooner story — of which you can find more detail at the Schooner website — now include:  Read more

January 25, 2011

ScaleBase, another MPP OLTP quasi-DBMS

Liran Zelkha of ScaleBase raised his hand on Twitter. It turns out ScaleBase has a story rather similar to that of CodeFutures/dbShards. That is:

Our talk didn’t get deeply technical, and I don’t know exactly how ScaleBase’s replication works. But a website reference to a small transaction log in a distributed cache does sound, while not identical to the dbShards approach, at least directionally similar.

ScaleBase is a year or so old, with about 6 people, based in the Boston area despite strong Israeli roots. ScaleBase has raised a round of venture capital; I didn’t ask for details.

Liran says that ScaleBase is in closed beta, with some production users, at least one of whom has over 100 database servers.

January 25, 2011

dbShards update

I talked yesterday with Cory Isaacson of CodeFutures, and hence can follow up on my previous post about dbShards. dbShards basics include:

One dbShards customer writes 1/2 billion rows on a busy day, and serves 3-4,000 pages per second, naturally with multiple queries per page. This is on a 32-node cluster, with uninspiring hardware, in the cloud. The database has 16 shards, aggregating 128 virtual shards. I forgot to ask how big the database actually is. Overall, dbShards is up to a dozen or so signed customers, half of whom are in production or soon will be.

dbShards’ replication scheme works like this:  Read more

August 18, 2010

I’m collecting data points on NoSQL and HVSP adoption

I was asked to do a magazine article on NoSQL, where by “NoSQL” is meant “whatever they talk about at NoSQL conferences.” By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about HVSP in general, NoSQL and SQL alike.

It also is understood that, realistically, I can’t be expected to know and mention the very latest news for all the many products in the categories. Even so, I think this would be fine time to check just where NoSQL and HVSP adoption stand. Here is most of what I know, or links to same; it would be great if you guys would contribute additional data in the comment thread.

In the NoSQL area:  Read more

July 31, 2010

Teradata, Xkoto Gridscale (RIP), and active-active clustering

Having gotten a number of questions about Teradata’s acquisition of Xkoto, I leaned on Teradata for an update, and eventually connected with Scott Gnau. Takeaways included:

Frankly, I’m disappointed at the struggles of clustering efforts such as Xkoto Gridscale or Continuent’s pre-Tungsten products, but if the DBMS vendors meet the same needs themselves, that’s OK too.

The logic behind active-active database implementations actually seems pretty compelling:  Read more

April 27, 2010

Gear6 seems to have failed in the memcached market too

As previously noted, I’ve briefly cut back on blogging (and research) due to some family health issues. The first casualty was a post about memcached. One of the two companies to be featured were my new clients at Northscale. The other was Gear6. What they had in common was:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.