Discussion of transparent sharding vendors and products.
One of my lesser-known clients is Citus Data, a largely Turkish company that is however headquartered in San Francisco. They make CitusDB, which puts a scale-out layer over a collection of fully-functional PostgreSQL nodes, much like Greenplum and Aster Data before it. However, in contrast to those and other Postgres-based analytic MPP (Massively Parallel Processing) DBMS:
- CitusDB does not permanently fork PostgreSQL; Citus Data has committed to always working with the latest PostgreSQL release, or at least with one that’s less than a year old.
- Citus Data never made the “fat head” mistake — if a join can’t be executed directly on the CitusDB data-storing nodes, it can’t be executed in CitusDB at all.
- CitusDB follows the modern best-practice of having many virtual nodes on each physical node. Default size of a virtual node is one gigabyte. Each virtual node is technically its own PostgreSQL table.*
- Citus Data has already introduced an open source column-store option for PostgreSQL, which CitusDB of course exploits.
*One benefit to this strategy, besides the usual elasticity and recovery stuff, is that while PostgreSQL may be single-core for any given query, a CitusDB query can use multiple cores by virtue of hitting multiple PostgreSQL tables on each node.
Citus has thrown a few things against the wall; for example, there are two versions of its product, one which involves HDFS (Hadoop Distributed File System) and one of which doesn’t. But I think Citus’ focus will be scale-out PostgreSQL for at least the medium-term future. Citus does have actual customers, and they weren’t all PostgreSQL users previously. Still, the main hope — at least until the product is more built-out — is that existing PostgreSQL users will find CitusDB easy to adopt, in technology and price alike.
|Categories: Aster Data, Citus Data, Columnar database management, Data warehousing, Database compression, Greenplum, Hadoop, Parallelization, PostgreSQL, SQL/Hadoop integration, Transparent sharding, Workload management||6 Comments|
The third of my three MySQL-oriented clients I alluded to yesterday is MemSQL. When I wrote about MemSQL last June, the product was an in-memory single-server MySQL workalike. Now scale-out has been added, with general availability today.
MemSQL’s flagship reference is Zynga, across 100s of servers. Beyond that, the company claims (to quote a late draft of the press release):
Enterprises are already using distributed MemSQL in production for operational analytics, network security, real-time recommendations, and risk management.
All four of those use cases fit MemSQL’s positioning in “real-time analytics”. Besides Zynga, MemSQL cites penetration into traditional low-latency markets — financial services (various subsectors) and ad-tech.
Highlights of MemSQL’s new distributed architecture start: Read more
|Categories: Clustering, Database compression, Emulation, transparency, portability, Games and virtual worlds, Investment research and trading, Log analysis, MemSQL, MySQL, NewSQL, Transparent sharding, Zynga||6 Comments|
I talked Friday with Deep Information Sciences, makers of DeepDB. Much like TokuDB — albeit with different technical strategies — DeepDB is a single-server DBMS in the form of a MySQL engine, whose technology is concentrated around writing indexes quickly. That said:
- DeepDB’s indexes can help you with analytic queries; hence, DeepDB is marketed as supporting OLTP (OnLine Transaction Processing) and analytics in the same system.
- DeepDB is marketed as “designed for big data and the cloud”, with reference to “Volume, Velocity, and Variety”. What I could discern in support of that is mainly:
- DeepDB has been tested at up to 3 terabytes at customer sites and up to 1 billion rows internally.
- Like most other NewSQL and NoSQL DBMS, DeepDB is append-only, and hence could be said to “stream” data to disk.
- DeepDB’s indexes could at some point in the future be made to work well with non-tabular data.*
- The Deep guys have plans and designs for scale-out — transparent sharding and so on.
*For reasons that do not seem closely related to product reality, DeepDB is marketed as if it supports “unstructured” data today.
Other NewSQL DBMS seem “designed for big data and the cloud” to at least the same extent DeepDB is. However, if we’re interpreting “big data” to include multi-structured data support — well, only half or so of the NewSQL products and companies I know of share Deep’s interest in branching out. In particular:
- Akiban definitely does. (Note: Stay tuned for some next-steps company news about Akiban.)
- Tokutek has planted a small stake there too.
- Key-value-store-backed NuoDB and GenieDB probably leans that way. (And SanDisk evidently shut down Schooner’s RDBMS while keeping its key-value store.)
- VoltDB, Clustrix, ScaleDB and MemSQL seem more strictly tabular, except insofar as text search is a requirement for everybody. (Edit: Oops; I forgot about Clustrix’s approach to JSON support.)
Edit: MySQL has some sort of an optional NoSQL interface, and hence so presumably do MySQL-compatible TokuDB, GenieDB, Clustrix, and MemSQL.
Also, some of those products do not today have the transparent scale-out that Deep plans to offer in the future.
I plan to write about several NewSQL vendors soon, but first here’s an overview post. Like “NoSQL”, the term “NewSQL” has an identifiable, recent coiner — Matt Aslett in 2011 — yet a somewhat fluid meaning. Wikipedia suggests that NewSQL comprises three things:
- OLTP- (OnLine Transaction Processing)/short-request-oriented SQL DBMS that are newer than MySQL.
- Innovative MySQL engines.
- Transparent sharding systems that can be used with, for example, MySQL.
I think that’s a pretty good working definition, and will likely remain one unless or until:
- SQL-oriented and NoSQL-oriented systems blur indistinguishably.
- MySQL (or PostgreSQL) laps the field with innovative features.
To date, NewSQL adoption has been limited.
- NewSQL vendors I’ve written about in the past include Akiban, Tokutek, CodeFutures (dbShards), Clustrix, Schooner (Membrain), VoltDB, ScaleBase, and ScaleDB, with GenieDB and NuoDB coming soon.
- But I’m dubious whether, even taken together, all those vendors have as many customers or production references as any of 10gen, Couchbase, DataStax, or Cloudant.*
That said, the problem may lie more on the supply side than in demand. Developing a competitive SQL DBMS turns out to be harder than developing something in the NoSQL state of the art.
Data/database virtualization seems to be a hot subject right now, and vendors of a broad variety of different technologies are all claiming to be in the space. A terminological mess has ensued, as Monash’s First and Third Laws of Commercial Semantics are borne out in spades.
If something is like “virtualization”, then it should resemble hypervisors such as VMware. To me:
- The core feature of a hypervisor is that it allows many somethings to run and coexist where ordinarily only one something would come into play. Here the “many somethings” are virtual machines and what’s going on inside them, and the “one something” is the ordinary operating system/hardware computing stack.
- A core feature of original VMware was that the “many somethings” could be quite different — for example, the operating environments of numerous different hardware systems you wanted to decommission, or of new systems that you didn’t want to buy quite yet.
- Important features of hypervisors include:
- The ability to have multiple virtual machines run side by side at once, safely.
- Flexible and powerful workload management if the virtual machines do contend for resources.
- Easy management.
- The negative feature of having sufficiently low overhead.
Anything that claims to be “like virtualization” should be viewed in that light. Read more
|Categories: Clustering, Data integration and middleware, ScaleDB, Theory and architecture, Transparent sharding||5 Comments|
I’ve been talking a fair bit with Cory Isaacson, CEO of my client CodeFutures, which makes dbShards. Business notes include:
- 7 production users, plus an 8th imminent.
- 12-14 signed contracts beyond that.
- ~160 servers in production.
- One customer who has almost 15 terabytes of data (in the cloud).
- Still <10 people, pretty much all engineers.
- Profitable, but looking to raise a bit of growth capital.
Apparently, the figure of 6 dbShards customers in July, 2010 is more comparable to today’s 20ish contracts than to today’s 7-8 production users. About 4 of the original 6 are in production now.
NDA stuff aside, the main technical subject we talked about is something Cory calls “relational sharding”. The point is that dbShards’ transparent sharding can be done in such a way as to make many joins be single-server. Specifically:
- When a table is sufficiently small to be replicated in full at every nodes, you can join on it without moving data across the network.
- When two tables are sharded on the same key, you can join on that key without moving data across the network.
dbShards can’t do cross-shard joins, but it can do distributed transactions comprising multiple updates. Cory argues persuasively that in almost all cases this is enough; but I see cross-shard joins as a feature that should someday be added to dbShards even so.
The real issue with dbShards’ transparent sharding is ensuring it’s really transparent. Cory regards as typical a customer with a couple thousand tables, who had to change a dozen or so SQL statements to implement dbShards. But there are near-term plans to automate the number of SQL changes needed down to 0. The essence of that change is this: Read more
|Categories: Clustering, Data integration and middleware, Data models and architecture, dbShards and CodeFutures, Market share and customer counts, Transparent sharding||1 Comment|
There’s a perception that, if you want (relatively) worry-free database scale-out, you need a non-relational/NoSQL strategy. That perception is false. In the analytic case it’s completely ridiculous, as has been demonstrated by Teradata, Vertica, Netezza, and various other MPP (Massively Parallel Processing) analytic DBMS vendors. And now it’s false for short-request/OLTP (OnLine Transaction Processing) use cases as well.
My favorite relational OLTP scale-out choice these days is the SchoonerSQL/dbShards partnership. Schooner Information Technology (SchoonerSQL) and Code Futures (dbShards) are young, small companies, but I’m not too concerned about that, because the APIs they want you to write to are just MySQL’s. The main scenarios in which I can see them failing are ones in which they are competitively leapfrogged, either by other small competitors – e.g. ScaleBase, Akiban, TokuDB, or ScaleDB — or by Oracle/MySQL itself. While that could suck for my clients Schooner and Code Futures, it would still provide users relying on MySQL scale-out with one or more good product alternatives.
Relying on non-MySQL NewSQL startups, by way of contrast, would leave me somewhat more concerned. (However, if their code is open sourced. you have at least some vendor-failure protection.) And big-vendor scale-out offerings, such as Oracle RAC or DB2 pureScale, may be more complex to deploy and administer than the MySQL and NewSQL alternatives.
|Categories: Clustering, dbShards and CodeFutures, IBM and DB2, MySQL, NewSQL, NoSQL, OLTP, Open source, Oracle, Parallelization, Schooner Information Technology, Transparent sharding||2 Comments|
When databases are too big to manage via a single server, responsibility for them is spread among multiple servers. There are numerous names for this strategy, or versions of it — all of them at least somewhat problematic. The most common terms include:
- (Shared-nothing) MPP (Massively Parallel Processing), often used to describe analytic DBMS. On the whole, these terms have worked pretty well, but they have issues even so. First, “MPP” means different things to different marketers. Second, most ostensibly “shared-nothing” systems aren’t really “shared-nothing.” They generally support at least storage arrays, if not storage-area networks (SANs); indeed, in a couple of cases (most notably EMC Greenplum), SAN support is prominent in their marketing message.
- (Horizontal) partitioning and/or data distribution. These have significant problems. “Partitioning” and “distribution” are easily confused with each other, not least because the term “partitioning” is used in different ways by different DBMS product vendors.
- Sharding, commonly used to describe scaled-out MySQL in Internet Request Processing use cases. This one has the advantage of being concise, but is beginning to mean two different things, in that it is used both when the data is REALLY in separate databases on different machines (i.e., the application has to explicitly reference the shard it wants to talk to) and also when the database is transparently distributed (e.g. via dbShards).
- Coherent caching and/or distributed shared memory, describing cases when data is in RAM. Besides being RAM-specific, these terms can be vague as to whether the same data is recopied onto different systems, or whether they are focused on letting (relatively) large in-memory data stores be spread across a cluster.
I plan to start using the term transparent sharding to denote a data management strategy in which data is assigned to multiple servers (or CPUs, cores, etc.), yet looks to programmers and applications as if it were managed by just one. Thus,
- dbShards and ScaleBase feature transparent sharding (this is the case which inspired me to introduce the term).
- Anything which has ever reasonably been called a “shared-nothing” MPP DBMS features transparent sharding.
- Memcached features transparent sharding. So, I imagine, do other caching systems I am less familiar with.
- Shared-disk DBMS do not feature transparent sharding, even if their query work can be scaled out across multiple servers. (But Oracle Exadata does, because of its server tier.)
After I posted recently about dbShards, a Very Smart Commenter emailed me with the challenge “but each individual shard is still replicated via two-phase commit, and everybody knows two-phase commit is fundamentally slow.” I replied that no, it wasn’t exactly two-phase commit, but fumbled the explanation of why — so I decided to escalate straight to dbShards honcho Cory Isaacson. Read more
Liran Zelkha of ScaleBase raised his hand on Twitter. It turns out ScaleBase has a story rather similar to that of CodeFutures/dbShards. That is:
- Like dbShards, ScaleBase is a proxy that looks to the application like a scale-out DBMS, but routes work to multiple servers running MySQL against different shards of the database. Other DBMS beyond MySQL are planned, but PostgreSQL — which dbShards supports — did not get mentioned.
- Sharding is done at configuration time, and is transparent to the application. You want to shard the big tables and replicate the small ones, because if you join two sharded tables, performance can be slow. ScaleBase may have more of a configuration-advisor wizard than dbShards does.
- Each shard is replicated to a mirror, in a high-availability way.
- You can use ScaleBase across multiple data centers, but there’s little or no magic to overcome the performance issues that would arise in many use cases.
- Much like dbShards, ScaleBase supports three kinds of sharding — hash, list, and range.
- ScaleBase currently has no support whatsoever for stored procedures, which is slightly less than dbShards has.
- Liran stresses that ScaleBase looks even to management tools — e.g. TOAD — like a single DBMS.
- ScaleBase runs on EC2 and private cloud.
Our talk didn’t get deeply technical, and I don’t know exactly how ScaleBase’s replication works. But a website reference to a small transaction log in a distributed cache does sound, while not identical to the dbShards approach, at least directionally similar.
ScaleBase is a year or so old, with about 6 people, based in the Boston area despite strong Israeli roots. ScaleBase has raised a round of venture capital; I didn’t ask for details.
Liran says that ScaleBase is in closed beta, with some production users, at least one of whom has over 100 database servers.
|Categories: Clustering, dbShards and CodeFutures, MySQL, OLTP, Parallelization, ScaleBase, Transparent sharding||9 Comments|