VoltDB and H-Store
Analysis of OLTP DBMS research project H-Store and its commercialization VoltDB. Related subjects include:
- Key H-Store researcher and VoltDB company founder Michael Stonebraker
- OLTP (OnLine Transaction Processing) database management
- Memory-centric data management
- Vertica Systems, which is incubating VoltDB
HP is acquiring Vertica. Read more
|Categories: Complex event processing (CEP), In-memory DBMS, Investment research and trading, Memory-centric data management, StreamBase, VoltDB and H-Store||12 Comments|
A number of recent posts have had good comments. This time, I won’t call them out individually.
Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on “sensor data” …
… and, even better, went on to say: Read more
I was asked to do a magazine article on NoSQL, where by “NoSQL” is meant “whatever they talk about at NoSQL conferences.” By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about HVSP in general, NoSQL and SQL alike.
It also is understood that, realistically, I can’t be expected to know and mention the very latest news for all the many products in the categories. Even so, I think this would be fine time to check just where NoSQL and HVSP adoption stand. Here is most of what I know, or links to same; it would be great if you guys would contribute additional data in the comment thread.
In the NoSQL area: Read more
|Categories: Akiban, Cassandra, Clustering, Clustrix, Couchbase, dbShards and CodeFutures, Facebook, Groovy Corporation, NewSQL, NoSQL, OLTP, Parallelization, ScaleDB, Specific users, VoltDB and H-Store, Zynga||17 Comments|
Todd Hoff (High Scalability blog) posted a lengthy examination of the case and use cases for VoltDB. That excellent post, in turn, is based on a Mike Stonebraker* webinar for VoltDB, for which the slide deck is happily available. It’s all nicely consistent with what I wrote about VoltDB last month, in connection with its launch. Read more
|Categories: In-memory DBMS, Michael Stonebraker, OLTP, Parallelization, Theory and architecture, VoltDB and H-Store||3 Comments|
VoltDB is finally launching today. As is common for companies in sectors I write about, VoltDB — or just “Volt” — has discovered the virtues of embargoes that end 12:01 am. Let’s go straight to the technical highlights:
- VoltDB is based on the H-Store technology, which I wrote about in February, 2009. Most of what I said about H-Store then applies to VoltDB today.
- VoltDB is a no-apologies ACID relational DBMS, which runs entirely in RAM.
- VoltDB has rather limited SQL. (One example: VoltDB can’t do SUMs in SQL.) However, VoltDB guy Tim Callaghan (Mark Callaghan’s lesser-known but nonetheless smart brother) asserts that if you code up the missing functionality, it’s almost as fast as if it were present in the DBMS to begin with, because there’s no added I/O from the handoff between the DBMS and the procedural code. (The data’s in RAM one way or the other.)
- VoltDB’s Big Conceptual Performance Story is that it does away with most locks, latches, logs, etc., and also most context switching.
- In particular, you’re supposed to partition your data and architect your application so that most transactions execute on a single core. When you can do that, you get VoltDB’s performance benefits. To the extent you can’t, you’re in two-phase-commit performance land. (More precisely, you’re doing 2PC for multi-core writes, which is surely a major reason that multi-core reads are a lot faster in VoltDB than multi-core writes.)
- VoltDB has a little less than one DBMS thread per core. When the data partitioning works as it should, you execute a complete transaction in that single thread. Poof. No context switching.
- A transaction in VoltDB is a Java stored procedure. (The early idea of Ruby on Rails in lieu of the Java/SQL combo didn’t hold up performance-wise.)
- Solid-state memory is not a viable alternative to RAM for VoltDB. Too slow.
- Instead, VoltDB lets you snapshot data to disk at tunable intervals. “Continuous” is one of the options, wherein a new snapshot starts being made as soon as the last one completes.
- In addition, VoltDB will also spool a kind of transaction log to the target of your choice. (Obvious choice: An analytic DBMS such as Vertica, but there’s no such connectivity partnership actually in place at this time.)
The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part). OLTP (OnLine Transaction Processing) and general purpose DBMS startups, however, have not yet done as well, with such success as there has been (MySQL, Intersystems Cache’, solidDB’s exit, etc.) generally accruing to products that originated in the 20th Century.
Nonetheless, OLTP/general-purpose data management startup activity has recently picked up, targeting what I see as some very real opportunities and needs. So as a jumping-off point for further writing, I thought it might be interesting to collect a few observations about the market in one place. These include:
- Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.
- By number, most of an enterprise’s OLTP/general-purpose databases are low-volume and low-value.
- Most interesting new OLTP/general-purpose data management products are either MySQL-based or NoSQL.
- It’s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.
- The era of silicon-centric relational DBMS is coming.
- The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds.
- Users’ instance on “free” could be a major problem for OLTP DBMS innovation.
I shall explain. Read more
The Boston Globe article has more detail than Vertica and VoltDB have ever OKed me to put out, and some business details they’ve never given me.
|Categories: In-memory DBMS, Memory-centric data management, OLTP, Vertica Systems, VoltDB and H-Store||Leave a Comment|
Eric Lai emailed today to ask what I thought about the NoSQL folks, and especially whether I thought their ideas were useful for enterprises in general, as opposed to just Web 2.0 companies. That was the first I heard of NoSQL, which seems to be a community discussing SQL alternatives popular among the cloud/big-web-company set, such as BigTable, Hadoop, Cassandra and so on. My short answers are:
- In most cases, no.
- Most of these technologies are designed for simple, high-volume OLTP (OnLine Transaction Processing.) Most large enterprises have an established way of doing OLTP, probably via relational database management systems. Why change?
- MapReduce is an exception, in that it’s designed for analytics. MapReduce may be useful for enterprises. But where it is, it probably should be integrated into an analytic DBMS.
- There’s one big countervailing factor to all these generalities — schema flexibility.
As for the longer form, let me start by noting that there are two main kinds of reason for not liking SQL. Read more
I’ve always honored more of an NDA about the H-Store project and its commercialization than I really felt obligated to, given how freely information was being bandied about to others. I’m still doing so.
But I think I’ll at least say that the H-Store project is now named VoltDB. The VoltDB website names two individuals — Mike Stonebraker and Andy Palmer — both of whom are founders of Vertica. Job listings on the site are for field engineer and trainer, but not developer, so that suggests something about the project’s/product’s maturity level.
If you have an extreme OLTP need, you should talk to VoltDB. If you don’t have access to Mike or Andy directly, I can hook you up with a key VoltDB marketing/outreach guy. Price may not be as much of a barrier as you’d initially fear.
If anybody from VoltDB wants to be less cloak-and-daggery and say more in the comment thread, I’d be pleased.
And yes — an open-secret working name for H-Store/VoltDB was, for a while, “Horizontica.”
|Categories: In-memory DBMS, Memory-centric data management, OLTP, Vertica Systems, VoltDB and H-Store||15 Comments|
Oracle Exadata was pre-teased as “Extreme performance.” Some incorrect speculation shortly before the announcement focused on the possibility of OLTP without disk, which clearly would speed things up a lot. I interpret that in part as being wishful thinking.
The most compelling approach I’ve seen to that problem yet is H-Store, which however makes some radical architectural assumptions. One point I didn’t stress in my earlier posts, but which turned out to be a deal-breaker for one early tire-kicker, is that to use H-Store you have to be able to shoehorn each transaction into its own stored procedure. Depending on how intricate your logic is, that might make it hard to port an existing app to H-Store.
Even for new apps, it could get in the way of some things you might want to do, such as rule-based processing. And that could be a problem. A significant fraction of the highest-performance OLTP apps are customer-facing, and customer-facing apps are one of the biggest areas where rule-based processing comes into play.