The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part). OLTP (OnLine Transaction Processing) and general purpose DBMS startups, however, have not yet done as well, with such success as there has been (MySQL, Intersystems Cache’, solidDB’s exit, etc.) generally accruing to products that originated in the 20th Century.
Nonetheless, OLTP/general-purpose data management startup activity has recently picked up, targeting what I see as some very real opportunities and needs. So as a jumping-off point for further writing, I thought it might be interesting to collect a few observations about the market in one place. These include:
- Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.
- By number, most of an enterprise’s OLTP/general-purpose databases are low-volume and low-value.
- Most interesting new OLTP/general-purpose data management products are either MySQL-based or NoSQL.
- It’s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.
- The era of silicon-centric relational DBMS is coming.
- The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds.
- Users’ instance on “free” could be a major problem for OLTP DBMS innovation.
I shall explain.
Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.
- OLTP applications are more complex than analytic ones, and hence more tightly wired into particular brands of DBMS. For example, third-party packaged OLTP applications are typically portable among only a few brands of DBMS. But third-party business intelligence tools, and the BI “applications” built in them, are more easily and widely portable.
- Specific technical observations such as “OLTP apps tend to use stored procedures, which are DBMS-specific” or “OLTP apps tend to have lots and lots of tables” serve to underscore the first point.
- An enterprise’s highest-value data is commonly the financial stuff handled by its core OLTP systems, so those are the last things they want to mess around with just to get some cost savings. Security, high availability, and so on are major considerations that can outweigh cost.
By number, most of an enterprise’s OLTP/general-purpose databases are low-volume and low-value. Indeed, “OLTP” is often a misnomer, which is why I tend to go with “general-purpose” or some similarly wishy-washy phrase instead.
- In theory, this is a ripe area for what I’ve called mid-range DBMS.
- The big brand vendors try hard to keep as many of those databases for themselves as they can. Enterprise-wide license pricing helps. Going forward, so will virtualization/consolidation strategies, such as Oracle’s Exadata-centric approach.
- A variety of mid-range DBMS alternatives beyond the big brands have technical merit, at least in some cases and configurations – MySQL, PostgreSQL, Intersystems Cache’, and so on.
- The only such mid-range DBMS alternative with much large enterprise business momentum, however, appears to be MySQL.
“General-purpose” might be a better term than “OLTP” anyway.
- I don’t have a link, but it’s widely agreed that over half of the processing on an “OLTP” enterprise app is commonly reporting and so on.
- “Operational BI” is progressing by fits and starts, but it is progressing.
- Anything customer-facing — web-based, call center, or otherwise — is likely to include a heavy dose of “real-time” analytic optimization.
Most interesting new OLTP/general-purpose data management products are either MySQL-based or NoSQL.
- VoltDB is the main exception that jumps to mind.
- This isn’t true in the analytic DBMS area, where Netezza, Greenplum, Aster, Vertica and others started from PostgreSQL’s code, APIs, or both.
It’s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.
- MySQL is a limited product without all the third-party storage engines that are being developed.
- Oracle’s promise of MySQL good behavior has an expiration date.
- None of the MySQL front-end alternatives are remotely mature yet.
The era of silicon-centric relational DBMS is coming.
- I think “silicon” means “solid-state memory” as much as or more than it means “RAM,” but that’s not yet certain.
- What is pretty certain is that, thanks to Moore’s Law, some kind of silicon will increasingly replace disk.
- Oracle’s increasingly Flash-centric story is a challenge to everybody.
- RAM-centric VoltDB will launch fairly soon. (By the way, while VoltDB still has a lot in common with H-Store, they’re not exactly the same thing. And H-Store research is progressing too.)
- RethinkDB is being developed, focused directly on solid-state memory. Based on the sparse information available online, RethinkDB sounds somewhat like a dumbed-down H-Store.
- New disk-based vendors may never optimize their use of disk, instead targeting a solid-state future. (E.g., I think Akiban should and quite well might follow this path.)
Users’ instance on “free” could be a major problem for OLTP DBMS innovation. Vendors of new OLTP data management technologies often feel obligated to open source their products, notwithstanding the historical lack of revenue in the open source OLTP DBMS market. As just one of many examples, Nova Spivack wrote:
I have recently seen some new graph data storage products that may provide the levels of scale and performance needed, but pricing has not been determined yet. In short, storage and retrieval of semantic graph datasets is a big unsolved challenge that is holding back the entire industry. We need federated database systems that can handle hundreds of billions to trillions of triples under high load conditions, in the cloud, on commodity hardware and open source software. Only then will it be affordable to make semantic applications and services at Web-scale.
I hear similar things from other startups, who evidently believe they need and/or are entitled to enjoy sophisticated, high-performance, zero-cost, specialized database management technology.