Analysis of hybrid memory-centric DBMS solidDB and its developer Solid Technology. Related subjects include:
Indexes are central to database management.
- My first-ever stock analyst report, in 1982, correctly predicted that index-based DBMS would supplant linked-list ones …
- … and to this day, if one wants to retrieve a small fraction of a database, indexes are generally the most efficient way to go.
- Recently, I’ve had numerous conversations in which indexing strategies played a central role.
Perhaps it’s time for a round-up post on indexing.
1. First, let’s review some basics. Classically:
- An index is a DBMS data structure that you probe to discover where to find the data you really want.
- Indexes make data retrieval much more selective and hence faster.
- While indexes make queries cheaper, they make writes more expensive — because when you write data, you need to update your index as well.
- Indexes also induce costs in database size and administrative efforts. (Manual index management is often the biggest hurdle for “zero-DBA” RDBMS installations.)
2. Further: Read more
|Categories: Data warehousing, Database compression, GIS and geospatial, Google, MapReduce, McObject, MemSQL, MySQL, ScaleDB, solidDB, Sybase, Text, Tokutek and TokuDB||19 Comments|
Three months ago, I pointed out that it is hard to generalize about memory-centric database management, because there are so many different kinds. That said, there are some basic points that I’d like to record as background for any future discussion of the subject, focusing on differences between disk and RAM. And while I’m at it, I’ll throw in a few comments about flash memory as well.
This post would probably be better if I had actual numbers for the speeds of various kinds of silicon operations, but I’ll do what I can without them.
For most purposes, database speed is a function of a few kinds of number:
- CPU cycles consumed.
- I/O throughput.
- I/O wait time.
- Network throughput.
- Network wait time.
The amount of storage used is also important, both directly — storage hardware costs money — and because if you save storage via compression, you may get corresponding benefits in I/O. Power consumption and similar costs are usually tied to hardware efficiency; the less gear you use, the less floor space and cooling you may be able to get away with.
When databases move to RAM from spinning disk, major consequences include: Read more
|Categories: Database compression, Memory-centric data management, Solid-state memory, solidDB||6 Comments|
I’m frequently asked to generalize in some way about in-memory or memory-centric data management. I can start:
- The desire for human real-time interactive response naturally leads to keeping data in RAM.
- Many databases will be ever cheaper to put into RAM over time, thanks to Moore’s Law. (Most) traditional databases will eventually wind up in RAM.
- However, there will be exceptions, mainly on the machine-generated side. Where data creation and RAM data storage are getting cheaper at similar rates … well, the overall cost of RAM storage may not significantly decline.
Getting more specific than that is hard, however, because:
- The possibilities for in-memory data storage are as numerous and varied as those for disk.
- The individual technologies and products for in-memory storage are much less mature than those for disk.
- Solid-state options such as flash just confuse things further.
Consider, for example, some of the in-memory data management ideas kicking around. Read more
- This is a list of Monash Advantage members.
- All our vendor clients are Monash Advantage members, unless …
- … we work with them primarily in their capacity as technology users. (A large fraction of our user clients happen to be SaaS vendors.)
- We do not usually disclose our user clients.
- We do not usually disclose our venture capital clients, nor those who invest in publicly-traded securities.
- Excluded from this round of disclosure is one vendor I have never written about.
- Included in this round of disclosure is one client paying for services partly in stock. All our other clients are cash-only.
For reasons explained below, I’ll group the clients geographically. Obviously, companies often have multiple locations, but this is approximately how it works from the standpoint of their interactions with me. Read more
I talked with McObject yesterday. McObject has two product lines, both of which are something like in-memory DBMS — eXtremeDB, which is the main one, and Perst. McObject has been around since at least 2003, probably has no venture capital, and probably has a very low double-digit number of employees.*
*I could be wrong in those guesses; as small companies go, McObject is unusually prone to secrecy games.
As best I understand:
- eXtremeDB is something like an in-memory object-oriented DBMS, designed to be embeddable.
- However, much as with Objectivity and other old-school OODBMS, eXtremeDB winds up being more of a toolkit with which to build DBMS than a full DBMS.
- eXtremeDB has a few indexing schemes. The main one is good old B-trees. One customer wanted Patricia tries, so they’re in there. (Perhaps not coincidentally, solidDB relies on Patricia tries.) At least one wanted R-trees, so they’re in there too.
- eXtremeDB has long had the option of persistent logs.
- eXtremeDB newly has a hybrid memory-centric option, in which you can have more data in the database than fits into RAM.
- eXtremeDB newly has multi-master two-phase-commit clustering.
My guess three years ago that eXtremeDB might emerge as an alternative to solidDB seems to have been borne out. McObject CEO Steve Graves says that the core of McObject’s business is OEMs, in sectors such as telecom equipment and defense/aerospace. That’s exactly solidDB’s traditional market, except that solidDB got acquired by IBM and deemphasized it.
I’ve said before that if I were starting a SaaS effort — and it wasn’t just focused on analytics — I’d look at using a memory-centric OODBMS. Perhaps eXtremeDB is worth looking at in such scenarios.
|Categories: In-memory DBMS, McObject, Memory-centric data management, Object, Objectivity and Infinite Graph, solidDB, Telecommunications||11 Comments|
In January, 2010, I posited that it might be helpful to view data as being divided into three categories:
- Human/Tabular data –i.e., human-generated data that fits well into relational tables or arrays.
- Human/Nontabular data — i.e., all other data generated by humans.
- Machine-Generated data.
I won’t now stand by every nuance in that post, which may differ slightly from those in my more recent posts about machine-generated data and poly-structured databases. But one general idea is hard to dispute:
Traditional database data — records of human transactional activity, referred to as “Human/Tabular data above” — will not grow as fast as Moore’s Law makes computer chips cheaper.
And that point has a straightforward corollary, namely:
It will become ever more affordable to put traditional database data entirely into RAM. Read more
|Categories: Analytic technologies, Cache, In-memory DBMS, memcached, Memory-centric data management, OLTP, Oracle, Oracle TimesTen, SAP AG, solidDB, Storage, Theory and architecture, VoltDB and H-Store||25 Comments|
Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion today (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal — if it happens — might affect the database management system industry. Read more
|Categories: Actian and Ingres, Data warehousing, EnterpriseDB and Postgres Plus, Greenplum, IBM and DB2, Infobright, Kickfire, Kognitio, Microsoft and SQL*Server, Mid-range, MySQL, Open source, ParAccel, PostgreSQL, solidDB||10 Comments|
A correspondent from China wrote in to ask about products that matched the following application scenario: Read more
|Categories: In-memory DBMS, McObject, Memory-centric data management, OLTP, Oracle TimesTen, solidDB||7 Comments|
Apparently, IBM is rolling out an appliance for small businesses. MySQL is under the covers. The appliance won’t have a keyboard or monitor, so there won’t be a lot of database administration going on.
Before Solid and solidDB were acquired by IBM, one of the things Solid was proudest of was some embedded apps in which solidDB ran for years in boxes without keyboards or monitors.
I still think it’s a pity that IBM isn’t using solidDB as broadly as the technology deserves. Even so, this is a nice endorsement of MySQL for reliable zero-DBA mid-range use.
McObject — vendor of memory-centric DBMS eXtremeDB — is a tiny, tiny company, without a development team of the size one would think needed to turn out one or more highly-reliable DBMS. So I haven’t spent a lot of time thinking about whether it’s a serious alternative to solidDB for embedded DBMS, e.g. in telecom equipment. However:
- IBM’s acquisition of Solid seems to suggest a focus on DB2 caching rather than the embedded market
- McObject actually has built up something of a customer list, as per the boilerplate on any of its press releases.
And they do seem to have some nice features, including Patricia tries (like solidDB), R-trees (for geospatial), and some kind of hybrid disk-centric/memory-centric operation.
|Categories: GIS and geospatial, In-memory DBMS, McObject, Memory-centric data management, solidDB||6 Comments|