January 5, 2013

Data(base) virtualization — a terminological mess

Data/database virtualization seems to be a hot subject right now, and vendors of a broad variety of different technologies are all claiming to be in the space. A terminological mess has ensued, as Monash’s First and Third Laws of Commercial Semantics are borne out in spades.

If something is like “virtualization”, then it should resemble hypervisors such as VMware. To me:

Anything that claims to be “like virtualization” should be viewed in that light. I.e., it isn’t real virtualization unless it has the ex uno plures* feature.

*”Out of one, many”. It turns out that e unum pluribus just means the same as e pluribus unum, namely “Out of many, one”; word order isn’t as important in Latin as in English.

Most commonly, “data/database virtualization” is used to denote some kind of transparent data federation.

I think “virtualization” is a bad name for this, because there isn’t much ex uno plures going on. But at least it’s a name that’s in widespread use.

More solid is the sense of “database virtualization” used by Delphix. Their core idea is to take all your different database copies for product, test, development, archiving and so on, and to the extent possible turn them into one real database, plus a bunch of diffs. Cost savings are obvious if that works. The ex uno plures feature is present.

Recently, I’ve noticed that transparent sharding is being referred to as database virtualization, especially by ParElastic. Transparent sharding is a great feature, but I don’t think calling it “database virtualization” makes much sense.

I noted back in October that the essence of multitenancy is a special-case version of ex uno plures. If somebody offered that and wanted to call it “virtualization”, I might not argue too much.

Weirdest of all is ScaleDB’s use of the term. ScaleDB seems to be claiming that:

Neither logic nor language support ScaleDB’s side.


5 Responses to “Data(base) virtualization — a terminological mess”

  1. Al DeLosSantos on January 7th, 2013 10:22 am

    Happy New Year Curt.
    Good job diving right into another cloudy topic in the DBMS space. I started researching database virtualization technology a little last year, also with the VMWare model as a backdrop, but found it didn’t quite fit, as you mention. I have not resumed my research yet (need to work through the OS/App model vs the DBMS/Data model for a proper solution?), but would welcome another of your classification writeups on vendors in this space if you’re looking for additional topics :^)
    Al D.

  2. Jerry Leichter on February 2nd, 2013 7:50 am

    The broad definitions I’ve used for a pair of distinct but related concepts is:

    – Something is virtual if it appears to be there but is not;
    – Something is transparent if it appears *not* to be there although it *is*.

    A virtual machine appears to be a machine, but it’s just a piece of something else. A VPN appears to be a private network made of real, private wires and routers and such, but that’s not really there – it’s a software construct. The Delphix example you give fits right in: There appear to be separate databases for product, test, and so on – real disks holding private copies of data an indexes and all the rest – but, again, these are really just software.

    You can, of course, tie yourself in knots deciding when it’s virtual and when it’s “just software”. That network the VPN virtualized was much more than just the wires and routers – without the software construct of an “IP network” on top of it, it wouldn’t have been worth much. But in practice the lines are quite clear – and if they aren’t, “virtual” is really the wrong word.

    — Jerry

  3. Comments on Gartner’s 2012 Magic Quadrant for Data Warehouse Database Management Systems — concepts | DBMS 2 : DataBase Management System Services on February 5th, 2013 8:26 am

    […] I disapprove, data virtualization seems to be the term that will win for describing data […]

  4. Kyle Hailey on May 7th, 2013 12:59 pm

    Thanks for lucid discussion of these terms. The wiki pages on “data virtualization” and “database virtualization” have left much to be desired.
    VMware set the precedence with virtualization making many out of one and in a similar way Delphix makes many virtual databases out of one set of database files.
    I like Jerry’s comment that one out of many is “transparency.” If something doesn’t appear to be there but is there, then it’s transparent such as aggregating multiple database into one apparent datasource. The multiple sources (say aggregating encapsulating an Oracle source and a SQL Server source) that look like one hide the individual players. Thus as you say, “transparent data federation” would be a great term for that technology to use.

    – Kyle Hailey

  5. The case for not Delphix | switchedfabric on November 20th, 2013 9:51 am

    […] platform (the virtualization idea of “ex uno plures”, that is, out of one, many). Curt Monash likes Delphix’s idea of database virtualization, but i still […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.