March 23, 2015

A new logical data layer?

I’m skeptical of data federation. I’m skeptical of all-things-to-all-people claims about logical data layers, and in particular of Gartner’s years-premature “Logical Data Warehouse” buzzphrase. Still, a reasonable number of my clients are stealthily trying to do some kind of data layer middleware, as are other vendors more openly, and I don’t think they’re all crazy.

Here are some thoughts as to why, and also as to challenges that need to be overcome.

There are many things a logical data layer might be trying to facilitate — writing, querying, batch data integration, real-time data integration and more. That said:

Trivial query routing or federation is … trivial.

In fact, what I just described is Business Objects’ original innovation — the semantic layer — two decades ago.

Careless query routing or federation can be a performance nightmare. Do a full scan. Move all the data to some intermediate server that lacks capacity or optimization to process it quickly. Wait. Wait. Wait. Wait … hmmm, maybe this wasn’t the best data-architecture strategy.

Streaming goes well with federation. Some data just arrived, and you want to analyze it before it ever gets persisted. You want to analyze it in conjunction with data that’s been around longer. That’s a form of federation right there.

There are ways to navigate schema messes. Sometimes they work.

Neither extreme view here — “It’s easy!” or “It will never work!” — seems right. Rather, I think there’s room for a lot of effort and differentiation in exposing cross-database schema information.

I’m leaving out one part of the story on purpose — how these data layers are going to be packaged, and specifically what other functionality they will be bundled with. Confidentially would screw up that part of the discussion; so also would my doubts as to whether some of those plans are fully baked yet. That said, there’s an aspect of logical data layer to CDAP, and to Kiji as well. And of course it’s central to BI (Business Intelligence) and ETL (Extract/Transform/Load) alike.

One way or another, I don’t think the subject of logical data layers is going away any time soon.

Related link


3 Responses to “A new logical data layer?”

  1. David Gruzman on March 23rd, 2015 6:02 pm

    It’s funny to see what’s happening. There used to be good old RDBMSs. They can do lots of things – transactions, aggregations, lookups and joins. They’re just hard to scale. So the gang of NoSQLs showed up, and each one took a piece of RDBMS functionality. Some built indexes but no scans, some had full text search, some had joins and group by (map reduce) but no lookups, etc. And now we see attempts to build a layer on top of all this just to gain back what we had with our trusty swiss army knife, RDBMS.

  2. dorian on March 24th, 2015 11:09 am

    Nobody ever thought if it would be easier to scale rdbms by building their sharding layer instead of creating their sharded-kv-store and adding features on top ?

  3. PeterCsillag on May 19th, 2015 4:32 pm

    I don’t think federation is only about scaling, rather preserving and reusing data, logic and capabilities (e.g. to scale) already built in underlying / source systems.
    My preferred solution is – instead of building a superior logical data layer on top of all systems – embedding the data federation / virtualization functionality into all the databases, enabling to mix data and resources with other systems. This way no trade-offs required to use one or the other (SQL vs NoSQL or anything) all the time. Use your actually preferred in front – and utilize the values of the rest in the background.

    Not surprisingly I am one founder of VirtDB, an early phase data virtualization product which does this.

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.