March 23, 2014

DBMS2 revisited

The name of this blog comes from an August, 2005 column. 8 1/2 years later, that analysis holds up pretty well. Indeed, I’d keep the first two precepts exactly as I proposed back then:

I’d also keep the general sense of the third precept, namely appropriately-capable data integration, but for that one the specifics do need some serious rework.

For starters, let me say:

Next, I’d like to call out what is generally a non-problem — when a query can go to two or more systems for the same information, which one should it tap? In theory, that’s a much harder problem in theory than ordinary DBMS optimization. But in practice, only the simplest forms of the challenge tend to arise, because when data is stored in more than one system, they tend to have wildly different use cases, performance profiles and/or permissions.

So what I’m saying is that most traditional kinds of data integration problems are well understood and on their way to being solved in practice. We have our silos; data is replicated as needed between silos; and everything is more or less cool. But of course, as traditional problems get solved, new ones arise, and those turn out to be concentrated among real-time requirements.

“Real-time” of course means different things in different contexts, but for now I think we can safely partition it into two buckets:

The latter category arises in the case of automated bidding, famously in high-frequency securities trading, but now in real-time advertising auctions as well. But those vertical markets aside, human real-time integration generally is fast enough.

Narrowing the scope further, I’d say that real-time transactional integration has worked for a while. I date it back to the initially clunky EAI (Enterprise Application Integration) vendors of the latter 1990s. The market didn’t turn out to be that big, but neither did the ETL market, so it’s all good. SOAs, as previously noted, are doing pretty well.

Where things still seem to be dicier is in the area of real-time analytic integration. How can analytic processing be tougher in this regard than transactional? Two ways. One, of course, is data volume. The second is that it’s more likely to involve machine-generated data streams. That said, while I hear a lot about a BI need-for-speed, I often suspect it of being a want-for-speed instead. So while I’m interested in writing a more focused future post on real-time data integration, there may be a bit of latency before it comes out.

Comments

2 Responses to “DBMS2 revisited”

  1. MattK on March 23rd, 2014 3:54 pm

    > People associate it with the kind of schema-heavy relational database design that’s now widely hated, and the long project cycles it is believed to be bring with it.

    That might a bit strong. I am seeing more examples, in some scenarios, that the fixed schema is the right solution, with some projects migrating away from schemaless.

    As always, one approach does not solve all problems.

  2. SteveF on March 26th, 2014 8:18 pm

    Re: multi-systems of record non-problem

    I respectfully disagree that this is a non-problem. A couple common real-world examples:

    – Same entity, horizontally partitioned between different systems with different data models and reference values. E.g. a customer dimension. How do you combine them into one dimension? A: Nasty rollup logic.

    – Same entity and record, but contradictory information across two systems (e.g. one says an order is open, the other says it is closed). Which is it? Business rules come into play.

    – Two giant tables that have to be joined at the middle tier. Only the most advanced BI products can break down a query and make each system aggregate its respective tables, and join the result sets in a second step. And it’s not trivial to design.

    I’ve never been able to assume that only trivial cross-system queries will be required in the real world.

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.