November 5, 2012

Do you need an analytic RDBMS?

I can think of seven major reasons not to use an analytic RDBMS. One is good; but the other six seem pretty questionable, niche circumstances excepted, especially at this time.

The good reason to not have an analytic RDBMS is that most organizations can run perfectly well on some combination of:

Those enterprises, however, are generally not who I write for or about.

The six bad reasons to not have an analytic RDBMS all take the form “Can’t some other technology do the job better?”, namely:

*I’ve long used “tabular” to cover both relational and MOLAP structures, the point being that in both cases you have a neat and regular schema, well-represented as a set of arrays.

What could change this picture would be a future in which:

In that case, it might be reasonable to get by with:

I’m on record as suggesting that traditional databases will indeed wind up in RAM. But I’m more doubtful that a single in-memory DBMS will suffice for OTLP and analytics alike.

What are some key aspects of a specialized analytic RDMS? My partial and overlapping list starts:

An analytic RDBMS typically:

If all the data is in RAM, these problems are indeed lessened. Also, Oracle Exadata is dedicated to the premise that, even using conventional computer parts, I/O bottlenecks can be reduced with enough hardware — and price aside, it seems to work. Still, if you talk with analytic RDBMS designers, you repeatedly hear that it’s not that simple — even getting data efficiently out of RAM is different in the analytic and OLTP cases.

Query planning/execution, data movement, and workload management go together — they’re all about getting the most work done with the least machine effort, and they all depend on determining which specific execution choices might be synergistic or anti-synergistic with each other. Taken together, they form a very tough optimization challenge, which is different in the OLTP and analytic cases. Adding in analytic platform capabilities adds yet more difficulty to the optimization problem. And so:

A fast analytic database manager is a hard thing to build; expecting it to be fast at OLTP as well may be too much to ask for.

Given that, the discussion pivots to:

OK, but can we overprovision the RAM by so much that suboptimal performance doesn’t matter?

My guess is “Not any time soon” — because efficiency is always a good thing, databases will always grow, and RAM will never be free.

Bottom line: Analytic RDBMS will likely be needed for a long time.

Related link

Comments

10 Responses to “Do you need an analytic RDBMS?”

  1. Thomas W Dinsmore on November 6th, 2012 9:15 am

    Analytic databases will always be with us in part due to organizational conflict between the needs of business stakeholders and the IT organization. Analysts whose careers depend on speed and agility are often at odds with IT, which tends to be control-oriented and focused on cost.

    There are entire industries that depend on this divide. In 1995, Gartner predicted that improvements to enterprise data warehouses would put Marketing Service Organizations out of business. MSOs today continue to grow and expand.

  2. Igor Stolarov on November 6th, 2012 11:39 am

    from the point of the start-up that has all the computing resources on Amazon cloud, I’d like to make a couple of points:

    – Access to elastic map reduce resources (Hadoop and Amazon map reduce) is very important. However, map reduce coding (which needs to be flexible, quick and “versioning friendly”) is not trivial, and the deployment of the environment takes time.

    – Availability of the large analytical DMBS with advanced SQL capabilities can accomplish most of the required (and quite advanced) processing leaving a portion of only what’s absolutely required for Hadoop and other non-db and non-mapreduce processing.
    We use Greenplum CE (on the largest single cluster instance in EC2 deployed with multiple EBS volumes) and it does the job beautifully even with the bad I/O typical for EC2. It is also very easily integrated with S3 storage and non-database processing.
    Data partitioning, compression, and advanced querying capabilities all what PostgreSQL has to offer with parallel processing is very hard to beat.
    This setup allows for a very flexible, very capable, and very cost effective infrastructure that is very hard to achieve on any single platform.

  3. Brian on November 8th, 2012 4:27 pm

    My company runs Database with a data base company- We also use Greenplum CE and it does a wonderful job as well… It is very flexible.. Our IT dept has been very appreciative of the new set up. Thanks for the great post!

  4. MattK on November 9th, 2012 6:41 pm

    > MOLAP. That ship has sailed …

    So pre-aggregated cubes are no longer a thing now?

    Replaced by ROLAP, or is it now NoLAP?

  5. Curt Monash on November 9th, 2012 7:16 pm

    Pretty much. Performance can be gotten without MOLAP rigidity.

  6. Teradata, SAP, HANA, in memory on November 13th, 2012 9:29 am
  7. Ian on November 15th, 2012 5:58 am

    The MOLAP ship is alive and well – but perhaps has found its place in the spectrum of BI solutions: Essentially batch-based and conceived around pre-aggregation as the strategy for increasing performance, it’s retreating into the mid-data/complex structured data arena leaving the lower latency, continuous reporting of very big data to the big appliance vendors.

    In particular, the rich (but complex) language that can define sets and calculated members is very useful in encapsulating complex logic within the cube structure for use in slice-and-dice interfaces. Additionally the MDX language being devoid of JOIN syntax is far more suited to reporting queries than SQL ever is.

  8. MattK on November 28th, 2012 11:25 am

    > Performance can be gotten without MOLAP rigidity.

    Is this perhaps what is driving the Microsoft BISM / Tabular mode and DAX?

  9. Notes on Microsoft SQL Server | DBMS 2 : DataBase Management System Services on November 29th, 2012 5:36 am

    […] can’t yet comment about is MOLAP/ROLAP, which is a pity; if anybody can refute my claim that ROLAP trumps MOLAP, it’s either Microsoft or […]

  10. Thoughts on in-memory columnar add-ons | DBMS 2 : DataBase Management System Services on October 10th, 2013 6:15 am

    […] certain merits of this approach, I don’t believe in complete alternatives to analytic RDBMS. The rise of analytic DBMS oriented toward multi-structured data just strengthens that […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.