November 5, 2012

Do you need an analytic RDBMS?

I can think of seven major reasons not to use an analytic RDBMS. One is good; but the other six seem pretty questionable, niche circumstances excepted, especially at this time.

The good reason to not have an analytic RDBMS is that most organizations can run perfectly well on some combination of:

SaaS (Software as a Service).
A low-volume static website.
A network focused on office software.
A single cheap server, likely running a single instance of a general-purpose RDBMS.

Those enterprises, however, are generally not who I write for or about.

The six bad reasons to not have an analytic RDBMS all take the form “Can’t some other technology do the job better?”, namely:

A data warehouse that’s just another instance of your OLTP (OnLine Transaction Processing) RDBMS. If your problem is that big, it’s likely that a specialized analytic RDBMS will be more cost-effective and generally easier to deal with.
MOLAP (Multi-Dimensional OnLine Analytic Processing). That ship has sailed … and foundered … and been towed to drydock.
In-memory BI. QlikView, SAP HANA, Oracle Exalytics, and Platfora are just four examples of many. But few enterprises will want to confine their analytics to such data as fits affordably in RAM.
Non-tabular* approaches to investigative analytics. There are many examples in the Hadoop world — including the recent wave of SQL add-ons to Hadoop — and some in the graph area as well. But those choices will rarely suffice for the whole job, as most enterprises will want better analytic SQL performance for (big) parts of their workloads.
Tighter integration of analytics and OLTP (OnLine Transaction Processing). Workday worklets illustrate that business intelligence/OLTP integration is a really good idea. And it’s an idea that Oracle and SAP can be expected to push heavily, when they finally get their product acts together. But again, that’s hardly all the analytics you’re going to want to do.
Tighter integration of analytics and other short-request processing. An example would be maintaining a casual game’s leaderboard via a NoSQL write-optimized database. Yet again, that’s hardly all the analytics a typical enterprise will want to do.

*I’ve long used “tabular” to cover both relational and MOLAP structures, the point being that in both cases you have a neat and regular schema, well-represented as a set of arrays.

What could change this picture would be a future in which:

All your tabular business data fits into RAM.
Also, the OLTP/analytic DBMS distinction becomes less important.

In that case, it might be reasonable to get by with:

A single in-memory relational DBMS, handling OLTP and some analytics alike.
Whichever additional short-request systems you need (mainly for internet-heavy uses).
A Hadoop-based analytic data store.

I’m on record as suggesting that traditional databases will indeed wind up in RAM. But I’m more doubtful that a single in-memory DBMS will suffice for OTLP and analytics alike.

What are some key aspects of a specialized analytic RDMS? My partial and overlapping list starts:

Fast, high-volume analytic I/O.
Smart query planning.
Smart, high-volume internal data movement.
Smart workload management.
Good data compression, including in cache and during query execution.
Strong analytic platform capabilities.
Fast execution of analytic requests — standard SQL, advanced SQL, or other.

An analytic RDBMS typically:

Is optimized for reads, which are often large, and perhaps temporary large writes as well.
Reduces I/O bottlenecks via, for example, compression, columnar storage, and/or scale-out.

If all the data is in RAM, these problems are indeed lessened. Also, Oracle Exadata is dedicated to the premise that, even using conventional computer parts, I/O bottlenecks can be reduced with enough hardware — and price aside, it seems to work. Still, if you talk with analytic RDBMS designers, you repeatedly hear that it’s not that simple — even getting data efficiently out of RAM is different in the analytic and OLTP cases.

Query planning/execution, data movement, and workload management go together — they’re all about getting the most work done with the least machine effort, and they all depend on determining which specific execution choices might be synergistic or anti-synergistic with each other. Taken together, they form a very tough optimization challenge, which is different in the OLTP and analytic cases. Adding in analytic platform capabilities adds yet more difficulty to the optimization problem. And so:

A fast analytic database manager is a hard thing to build; expecting it to be fast at OLTP as well may be too much to ask for.

Given that, the discussion pivots to:

OK, but can we overprovision the RAM by so much that suboptimal performance doesn’t matter?

My guess is “Not any time soon” — because efficiency is always a good thing, databases will always grow, and RAM will never be free.

Bottom line: Analytic RDBMS will likely be needed for a long time.

Related link

Integrating short-request and analytic processing (March, 2011)

Categories: Business intelligence, Data warehousing, Games and virtual worlds, Hadoop, In-memory DBMS, MOLAP

Subscribe to our complete feed!

Comments

12 Responses to “Do you need an analytic RDBMS?”

Thomas W Dinsmore on November 6th, 2012 9:15 am

Analytic databases will always be with us in part due to organizational conflict between the needs of business stakeholders and the IT organization. Analysts whose careers depend on speed and agility are often at odds with IT, which tends to be control-oriented and focused on cost.

There are entire industries that depend on this divide. In 1995, Gartner predicted that improvements to enterprise data warehouses would put Marketing Service Organizations out of business. MSOs today continue to grow and expand.
Igor Stolarov on November 6th, 2012 11:39 am

from the point of the start-up that has all the computing resources on Amazon cloud, I’d like to make a couple of points:

– Access to elastic map reduce resources (Hadoop and Amazon map reduce) is very important. However, map reduce coding (which needs to be flexible, quick and “versioning friendly”) is not trivial, and the deployment of the environment takes time.

– Availability of the large analytical DMBS with advanced SQL capabilities can accomplish most of the required (and quite advanced) processing leaving a portion of only what’s absolutely required for Hadoop and other non-db and non-mapreduce processing.
We use Greenplum CE (on the largest single cluster instance in EC2 deployed with multiple EBS volumes) and it does the job beautifully even with the bad I/O typical for EC2. It is also very easily integrated with S3 storage and non-database processing.
Data partitioning, compression, and advanced querying capabilities all what PostgreSQL has to offer with parallel processing is very hard to beat.
This setup allows for a very flexible, very capable, and very cost effective infrastructure that is very hard to achieve on any single platform.
Brian on November 8th, 2012 4:27 pm

My company runs Database with a data base company- We also use Greenplum CE and it does a wonderful job as well… It is very flexible.. Our IT dept has been very appreciative of the new set up. Thanks for the great post!
MattK on November 9th, 2012 6:41 pm

> MOLAP. That ship has sailed …

So pre-aggregated cubes are no longer a thing now?

Replaced by ROLAP, or is it now NoLAP?
Curt Monash on November 9th, 2012 7:16 pm

Pretty much. Performance can be gotten without MOLAP rigidity.
Teradata, SAP, HANA, in memory on November 13th, 2012 9:29 am

[…] http://www.dbms2.com/2012/11/05/do-you-need-an-analytic-rdbms/ […]
Ian on November 15th, 2012 5:58 am

The MOLAP ship is alive and well – but perhaps has found its place in the spectrum of BI solutions: Essentially batch-based and conceived around pre-aggregation as the strategy for increasing performance, it’s retreating into the mid-data/complex structured data arena leaving the lower latency, continuous reporting of very big data to the big appliance vendors.

In particular, the rich (but complex) language that can define sets and calculated members is very useful in encapsulating complex logic within the cube structure for use in slice-and-dice interfaces. Additionally the MDX language being devoid of JOIN syntax is far more suited to reporting queries than SQL ever is.
MattK on November 28th, 2012 11:25 am

> Performance can be gotten without MOLAP rigidity.

Is this perhaps what is driving the Microsoft BISM / Tabular mode and DAX?
Notes on Microsoft SQL Server | DBMS 2 : DataBase Management System Services on November 29th, 2012 5:36 am

[…] can’t yet comment about is MOLAP/ROLAP, which is a pity; if anybody can refute my claim that ROLAP trumps MOLAP, it’s either Microsoft or […]
Thoughts on in-memory columnar add-ons | DBMS 2 : DataBase Management System Services on October 10th, 2013 6:15 am

[…] certain merits of this approach, I don’t believe in complete alternatives to analytic RDBMS. The rise of analytic DBMS oriented toward multi-structured data just strengthens that […]
top.moncs.su on October 15th, 2015 1:23 pm

top.moncs.su

Do you need an analytic RDBMS? | DBMS 2 : DataBase Management System Services
run your own web server on November 25th, 2015 11:04 am

run your own web server

Do you need an analytic RDBMS? | DBMS 2 : DataBase Management System Services

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Do you need an analytic RDBMS?

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin