July 7, 2010

Why analytic DBMS increasingly need to be storage-aware

In my quick reactions to the EMC/Greenplum announcement, I opined

I think that even software-only analytic DBMS vendors should design their systems in an increasingly storage-aware manner

promising to explain what I meant later on. So here goes. 

There always have been good technical reasons to tailor hardware to analytic database software. Data moves through disk controller, network, RAM, CPU and more, each with its own data rate. Getting different kinds of parts into the right balance doesn’t completely eliminate bottlenecks – the Wonderful One-Hoss Shay is poetic fiction – but it certainly can help. As a result, every analytic DBMS vendor of any size offers at least one of:

And beyond performance, appliances and pre-specified hardware configurations offer at least the possibility of easing installation, administration, and support.

There also are marketing reasons to offer an appliance or something appliance-like.

Finally, there are three overlapping technical trends that increase the need for storage-awareness in analytic DBMS. First and foremost is the rise of solid-state memory. For starters, I believe:

But this move to flash will require analytic DBMS vendors to be increasingly storage-aware for at least three reasons:

Another trend that could naturally lead analytic DBMS vendors to be more storage-aware is their incorporation of what could be viewed as hierarchical storage/ILM technologies. Different data is stored in different ways and/or on different kinds of storage hardware. (Vendors pursuing – you guessed it – different approaches to this include Teradata, Greenplum, Vertica, and Sybase.) The more automatic that process is, the more storage-aware the DBMS will need to be.

Finally, there are reasons to think that DBMS should be split between conventional servers and smart storage. This is, of course, the Exadata strategy. Netezza’s two-processor approach, while rather different, also somewhat validates the idea.


6 Responses to “Why analytic DBMS increasingly need to be storage-aware”

  1. Mark Weiss on July 7th, 2010 12:12 pm

    The idea that flash helps remove the storage/disk access bottleneck and that columnar and other analytic DB architectures are largely predicated on opening that bottleneck isn’t unlike the insight followed by VoltDB (and before it in the early ’90s by kdb). The idea is that if you remove the need to support slow, blocking operations then you can just do serial operations very fast, use all your resources on computation and dramatically simplifying your system by cutting out all the thread management, concurrency management etc. required by supporting concurrent blocking operations. Do you think speaks to more of a convergence toward simpler architectures that rely on in-memory and fast storage and can support TX as well as analytic workloads because disk access is removed as an issue?

  2. Curt Monash on July 7th, 2010 2:38 pm


    I think OLTP and OLAP will long call for different DBMS architectures. It still matters whether you’re bringing back big blocks or single records. It still matters whether you have a big data redistribution issue. It still matters at what speed you’re doing update

  3. Vlad Rodionov on July 7th, 2010 5:15 pm

    “The idea is that if you remove the need to support slow, blocking operations then you can just do serial operations very fast, use all your resources on computation and dramatically simplifying your system by cutting out all the thread management, concurrency management etc. required by supporting concurrent blocking operations.”

    This is “one query – one thread – one CPU core” approach and it has proved to be suboptimal (Hey, Infobright). You will definitely need thread management and concurrency control in you system if you care about performance and resource utilization. I am not talking about extreme OLTP systems like VoltDB though – mostly about OLAP and analytical DBMS.

  4. Jim Dietz on July 9th, 2010 7:20 pm

    Good points made here. Flash memory technology is indeed a game changer for the DW industry. As Curt has posted here in the past, there’s already an appliance available that leverages an all-Flash memory approach in the form of Solid State Drive arrays for data storage. This product opens up whole new applications for DW in very high performance analytical applications such as near real time cyber security and cap markets portfolio risk assessment.

    The ultimate best use of Flash is intelligently combining its capabilities with the other storage memory technologies to take best advantage of each. To do this right, the DBMS has to be very much storage aware. It has to be able to characterize the speed of the various types of storage hardware and at the same time characterize the usage patterns of each small data block – essentially takes its “temperature”. Then the DBMS has to be able to intelligently migrate data to the storage type that best meets its usage “temperature” – and do this all automatically. See another blog post (Teradata Virtual Storage) on this site for an available approach.

    On the last point, splitting the DBMS between conventional servers and smart storage makes a lot of sense but has to be done carefully to be sure you can efficiently accomplish the “hybrid” storage work discussed above. We decided long ago that virtualization of the database servers and smart storage intelligence in the same powerful node made best use of all the resources including the enormous power of the Intel multi-core processors.

  5. andy on July 14th, 2010 12:14 pm

    I am impressed with most of the discussion here. However one point should be made. A ‘DBMS’ has no intelligence, in the traditional sense. Further, those not practicing our discipline are done a disservice when anthropomorphic tags are used. Since computer systems execute what they are instructed to do by humans, humans need to optimize the performance of DBMS’ in order to take advantage of FLASH technology. Designers and architects need to inform developers and users what the “characterictstic speed of the various types of storage hardware and at the same time characterize the usage patterns of each small data block” in order for the DBMS to reach full performance potential. “Then the [project team]… has to be able to intelligently [prescribe the method to] migrate data to the storage type that best meets…” the characterized speed of the storage hardware and the usage of each small data block. Obviously, this technical work needs to be undertaken to enhance the performance of Electronic, digitally based, warehouse artifacts.
    Using the short-hand of anthropormophism is as wrong as the consequences of not properly executing due diligence in developing reference data bases in the first place.
    To me the term AI is an oxymoron.

  6. Alin Dobra on July 17th, 2010 8:55 am


    You said in your post:

    “Flash overturns some of the fundamental assumptions of modern analytic DBMS, in particular:

    * Sequential reads are hugely better than random

    In my experience this might not necessarily be the case. A couple of months back I go t my hands on 2 of the new OCZ P88 SSD drives (PCI cards really). They are the top of the line in terms of SSD technology and have an advertised 30,000 IO/S and 1.4GB/s sequential read rate (and they cost 4000$ a piece). After playing with them I noticed the following:

    1. With 4K pages, even the sequential read is slow => 33MB/s. Indeed the “seek time” is negligible but that does not mean that things are perfect

    2. With 4M pages the sequential read is 1.4GB/s. That is 40 times better than 4K. That means that mostly sequential access using 4K pages is not anywhere close to large page sequential access. In fact, I think most people do not realize just how large the pages have to be to have peak performance from modern storage.

    3. If you do the math, at 33MB/s for 4K pages you only get 9,000 IO/s. To get the rest you must have multiple pending requests. That means that you must have an engine that can “parallelize” index algorithms if you want to get good performance for a single query.

    What the above mean to me is that you should still do pure sequential access for analytical queries and focus on a design that supports large pages and can use multiple devices in parallel. Case in point, for TPC-H Q6 at 1TB scale on the 2 P88 disks with DataPath (see the SIGMOD paper this year) I get running times of 62s. If you look at published TCP-H results you will notice that systems with 500+ disks take 5-6s, which is 10-12 times better. The bulk of that comes from selecting 1 out of 7 years (7X improvement) so the indexing does not help that much. The 7X is easy to pick up in a scan based system (you only have to start the scan where tuples with the correct date appear; this is a one-pony-trick since it works for a single attribute). The tremendous advantage with linear scans is that you can do parallel query execution. For example on DataPath 32 instances of Q6 take 70s (for all of them) and 64 instances 92s. This suggests that whatever advantage you have from indexing (random I/O) is wiped out for concurrent queries. Datapath was using 4M pages and was reading data sequentially at 2GB/s (over both disks).

    In the end, when it comes to analytical queries, even on SSDs you might be better off with large linear scans. The good news is that the SSD disks improved both sequential and random I/O. In the case of OCZ Z-drives, it allows a tremendous I/O throughput in a very small package (an over-sized PCI card). With 20,000$ you can get a system that can rival in terms of I/O capabilities much more expensive systems.


Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.