July 12, 2012

Disk, flash, and RAM

Three months ago, I pointed out that it is hard to generalize about memory-centric database management, because there are so many different kinds. That said, there are some basic points that I’d like to record as background for any future discussion of the subject, focusing on differences between disk and RAM. And while I’m at it, I’ll throw in a few comments about flash memory as well.

This post would probably be better if I had actual numbers for the speeds of various kinds of silicon operations, but I’ll do what I can without them.

For most purposes, database speed is a function of a few kinds of number:

The amount of storage used is also important, both directly — storage hardware costs money — and because if you save storage via compression, you may get corresponding benefits in I/O. Power consumption and similar costs are usually tied to hardware efficiency; the less gear you use, the less floor space and cooling you may be able to get away with.

When databases move to RAM from spinning disk, major consequences include:

because:

Consequently:

But notwithstanding everything else, you still need a persistent-storage story. Typically, that’s just your update/transaction log. Hence in-memory write performance is actually gated by the speed at which you can stream your update log to persistent storage — unless, of course, you’re running some kind of event processing/data reduction system and truly are willing to discard most of the data that passes through.

When you have to go to spinning disk, your data access methods are commonly indexes and scans, because those are the approaches that minimize the number of disk reads. But when data lives in RAM, pointer-chasing is a reasonable choice. Also, directly calculated addresses seem to be used more in memory than they are on disk. For example:

Flash, of course, is another kind of silicon memory — persistent, and slower than RAM. Beyond that:

In theory, all the comments about random vs. sequential, pointers vs. indexes, and so on carry over pretty well from RAM to flash. In practice, however, data  access methods used on flash seem to be pretty similar to those on spinning disk. I’m not totally sure why.

Comments

6 Responses to “Disk, flash, and RAM”

  1. Igor on July 13th, 2012 7:06 am

    “Sequential writes to flash are slow, perhaps even slower than sequential writes to spinning disk.” – that’s not the case for last several years.
    Best hard drives can write at ~200MB/sec and most modern SSDs (SATA/SAS – PCIe are much faster) write over 400MB/sec (and best ones over 500MB/sec).

  2. Daniel Lemire on July 13th, 2012 9:34 am

    With RAM, compression may not save you actual dollars in real life because people don’t tend to dynamically grow the amount of RAM on their machines. A realistic setting is that you try to keep the “hot” data in RAM, and you have a backend of archives that are expensive to recover. If you can keep more data in RAM through compression, you get better overall performance.

  3. Ethan Jewett on July 13th, 2012 10:00 am

    Good blog, as is the norm around here. It’s important to keep this kind of stuff up to date in one’s head when considering differently optimized database solutions.

    This isn’t a disagreement with you at all – more just a bit more detail – but I’d like to point out that random access to RAM still seems to be about an order of magnitude slower than sequential access. So while the sequential/random difference on RAM is much smaller as a % of throughput than it is for disk, it is actually a *larger* difference in terms of net throughput! So, is it a smaller or larger difference? It depends on how you count and what you want to do. But data structures optimized for sequential access certainly remains an important consideration even in RAM-based systems.

    My touchstone on this topic is the admittedly somewhat old article “The Pathologies of Big Data” on ACM Queue – http://queue.acm.org/detail.cfm?id=1563874

    Cheers,
    Ethan

  4. Daniel Abadi on July 13th, 2012 10:46 am

    Yes, Igor is right. One place where flash data structures are often different than disk and memory-based data structures are around log-structured storage (e.g. log structured merge trees instead of B trees) because of the very large differences between random writes and sequential writes on flash.

    Also, as far as main memory DBs needing to stream the update log to storage, I’d like to point that one big advantage of deterministic database systems (like the Calvin system we’re building at Yale) is that you only have to log the transaction input rather than every action of the transaction (as the ARIES protocol requires). This is because the input deterministically generates the final state, so logging the input is all you need. This can result in a factor of 10X decrease in log output.

  5. Curt Monash on July 13th, 2012 1:34 pm

    Igor, Ethan — thanks for the catches!

    Dan — just one great example of a general point — when the log becomes the bottleneck, it becomes more important to optimize performance of the log. :)

  6. Clustrix 4.0 and other Clustrix stuff | DBMS 2 : DataBase Management System Services on July 18th, 2012 5:55 pm

    [...] Clustrix uses MLC flash. [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.