Solid-state memory

Discussion of how developments in solid-state memory will affect database management. Related subjects include:

July 12, 2012

Disk, flash, and RAM

Three months ago, I pointed out that it is hard to generalize about memory-centric database management, because there are so many different kinds. That said, there are some basic points that I’d like to record as background for any future discussion of the subject, focusing on differences between disk and RAM. And while I’m at it, I’ll throw in a few comments about flash memory as well.

This post would probably be better if I had actual numbers for the speeds of various kinds of silicon operations, but I’ll do what I can without them.

For most purposes, database speed is a function of a few kinds of number:

The amount of storage used is also important, both directly — storage hardware costs money — and because if you save storage via compression, you may get corresponding benefits in I/O. Power consumption and similar costs are usually tied to hardware efficiency; the less gear you use, the less floor space and cooling you may be able to get away with.

When databases move to RAM from spinning disk, major consequences include: Read more

June 27, 2012

Schooner got acquired by SanDisk

SanDisk has acquired my client Schooner Information Technology. Notes on that include:

That’s about all I have at this time.

April 4, 2012

IBM DB2 10

Shortly before Tuesday’s launch of DB2 10, IBM’s Conor O’Mahony checked in for a relatively non-technical briefing.* More precisely, this is about DB2 for “distributed” systems, aka LUW (Linux/Unix/Windows); some of the features have already been in the mainframe version of DB2 for a while. IBM is graciously permitting me to post the associated DB2 10 announcement slide deck.

*I hope any errors in interpretation are minor.

Major aspects of DB2 10 include new or improved capabilities in the areas of:

Of course, there are various other enhancements too, including to security (fine-grained access control), Oracle compatibility, and DB2 pureScale. Everything except the pureScale part is also reflected in IBM InfoSphere Warehouse, which is a near-superset of DB2.*

*Also, the data ingest part isn’t in base DB2.

Read more

March 9, 2012

Hardware and components — lessons from Teradata

I love talking with Carson Schmidt, chief of Teradata’s hardware engineering (among other things), even if I don’t always understand the details of what he’s talking about. It had been way too long since our last chat, so I requested another one. We were joined by Keith Muller, who I presume is pictured here. Takeaways included:

Read more

November 1, 2011

MarkLogic 5, and why you might care

MarkLogic is releasing MarkLogic 5. Key elements of the announcement are:

Also, MarkLogic is early with a feature that most serious DBMS vendors will soon have – support for tiered storage, with writes going first to solid-state storage, then being flushed to disk via a caching-style algorithm.* And as befits a sometime search-engine-substitute, MarkLogic has finally licensed a large set of document filters, from an Australian company called Isys. Apparently, the special virtue of the Isys filters is that they’re good at extracting not only text, but metadata as well.

*If there’s a caching algorithm that doesn’t contain a major element of LRU (Least Recently Used), I don’t recall ever hearing about it.

MarkLogic seems to have settled on a positioning that, although distressingly buzzword-heavy, is at least partly based upon reality. The real part includes:

Based on that reality, MarkLogic talks a lot about Volume, Velocity, Variety, Big Data, unstructured data, semi-structured data, and big data analytics.

Read more

September 22, 2011

HP systems soundbites

It is widely rumored that there will be a leadership change at HP (Meg Whitman in, Leo Apotheker out). In connection with that, I found myself holding forth on points such as:

September 19, 2011

Are there any remaining reasons to put new OLTP applications on disk?

Once again, I’m working with an OLTP SaaS vendor client on the architecture for their next-generation system. Parameters include:

So I’m leaning to saying:   Read more

September 14, 2011

Kaminario goes (mainly) flash

Kaminario, which used to be in the business of solid state storage via DRAM, now is emphasizing hybrid DRAM/flash storage appliances instead. The reason is evidently price. Per terabyte of primary storage (before mirroring onto disk and so on):

Kaminario positions DRAM as where you focus your most write-intensive/ bottlenecking loads, such as logging or temp space, with the primary benefit being performance and a secondary benefit being slowing the wear on your flash.

Read more

August 13, 2011

Couchbase technical update

My Couchbase business update with Bob Wiederhold was very interesting, but it didn’t answer much about the actual Couchbase product. For that, I talked with Dustin Sallings. We jumped around a lot, and some important parts of the Couchbase product haven’t had their designs locked down yet anyway. But here’s at least a partial explanation of what’s up.

memcached is a way to cache data in RAM across a cluster of servers and have it all look logically like a single memory pool, extremely popular among large internet companies. The Membase product — which is what Couchbase has been selling this year — adds persistence to memcached, an obvious improvement on requiring application developers to write both to memcached and to non-transparently-sharded MySQL. The main technical points in adding persistence seem to have been:

Couchbase is essentially Membase improved by integrating CouchDB into it, with the main changes being:

Let’s drill down a bit into Membase/Couchbase clustering and consistency. Read more

July 27, 2011

MongoDB users and use cases

I spoke with Eliot Horowitz and Max Schierson of 10gen last month about MongoDB users and use cases. The biggest clusters they came up with weren’t much over 100 nodes, but clusters an order of magnitude bigger were under development. The 100 node one we talked the most about had 33 replica sets, each with about 100 gigabytes of data, so that’s in the 3-4 terabyte range total. In general, the largest MongoDB databases are 20-30 TB; I’d guess those really do use the bulk of available disk space.   Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.