In-memory DBMS

Analysis of memory-centric OLTP DBMS. Related subjects include:

December 2, 2012

Are column stores really better at compression?

A consensus has evolved that:

Still somewhat controversial is the claim that:

A strong plausibility argument for the latter point is that new in-memory analytic data stores tend to be columnar — think HANA or Platfora; compression is commonly cited as a big reason for the choice. (Another reason is that I/O bandwidth matters even when the I/O is from RAM, and there are further reasons yet.)

One group that made the in-memory columnar choice is the Spark/Shark guys at UC Berkeley’s AMP Lab. So when I talked with them Thursday (more on that another time, but it sounds like cool stuff), I took some time to ask why columnar stores are better at compression. In essence, they gave two reasons — simplicity, and speed of decompression.

In each case, the main supporting argument seemed to be that finding the values in a column is easier when they’re all together in a column store. Read more

November 29, 2012

Notes on Microsoft SQL Server

I’ve been known to gripe that covering big companies such as Microsoft is hard. Still, Doug Leland of Microsoft’s SQL Server team checked in for phone calls in August and again today, and I think I got enough to be worth writing about, albeit at a survey level only,

Subjects I’ll mention include:

One topic I can’t yet comment about is MOLAP/ROLAP, which is a pity; if anybody can refute my claim that ROLAP trumps MOLAP, it’s either Microsoft or Oracle.

Microsoft’s slides mentioned Yahoo refining a 6 petabyte Hadoop cluster into a 24 terabyte SQL Server “cube”, which was surprising in light of Yahoo’s history as an Oracle reference.

Read more

November 5, 2012

Do you need an analytic RDBMS?

I can think of seven major reasons not to use an analytic RDBMS. One is good; but the other six seem pretty questionable, niche circumstances excepted, especially at this time.

The good reason to not have an analytic RDBMS is that most organizations can run perfectly well on some combination of:

Those enterprises, however, are generally not who I write for or about.

The six bad reasons to not have an analytic RDBMS all take the form “Can’t some other technology do the job better?”, namely:

Read more

August 20, 2012

In-memory, (hybrid) memory-centric DBMS — three analytic glossary draft entries

These are three closely-related draft entries for the DBMS2 analytic glossary. Please comment with any ideas you have for their improvement!

1. We coined the term memory-centric data management to comprise several kinds of technology that manage data in RAM (Random Access Memory), including:

Related link

2. An in-memory DBMS is a DBMS designed under the assumption that substantially all database operations will be performed in RAM (Random Access Memory). Thus, in-memory DBMS form a subcategory of memory-centric data management systems.

Ways in which in-memory DBMS are commonly different from those that query and update persistent storage include: Read more

July 17, 2012

Why I recommend avoiding Kognitio

Since my recent post about Kognitio, things have gotten worse. The company is insistently pushing the marketing message that Kognitio has always been an in-memory product, and at one point went so far as to publicly pretend that I had agreed.

I do not agree. Yes, it’s fair to say — as I did in 2008 — that Kognitio is very RAM-centric, but that’s not at all the same thing. In particular:

The truth is that Kognitio offers a disk-based DBMS that has long been worked on by a small team. I believe that the team really has put considerable effort into how Kognitio uses RAM. But there’s no basis to give Kognitio credit for being “really” in-memory vs. a variety of other analytic RDBMS alternatives. And a row-based product that doesn’t currently offer compression is at a large disadvantage versus, say, columnar products that already do.*

*Columnar systems don’t clobber row-based ones in-memory as extremely as they do in some disk-based use cases. But even in-memory it’s good not to have to move around data that isn’t relevant to your query.

Until Kognitio gets at least somewhat more honest in its marketing, I recommend avoiding Kognitio like the plague. It’s simply not a big enough company to buy from unless you have some level of trust in the management team.

July 2, 2012

Introduction to Yarcdata

Cray’s strategy these days seems to be:

At the moment, the main diversifications are:

The last of the three is what Cray subsidiary Yarcdata is all about. Read more

June 18, 2012

Introduction to MemSQL

I talked with MemSQL shortly before today’s launch. MemSQL technology basics are:

MemSQL’s performance claims include:

MemSQL company basics include: Read more

April 7, 2012

Many kinds of memory-centric data management

I’m frequently asked to generalize in some way about in-memory or memory-centric data management. I can start:

Getting more specific than that is hard, however, because:

Consider, for example, some of the in-memory data management ideas kicking around. Read more

March 21, 2012

Comments on Oracle’s third quarter 2012 earnings call

Various reporters have asked me about Oracle’s third quarter 2012 earnings conference call. Specific Q&A includes:

What did Oracle do to have its earnings beat Wall Street’s estimates?

Have a bad second quarter and then set Wall Street’s expectations too low for Q3. This isn’t about strong results; it’s about modest expectations.

Can Oracle be a leader in both hardware and software?

Beyond that, please see below.

What about Oracle in the cloud?

MySQL is an important cloud supplier. But Oracle overall hasn’t demonstrated much understanding of what cloud technology and business are all about. An expensive SaaS acquisition here or there could indeed help somewhat, but it seems as if Oracle still has a very long way to go.

Other comments

Other comments on the call, whose transcript is available, include: Read more

February 26, 2012

SAP HANA today

SAP HANA has gotten much attention, mainly for its potential. I finally got briefed on HANA a few weeks ago. While we didn’t have time for all that much detail, it still might be interesting to talk about where SAP HANA stands today.

The HANA section of SAP’s website is a confusing and sometimes inaccurate mess. But an IBM whitepaper on SAP HANA gives some helpful background.

SAP HANA is positioned as an “appliance”. So far as I can tell, that really means it’s a software product for which there are a variety of emphatically-recommended hardware configurations — Intel-only, from what right now are eight usual-suspect hardware partners. Anyhow, the core of SAP HANA is an in-memory DBMS. Particulars include:

SAP says that the row-store part is based both on P*Time, an acquisition from Korea some time ago, and also on SAP’s own MaxDB. The IBM white paper mentions only the MaxDB aspect. (Edit: Actually, see the comment thread below.) Based on a variety of clues, I conjecture that this was an aspect of SAP HANA development that did not go entirely smoothly.

Other SAP HANA components include:  Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.