Theory and architecture

Analysis of design choices in databases and database management systems. Related subjects include:

May 13, 2010

SAP believes in database proliferation

For as long as we’ve had the concept of database management, there’s been a debate as to whether it is realistic for large enterprises to have a single Grand Unified Enterprise Storehouse Of All Information, or whether database proliferation actually makes sense. This argument has been particularly intense in the area of data warehouse/data marts. I’m generally on the side of data mart proliferation.

4 1/2 years ago, I noted that SAP believed strongly in database proliferation: Read more

May 12, 2010

Quick reactions to SAP acquiring Sybase

SAP is acquiring Sybase. On the conference call SAP said Sybase would be run as a separate division of SAP (no surprise). Most of the focus was on Sybase’s mobile technology, which is forecast at >$400 million in 2010 revenues (which would be 30%ish of the total). My quick reactions include: Read more

May 12, 2010

The Clustrix story

After my recent post, the Clustrix guys raised their hands and briefed me. Takeaways included:    Read more

May 4, 2010

Clustrix may be doing something interesting

Clustrix launched without briefing me or, at least so far as I can tell, anybody else who knows much about database technology. But Clustrix did post a somewhat crunchy, no-registration-required, white paper. Based on that, I get the impression:

May 2, 2010

Daniel Abadi on NoSQL design tradeoffs

In a thought-provoking post, Daniel Abadi points out NoSQL-related terminological problems similar to the ones I just railed against, and argues

To me, CAP should really be PACELC — if there is a partition (P) how does the system tradeoff between availability and consistency (A and C); else (E) when the system is running as normal in the absence of partitions, how does the system tradeoff between latency (L) and consistency (C)?

and goes on to say

For example, Amazon’s Dynamo (and related systems like Cassandra and SimpleDB) are PA/EL in PACELC — upon a partition, they give up consistency for availability; and under normal operation they give up consistency for lower latency. Giving up C in both parts of PACELC makes the design simpler — once the application is configured to be able to handle inconsistencies, it makes sense to give up consistency for both availability and lower latency.

However, I think Daniel’s improved formulation is still misleading, in at least two ways:

May 1, 2010

Read-your-writes (RYW), aka immediate, consistency

In which we reveal the fundamental inequality of NoSQL, and why NoSQL folks are so negative about joins.

Discussions of NoSQL design philosophies tend to quickly focus in on the matter of consistency. “Consistency”, however, turns out to be a rather overloaded concept, and confusion often ensues.

In this post I plan to address one essential subject, while ducking various related ones as hard as I can. It’s what Werner Vogel of Amazon called read-your-writes consistency (a term to which I was actually introduced by Justin Sheehy of Basho). It’s either identical or very similar to what is sometimes called immediate consistency, and presumably also to what Amazon has recently called the “read my last write” capability of SimpleDB.

This is something every database-savvy person should know about, but most so far still don’t. I didn’t myself until a few weeks ago.

Considering the many different kinds of consistency outlined in the Werner Vogel link above or in the Wikipedia consistency models article — whose names may not always be used in, er, a wholly consistent manner — I don’t think there’s much benefit to renaming read-your-writes consistency yet again. Rather, let’s just call it RYW consistency, come up with a way to pronounce “RYW”, and have done with it. (I suggest “ree-ooh”, which evokes two syllables from the original phrase. Thoughts?)

Definition: RYW (Read-Your-Writes) consistency is achieved when the system guarantees that, once a record has been updated, any attempt to read the record will return the updated value.

Read more

April 29, 2010

Vertica update

Last month, Vertica’s CEO Ralph Breslauer quit,* and Vertica made it sound like there would be a new CEO late in April. And indeed, as of April 29, there was. He’s a guy I’ve never heard of before named Chris Lynch, apparently quite the sales machine builder. The most substance I’ve found is a pair of Mass High Tech articles — the latter exceedingly typo-ridden — to the general effect that:

Read more

April 27, 2010

Gear6 seems to have failed in the memcached market too

As previously noted, I’ve briefly cut back on blogging (and research) due to some family health issues. The first casualty was a post about memcached. One of the two companies to be featured were my new clients at Northscale. The other was Gear6. What they had in common was:

Read more

April 18, 2010

Greenplum et alia’s BigDataNews.com site

Greenplum recently started a website BigDataNews.com, and quickly signed up Aster Data as a co-sponsor. (Edit: As per a comment below, the decision to sign up additional sponsors was made by the site’s independent publisher.) It’s actually being run by Brett Sheppard, a former Gartner/DataQuest analyst who now gets involved in this kind of thing. (Brett and I may be working on another project soon, with Greenplum funding.)

The heart of the site is feeds* from a variety of high-profile blogs (DBMS2, Daniel Abadi’s, Joe Hellerstein’s, James Kobelius’, et al.), plus some additional posts written by Brett (primarily) or Greenplum folks. Highlights of Brett’s posts include:

*At least in my case, that’s just a post title or snippet, plus a link back to the main post. The same goes for mapreduce.org, actually.

April 12, 2010

Greenplum Chorus and Greenplum 4.0

Greenplum is making two product announcements this morning. Greenplum 4.0 is a revision of the core Greenplum database technology. In addition, Greenplum is announcing Greenplum Chorus, which is the first product release instantiating last year’s EDC (Enterprise Data Cloud) vision statement and marketing campaign.

Greenplum 4.0 highlights and related observations include: Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.