Structured documents

Analysis of data management technology based on a structured-document model, or optimized for XML data. Related subjects include:

December 18, 2016

Introduction to and CrateDB and CrateDB basics include:

In essence, CrateDB is an open source and less mature alternative to MemSQL. The opportunity for MemSQL and CrateDB alike exists in part because analytic RDBMS vendors didn’t close it off.

CrateDB’s not-just-relational story starts:

Read more

November 23, 2016

MongoDB 3.4 and “multimodel” query

“Multimodel” database management is a hot new concept these days, notwithstanding that it’s been around since at least the 1990s. My clients at MongoDB of course had to join the train as well, but they’ve taken a clear and interesting stance:

When I pointed out that it would make sense to call this “multimodel query” — because the storage isn’t “multimodel” at all — they quickly agreed.

To be clear: While there are multiple ways to read data in MongoDB, there’s still only one way to write it. Letting that sink in helps clear up confusion as to what about MongoDB is or isn’t “multimodel”. To spell that out a bit further: Read more

December 10, 2015

Readings in Database Systems

Mike Stonebraker and Larry Ellison have numerous things in common. If nothing else:

I mention the latter because there’s a new edition of Readings in Database Systems, aka the Red Book, available online, courtesy of Mike, Joe Hellerstein and Peter Bailis. Besides the recommended-reading academic papers themselves, there are 12 survey articles by the editors, and an occasional response where, for example, editors disagree. Whether or not one chooses to tackle the papers themselves — and I in fact have not dived into them — the commentary is of great interest.

But I would not take every word as the gospel truth, especially when academics describe what they see as commercial market realities. In particular, as per my quip in the first paragraph, the data warehouse market has not yet gone to the extremes that Mike suggests,* if indeed it ever will. And while Joe is close to correct when he says that the company Essbase was acquired by Oracle, what actually happened is that Arbor Software, which made Essbase, merged with Hyperion Software, and the latter was eventually indeed bought by the giant of Redwood Shores.**

*When it comes to data warehouse market assessment, Mike seems to often be ahead of the trend.

**Let me interrupt my tweaking of very smart people to confess that my own commentary on the Oracle/Hyperion deal was not, in retrospect, especially prescient.

Mike pretty much opened the discussion with a blistering attack against hierarchical data models such as JSON or XML. To a first approximation, his views might be summarized as:  Read more

October 15, 2015

Couchbase 4.0 and related subjects

I last wrote about Couchbase in November, 2012, around the time of Couchbase 2.0. One of the many new features I mentioned then was secondary indexing. Ravi Mayuram just checked in to tell me about Couchbase 4.0. One of the important new features he mentioned was what I think he said was Couchbase’s “first version” of secondary indexing. Obviously, I’m confused.

Now that you’re duly warned, let me remind you of aspects of Couchbase timeline.

Technical notes on Couchbase 4.0 — and related riffs :) — start: Read more

September 10, 2015

MongoDB update

One pleasure in talking with my clients at MongoDB is that few things are NDA. So let’s start with some numbers:

Also >530 staff, and I think that number is a little out of date.

MongoDB lacks many capabilities RDBMS users take for granted. MongoDB 3.2, which I gather is slated for early November, narrows that gap, but only by a little. Features include:

There’s also a closed-source database introspection tool coming, currently codenamed MongoDB Scout.  Read more

May 20, 2015

MemSQL 4.0

I talked with my clients at MemSQL about the release of MemSQL 4.0. Let’s start with the reminders:

The main new aspects of MemSQL 4.0 are:

There’s also a new free MemSQL “Community Edition”. MemSQL hopes you’ll experiment with this but not use it in production. And MemSQL pricing is now wholly based on RAM usage, so the column store is quasi-free from a licensing standpoint is as well.

Read more

March 15, 2015

BI for NoSQL — some very early comments

Over the past couple years, there have been various quick comments and vague press releases about “BI for NoSQL”. I’ve had trouble, however, imagining what it could amount to that was particularly interesting, with my confusion boiling down to “Just what are you aggregating over what?” Recently I raised the subject with a few leading NoSQL companies. The result is that my confusion was expanded. :) Here’s the small amount that I have actually figured out.

As I noted in a recent post about data models, many databases — in particular SQL and NoSQL ones — can be viewed as collections of <name, value> pairs.

Consequently, a NoSQL database can often be viewed as a table or a collection of tables, except that:

That’s all straightforward to deal with if you’re willing to write scripts to extract the NoSQL data and transform or aggregate it as needed. But things get tricky when you try to insist on some kind of point-and-click. And by the way, that last comment pertains to BI and ETL (Extract/Transform/Load) alike. Indeed, multiple people I talked with on this subject conflated BI and ETL, and they were probably right to do so.

Read more

February 22, 2015

Data models

7-10 years ago, I repeatedly argued the viewpoints:

Since then, however:

So it’s probably best to revisit all that in a somewhat organized way.

Read more

February 12, 2015

MongoDB 3.0

Old joke:

A lot has happened in MongoDB technology over the past year. For starters:

*Newly-released MongoDB 3.0 is what was previously going to be MongoDB 2.8. My clients at MongoDB finally decided to give a “bigger” release a new first-digit version number.

To forestall confusion, let me quickly add: Read more

October 22, 2014

Snowflake Computing

I talked with the Snowflake Computing guys Friday. For starters:

Much of the Snowflake story can be summarized as cloud/elastic/simple/cheap.*

*Excuse me — inexpensive. Companies rarely like their products to be labeled as “cheap”.

In addition to its purely relational functionality, Snowflake accepts poly-structured data. Notes on that start:

I don’t know enough details to judge whether I’d call that an example of schema-on-need.

A key element of Snowflake’s poly-structured data story seems to be lateral views. I’m not too clear on that concept, but I gather: Read more

Next Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.