September 30, 2009

Facts and rumors

September 29, 2009

What Nielsen really uses in data warehousing DBMS

In its latest earnings call, Oracle made a reference to The Nielsen Company that was — to put it politely — rather confusing. I just plopped down in a chair next to Greg Goff, who evidently runs data warehousing at Nielsen, and had a quick chat. Here’s the real story.

September 29, 2009

Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2

Oracle is pushing Exadata 2 as being a great system for any of OLTP (OnLine Transaction Processing), data warehousing or, presumably, the integration of same. This claim rests on a few premises, namely: Read more

September 25, 2009

The hunt for Oracle Exadata production references

Over the past four weeks, I’ve given speeches in Boston, DC, Milan, London, and SF,* attended a conference in Lyon, done a fair amount of consulting, and taken a few non-client briefings as well. That’s why I haven’t had much of a chance to sit down, analyze the tea leaves, and write about Exadata 2. (Small exception: Highlights from and remarks on the Oracle Database 11g Release 2 white paper.) I hope to do that soon.

*I’ll bop over to Chicago for the last of the series early next week.

But first — can anybody identify much in the way of Exadata production references? Oracle recently talked of a few flagship data warehouse customers, but those don’t seem to be running Exadata. I talked recently with an Oracle prospect from the US, who only got one reference from Oracle — in Eastern Europe. (Well, two references, if you also count the system integrator on the same deal.)

So far as I can tell, Oracle Exadata production sites are pretty scarce on the ground. What, if anything, am I missing?

September 21, 2009

Notes on the Oracle Database 11g Release 2 white paper

The Oracle Database 11g Release 2 white paper I cited a couple of weeks ago has evidently been edited, given that a phrase I quoted last month is no longer to be found. Anyhow, here are some quotes from and comments on what evidently is the latest version. Read more

September 19, 2009

Some issues in comparing analytic DBMS performance

The analytic DBMS/data warehouse appliance market is full of competitive performance claims. Sometimes, they’re completely fabricated, with no basis in fact whatsoever. But often performance-advantage claims are based on one or more head-to-head performance comparisons. That is, System A and System B are used to run the same set of queries, and some function is applied that takes the two sets of query running times as an input, and spits out a relative performance number as an output. Read more

September 19, 2009

Oracle gives a few customer database size examples

In its recent quarterly conference call, Oracle said (as per the Seeking Alpha transcript):

AC Neilsen, for instance, we deployed a 45-terabyte data [mart], they called it; Adidas, 13 terabytes; Australian Bureau of Statistics, 250 terabytes; and of course, some of our high-end ones that you have probably heard of in the past, AT&T, 250 terabytes; Yahoo!, 700 terabytes — just gives you an idea of the size of the databases that are out there and how they are growing, and that’s driving the need for greater throughput.

I don’t know what’s being counted there, but I wouldn’t be surprised if those were legit user-data figures.

Some other notes:

September 13, 2009

HadoopDB

Despite a thoughtful heads-up from Daniel Abadi at the time of his original posting about HadoopDB, I’m just getting around to writing about it now. HadoopDB is a research project carried out by a couple of Abadi’s students. Further research is definitely planned. But it seems too early to say that HadoopDB will ever get past the “research and oh by the way the code is open sourced” stage and become a real code line — whether commercialized, open source, or both.

The basic idea of HadoopDB is to put copies of a DBMS at different nodes of a grid, and use Hadoop to parcel work among them. Major benefits when compared with massively parallel DBMS are said to be:

HadoopDB has actually been built with PostgreSQL. That version achieved performance well below that of a commercial DBMS “DBX”, where X=2. Column-store guru Abadi has repeatedly signaled his intention to try out HadoopDB with VectorWise at the nodes instead. (Recall that VectorWise is shared-everything.) It will be interesting to see how that configuration performs.

The real opportunity for HadoopDB, however, in my opinion may lie elsewhere. Read more

September 13, 2009

Fault-tolerant queries

MapReduce/Hadoop fans sometimes raise the question of query fault-tolerance. That is — if a node fails, does the query need to be restarted, or can it keep going? For example, Daniel Abadi et al. trumpet query fault-tolerance as one of the virtues of HadoopDB. Some of the scientists at XLDB spoke of query fault-tolerance as being a good reason to leave 100s or 1000s of terabytes of data in Hadoop-managed file systems.

When we discussed this subject a few months ago in a couple of comment threads, it seemed to be the case that:

This raises an obvious (pair of) question(s) — why and/or when would anybody ever care about query fault-tolerance? Read more

September 12, 2009

Introduction to the XLDB and SciDB projects

Before I write anything else about the overlapping efforts known as XLDB and SciDB, I probably should explain and disambiguate what they are as best I can. XLDB was organized and still is run by guys who want to solve a scientific problem in eXtremely Large DataBase Management, most especially Jacek Becla of SLAC (the organization previously known as Stanford Linear Accelerator Center). Becla’s original motivation was that he needs a DBMS to manage what will be 55 petabytes of raw image data and 100 petabytes of astronomical data total for LSST (Large Synoptic Survey Telescope). Read more

Next Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.