November 11, 2006

Federation in the MySQL empire

Marten Micklos, CEO of MySQL, gave a recent speech speculating about a big federated “database in the sky,” providing all sorts of Web 2.0 benefits. Apparently, the idea isn’t at all fleshed out yet. Even so, I have a nagging suspicion he’s pointing in somewhat the wrong direction.

That’s because I think federating relational databases is a generically bad idea. You can federate sets of services, and you can generate services from relational databases – and that’s where DBMS2 (DataBase Management System Services) got its name. This is a superior approach to direct database federation, for two main reasons. (By “direct federation,” I mean some sort of structure in which there’s a giant virtual database whose schema more or less directly incorporates much of the schema of each individual database.)

Read more

October 5, 2006

Introduction to Kognitio WX-2

Kognitio called me for a briefing this morning on their WX-2 product. Technical highlights included:

Much like the other “new” MPP data warehouse vendors, Kognitio claims to never have knowingly been outbenchmarked, whether on performance or on TCO factors such as ease of installation.
Read more

October 4, 2006

SAS Intelligence Storage

SAS has its own data store, called SAS Intelligence Storage. It’s a relational system running on SMP boxes, whose unique feature is that it has fixed-length records and hence is a perfect array, for speedy lookup. This is highly analogous to classical MOLAP systems. However, SAS reports that customers store up to several hundred terabytes of data in SAS Intelligence Storage, which is definitely not very analogous to what goes on in the MOLAP world.

It sounds as if the product is optimized for data mining and generic OLAP alike. Indeed, SAS Intelligence Storage is used to power both SAS’s data mining and other advanced analytics, and also its more conventional BI suite.

October 4, 2006

Data mining is driving much of data warehousing

Until I did all this recent research on data warehousing, I didn’t realize just how big a role data mining plays in driving the whole thing. Basically, there are three things you can do with a data warehouse – classical BI, “operational” BI, and data mining. If we’re talking about long-running queries, that’s not operational BI, and it’s not all of classical BI either. The rest is data mining. Indeed, if you think back to what you know of the customer bases at data warehouse appliance vendors Netezza and DATallegro, there are a lot of credit-reporting-data types of users – i.e., data miners. And it’s hard to talk about uses for those appliances very long without SAS extracts and the like coming up.
Read more

October 4, 2006

Philip Howard on Netezza

Philip Howard has published a write-up based on Netezza’s user conference, entertaininly mixing fantasy and reality in his usual manner. Notably, he confuses Netezza’s zone maps, which are basically a very limited form of range partitioning, with something that can substitute for real indices. And the mind boggles at his implication that Netezza has neglected the FPGA in its overall market messaging. More understandable is his regurgitation of Netezza’s claims about heat and power, but although I must confess to not having checked either side’s arithmetic, I find Stuart Frost’s rebuttal in the comments to this thread pretty interesting.

But little nits like that aside — yeah, he went to the same conference I did. 😉

October 3, 2006

Vendor segmentation for data warehouse DBMS

February, 2011 edit: I’ve now commented on Gartner’s 2010 Data Warehouse Database Management System Magic Quadrant as well.

Several vendors are offering links to Gartner’s new Magic Quadrant report on data warehouse DBMS. (Edit: This is now a much better link to the 2006 MQ.) Somewhat atypically for Gartner, there’s a strict hierarchy among most of the vendors, with Teradata > IBM > Oracle > Microsoft > Sybase > Kognitio > MySQL > Sand, in each case on both axes of the matrix. The only two exceptions are Netezza and DATallegro, which are depicted as outvisioning Microsoft somewhat even as they trail both Microsoft and Sybase in execution.

Gartner Magic Quadrants tend to annoy me, and I’m not going to critique the rankings in detail. But I do think this particular MQ is helpful in framing a vendor segmentation, namely:

  1. Big full-spectrum MPP/shared-nothing vendors: Teradata and IBM.
  2. MPP/shared-nothing appliance upstarts: Netezza and DATallegro
  3. Big SMP/shared-everything vendors who also are apt to be your OLTP incumbent, and who want to integrate your software stack soup-to-nuts: Oracle and Microsoft
  4. Niche vendors: Pretty much everybody else

Read more

October 3, 2006

IBM and Teradata too

If I had to name one company with the broadest possible overview of the data warehouse engine market, it would have to be IBM. IBM offers software and hardware, services-heavy deals and quasi-appliances, OLTP and ROLAP, shared-everything and shared-nothing, integrated-(almost)-everything and best-of-breed. So their ROLAP recommendations, while still rather self-serving (just as any other vendor’s would be), are at least somewhat more than just a case of “Where you stand depends upon where you sit.”

At its core, the current IBM ROLAP story is:

Here’s some more detail, about IBM and other vendors alike.

Read more

October 2, 2006

Comment spam continues to be ridiculous

As previously noted, this blog is under serious attack from the comment spammers, and there’s a slight chance a legitimate comment will get lost as supposed spam. That said, I know of only one such confirmed incident in all the time I’ve had WordPress-based blogs.

Also, the spam blockers are imperfect, and some vile spam comments do get through until I delete them, commonly the same day. Sorry about that. It’s nothing that you haven’t also seen many times over in your email, I’m sure.

I just checked a few minutes ago, and Akismet intercepted 372 comments since the last time I cleared the buffer, less than a day ago. The top 150 (all I could check) were certainly real spam …

EDIT:  51 more spam cleared out 4 1/2 hours later.

September 28, 2006

Relational data warehouse Expansion (or Explosion) Ratios

One of the least understood aspects of data warehouse technology is what may be called the

Expansion Ratio = (Total disk space used, except for mirroring) / (Size of the base database).

This is similar to the explosion ratio discussed in the OLAP Report’s justly famous discussion of database explosion, but I’m going with my own terminology because I don’t want to be tied to their precise terminology, nor to their technical focus. Expansion Ratios are hotly debated, with some figures being:

I don’t have actual figures from Netezza and DATallegro, but I imagine they’d come out lower than 2X, possibly well below.

Read more

September 27, 2006

Logless, lockless Netezza more carefully explained

I talked at length with Bill Blake and Doug Johnson of Netezza today. (Bill is exactly the guy I complained of previously having had my access cut off to.) One takeaway was a clarification of their approach to transactions, which sounds even cooler than I first thought. It’s actually not a new idea; they just timestamp rows with CreateIDs and DeleteIDs, then exploit those to the hilt. Actually, it seems like this approach would be interesting in OTLP as well, although I’m not aware of it being used in any of the more successful OLTP DBMS systems. (Yes, this is an open invitation to fans of less-established DBMS products to tell me of their virtues, preferably in a flame-free manner.)
Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.