Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

October 4, 2006

Data mining is driving much of data warehousing

Until I did all this recent research on data warehousing, I didn’t realize just how big a role data mining plays in driving the whole thing. Basically, there are three things you can do with a data warehouse – classical BI, “operational” BI, and data mining. If we’re talking about long-running queries, that’s not operational BI, and it’s not all of classical BI either. The rest is data mining. Indeed, if you think back to what you know of the customer bases at data warehouse appliance vendors Netezza and DATallegro, there are a lot of credit-reporting-data types of users – i.e., data miners. And it’s hard to talk about uses for those appliances very long without SAS extracts and the like coming up.
Read more

October 4, 2006

Philip Howard on Netezza

Philip Howard has published a write-up based on Netezza’s user conference, entertaininly mixing fantasy and reality in his usual manner. Notably, he confuses Netezza’s zone maps, which are basically a very limited form of range partitioning, with something that can substitute for real indices. And the mind boggles at his implication that Netezza has neglected the FPGA in its overall market messaging. More understandable is his regurgitation of Netezza’s claims about heat and power, but although I must confess to not having checked either side’s arithmetic, I find Stuart Frost’s rebuttal in the comments to this thread pretty interesting.

But little nits like that aside — yeah, he went to the same conference I did. 😉

October 3, 2006

Vendor segmentation for data warehouse DBMS

February, 2011 edit: I’ve now commented on Gartner’s 2010 Data Warehouse Database Management System Magic Quadrant as well.

Several vendors are offering links to Gartner’s new Magic Quadrant report on data warehouse DBMS. (Edit: This is now a much better link to the 2006 MQ.) Somewhat atypically for Gartner, there’s a strict hierarchy among most of the vendors, with Teradata > IBM > Oracle > Microsoft > Sybase > Kognitio > MySQL > Sand, in each case on both axes of the matrix. The only two exceptions are Netezza and DATallegro, which are depicted as outvisioning Microsoft somewhat even as they trail both Microsoft and Sybase in execution.

Gartner Magic Quadrants tend to annoy me, and I’m not going to critique the rankings in detail. But I do think this particular MQ is helpful in framing a vendor segmentation, namely:

  1. Big full-spectrum MPP/shared-nothing vendors: Teradata and IBM.
  2. MPP/shared-nothing appliance upstarts: Netezza and DATallegro
  3. Big SMP/shared-everything vendors who also are apt to be your OLTP incumbent, and who want to integrate your software stack soup-to-nuts: Oracle and Microsoft
  4. Niche vendors: Pretty much everybody else

Read more

October 3, 2006

IBM and Teradata too

If I had to name one company with the broadest possible overview of the data warehouse engine market, it would have to be IBM. IBM offers software and hardware, services-heavy deals and quasi-appliances, OLTP and ROLAP, shared-everything and shared-nothing, integrated-(almost)-everything and best-of-breed. So their ROLAP recommendations, while still rather self-serving (just as any other vendor’s would be), are at least somewhat more than just a case of “Where you stand depends upon where you sit.”

At its core, the current IBM ROLAP story is:

Here’s some more detail, about IBM and other vendors alike.

Read more

September 28, 2006

Relational data warehouse Expansion (or Explosion) Ratios

One of the least understood aspects of data warehouse technology is what may be called the

Expansion Ratio = (Total disk space used, except for mirroring) / (Size of the base database).

This is similar to the explosion ratio discussed in the OLAP Report’s justly famous discussion of database explosion, but I’m going with my own terminology because I don’t want to be tied to their precise terminology, nor to their technical focus. Expansion Ratios are hotly debated, with some figures being:

I don’t have actual figures from Netezza and DATallegro, but I imagine they’d come out lower than 2X, possibly well below.

Read more

September 24, 2006

Data warehouse and mart uses – a tentative taxonomy

I’ve been posting a lot recently about the diverse database technologies used to support data warehousing. With the marketplace supporting such a broad range of architectures, it seems clear that a lot of those architectures actually deserve to thrive, presumable each in a different kind of usage scenario. So in this post I’ll take a pass at dividing up use cases for data warehouses, and suggesting which kinds of data warehouse management technologies might do the best job of supporting them. To start with, I’ve divided things into a number of buckets:

Read more

September 22, 2006

Competitive issues in data warehouse ease of administration

The last person I spoke with at the Netezza conference on Tuesday was a customer/presenter that the company had picked out for me. One thing he said baffled me — he claimed that Netezza was a real appliance vendor, but DATallegro wasn’t, presumably due to administrability issues. Now, it wasn’t clear to me that he’d ever evaluated DATallegro, so I didn’t take this too seriously, but still the exchange brought into focus the great differences between data warehouse products in the area of administration. For example:

September 20, 2006

SAP’s BI Accelerator

I wrote about SAP’s BI Accelerator quite a bit in my white paper on memory-centric data management, but otherwise I seem not to have posted much about it here. In essence, it’s a product that’s all RAM-based, and generally geared for multi-hundred-gigabyte data marts. The basic design is a compression-heavy column-based architecture, evolved from SAP’s text-indexing technology TREX. Like data warehouse appliances, it eschews indexing, relying instead on blazingly fast table scans.

I asked Lothar Schubert of SAP how BIA was doing in the market in its early going. This was his response:

Read more

September 20, 2006

Netezza vs. conventional data warehousing RDBMS

For various reasons, I’m not going to try to give a comprehensive overview of the Netezza story. But I’d like to highlight four points that illustrate a lot of the difference between Netezza’s architecture and that of more conventional data warehousing DBMS.
Read more

August 17, 2006

Business Objects on EIM, ETL, etc.

I chatted with some Business Objects ETL/EIM (Enterprise Information Management) folks today, in a call that was a direct response to what I heard from and posted about Informatica. The core of the Business Objects story can be summarized (albeit brutally!) like this:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.