Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

September 26, 2008

So what does Oracle Exadata mean for HP Neoview?

That HP is committed to selling a lot of data warehouse hardware — and probably data warehouse appliances in particular — seems obvious, for reasons including:

But Oracle Exadata could produce those appliance sales. So where does HP Neoview fit in?

I was told by an investor today that HP’s investor relations department is saying Oracle Exadata is a Netezza competitor, while Neoview is more in the Teradata market. That’s laughable. Read more

September 25, 2008

Other notes on Oracle data warehousing

Obviously, the big news this week is Exadata, and its parallelization or lack thereof. But let’s not forget the rest of Oracle’s data warehousing technology.

  1. Frankly, I’ve come to think that disk-based OLAP cubes and materialized views are both cop-outs, indicative of a relational data warehouse architecture that can’t answer queries quickly enough straight-up. But if you disagree, then you might like Oracle’s new OLAP cube materialized views, which sound like a worthy competitor to Microsoft Analysis Services. (Further confusing things, I’ve seen reports that Oracle is increasing its commitment to Essbase, a separate MOLAP engine. I hope those are incorrect.)
  2. A few weeks ago, I came to realize that Oracle’s data mining database features actually mattered — perhaps not quite as much as Charlie Berger might think, but to say that is to praise with faint damns. 😉 SPSS seems to be getting large performance gains from leveraging the scoring part, and perhaps the transformation part as well. I haven’t focused on getting my details right yet, so I haven’t been writing about it. But heck, with all the other Oracle data warehousing discussion, it seems right to at least mention this part too.
September 25, 2008

So what’s Oracle’s MPP-aware optimizer and query execution plan story?

Edit: Answers to the title question have now shown up, and so the post below is now superseded by this one.

In most respects — including most data warehousing respects — Oracle’s query optimizer is the most sophisticated on the planet (even ahead of IBM’s, I’d say). But in all the Exadata discussion — and also in a good, comprehensive review of Oracle’s data warehouse technology — I haven’t seen any claims that Oracle has tackled the hard problems of parallel analytics.

Yes, Oracle is now getting data off of multiple disks onto multiple processors at once, without SAN bottlenecks, and doing some local filtering. That’s the heart of the Exadata storage story, and it’s indeed a huge advance over Oracle’s prior technology. But what happens to the data after that? It’s sent over to a RAC cluster. And unless I’m terribly mistaken, any further processing will be done on just a single node in that cluster.

September 24, 2008

Oracle Exadata and Oracle data warehouse appliance sound bites

In addition to my previously posted thoughts on the Oracle Exadata/data warehouse appliance announcement, let me offer some more concise observations.

Contradicting all that potential goodness, Oracle has been making ringing anti-shared-nothing statements, such as the silly:

There are “speed-of-light issues” associated with … scale-out-style grids

That mindset doesn’t auger well for Oracle to ever be a fully competitive high-end data warehouse DBMS vendor.

September 24, 2008

Some of Oracle’s largest data warehouses

Googling around, I came across an Oracle presentation – given some time this year – that lists some of Oracle’s largest data warehouses. 10 databases total are listed with >16 TB, which is fairly consistent with Larry Ellison’s confession during the Exadata announcement that Oracle has trouble over 10 TB (which is something I’ve gotten a lot of flack from a few Oracle partisans for pointing out … 😀 ).

However, what’s being measured is probably not the same in all cases. For example, I think the Amazon 70 TB figure is obviously for spinning disk (elsewhere in the presentation it’s stated that Amazon has 71 TB of disk). But the 16 TB British Telecom figure probably is user data — indeed, it’s the same figure Computergram cited for BT user data way back in 2001.

The list is: Read more

September 24, 2008

Exadata: Oracle finally answers the data warehouse challengers

Oracle, in partnership with HP, has announced a new data warehouse appliance product line, cleverly branded “Exadata.” The basic idea seems to be that database processing is split among two sets of servers:

Numbers are being thrown around suggesting that, unlike prior Oracle offerings, the Oracle Exadata-based appliance at least has scalability and price/performance worth comparing to Teradata — hey, Exa is bigger than Tera! — Netezza, et al.

Kevin Closson, who evidently worked on the project, offers the most useful and detailed description of Oracle Exadata I’ve seen so far. In particular, he and Oracle seem to claim: Read more

September 23, 2008

Oracle is integrating clickstream and network analytics too

Oracle announced today the not-so-concisely-named Oracle Real User Experience Insight, which actually seems to be an official nickname for what is more properly called “Oracle Enterprise Manager Real User Experience Insight.” Trying saying that 10 times straight at network speeds … but I digress.

If I’m reading things correctly, add Oracle to the already long list of vendors who see clickstream and network event analytics as being two sides of the same coin.

September 23, 2008

A few operational BI/BPM/business rules stories

Intersystems is rolling out DeepSee, which is a Cache’-specific BI engine. Since some Intersystems OEMs have been known to pay more money to Business Objects/Crystal Reports than to Intersystems itself, the business motivation is obvious. Technically, Intersystems’ claims include: Read more

September 23, 2008

Peter Batty on Netezza Spatial

As previously noted, I’m not up to speed on Netezza Spatial. Phil Francisco of Netezza has promised we’ll fix that ASAP. In the mean time, I found a blog by a guy named Peter Batty, who evidently:

Batty offers a lot of detail in two recent posts, intermixed with some gollygeewhiz about Netezza in general. If you’re interested in this stuff, Batty’s blog is well worth checking out. Read more

September 22, 2008

Database compression is heavily affected by the kind of data

I’ve written often of how different kinds or brands of data warehouse DBMS get very different compression figures. But I haven’t focused enough on how much compression figures can vary among different kinds of data. This was really brought home to me when Vertica told me that web analytics/clickstream data can often be compressed 60X in Vertica, while at the other extreme — some kind of floating point data, whose details I forget for now — they could only do 2.5X. Edit: Vertica has now posted much more accurate versions of those numbers. Infobright’s 30X compression reference at TradeDoubler seems to be for a clickstream-type app. Greenplum’s customer getting 7.5X — high for a row-based system — is managing clickstream data and related stuff. Bottom line:

When evaluating compression ratios — especially large ones — it is wise to inquire about the nature of the data.

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.