September 3, 2009

Teradata and Netezza are doing MapReduce too

Netezza told me a while ago that it planned to introduce MapReduce, and agreed yesterday this was no longer NDAed. Stephen Brobst of Teradata* let slip at XLDB that Teradata has MapReduce too, apparently implemented but not yet generally available.

I don’t have details in either case.  Netezza and Teradata evidently aren’t taking MapReduce as seriously as Aster Data, or even Greenplum or Vertica. But MapReduce has become pretty much of a “checkmark” item for large-database analytic DBMS vendors even so.

*Technically, Brobst is not and never has been a Teradata employee — but he’s widely and correctly regarded as being “of Teradata” even so. 🙂

September 3, 2009

SAS on Netezza and other Netezza extensibility

I chatted with SAS CTO Keith Collins yesterday about the new SAS/Netezza in-database parallel data mining scoring offering. My impression is that this is very similar to SAS’ current Teradata support, notwithstanding SAS’ and Teradata’s apparent original intention of offering in-database modeling by now as well.

I gather this is a big performance-enhancing deal, just as it is for SPSS or Oracle’s own data mining over Oracle.  However, I must confess to not yet understanding why.  That is, I don’t know what’s so complicated about data mining scoring algorithms that makes hand-coding them in SQL particularly forbidding. My naive view of data mining is that you do a big regression to get a bunch of weights, and the resulting scoring algorithm is a linear combination of a few dozen variables.  Evidently, that’s not quite right.

Anyhow, it turns out that SAS held off on this work until it could be done for TwinFin. That’s largely because TwinFin lets partners write code on Intel CPUs, while previously they had to write in C for Netezza’s FPGAs. I got a similar sense from at least one other Netezza partner as well.

September 3, 2009

Oracle Exadata hybrid columnar compression

Oracle Database 11g Release 2 is out, and as usual I wasn’t briefed — perhaps because Oracle is more scared than its competitors are of hard questions, perhaps for some other reason entirely.*  Anyhow, Oracle Database 11 Release 2 contains an Exadata-only feature called hybrid columnar compression. The Oracle Database 11g Release 2 white paper says “data is grouped, ordered, and stored one column at a time.” But Kevin Closson clarifies:

The word hybrid is important.

Rows are still used. They are stored in an object called a Compression Unit. Compression Units can span multiple blocks. Like values are stored in the compression unit with metadata that maps back to the rows.

So, “hybrid” is the word. But, none of that matters as much as the effectiveness. This form of compression is extremely effective.

That sounds a whole lot like PAX. Specifically, in Oracle’s case I would guess “hybrid columnar compression” provides the compression benefits of column stores, but not column stores’ I/O benefits, and also not any kind of in-memory compression. Read more

September 2, 2009

Teradata has over 100 appliances in production

I recently wrote that Teradata had gotten serious about appliance product lines, and had non-trivial sales figures for them. In a press release today, Teradata is now explicitly saying (emphasis mine):

Teradata now has more than 100 appliances in production, including the Data Mart Appliance 551, the Data Warehouse Appliance 2550, and the Extreme Data Appliance 1550, which complement the core platform, the Teradata Active Enterprise Data Warehouse 5550.

The breakdowns on that are NDA, and anyhow I can’t find them immediately in my notes.* But if memory serves — while a lot of those appliances are used for test and development, a whole other lot of them are used to do actual production query-answering work. (Edit: Memory turned out to be wrong.) Read more

August 25, 2009

Sybase IQ technical highlights

General highlights of the Sybase IQ technical story include:

Highlights of the Sybase IQ compression story include: Read more

August 25, 2009

Sybase IQ business notes

As specialized analytic DBMS go, Sybase is near the top of the charts both in age (Sybase IQ was first introduced in the mid 1990s) and adoption. That’s even more true, of course, if we restrict the discussion strictly to columnar DBMS, aka column stores. Basic Sybase IQ adoption claims include:

Note that 98% of Sybase IQ installations are under 5 terabytes; the heart of Sybase IQ’s business is the sub-terabyte data warehouse market.* Read more

August 24, 2009

Teradata highlights some analytic use cases

A couple of slides in my recent briefing on Teradata’s Active Enterprise Data Warehouse Story contained long lists of analytic use cases, at a finer level of granualarity than I’m focusing on for a September speaking tour. I think they’re interesting to pass along. Read more

August 24, 2009

Teradata’s Active Enterprise Data Warehouse story

Teradata used to tell a one-size-fits-all Enterprise Data Warehouse (EDW) story. That’s no longer the case. Last year, Teradata introduced a range of products. I think Teradata is serious about selling its full product range, and by now has achieved buy-in from its sales force for that strategy. I base these beliefs on data points such as:

But that raises the question: How does Teradata pitch the advantages of its top-end product line these days? At least at the corporate level, the answer seems to focus less on the “EDW” concept than it used to, and more on “Active.” Teradata – which actually has been talking about “Active Data Warehousing” for about a decade — indeed calls its top-end 55xx series the “Teradata Active Enterprise Data Warehouse.”

Teradata proudly told me that it has >100 customers who have truly adopted an “Active” EDW. When we discussed what that meant, supported by a whole lot of named examples, it became clear that “Active” data warehousing: Read more

August 21, 2009

Social network analysis, aka relationship analytics

A number of applications lend themselves to graph-oriented analytics, including:

There are plenty more graph-oriented applications, of course, such as the identification of biochemical pathways. But I want to focus for now on ones like those on my list. My key points are:

Here’s what I mean. Read more

August 21, 2009

Bottleneck Whack-A-Mole

Developing a good software product is often a process of incremental improvement. Obviously, that can happen in the case of feature addition or bug-fixing. Less obviously, there’s also great scope for incremental improvement in how the product works at its core.

And it goes even further. For example, I was told by a guy who is now a senior researcher at Attivio: “How do you make a good speech recognition product? You start with a bad one and keep incrementally improving it.”

In particular, I’ve taken to calling the process of enhancing a product’s performance across multiple releases “Bottleneck Whack-A-Mole” (rhymes with guacamole). This is a reference to the Whack-A-Mole arcade game,* the core idea of which is:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.