Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

May 8, 2008

Vertica update

Another TDWI conference approaches. Not coincidentally, I had another Vertica briefing. Primary subjects included some embargoed stuff, plus (at my instigation) outsourced data marts. But I also had the opportunity to follow up on a couple of points from February’s briefing, namely:

Vertica has about 35 paying customers. That doesn’t sound like a lot more than they had a quarter ago, but first quarters can be slow.

Vertica’s list price is $150K/terabyte of user data. That sounds very high versus the competition. On the other hand, if you do the math versus what they told me a few months ago — average initial selling price $250K or less, multi-terabyte sites — it’s obvious that discounting is rampant, so I wouldn’t actually assume that Vertica is a high-priced alternative.

Vertica does stress several reasons for thinking its TCO is competitive. First, with all that compression and performance, they think their hardware costs are very modest. Second, with the self-tuning, they think their DBA costs are modest too. Finally, they charge only for deployed data; the software that stores copies of data for development and test is free.

May 8, 2008

Outsourced data marts

Call me slow on the uptake if you like, but it’s finally dawned on me that outsourced data marts are a nontrivial segment of the analytics business. For example:

To a first approximation, here’s what I think is going on. Read more

April 29, 2008

Truviso and EnterpriseDB blend event processing with ordinary database management

Truviso and EnterpriseDB announced today that there’s a Truviso “blade” for Postgres Plus. By email, EnterpriseDB Bob Zurek endorsed my tentative summary of what this means technically, namely:

  • There’s data being managed transactionally by EnterpriseDB.

  • Truviso’s DML has all along included ways to talk to a persistent Postgres data store.

  • If, in addition, one wants to do stream processing things on the same data, that’s now possible, using Truviso’s usual DML.

Read more

April 25, 2008

ParAccel pricing

I made a round of queries about data warehouse software or appliance pricing, and am posting the results as I get them. Earlier installments featured Teradata and Netezza. Now ParAccel is up.

ParAccel’s software license fees are actually very simple — $50K per server or $100K per terabyte, whichever is less. (If you’re wondering how the per-TB fee can ever be the smaller one, please recall that ParAccel offers a memory-centric approach to sub-TB databases.)

Details about how much data fits on a node are hard to come by, as is clarity about maintenance costs. Even so, pricing turns out to be one of the rare subjects on which ParAccel is more forthcoming than most competitors.

April 25, 2008

Yet another data warehouse database and appliance overview

For a recent project, it seemed best to recapitulate my thoughts on the overall data warehouse specialty DBMS and appliance marketplace. While what resulted is highly redundant with what I’ve posted in this blog before, I’m sharing anyway, in case somebody finds this integrated presentation more useful. The original is excerpted to remove confidential parts.

… This is a crowded market, with a lot of subsegments, and blurry, shifting borders among the subsegments.

Everybody starts out selling consumer marketing and telecom call-detail-record apps. …

Oracle and similar products are optimized for updates above everything else. That is, short rows of data are banged into tables. The main indexing scheme is the “b-tree,” which is optimized for finding specific rows of data as needed, and also for being updated quickly in lockstep with updates to the data itself.

By way of contrast, an analytic DBMS is optimized for some or all of:

Database and/or DBMS design techniques that have been applied to analytic uses include: Read more

April 21, 2008

DATAllegro finally has a blog

It took a lot of patient nagging, but DATAllegro finally has a blog. Based on the first post, I predict:

The crunchiest part of the first post is probably

Another very important aspect of performance is ensuring sequential reads under a complex workload. Traditional databases do not do a good job in this area – even though some of the management tools might tell you that they are! What we typically see is that the combination of RAID arrays and intervening storage infrastructure conspires to break even large reads by the database into very small reads against each disk. The end result is that most large DW installations have very large arrays of expensive, high-speed disks behind them – and still suffer from poor performance.

I’ve pounded the table about sequential reads multiple times — including in a (DATAllegro-sponsored) white paper — but the point about misleading management tools is new to me.

Now if I could just get a production DATAllegro reference, I’d be completely happy …

April 21, 2008

Netezza pricing

In connection with the announcement of the Teradata 2500, I asked some Teradata competitors about pricing. Netezza’s response amounted to “We don’t disclose list pricing, but our cheapest system handles about 3 1/4 TB and sells for under $200K.” So Netezza’s actual pricing is well below the list price of the Teradata 2500.

April 21, 2008

Teradata introduces lower-cost appliances

After months of leaks, Teradata has unveiled its new lines of data warehouse appliances, raising the total number either from 1 to 3 (my view) or 0 to 2 (what you believe if you think Teradata wasn’t previously an appliance vendor). Most significant is the new Teradata 2500 series, meant to compete directly with the smaller data warehouse specialists. Highlights include:

Read more

April 18, 2008

Kickfire kicks off

I chatted with Raj Cherabuddi and others on the Kickfire (formerly C2) team for over an hour on Monday, and now have a better sense of their story. There are some very basic questions I still don’t have answers to; I’ll fill those in when I can.

Highlights of what I have and haven’t figured out so far include:

*Somebody – perhaps adman extraordinaire Rick Bennett? — may want to check my memory on this, but I think Oracle’s famed “Gentlemen, start your snails” ad in the early 1990s was about PC World tests, not TPCs. Oracle also had an ad about WW1-style planes nosediving, but I don’t think those referenced TPCs either.

April 8, 2008

Kickfire is de-cloaking

Kickfire, the renamed C2, is doing one of those buzz-building rollouts in which they make sure the first word comes from people on their payroll golly-gee-whizzing. You can see those at Xarpb and Diamond Notes, as well as a forthcoming article in MySQL magazine. Farhan Mashraqi also appears to be involved. Kickfire is also sponsoring the MySQL user conference next week.

I plan to write more after I get some substance, but a few things seem clear:

1. Kickfire’s product is an appliance that functions as a MySQL storage engine.
2. There’s a custom chip involved.
3. Kickfire plans to throw around the “stream processing” buzzphrase a lot.

Now, “stream processing” means a lot of different things to different people. E.g., Netezza uses the phrase just because their FPGA throws away a lot of data before ever routing it to more conventional SQL processing. But pending a briefing, I’m guessing that Kickfire’s sense is similar to what underlies the case for using CEP in BI.

Edit: Here’s an update after an actual Kickfire briefing.

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.