ParAccel

Analysis of columnar data warehouse DBMS vendor ParAccel. Related subjects include:

June 23, 2009

ParAccel pricing

As I noted in connection with ParAccel’s recent TPC-H filing, I think the whole exercise is basically an expensive joke. But one slightly useful spin-off is that ParAccel disclosed pricing.  Specifically, ParAccel’s stated price in the disclosure document is:

Last year ParAccel quoted prices of $100,000/TB or $50,000/server.  The latter figure would seem to have led to lower numbers on the benchmark configuration, so perhaps it’s no longer an option on ParAccel’s price list.

June 22, 2009

The TPC-H benchmark is a blight upon the industry

ParAccel has released a 30,000-gigabtye TPC-H benchmark, and no less a sage than Merv Adrian paid attention. Now, the TPCs may have had some use in the 1990s. Indeed, Merv was my analyst relations contact for a visit to my clients at Sybase around the time — 1996 or so — I was advising Sybase on how to market against its poor benchmark results.  But TPCs are worthless today.

It’s not just that TPCs are highly tuned (ParAccel’s claim of “load-and-go” is laughable Edit: Looking at Appendix A of the full disclosure report, maybe it’s more justified than I thought.). It’s also not just that different analytic database management products perform very differently on different workloads, making the TPC-H not much of an indicator of anything real-life.  The biggest problem is: Most TPC benchmarks are run on absurdly unrealistic hardware configurations.

For example, if you look at some details, the ParAccel 30-terabyte benchmark ran on 43 nodes, each with 64 gigabytes of RAM and 24 terabytes of disk. That’s 961,124.9 gigabytes of disk, officially, for a 32:1 disk/data ratio. By way of contrast, real-life analytic DBMS with good compression often have disk/data ratios of well under 1:1.

Meanwhile, the RAM:data ratio is around 1:11  It’s clear that ParAccel’s early TPC-H benchmarks ran entirely in RAM; indeed, ParAccel even admits that.  And so I conjecture that ParAccel’s latest TPC-H benchmark ran (almost) entirely in RAM as well. Once again, this would illustrate that the TPC-H is irrelevant to judging an analytic DBMS’ real world performance.

More generally — I would not advise anybody to consider ParAccel’s product, for any use, except after a proof-of-concept in which ParAccel was not given the time and opportunity to perform extensive off-site tuning. I tend to feel that way about all analytic DBMS, but it’s a particular concern in the case of ParAccel.

April 22, 2009

DBMS transparency layers never seem to sell well

A DBMS transparency layer, roughly speaking, is software that makes things that are written for one brand of database management system run unaltered on another.* These never seem to sell well. ANTs has failed in a couple of product strategies. EnterpriseDB’s Oracle compatibility only seems to have netted it a few sales, and only a small fraction of its total business. ParAccel’s and Dataupia’s transparency strategies have produced even less.

*The looseness in that definition highlights a key reason these technologies don’t sell well — it’s hard to be sure that what you’re buying will do a good job of running your particular apps.

This subject comes to mind for two reasons. One is that IBM seems to have licensed EnterpriseDB’s Oracle transparency layer for DB2. The other is that a natural upgrade path from MySQL to Oracle might be a MySQL transparency layer on top of an Oracle base.

Read more

April 1, 2009

Lots of analytic DBMS vendors are hiring

After writing about a Twitter jobs page, it occurred to me to check out whether analytic DBMS vendors are still hiring. Based on the Careers pages on their websites, I determined that Aster, Greenplum, Kickfire, and ParAccel all evidently are, in various mixes of (mainly) technical and field positions. At that point I got bored and stopped.

I didn’t choose those vendors entirely at random. If I had to name three vendors who are said to have had small layoffs at some point over the past few quarters, it would be ParAccel, Greenplum, and Kickfire.  So if even they are hiring, the analytic DBMS sector is still pretty healthy … or at least thinks it is. ;)

March 18, 2009

Database implications if IBM acquires Sun

Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion today (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal — if it happens — might affect the database management system industry.

Read more

February 4, 2009

Draft slides on how to select an analytic DBMS

I need to finalize an already-too-long slide deck on how to select an analytic DBMS by late Thursday night.  Anybody see something I’m overlooking, or just plain got wrong?

Edit: The slides have now been finalized.

January 3, 2009

ParAccel’s market momentum

After my recent blog post, ParAccel is once again angry that I haven’t given it proper credit for it accomplishments. So let me try to redress the failing.

Uh, that’s about all I can think of. What else am I forgetting? Surely that can’t be ParAccel’s entire litany of market success!

December 29, 2008

ParAccel actually uses relatively little PostgreSQL code

I often find it hard to write about ParAccel’s technology, for a variety of reasons:

ParAccel is quick, however, to send email if I post anything about them they think is incorrect.

All that said, I did get careless when I neglected to doublecheck something I already knew. Read more

December 20, 2008

More grist for the column vs. row mill

Daniel Abadi and Sam Madden are at it again, following up on their blog posts of six months arguing for the general superiority of column stores over row stores (for analytic query processing).  The gist is to recite a number of bases for superiority, beyond the two standard ones of less I/O and better compression, and seems to be based largely on Section 5 of a SIGMOD paper they wrote with Neil Hachem.

A big part of their argument is that if you carry the processing of columnar and/or compressed data all the way through in memory, you get lots of advantages, especially because everything’s smaller and hence fits better into Level 2 cache. There also is some kind of join algorithm enhancement, which seems to be based on noticing when the result wound up falling into a range according to some dimension, and perhaps using dictionary encoding in a way that will help induce such an outcome.

The main enemy here is row-store vendors who say, in effect, “Oh, it’s easy to shoehorn almost all the benefits of a column-store into a row-based system.”  They also take a swipe — for being insufficiently purely columnar — at unnamed columnar Vertica competitors, described in terms that seemingly apply directly to ParAccel.

August 24, 2008

My current customer list among the data warehouse specialists

One of my favorite pages on the Monash Research website is the list of many current and a few notable past customers. (Another favorite page is the one for testimonials.) For a variety of reasons, I won’t undertake to be more precise about my current customer list than that. But I don’t think it would hurt anything to list the data warehouse DBMS/appliance specialists in the group. They are:

All of those are Monash Advantage members.

If you care about all this, you may also be interested in the rest of my standards and disclosures.

August 12, 2008

Compare/constrast of Vertica, ParAccel, and Exasol

I talked with Exasol today – at 5:00 am! — and of course want to blog about it. For clarity, I’d like to start by comparing/contrasting the fundamental data structures at Vertica, ParAccel, and Exasol. And it feels like that should be a separate post. So here goes.

Beyond the above, I plan to discuss in a separate post how Exasol does MPP shared-nothing software-only columnar data warehouse database management differently than Vertica and ParAccel do shared-nothing software-only columnar data warehouse database management. :)

July 24, 2008

How will Oracle save its data warehouse business?

By acquiring DATAllegro, Microsoft has seriously leapfrogged Oracle in data warehouse technology. All doubts about maturity and versatility notwithstanding, DATAllegro has a 10X or better size advantage (actually, I think it’s more like 20-40X) versus Oracle in warehouses its technology can straightforwardly handle. Oracle cannot afford to let this move go unanswered.

It’s of course possible that Oracle has been successfully developing comparable data warehouse technology internally. But it’s unlikely. Oracle hasn’t done anything that radical, internally and successfully, for about 15 years, RAC (Real Application Clusters) excepted. (I.e., since the object/relational extensibility framework started in Release 7.) So in all likelihood, the answer will come via acquisition. I think there are four candidates that make the most sense: Teradata, Vertica, ParAccel, and Greenplum. Kognitio (controlled by former Oracle honcho Geoff Squire) might be in the mix as well. Netezza is probably a non-starter because of its hardware-centric strategy.

Here’s why I’m emphasizing Teradata, Vertica, ParAccel, and Greenplum:

Read more

May 19, 2008

ParAccel unveils its EMC-related appliance strategy

Embargoes are getting ever more stupid these days, wasting analysts’ and bloggers’ time in doomed attempts to micromanage the news flow. ParAccel is no exception to the rule. An announcement that’s actually been public knowledge for a couple of months was finally made official a few minutes ago. It’s an appliance, or at least an attempt to gain customers for an appliance. The core ideas include:

April 25, 2008

ParAccel pricing

I made a round of queries about data warehouse software or appliance pricing, and am posting the results as I get them. Earlier installments featured Teradata and Netezza. Now ParAccel is up.

ParAccel’s software license fees are actually very simple — $50K per server or $100K per terabyte, whichever is less. (If you’re wondering how the per-TB fee can ever be the smaller one, please recall that ParAccel offers a memory-centric approach to sub-TB databases.)

Details about how much data fits on a node are hard to come by, as is clarity about maintenance costs. Even so, pricing turns out to be one of the rare subjects on which ParAccel is more forthcoming than most competitors.

April 5, 2008

Positioning the data warehouse appliances and specialty DBMS

There now are four hardware vendors that each offer or seem about to announce two different tiers of data warehouse appliances: Sun, HP, EMC, and Teradata. Specifically:

Read more

April 5, 2008

EMC is partnering with ParAccel

A talk about a ParAccel/EMC partnership has been promised for a forthcoming EMC user conference. Otherwise, ParAccel is exposing no useful information on the matter.*

*So what else is new?

The talk is called Highly Scalable Analytic Appliance Powered by EMC and ParAccel, and the abstract says: Read more

February 18, 2008

ParAccel technical highlights

I recently caught up with ParAccel’s CTO Barry Zane and Marketing VP Kim Stanick for a long technical discussion, which they have graciously continued by email. It would be impolitic in the extreme to comment on what led up to that. Let’s just note that many things I’ve previously written about ParAccel are now inoperative, and go straight to the highlights.

Read more

February 8, 2008

Load speeds and related issues in columnar DBMS

Please do not rely on the parts of the post below that are about ParAccel. See our February 18 post about ParAccel instead.

I’ve already posted about a chat I had with Mike Stonebraker regarding Vertica yesterday. I naturally raised the subject of load speed, unaware that Mike’s colleague Stan Zlodnik had posted at length about load speed the day before. Given that post, it seems timely to go into a bit more detail, and in particular to address three questions:

  1. Can columnar DBMS do operational BI?
  2. Can columnar DBMS do ELT (Extract-Load-Transform, as opposed to ETL)?
  3. Are columnar DBMS’ load speeds a problem other than in issues #1 and #2?

Read more

January 16, 2008

Things could get interesting for Infobright

Of the many new specialty data warehouse DBMS and appliances, Infobright’s BrightHouse is the only leading one based on MySQL. I expect Sun and Infobright to have some interesting conversations now. Conversely, I wouldn’t be optimistic about any partnering discussions Infobright might have with, say, HP.

The most directly competitive relationship Sun now has to any future Infobright partnership is with ParAccel.

January 14, 2008

Intelligent Enterprise’s list of 12/36/48 vendors

I’m getting a flood of press releases today, because many of the companies I write about were selected to Intelligent Enterprise’s list of 12 most influential vendors plus 36 more to watch in the areas Intelligent Enterprise covers (which seems to be pretty much the analytics-related parts of what I write about here and on Text Technologies). It looks like a pretty reasonable list, although I think they forced the issue in some of the small analytics vendors they selected, and of course anybody can quibble with some of the omissions.

Among the companies they cited, you can find topical categories here for IBM (and Cognos), Informatica, Microsoft, Netezza, Oracle, SAP/Business Objects (both), SAS, and Teradata; QlikTech; Cast Iron, Coral8, DATAllegro, HP, ParAccel, and StreamBase; and Software AG. On Text Technologies you’ll find categories for some of the same vendors, plus Attensity, Clarabridge, and Google. There also are categories for some of these vendors on the Monash Report.

December 14, 2007

A quick survey of data warehouse management technology

There are at least 16 different vendors offering appliances and/or software that do database management primarily for analytic purposes.* That’s a lot to keep up with,. So I’ve thrown together a little overview of the analytic data management landscape, liberally salted with links to information about specific vendors, products, or technical issues. In some ways, this is a companion piece to my prior post about data warehouse appliance myths and realities.

*And that’s just the tabular/alphanumeric guys. Add in text search and you run the total a lot higher.

Numerous data warehouse specialists offer traditional row-based relational DBMS architectures, but optimize them for analytic workloads. These include Teradata, Netezza, DATAllegro, Greenplum, Dataupia, and SAS. All of those except SAS are wholly or primarily vendors of MPP/shared-nothing data warehouse appliances. EDIT: See the comment thread for a correction re Kognitio.

Numerous data warehouse specialists offer column-based relational DBMS architectures. These include Sybase (with the Sybase IQ product, originally from Expressway), Vertica, ParAccel, Infobright, Kognitio (formerly White Cross), and Sand. Read more

November 12, 2007

An interesting claim regarding BI openness

Analyst conference calls about merger announcements are generally pretty boring. Indeed, the companies involved tend to feel they are legally barred from saying anything interesting, by mandate of both the antitrust regulators and the SEC.

Still, such calls are joyful events, full of strategic happy talk. If one is really lucky, there may a virtuouso tap dancing exhibition as well. On today’s IBM/Cognos call, Cognos CEO Rob Ashe was asked whether he thought Cognos’ independence or lack thereof was as important today as he said it was after SAP announced its BOBJ takeover. Without missing a beat, he responded that there were two kinds of openness:

  1. Database openness (not important)
  2. ERP/business process openness (indeed important)

Hmm. I’m not so sure I agree. To begin with, there aren’t just two major points of potential integration. There’s also a whole lot of middleware: obviously data integration, but also app servers, portals, and query execution acceleration as well. Read more

October 29, 2007

ParAccel opens the kimono slightly

Please do not rely on the parts of this post that draw a distinction between in-memory and disk-based operation. See our February 18, 2008 post about ParAccel instead. It turns out that communication with ParAccel was yet worse than I had realized.

Officially launched today at the TDWI conference, ParAccel is out to compete with Netezza. Right out of the chute, ParAccel may have surpassed Netezza in at least one area: pointlessly annoying secrecy. (In other regards I love them dearly, but that paranoia can be a real pain.) As best I can remember, here are some things about ParAccel that I both am allowed to say and find interesting:

Read more

Feed including blog about database management, data warehousing, and business intelligence Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.