ParAccel
Analysis of columnar data warehouse DBMS vendor ParAccel. Related subjects include:
There sure seem to be a lot of inaccuracies on ParAccel’s website
In what is actually an interesting post on database compression, ParAccel CTO Barry Zane threw in
Anyone who has met with us knows ParAccel shies away from hype.
But like many things ParAccel says, that is not true.
The latest whoppers came in the form of several customers ParAccel listed on its website who hadn’t actually bought ParAccel’s DBMS, nor even decided to do so. It is fairly common to to claim a customer win, then retract the claim due to lack of permission to disclose. But that’s not what happened in these cases. Based on emails helpfully shared by a ParAccel competitor competing in some of those accounts, it seems clear that ParAccel actually posted fabricated claims of customer wins. Read more
| Categories: Columnar database management, Data warehousing, Database compression, Market share, ParAccel, Telecommunications | 21 Comments |
Facts and rumors
- Vertica is putting out a press release today touting its 100th customer, and talking of triple digit growth last year.
- Multiple sources have told me that the DATAllegro system is being thrown out of Dell, so evidently Dell is telling this to one and all. If that goes through, this would presumably leave TEOCO as DATAllegro’s single happy customer. (I haven’t checked with Microsoft for its view.)
- A rumor has it that Infiniband technology vendor Voltaire, Ltd. privately claims triple-digit sales of switches for Exadata 1 (I think that one would be one switch per Exadata installation, not per rack). Based just on a quick glance, this is far from confirmed by Voltaire’s earnings conference call transcripts or SEC filings. However, the most recent transcript does seem to indicate Voltaire got multiple Exadata deals in the telecommunications sector, and suggests some Exadata penetration in other sectors as well.
- I was told of a classified-agency user that has >1 petabyte of data on Exadata 1 and 600 terabytes or so on Netezza. My not-obviously-biased source says the agency is distinctly happier with Netezza than Exadata.
- Like ParAccel, Oracle just got dinged for TPC-related misbehavior.
- Rumor has it that Sun has no intention of helping ParAccel rerun its withdrawn TPC-H benchmark.
- ParAccel has withdrawn the claim from its home page to be the “CERTIFIED” price-performance leader. This seems to confirm that the claim was a reference to the TPC-H. In my opinion, that was a gross misrepresentation of what the TPC-H shows.
Progress in figuring out what ParAccel is doing
(Oops: Thought I’d posted this before I went out for the afternoon …)
Barry Zane of ParAccel has — finally! — started a blog. Barrry’s first post, probably in connection with ParAccel’s recent TPC-H submission and subsequent brouhaha, consisted mainly of metaphor + very elementary and well-known arguments for column stores. Barry’s second post, however, was in direct response to Daniel Abadi’s speculation about ParAccel’s architecture. That post also promises a follow-up addressing the TPC-H in a more substantive way.
Barry’s points include:
- ParAccel never used the row-oriented Postgres execution engine. This is contrary to Daniel’s speculation.
- ParAccel previously used an adaption of the Postgres cost-based optimizer, but now has written a new one from scratch.
- ParAccel has designed its optimizer to handle lots and lots of joins. One reason Barry offers is that ParAccel wants to run customers’ old schemas unaltered, whether or not those are really optimal for the ParAccel DBMS. That approach is somewhat in contrast to Vertica, which originally focused entirely on star schemas. And it goes well with ParAccel’s interest in appealing to customers who at least think they want to run ParAccel in Oracle or SQL Server emulation mode.
Also in the post, Barry:
- Makes an extremely silly marketing exaggeration by referring to ” the only other vendor that was able to run the 30TB TPC-H” (emphasis mine).
- Makes the more excusable marketing exaggeration “Publishing the benchmark with unmatched performance is simply one way to demonstrate robustness and flexibility. Nothing more, nothing less.”
- Makes the very clear marketing claim “For customers, the real test will be their own bake-offs, where our performance has never been beaten.” (Emphasis mine.) That last one directly contradicts what I’ve been told by at least two ParAccel competitors, so I’ll be curious to see what they come up with to substantiate their version of the story.
Anyhow, it’s great to see ParAccel retreating from its obsessive secrecy, which in my opinion has been even worse than Netezza’s used to be.
| Categories: Columnar database management, Data warehousing, ParAccel | 2 Comments |
Daniel Abadi has a theory about ParAccel
When I was at SIGMOD last week, ParAccel and its SIGMOD talk were mentioned several times, always in puzzled and at least slightly unflattering terms. (Typical comment: “Why did they present a paper about that? We were doing the same thing in our company years ago.”) That doesn’t prove much per se, since most of the mentions were by competitors and/or Vertica-affiliated academics, and since my own unflattering ParAccel-related comments were rather fresh at the time.
But now Daniel Abadi has done a brilliant, detailed, speculative analysis of ParAccel’s publications. Here’s the meat, emphasis mine: Read more
| Categories: Benchmarks and POCs, Columnar database management, Data warehousing, ParAccel, Theory and architecture | 26 Comments |
ParAccel pricing
As I noted in connection with ParAccel’s recent TPC-H filing, I think the whole exercise is basically an expensive joke. But one slightly useful spin-off is that ParAccel disclosed pricing. Specifically, ParAccel’s stated price in the disclosure document is:
- $100,000/TB license fee (user data). That’s like Vertica, although I don’t know whether ParAccel emulates Vertica’s policy of making test and development licenses free.
- 57% quantity discount at 30 terabytes. That’s not surprising.
- 1% annual maintenance fee (applied to the discounted price). That’s astounding.
Last year ParAccel quoted prices of $100,000/TB or $50,000/server. The latter figure would seem to have led to lower numbers on the benchmark configuration, so perhaps it’s no longer an option on ParAccel’s price list.
| Categories: Benchmarks and POCs, Data warehousing, ParAccel, Pricing | 2 Comments |
The TPC-H benchmark is a blight upon the industry
ParAccel has released a 30,000-gigabtye TPC-H benchmark, and no less a sage than Merv Adrian paid attention. Now, the TPCs may have had some use in the 1990s. Indeed, Merv was my analyst relations contact for a visit to my clients at Sybase around the time — 1996 or so — I was advising Sybase on how to market against its poor benchmark results. But TPCs are worthless today.
It’s not just that TPCs are highly tuned (ParAccel’s claim of “load-and-go” is laughable Edit: Looking at Appendix A of the full disclosure report, maybe it’s more justified than I thought.). It’s also not just that different analytic database management products perform very differently on different workloads, making the TPC-H not much of an indicator of anything real-life. The biggest problem is: Most TPC benchmarks are run on absurdly unrealistic hardware configurations.
For example, if you look at some details, the ParAccel 30-terabyte benchmark ran on 43 nodes, each with 64 gigabytes of RAM and 24 terabytes of disk. That’s 961,124.9 gigabytes of disk, officially, for a 32:1 disk/data ratio. By way of contrast, real-life analytic DBMS with good compression often have disk/data ratios of well under 1:1.
Meanwhile, the RAM:data ratio is around 1:11 It’s clear that ParAccel’s early TPC-H benchmarks ran entirely in RAM; indeed, ParAccel even admits that. And so I conjecture that ParAccel’s latest TPC-H benchmark ran (almost) entirely in RAM as well. Once again, this would illustrate that the TPC-H is irrelevant to judging an analytic DBMS’ real world performance.
More generally — I would not advise anybody to consider ParAccel’s product, for any use, except after a proof-of-concept in which ParAccel was not given the time and opportunity to perform extensive off-site tuning. I tend to feel that way about all analytic DBMS, but it’s a particular concern in the case of ParAccel.
| Categories: Analytic technologies, Benchmarks and POCs, Buying processes, Columnar database management, Data warehousing, Database compression, ParAccel | 88 Comments |
DBMS transparency layers never seem to sell well
A DBMS transparency layer, roughly speaking, is software that makes things that are written for one brand of database management system run unaltered on another.* These never seem to sell well. ANTs has failed in a couple of product strategies. EnterpriseDB’s Oracle compatibility only seems to have netted it a few sales, and only a small fraction of its total business. ParAccel’s and Dataupia’s transparency strategies have produced even less.
*The looseness in that definition highlights a key reason these technologies don’t sell well — it’s hard to be sure that what you’re buying will do a good job of running your particular apps.
This subject comes to mind for two reasons. One is that IBM seems to have licensed EnterpriseDB’s Oracle transparency layer for DB2. The other is that a natural upgrade path from MySQL to Oracle might be a MySQL transparency layer on top of an Oracle base.
| Categories: ANTs Software, Dataupia, Emulation, transparency, portability, EnterpriseDB and Postgres Plus, IBM and DB2, Market share, MySQL, Oracle, ParAccel | 9 Comments |
Lots of analytic DBMS vendors are hiring
After writing about a Twitter jobs page, it occurred to me to check out whether analytic DBMS vendors are still hiring. Based on the Careers pages on their websites, I determined that Aster, Greenplum, Kickfire, and ParAccel all evidently are, in various mixes of (mainly) technical and field positions. At that point I got bored and stopped.
I didn’t choose those vendors entirely at random. If I had to name three vendors who are said to have had small layoffs at some point over the past few quarters, it would be ParAccel, Greenplum, and Kickfire. So if even they are hiring, the analytic DBMS sector is still pretty healthy … or at least thinks it is.
| Categories: Aster Data, Data warehousing, Greenplum, Kickfire, ParAccel | 5 Comments |
Database implications if IBM acquires Sun
Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion today (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal — if it happens — might affect the database management system industry. Read more
Draft slides on how to select an analytic DBMS
I need to finalize an already-too-long slide deck on how to select an analytic DBMS by late Thursday night. Anybody see something I’m overlooking, or just plain got wrong?
Edit: The slides have now been finalized.
