Analysis of columnar data warehouse DBMS vendor ParAccel, maker of PADB (ParAccel Analytic DataBase). Related subjects include:
- Subsecond load latency is substantially impossible. Doing that amounts to OLTP.
- 5 seconds or so is doable with aggressive investment and tuning.
- Several minute load latency is pretty easy.
- 10-15 minute latency or longer is now very routine.
There’s generally a throughput/latency tradeoff, so if you want very low latency with good throughput, you may have to throw a lot of hardware at the problem.
I’d expect to hear similar things from any other vendor with reasonably mature analytic DBMS technology. Low-latency load is a problem for columnar systems, but both Vertica and ParAccel designed in workarounds from the getgo. Aster Data probably didn’t meet these criteria until Version 4.0, its old “frontline” positioning notwithstanding, but I think it does now.
Just what is your need for speed anyway?
|Categories: Analytic technologies, Aster Data, Columnar database management, Data warehousing, IBM and DB2, Netezza, ParAccel, Vertica Systems||4 Comments|
When you are selecting an analytic DBMS or appliance, most of the evaluation boils down to two questions:
- How quickly and cost-effectively does it execute SQL?
- What analytic functionality, SQL or otherwise, does it do a good job of executing?
And so, in undertaking such a selection, you need to start by addressing three issues:
- What does “speed” mean to you?
- What does “cost” mean to you?
- What analytic functionality do you need anyway?
SAP is acquiring Sybase. On the conference call SAP said Sybase would be run as a separate division of SAP (no surprise). Most of the focus was on Sybase’s mobile technology, which is forecast at >$400 million in 2010 revenues (which would be 30%ish of the total). My quick reactions include: Read more
One of our readers was kind enough to walk me through his analytic DBMS evaluation process. The story is:
- The X Company (XCo) has a <1 TB database.
- 100s of XCo’s customers log in at once to run reports. 50-200 concurrent queries is a good target number.
- XCo had been “suffering” with Oracle and wanted to upgrade.
- XCo didn’t have a lot of money to spend. Netezza pulled out of the sales cycle early due to budget (and this was recently enough that Netezza Skimmer could have been bid).
- Greenplum didn’t offer any references that approached the desired number of concurrent users.
- Ultimately the evaluation came down to Vertica and ParAccel.
- Vertica won.
Notes on the Vertica vs. ParAccel selection include: Read more
|Categories: Analytic technologies, Benchmarks and POCs, Buying processes, Data warehousing, Greenplum, Netezza, Oracle, ParAccel, Vertica Systems||7 Comments|
I caught up with Jerry Held (Chairman) and Dave Menninger (VP Marketing) of Vertica for a chat yesterday. The immediate reason for the call was that a competitor had tipped me off to the departure of Vertica CEO Ralph Breslauer, which of course raises a host of questions. Highlights of the call included:
- Vertica had a “killer” Q4 and is doing very well in Q1 again.
- Vertica burned hardly any cash last year; i.e., it was close to cash-flow neutral in 2009.
- Vertica is hiring aggressively, e.g., in sales.
- Vertica is well down the path with several CEO candidates who Jerry regards as outstanding. He is hopeful there will be a new CEO in April. (But I bet that would be late April, given what Jerry mentioned about his own travel plans.)
- Absent a full-time CEO, Jerry and Andy Palmer are spending a lot more time with Vertica.
- One Vertica customer is approaching a petabyte of user data. The last time Vertica had checked, that customer had been more in the ¼ petabyte range.
- Other multi-hundred terabyte Vertica databases were mentioned, including one where Vertica claims to have beaten Teradata and perhaps other competitors in a head-to-head competition (it sounds like that one’s too recent to be deployed yet).
- Vertica sees Aster and Greenplum competitively more often than it sees ParAccel.
- Vertica sees Sybase IQ competitively a lot in financial services (in new-name accounts for Sybase as well as where some kind of Sybase DBMS is an incumbent), and more occasionally in other sectors.
NDA parts of the conversation also gave me the impression that Vertica is moving forward just as eagerly as its peers. I.e., I didn’t uncover any reason to think that Ralph’s departure is a sign of trouble, of the company being shopped, etc. Read more
|Categories: Analytic technologies, Data warehousing, Investment research and trading, Market share and customer counts, ParAccel, Petabyte-scale data management, Sybase, Vertica Systems||6 Comments|
In what is actually an interesting post on database compression, ParAccel CTO Barry Zane threw in
Anyone who has met with us knows ParAccel shies away from hype.
But like many things ParAccel says, that is not true.
Edit (October, 2010): Like other posts I’ve linked to from Barry Zane’s blog, that one seems to be gone, with the URL redirecting elsewhere on ParAccel’s website.
The latest whoppers came in the form of several customers ParAccel listed on its website who hadn’t actually bought ParAccel’s DBMS, nor even decided to do so. It is fairly common to to claim a customer win, then retract the claim due to lack of permission to disclose. But that’s not what happened in these cases. Based on emails helpfully shared by a ParAccel competitor competing in some of those accounts, it seems clear that ParAccel actually posted fabricated claims of customer wins. Read more
|Categories: Columnar database management, Data warehousing, Database compression, Market share and customer counts, ParAccel, Telecommunications||24 Comments|
- Vertica is putting out a press release today touting its 100th customer, and talking of triple digit growth last year.
- Multiple sources have told me that the DATAllegro system is being thrown out of Dell, so evidently Dell is telling this to one and all. If that goes through, this would presumably leave TEOCO as DATAllegro’s single happy customer. (I haven’t checked with Microsoft for its view.)
- A rumor has it that Infiniband technology vendor Voltaire, Ltd. privately claims triple-digit sales of switches for Exadata 1 (I think that one would be one switch per Exadata installation, not per rack). Based just on a quick glance, this is far from confirmed by Voltaire’s earnings conference call transcripts or SEC filings. However, the most recent transcript does seem to indicate Voltaire got multiple Exadata deals in the telecommunications sector, and suggests some Exadata penetration in other sectors as well.
- I was told of a classified-agency user that has >1 petabyte of data on Exadata 1 and 600 terabytes or so on Netezza. My not-obviously-biased source says the agency is distinctly happier with Netezza than Exadata.
- Like ParAccel, Oracle just got dinged for TPC-related misbehavior.
- Rumor has it that Sun has no intention of helping ParAccel rerun its withdrawn TPC-H benchmark.
- ParAccel has withdrawn the claim from its home page to be the “CERTIFIED” price-performance leader. This seems to confirm that the claim was a reference to the TPC-H. In my opinion, that was a gross misrepresentation of what the TPC-H shows.
Barry Zane of ParAccel has — finally! — started a blog. Barrry’s first post, probably in connection with ParAccel’s recent TPC-H submission and subsequent brouhaha, consisted mainly of metaphor + very elementary and well-known arguments for column stores. Barry’s second post, however, was in direct response to Daniel Abadi’s speculation about ParAccel’s architecture. That post also promises a follow-up addressing the TPC-H in a more substantive way.
(Edit: As of October, 2010, those links have been redirected away from the original posts, which seem to have been taken down.)
Barry’s points include:
- ParAccel never used the row-oriented Postgres execution engine. This is contrary to Daniel’s speculation.
- ParAccel previously used an adaption of the Postgres cost-based optimizer, but now has written a new one from scratch.
- ParAccel has designed its optimizer to handle lots and lots of joins. One reason Barry offers is that ParAccel wants to run customers’ old schemas unaltered, whether or not those are really optimal for the ParAccel DBMS. That approach is somewhat in contrast to Vertica, which originally focused entirely on star schemas. And it goes well with ParAccel’s interest in appealing to customers who at least think they want to run ParAccel in Oracle or SQL Server emulation mode.
Also in the post, Barry:
- Makes an extremely silly marketing exaggeration by referring to ” the only other vendor that was able to run the 30TB TPC-H” (emphasis mine).
- Makes the more excusable marketing exaggeration “Publishing the benchmark with unmatched performance is simply one way to demonstrate robustness and flexibility. Nothing more, nothing less.”
- Makes the very clear marketing claim “For customers, the real test will be their own bake-offs, where our performance has never been beaten.” (Emphasis mine.) That last one directly contradicts what I’ve been told by at least two ParAccel competitors, so I’ll be curious to see what they come up with to substantiate their version of the story.
Anyhow, it’s great to see ParAccel retreating from its obsessive secrecy, which in my opinion has been even worse than Netezza’s used to be.
When I was at SIGMOD last week, ParAccel and its SIGMOD talk were mentioned several times, always in puzzled and at least slightly unflattering terms. (Typical comment: “Why did they present a paper about that? We were doing the same thing in our company years ago.”) That doesn’t prove much per se, since most of the mentions were by competitors and/or Vertica-affiliated academics, and since my own unflattering ParAccel-related comments were rather fresh at the time.
But now Daniel Abadi has done a brilliant, detailed, speculative analysis of ParAccel’s publications. Here’s the meat, emphasis mine: Read more
|Categories: Benchmarks and POCs, Columnar database management, Data warehousing, ParAccel, Theory and architecture||30 Comments|
As I noted in connection with ParAccel’s recent TPC-H filing, I think the whole exercise is basically an expensive joke. But one slightly useful spin-off is that ParAccel disclosed pricing. Specifically, ParAccel’s stated price in the disclosure document is:
- $100,000/TB license fee (user data). That’s like Vertica, although I don’t know whether ParAccel emulates Vertica’s policy of making test and development licenses free.
- 57% quantity discount at 30 terabytes. That’s not surprising.
- 1% annual maintenance fee (applied to the discounted price). That’s astounding.
Last year ParAccel quoted prices of $100,000/TB or $50,000/server. The latter figure would seem to have led to lower numbers on the benchmark configuration, so perhaps it’s no longer an option on ParAccel’s price list.