Vertica Systems
Analysis of columnar data warehouse DBMS vendor Vertica Systems. Related subjects include:
More on temp space, compression, and “random” I/O
My PhD was in a probability-related area of mathematics (game theory), so I tend to squirm when something is described as “random” that clearly is not. That said, a comment by Shilpa Lawande on our recent Flash/temp space discussion suggests the following way of framing a key point:
- You really, really want to have multiple data streams coming out of temp space, as close to simultaneously as possible.
- The storage performance characteristics of such a workload are more reminiscent of “random” than “sequential” I/O.
If everybody else is cool with it too, I can live with that.
Meanwhile, I talked again with Tim Vincent of IBM this afternoon. Tim endorsed the temp space/Flash fit, but with a different emphasis, which upon review I find I don’t really understand. The idea is:
- Analytic DBMS processing generally stresses reads over writes.
- Temp space is an exception — read and write use of temp space is pretty balanced. (You spool data out once, you read it back in once, and that’s the end of that; next time it will be overwritten.)
My problem with that is: Flash typically has lower write than read IOPS (I/O per second), so being (relatively) write-intensive would, to a first approximation, seem if anything to disfavor a workload for Flash.
On the plus side, I was reminded of something I should have noted when I wrote about DB2 compression before:
Much like Vertica, DB2 operates on compressed data all the way through, including in temp space.
| Categories: Data warehousing, Database compression, IBM and DB2, Vertica Systems | 5 Comments |
Vertica’s innovative architecture for Flash, plus more about temp space than you perhaps wanted to know
Vertica is announcing:
- Technology it already has released*, but has not published any reference architectures for
- A Barney partnership**
In other words, Vertica has succumbed to the common delusion that it’s a good idea to put out half-baked press releases the week of TDWI conferences. But if we look past that kind of all-too-common nonsense, Vertica is highlighting an interesting technical story, about how the analytic DBMS industry can exploit solid-state memory technology.
*Upgrades to Vertica FlexStore to handle Flash memory, actually released as part of Vertica 4.0
** With Fusion I/O
To set the context, let’s recall a few points I’ve noted in the past:
- Solid-state memory’s price/throughput tradeoffs obviously make it the future of database storage.
- The Flash future is coming soon, in part because Flash’s propensity to wear out is overstated. This is especially true in the case of modern analytic DBMS, which tend to write to blocks all at once, and most particularly the case for append-only systems such as Vertica.
- Being able to intelligently split databases among various cost tiers of storage – e.g. Flash and disk – makes a whole lot of sense.
Taken together, those points tell us:
For optimal price/performance, analytic DBMS should support databases that run part on Flash, part on disk.
While all this is a future for some other analytic DBMS vendors, Vertica is shipping it today.* What’s more, three aspects of Vertica’s architecture make it particularly well-suited for hybrid Flash/disk storage, in each case for a similar reason – you can get most of the performance benefit of all-Flash for a relatively low actual investment in Flash chips: Read more
| Categories: Columnar database management, Data warehousing, Database compression, Solid-state memory, Vertica Systems | 10 Comments |
What kinds of data warehouse load latency are practical?
I took advantage of my recent conversations with Netezza and IBM to discuss what kinds of data warehouse load latency were practical. In both cases I got the impression:
- Subsecond load latency is substantially impossible. Doing that amounts to OLTP.
- 5 seconds or so is doable with aggressive investment and tuning.
- Several minute load latency is pretty easy.
- 10-15 minute latency or longer is now very routine.
There’s generally a throughput/latency tradeoff, so if you want very low latency with good throughput, you may have to throw a lot of hardware at the problem.
I’d expect to hear similar things from any other vendor with reasonably mature analytic DBMS technology. Low-latency load is a problem for columnar systems, but both Vertica and ParAccel designed in workarounds from the getgo. Aster Data probably didn’t meet these criteria until Version 4.0, its old “frontline” positioning notwithstanding, but I think it does now.
Related link
-
Just what is your need for speed anyway?
| Categories: Analytic technologies, Aster Data, Columnar database management, Data warehousing, IBM and DB2, Netezza, ParAccel, Vertica Systems | 4 Comments |
Quick reactions to SAP acquiring Sybase
SAP is acquiring Sybase. On the conference call SAP said Sybase would be run as a separate division of SAP (no surprise). Most of the focus was on Sybase’s mobile technology, which is forecast at >$400 million in 2010 revenues (which would be 30%ish of the total). My quick reactions include: Read more
Vertica update
Last month, Vertica’s CEO Ralph Breslauer quit,* and Vertica made it sound like there would be a new CEO late in April. And indeed, as of April 29, there was. He’s a guy I’ve never heard of before named Chris Lynch, apparently quite the sales machine builder. The most substance I’ve found is a pair of Mass High Tech articles — the latter exceedingly typo-ridden — to the general effect that:
- Vertica plans to build a massive, world-conquering sales force.
- If Vertica dips back into negative cash flow to do that and has to raise more venture capital, so be it.
- “Triple-digit” revenue growth is expected for this year.
| Categories: Analytic technologies, Columnar database management, Data warehousing, Games and virtual worlds, Market share, Specific users, Vertica Systems, Web analytics | 1 Comment |
Story of an analytic DBMS evaluation
One of our readers was kind enough to walk me through his analytic DBMS evaluation process. The story is:
- The X Company (XCo) has a <1 TB database.
- 100s of XCo’s customers log in at once to run reports. 50-200 concurrent queries is a good target number.
- XCo had been “suffering” with Oracle and wanted to upgrade.
- XCo didn’t have a lot of money to spend. Netezza pulled out of the sales cycle early due to budget (and this was recently enough that Netezza Skimmer could have been bid).
- Greenplum didn’t offer any references that approached the desired number of concurrent users.
- Ultimately the evaluation came down to Vertica and ParAccel.
- Vertica won.
Notes on the Vertica vs. ParAccel selection include: Read more
| Categories: Analytic technologies, Benchmarks and POCs, Buying processes, Data warehousing, Greenplum, Netezza, Oracle, ParAccel, Vertica Systems | 7 Comments |
Vertica update
I caught up with Jerry Held (Chairman) and Dave Menninger (VP Marketing) of Vertica for a chat yesterday. The immediate reason for the call was that a competitor had tipped me off to the departure of Vertica CEO Ralph Breslauer, which of course raises a host of questions. Highlights of the call included:
- Vertica had a “killer” Q4 and is doing very well in Q1 again.
- Vertica burned hardly any cash last year; i.e., it was close to cash-flow neutral in 2009.
- Vertica is hiring aggressively, e.g., in sales.
- Vertica is well down the path with several CEO candidates who Jerry regards as outstanding. He is hopeful there will be a new CEO in April. (But I bet that would be late April, given what Jerry mentioned about his own travel plans.)
- Absent a full-time CEO, Jerry and Andy Palmer are spending a lot more time with Vertica.
- One Vertica customer is approaching a petabyte of user data. The last time Vertica had checked, that customer had been more in the ¼ petabyte range.
- Other multi-hundred terabyte Vertica databases were mentioned, including one where Vertica claims to have beaten Teradata and perhaps other competitors in a head-to-head competition (it sounds like that one’s too recent to be deployed yet).
- Vertica sees Aster and Greenplum competitively more often than it sees ParAccel.
- Vertica sees Sybase IQ competitively a lot in financial services (in new-name accounts for Sybase as well as where some kind of Sybase DBMS is an incumbent), and more occasionally in other sectors.
NDA parts of the conversation also gave me the impression that Vertica is moving forward just as eagerly as its peers. I.e., I didn’t uncover any reason to think that Ralph’s departure is a sign of trouble, of the company being shopped, etc. Read more
| Categories: Analytic technologies, Data warehousing, Investment research and trading, Market share, ParAccel, Petabyte-scale data management, Sybase, Vertica Systems | 6 Comments |
February 2010 data warehouse DBMS news roundup
February is usually a busy month for data warehouse DBMS product releases, product announcements, and other real or contrived data warehouse DBMS news, and it can get pretty confusing trying to keep those categories of “news” apart.* This year is no exception, although several vendors – including Teradata and Netezza – are taking “rolling thunder” approaches, doing some of their announcements this month while holding others back for March or April.
*I probably have it worse than most people in that regard, because my clients run tentative feature lists and announcement schedules by me well in advance, which may get changed multiple times before the final dates roll around. I also occasionally miss some detail, if it wasn’t in a pre-briefing but gets added at the end.
Anyhow, the three big themes of this month’s announcements are probably:
- Integrating different kinds of analytic processing into databases and DBMS.
- Taking advantage of hardware advances.
- Playing catchup in areas where small vendors’ products weren’t mature yet.
| Categories: Analytic technologies, Aster Data, Data warehousing, Netezza, Teradata, Vertica Systems | Leave a Comment |
Vertica 4.0
Vertica briefed me last month on its forthcoming Vertica 4.0 release. I think it’s fair to say that Vertica 4.0 is mainly a cleanup/catchup release, washing away some of the tradeoffs Vertica had previously made in support of its innovative DBMS architecture.
For starters, there’s a lot of new analytic functionality. This isn’t Aster/Netezza-style ambitious. Rather, there’s a lot more SQL-99 functionality, plus some time series extensions of the sort that financial services firms – an important market for Vertica – need and love. Vertica did suggest a couple of these time series extensions are innovative, but I haven’t yet gotten detail about those.
Perhaps even more important, Vertica is cleaning up a lot of its previous SQL optimization and execution weirdnesses. In no particular order, I was told: Read more
| Categories: Analytic technologies, Columnar database management, Data warehousing, Vertica Systems | 4 Comments |
Intelligent Enterprise’s Editors’/Editor’s Choice list for 2010
As he has before, Intelligent Enterprise Editor Doug Henschen
- Personally selected annual lists of 12 “Most influential” companies and 36 “Companies to watch” in analytics- and database-related sectors.
- Made it clear that these are his personal selections.
- Nonetheless has called it an Editors’ Choice list, rather than Editor’s Choice.
(Actually, he’s really called it an “award.”)
