Vertica Systems
Analysis of columnar data warehouse DBMS vendor Vertica Systems. Related subjects include:
Eight kinds of analytic database (Part 2)
In Part 1 of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I’ll cover four more kinds of analytic database — even newer, for the most part, with a use case/product short list match that is even less clear. Read more
Eight kinds of analytic database (Part 1)
Analytic data management technology has blossomed, leading to many questions along the lines of “So which products should I use for which category of problem?” The old EDW/data mart dichotomy is hopelessly outdated for that purpose, and adding a third category for “big data” is little help.
Let’s try eight categories instead. While no categorization is ever perfect, these each have at least some degree of technical homogeneity. Figuring out which types of analytic database you have or need — and in most cases you’ll need several — is a great early step in your analytic technology planning. Read more
The Vertica story (with soundbites!)
I’ve blogged separately that:
- Vertica has a bunch of customers, including seven with 1 or more petabytes of data each.
- Vertica has progressed down the analytic platform path, with Monday’s release of Vertica 5.0.
And of course you know:
- Vertica (the product) is columnar, MPP, and fast.*
- Vertica (the company) was recently acquired by HP.**
| Categories: Benchmarks and POCs, Columnar database management, ParAccel, Parallelization, Vertica Systems | 2 Comments |
Vertica as an analytic platform
Vertica 5.0 is coming out today, and delivering the down payment on Vertica’s analytic platform strategy. In Vertica lingo, there’s now a Vertica SDK (Software Development Kit), featuring Vertica UDT(F)s* (User-Defined Transform Functions). Vertica UDT syntax basics start: Read more
| Categories: Analytic technologies, Data warehousing, GIS and geospatial, Predictive modeling and advanced analytics, RDF and graphs, Vertica Systems, Workload management | 6 Comments |
Temporal data, time series, and imprecise predicates
I’ve been confused about temporal data management for a while, because there are several different things going on.
- Date arithmetic. This of course has been around for a very long — er, for a very long time.
- Time-series-aware compression. This has been around for quite a while too.
- “Time travel”/snapshotting — preserving the state of the database at previous points in time. This is a matter of exposing (and not throwing away) the information you capture via MVCC (Multi-Version Concurrency Control) and/or append-only updates (as opposed to update-in-place). Those update strategies are increasingly popular for pretty much anything except update-intensive OLTP (OnLine Transaction Processing) DBMS, so time-travel/snapshotting is an achievable feature for most vendors.
- Bitemporal data access. This occurs when a fact has both a transaction timestamp and a separate validity duration. A Wikipedia article seems to cover the subject pretty well, and I touched on Teradata’s bitemporal plans back in 2009.
- Time series SQL extensions. Vertica explained its version of these to me a few days ago. I imagine Sybase IQ and other serious financial-trading market players have similar features.
In essence, the point of time series/event series SQL functionality is to do SQL against incomplete, imprecise, or derived data.* Read more
| Categories: Analytic technologies, Data types, Investment research and trading, Log analysis, Sybase, Telecommunications, Theory and architecture, Vertica Systems | 1 Comment |
Columnar DBMS vendor customer metrics
Last April, I asked some columnar DBMS vendors to share customer metrics. They answered, but it took until now to iron out a couple of details. Overall, the answers are pretty impressive. Read more
Attensity update
I talked with Michelle de Haaff and Ian Hersey of Attensity back in February. We covered a lot of ground, so let’s start with a very high-level view.
- Two years ago, Attensity merged with two other companies in somewhat related businesses, thus expanding 4X or so in size.
- Due to the merger, Attensity now has two core lines of business:
- Text analytics.
- Driving actions, such as call center or social media response, based on text analytics.
- The combined Attensity is part American, part German.
- Attensity’s German part compels it to do some public financial reporting. Attensity will do $50-60 million in 2011 revenue.
- Attensity crunches text in 17 languages. English is preeminent. #2 is — you guessed it! — German.
- A big part of Attensity’s business (or at least of its value proposition) is analyzing the text in social media. Attensity boasts coverage of 75 million social media sources, such as blogs, forums, or review sites.
The four most interesting technical points were probably:
- Attensity has changed how it does exhaustive extraction. I’m having some trouble writing that part up, so for now I’ll just refer you to Attensity’s own description of the new way of doing things.
- Attensity has development work underway meant to address some of the problems in text analytics/other analytics integration. I don’t feel I got enough detail to want to talk about that yet.
- Attensity runs its own data centers, with approximately 60 Hadoop/HBase nodes and 30 nodes of Apache Solr (open source text search). More on that below.
- Attensity now OEMs Vertica. More on that below too.
Some more specific notes include: Read more
| Categories: Analytic technologies, Cloud computing, Hadoop, HBase, Predictive modeling and advanced analytics, Software as a Service (SaaS), Sybase, Vertica Systems | 6 Comments |
Updating our vendor client disclosures
Edit: This disclosure has been superseded by a March, 2012 version.
From time to time, I disclose our vendor client lists. Another iteration is below. To be clear:
- This is a list of Monash Advantage members.
- All our vendor clients are Monash Advantage members, unless …
- … we work with them primarily in their capacity as technology users. (A large fraction of our user clients happen to be SaaS vendors.)
- We do not usually disclose our user clients.
- We do not usually disclose our venture capital clients, nor those who invest in publicly-traded securities.
- Included in the list below are two expired Monash Advantage members who haven’t said they will renew, as mentioned in my recent post on analyst bias. (You can probably imagine a couple of reasons for that obfuscation.)
With that said, our vendor client disclosures at this time are:
- Aster Data
- Cloudera
- CodeFutures/dbShards
- Couchbase
- EMC/Greenplum
- Endeca
- IBM/Netezza
- Infobright
- Intel
- MarkLogic
- ParAccel
- QlikTech
- salesforce.com/database.com
- SAND Technology
- SAP/Sybase
- Schooner Information Technology
- Skytide
- Splunk
- Teradata
- Vertica
Now we know why Vertica has been so weirdly evasive
Communicating with Vertica has been tricky recently. But HP is now announced to be buying Vertica, which pretty much forces me to comment about Vertica.
So I’ll indulge in a little bit of explanation as to what I know about Vertica, whether for publication or under NDA. My analysis of the HP/Vertica combination, and expectations for same, will go into another post. Read more
| Categories: Analytic technologies, Data warehousing, HP and Neoview, Market share and customer counts, Michael Stonebraker, Vertica Systems | 10 Comments |
Comments on the 2011 Forrester Wave for Enterprise Data Warehouse Platforms
The Forrester Wave: Enterprise Data Warehouse Platforms, Q1 2011 is now out,* hot on the heels of the Gartner Magic Quadrant. Unfortunately, this particular Forrester Wave is riddled with inaccuracy. Read more
