February 16th, 2008 Curt Monash
In a response to my recent five-part series on DBMS diversity, Mike Stonebraker has proposed his own taxonomy of data management technologies over on Vertica’s Database Column blog.
- OLTP DBMSs focused on fast, reliable transaction processing
- Analytic/Data Warehouse DBMSs focused on efficient load and ad-hoc query performance
-
Science DBMSs — after all MatLab does not scale to disk-sized arrays
- RDF stores focused on efficiently storing semi-structured data in this format
-
XML stores focused on semi-structured data in this format
- Search engines — the big players all use proprietary engines in this area
- Stream Processing Engines focused on real-time StreamSQL
- “Lean and Mean,” less-than-a-database engines focused on doing a small number of things very well (embedded databases are probably in this category)
- MapReduce and Hadoop — after all Google has enough “throw weight” to define a category
He goes on to say that each will be architected differently, except that — as he already convinced me back in July — RDF will be well-managed by specialty data warehouse DBMS. Read the rest of this entry »
Posted in Data types, Database diversity, Database theory and practice, Michael Stonebraker, Mid-range DBMS, OLTP database management, RDF and graphs, Relational database management systems | No Comments »
November 7th, 2007 Curt Monash
Vertica quietly announced an appliance bundling deal with HP and Red Hat today. That got me quickly onto the phone with Vertica’s Andy Ellicott, to discuss a few different subjects. Most interesting was the part about Vertica’s customer base, highlights of which included:
- Vertica’s claim to have “50” customers includes a bunch of unpaid licenses, many of them in academia.
- Vertica has about 15 paying customers.
- Based on conversations with mutual prospects, Vertica believes that’s more customers than DATAllegro has. (Of course, each DATAllegro sale is bigger than one of Vertica’s. Even so, I hope Vertica is wrong in its estimate, since DATAllegro told me its customer count was “double digit” quite a while ago.)
- Most Vertica customers manage over 1 terabyte of user data. A couple have bought licenses showing they intend to manage 20 terabytes or so.
- Vertica’s biggest customer/application category – existing customers and sales pipelines alike – is call detail records for telecommunications companies. (Other data warehouse specialists also have activity in the CDR area.). Major applications are billing assurance (getting the inter-carrier charges right) and marketing analysis. Call center uses are still in the future.
- Vertica’s other big market to date is investment research/tick history. Surely not coincidentally, this is a big area of focus for Mike Stonebraker, evidently at both companies for which he’s CTO. (The other, of course, is StreamBase.)
-
Runners-up in market activity are clickstream analysis and general consumer analytics. These seem to be present in Vertica’s pipeline more than in the actual customer base.
Read the rest of this entry »
Posted in Analytics and analytic technologies, Business Objects, DATAllegro, Data warehouse appliances, Data warehousing, HP and Neoview, RDF and graphs, Relational database management systems, Vertica Systems | No Comments »
July 13th, 2007 Curt Monash
I just finished a short Monash Letter on markets for nonstandard data management software. Of course, the whole thing is available only to Monash Advantage members, but here are some salient points:
- When new kinds of data are managed, new kinds of data management are used. More precisely, the old ways are tried first — but once they fail new technologies are tried out.
- Up through the “Bowling Alley,” markets for nonstandard data management technology commonly follow the classic Geoffrey Moore pattern. However, they rarely experience a “Tornado” or mass adoption.
- I think this is apt to change. My three strongest candidates are native XML, RDF, and memory-centric event/stream processing used for data reduction (as opposed to sub-millisecond latency, which I do think will continue to be a niche requirement).
Posted in Complex event/stream processing (CEP), Hierarchies, networks, graphs, and trees, Memory-centric data management, Native XML, RDF and graphs | No Comments »
June 15th, 2007 Curt Monash
When Mike Stonebraker and I discussed RDF yesterday, he quickly turned to suggesting fast ways of implementing it over an RDBMS. Then, quite characteristically, he sent over a paper that allegedly covered them, but actually was about closely related schemes instead.
Edit: The paper has a new, stable URL. Hat tip to Daniel Abadi.
All minor confusion aside, here’s the story. At its core, an RDF database is one huge three-column table storing subject-property-object triples. In the naive implementation, you then have to join this table to itself repeatedly. Materialized views are a good start, but they only take you so far. Read the rest of this entry »
Posted in Columnar architectures, Data warehousing, Database compression, Database theory and practice, Hierarchies, networks, graphs, and trees, RDF and graphs, Relational database management systems, Vertica Systems | No Comments »
June 15th, 2007 Curt Monash
Thus spake Mike Stonebraker to me, on a call we’d scheduled to talk about several other things altogether. This was one day after I was told at the Text Analytics Summit that the US government is going nuts for RDF. And I continue to get confirmation of something I first noted last year — Oracle is pushing RDF heavily, especially in the life sciences market.
Evidently, the RDF data model is for real … unless, of course, you’re the kind of purist who cares to dispute whether RDF is a true “data model” at all.
Technorati Tags: RDF, Semantic Web, database
Posted in Database theory and practice, Hierarchies, networks, graphs, and trees, Oracle, RDF and graphs | Comments Off
May 7th, 2007 Curt Monash
A major Semantic Web researcher has built a cluster that can do RDF queries, and hence can get subsecond response time on queries against a database of 7 billion three-column records, The Register obsequiously reports. Golly gee whiz wow.
“The importance of this breakthrough cannot be overestimated,” said Professor Stefan Decker, director of DERI.”
I actually think the Semantic Web contains some good ideas, but this kind of over-the-top breathlessness doesn’t seem to do anybody very much good.
Posted in Hierarchies, networks, graphs, and trees, RDF and graphs | 3 Comments »
December 27th, 2006 Curt Monash
My Bulletin on Cogito — i.e., a short-short white paper — is now available for download. Thankfully, it turned out to be pretty consistent with what I previously wrote on the company and its technology.
The conclusion to the paper bears quoting here:
In deciding between conventional DBMS and specialty graph-oriented tools such as Cogito’s, there’s one key criterion: Path length. If path lengths are short and predictable, there’s a good chance that relational DBMS and their forthcoming extensions can do the job. In complex graphs with longer paths, however, relational approaches may not scale well. In such cases, specialty technologies warrant serious consideration.
Posted in Cogito and 7 Degrees, Hierarchies, networks, graphs, and trees, RDF and graphs | Comments Off
July 3rd, 2006 Curt Monash
I wrote recently of Cogito’s high-performance engine for modeling graphs. Oracle has taken a very different approach to the same problem, and last Monday I drove over to Burlington to be briefed on it.
Name an approach to data management, and Oracle has probably
- Hacked together a version on a consulting contract
- Packaged it up for other customers in the same industry
- Set to work on improving and generalizing it
- Integrated it into SQL as a preference over supporting standalone data manipulation languages for it
- Stopped short of being 100% competitive in that functionality
(At least, that’s the general template; truth be told, most of the important cases deviate in some way or other.)
Read the rest of this entry »
Posted in Hierarchies, networks, graphs, and trees, Oracle, RDF and graphs, Relational database management systems | 2 Comments »
May 22nd, 2006 Curt Monash
In my Computerworld column appearing today, I promised to post here about Cogito. Let me start with a disclosure and a confession: Read the rest of this entry »
Posted in Cogito and 7 Degrees, Hierarchies, networks, graphs, and trees, RDF and graphs | 7 Comments »