January 22, 2007

Are row-oriented RDBMS obsolete?

If Mike Stonebraker is to be believed, the era of columnar data stores is upon us.

Whether or not you buy completely into Mike’s claims, there certainly are cool ideas in his latest columnar offering, from startup Vertica Systems. The Vertica corporate site offers little detail, but Mike tells me that the product’s architecture closely resembles that of C-Store, which is described in this November, 2005 paper.

The core ideas behind Vertica’s product are as follows.

Obviously, my post title was exaggerated; nobody, including Mike, thinks row-oriented data stores are obsolete for OLTP. But what about data warehousing? Will an approach like Vertica’s eventually win versus, say, the shared-nothing row-oriented RDBMS leaders (that would be some combination of IBM, Teradata, Netezza, and DATAllegro, depending on what you mean by “leader”)? Well, apparently Vertica has a bunch of tests going on, at database sizes from the low 100s of gigabytes to the low 10s of terabytes. And of course they have those great-looking benchmark results, for which they swear they tuned competitor’s products with passionate care.

If I have to make an early guess, I’d say that the success of columnar systems will depend in no small part on what kind of data warehouse applications we’re talking about. Referencing a taxonomy I previously posted:

Comments

19 Responses to “Are row-oriented RDBMS obsolete?”

  1. DBMS2 — DataBase Management System Services»Blog Archive » Mike Stonebraker Blasts “One Size Fits All” on January 22nd, 2007 7:24 am

    […] More recently, the argument in that paper has been extended with a benchmark-filled follow-up based on another Stonebraker startup, Vertica. • • • […]

  2. Stuart Frost on January 22nd, 2007 11:38 am

    Curt,

    I took a look at C-Store a while ago when Vertica first came on the scene. The idea that row-oriented databases are going to be superceded “real soon” by column-oriented has been pushed on and off for around 20 years. Sand and Sybase IQ are OK for small (sub-TB) data warehouses, but they just don’t scale beyond that. Am I missing something in Vertica that would change that?

    In practice, our row-oriented DATAllegro appliance is CPU bound for most queries, so I/O isn’t really the bottleneck as it is with most systems. We’re also about to introduce compression to move the bar even further.

    Stuart
    DATAllegro

  3. Curt Monash on January 22nd, 2007 11:50 am

    Hi Stuart!

    Sybase IQ doesn’t scale because it isn’t properly parallellized. I can’t comment on Sand; they certainly claim to scale.

    Kognitio, on the other hand, does show every sign of scaling with a columnar, shared-nothing architecture.

    The one thing that worries me is what’s highlighted in Point #6 above. Just for what kinds of queries does or doesn’t the system scale? (I also don’t know the answer to that for Kognitio.) Otherwise, the story sounds pretty clean to me.

    Best,

    CAM

  4. DBMS2 — DataBase Management System Services»Blog Archive » It’s a good week for puns … on January 31st, 2007 6:47 pm

    […] … unless you think that is inherently an oxymoron. I thought I was doing well catching and expanding on a clever pop culture reference. But the folks at columnar DBMS start-up Vertica Systems may have topped that with their slogan […]

  5. DBMS2 — DataBase Management System Services»Blog Archive » Word of the day: “Compression” on March 16th, 2007 5:30 am

    […] IBM sent over a bunch of success stories recently, with DB2’s new aggressive compression prominently mentioned. Mike Stonebraker made a big point of Vertica’s compression when last we talked; other column-oriented data warehouse/mart software vendors (e.g. Kognitio, SAP, Sybase) get strong compression benefits as well. Other data warehouse/mart specialists are doing a lot with compression too, although some of that is governed by please-don’t-say-anything-good-about-us NDA agreements. […]

  6. DBMS2 — DataBase Management System Services»Blog Archive » DATAllegro vs. Vertica and other columnar systems on March 19th, 2007 11:25 pm

    […] I’m hard pressed to see why, for some applications, this wouldn’t have all the benefits of the full columnar architectures of, say, Vertica or Kognitio. That said, I can also envision other applications in which Vertica would offer large performance benefits by allowing redundant storage with a variety of sort orders. […]

  7. Phil Bowermaster on May 4th, 2007 6:46 pm

    Hi Curt,

    Bloor Research recently published an excellent evaluation white paper on Sybase IQ, authored by Philip Howard, which addresses (among other subjects) the Sybase IQ approach to parallelization.

    http://www.sybase.com/content/1035804/SybaseIQ-12.7-010407-wp.pdf

    As for Sybase IQ’s ability to scale — it has been dramatically demonstrated in a number of benchmark exercises (up to 155 TB) and customer implementations (40+ TB in production). A few examples:

    http://www.sybase.com/detail_list?id=49108
    http://www.sybase.com/detail?id=1027323
    http://www-03.ibm.com/systems/p/solutions/sybase/iq/index.html

    The entry of Vertica and other players into the column-based database market helps to demonstrate the growth potential of this space. We can expect to see more such entrants as database sizes continue to increase and organizations continue to look for technology that can reliably handle their analytics requirements.

    Phil Bowermaster
    Sybase

  8. Curt Monash on May 5th, 2007 12:41 am

    Hi Phil,

    Nice paper! Did you guys sponsor it? I didn’t see any disclosure statements about that, but I noticed that “evaluation” was in quotes in the title.

    Either way, I’m a great admirer of Philip Howard’s unrelentingly optimistic view of technology, as per http://www.dbms2.com/2006/05/15/philip-howard-likes-viper/. And I wonder whether it’s really true that the appliance vendors don’t do tokenization/dictionary compression. If they don’t, they surely should, and probably will soon.

    Seriously, I’d be interested to learn what unnatural acts you did or didn’t have to perform to scale that high. And I’d really like to learn about the complexity you do or don’t offer in text analysis, since I’ve long thought that columnar relational indexing and text indexing were apt to fit very well together.

    CAM

  9. Ruslan on September 7th, 2007 4:11 am

    Hi all,

    well, BEFORE you have invent Vertica, and BEFORE Sybase have ship its column-oriented product, yet in 1998 year was introduced Valentina Database (www.paradigmasoft.com), with major development started at 1994-1995.

    Intresting to compare 🙂

  10. Curt Monash on September 8th, 2007 2:10 am

    Hi Ruslan,

    I’m trying to remember when Bob Epstein of Sybase first enthused to me about the Expressway acquisition, and I think it was a little earlier than the timeframe you’re suggesting.

    Anyhow, after looking at your website I have a few suggestions:

    1. If your main claim is speed, don’t have the benchmark link be dead.

    2. Developer pricing is a bad business model in most markets.

    3. Your web site doesn’t really say very much .

    4. You need a copy editor who is a native English speaker.

    Best regards,

    CAM

  11. Ileana Somesan on November 17th, 2007 12:58 pm

    Hi all,

    where is the novelty of column-oriented DBMS? Is this storage architecture another name for vertical partitioning in traditional RDBMS?

    Ileana

  12. Curt Monash on November 17th, 2007 6:29 pm

    Hi Ileana,

    You might want to look through http://www.dbms2.com/category/database-theory-practice/columnar-database-management/ for some ideas and answers. ParAccel and SAP would say that columnar architectures make memory-centric processing easier. Vertica and Infobright would say they make compression easier. DATAllegro and other row-based vendors, however, would offer the same skeptical questions you did.

    Best,

    CAM

  13. Steve on December 6th, 2007 7:51 am

    Sybase IQ doesn’t scale beyond one TB… Damn I must tell my client that, they have been using Sybase IQ for a 7TB DWH for the past 3 years (40Tb raw data btw)…

  14. Curt Monash on December 6th, 2007 1:04 pm

    Steve,

    As I asked above — are there any unnatural acts of partitioning reflected in the SQL to get that kind of scalability?

    Any serious DBMS can scale almost arbitrarily large if you just put a lot of database instances side by side …

    CAM

  15. DBMS2 — DataBase Management System Services » Blog Archive » Arguments AGAINST data warehouse appliances on April 25th, 2008 12:11 am

    […] similar arguments to me a few days ago. They are not wholly unbiased; indeed, both are involved in Vertica Systems. With that caveat, they have an interesting three-part […]

  16. DBMS2 — DataBase Management System Services » Blog Archive » Who’s who in columnar relational database management systems on May 30th, 2008 3:45 am

    […] entirely in-memory and hence is limited in possible database size. Mike Stonebraker’s startup Vertica is of course the new kid on the block, and there are other columnar startups as well whose names […]

  17. ITEC-470: Draft Schedule on June 18th, 2009 7:45 pm
  18. ITEC-470: Fall 09 Schedule (Draft) on June 28th, 2009 7:37 pm

    […] article and article and article […]

  19. Technical basics of Sybase IQ | DBMS2 -- DataBase Management System Services on May 23rd, 2010 4:35 am

    […] columns themselves can be used as indexes in the usual Vertica-like […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.