August 4, 2009

Vertica’s version of MapReduce integration

I talked with Omer Trajman of Vertica Monday night about Vertica’s MapReduce integration, part of its Vertica 3.5 release. Highlights included:

Apparently, the use cases for Vertica/Hadoop integration to date lie in algorithmic trading and two kinds of web analytics. Specifically:

By the way, Vertica is based on C-Store, the Ph.D. thesis project of Daniel Abadi, who recently wrote:

To me, it is far more efficient from a performance and a “green” perspective to push the computation to the data. Hence, I am not a fan of decoupling the compute grid and the data grid.

Not coincidentally, Daniel also recently wrote that

If the VectorWise/Ingres solution does get released open source, I believe they will be an excellent column-store storage engine for HadoopDB. I have already requested an academic preview edition of their software to play with.

The VectorWise guys also told me they are looking forward to seeing how the two projects work together.

Comments

5 Responses to “Vertica’s version of MapReduce integration”

  1. Omer Trajman on August 4th, 2009 8:12 am

    One clarification regarding compute/data locality. MR necessarily has a data re-distribution phase prior to reduce (unless data is distributed by map key on load). When pushing the map down to Vertica there is no more data shuffling beyond what any other MR requires. You do get the added flexibility of being able to reduce on a different collection of nodes.

  2. Daniel Abadi on August 4th, 2009 11:01 am

    I agree with Omer’s clarification.

    Also, just for the record, it’s probably giving me too much credit to say that C-Store was my PhD thesis. My thesis involved research behind building the query execution engine for C-Store, but the C-Store project was much bigger than just the work that I did.

  3. Vertica Projects Leadership, Embraces MapReduce (Sorta) « Market Strategies for IT Suppliers on August 11th, 2009 10:02 pm

    [...] MapReduce support, but with a difference. Unlike Greenplum and Aster, who are bringing it into the database itself, Vertica is providing a streaming connection to Hadoop instances (the open source implementation of MapReduce; Vertica is contributing the adapter to the community). This architecture mirrors usage patterns we’ve seen, and which Vertica asserts its customers have told them they want. One scenario: use your ADBMS to retrieve stored data, pass it to Hadoop for analysis by staff with different skill sets from the typical ADBMS users, and then bring result sets back. A separate hardware for the Hadoop sandbox is fairly typical among early adopters today, and via a Cloudera partnership, Vertica can offer a deployment architecture that doesn’t break the bank. Curt Monash does the usual excellent summary of Hadoop issues in his blog. [...]

  4. How 30+ enterprises are using Hadoop | DBMS2 -- DataBase Management System Services on December 11th, 2009 11:26 pm

    [...] (Vertica recently made its 100th sale, and of course not all those buyers are in production yet.) Vertica/Hadoop usage seems to have started in Vertica’s financial services stronghold — specifically [...]

  5. Will a Shotgun Marriage Avert Squabbles among the Data Clans : Beyond Search on December 17th, 2009 2:03 am

    [...] TechWorld.com ran a story that I thought was interesting and closer to the truth about the relational databases and big data. “Sybase Embraces Google MapReduce” runs down a number of data management companies expressing interest in one of Google’s earlier innovations. One comment worth noting in my opinion was: Relational database pioneer Michael Stonebraker co-authored a paper earlier this year contending that that SQL technology still beats MapReduce in most cases. But that conclusion didn’t stop Vertica Systems, the startup where he serves as CTO, from adding Hadoop functionality to its new Vertica 3.5 database. [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.