June 16, 2012

Metamarkets Druid overview

This is part of a three-post series:

My clients at Metamarkets are planning to open source part of their technology, called Druid, which is described in the Druid section of Metamarkets’ blog. The timing of when this will happen is a bit unclear; I know the target date under NDA, but it’s not set in stone. But if you care, you can probably contact the company to get involved earlier than the official unveiling.

I imagine that open-source Druid will be pretty bare-bones in its early days. Code was first checked in early in 2011, and Druid seems to have averaged around 1 full-time developer since then. What’s more, it’s not obvious that all the features I’m citing here will be open-sourced; indeed, some of the ones I’m describing probably won’t be.

In essence, Druid is a distributed analytic DBMS. Druid’s design choices are best understood when you recall that it was invented to support Metamarkets’ large-scale, RAM-speed, internet marketing/personalization SaaS (Software as a Service) offering. In particular:

Interestingly, the single-table/multi-valued choice is echoed at WibiData, which deals with similar data sets. However, WibiData’s use cases are different from Metamarkets’, and in most respects the WibiData architecture is quite different from that of Metamarkets/Druid.

As for many DBMS, much of what’s interesting about Druid is how it organizes and chunks data. Most important, Druid has MVCC (Multi-Version Concurrency Control) on a segment-by-segment basis. That is, an update requires a new version of the whole segment to be written; while that happens, reads can continue on the old version unabated.

Obviously, this is more suited for streaming or batch-load scenarios than for ones with many single-row updates.

Other Druid specifics include:

For more on Druid, please see my post on Metamarkets’ back-end technology.

Comments

12 Responses to “Metamarkets Druid overview”

  1. Introduction to Metamarkets and Druid | DBMS 2 : DataBase Management System Services on June 16th, 2012 5:54 pm

    […] Druid overview […]

  2. Metamarkets open sources Druid, its in-memory database — Data | GigaOM on October 24th, 2012 9:03 am

    […] Metamarkets runs Druid on an 800-core system running on Amazon EC2. Others have done a decent job explaining what Druid seems good for and where the tradeoffs might […]

  3. Metamarkets open sources Druid, its in-memory database ← techtings on October 24th, 2012 9:06 am

    […] Metamarkets runs Druid on an 800-core system running on Amazon EC2. Others have done a decent job explaining what Druid seems good for and where the tradeoffs might […]

  4. Patrick Wendell on October 31st, 2012 12:54 am

    This is the best explanation of Druid that exists anywhere – inclusive of their Marketing material, the Strata talk, and the documentation in the code. Thanks!

  5. Curt Monash on October 31st, 2012 2:29 am

    Thanks for the kind words!

    I put a lot of effort into it, but was still frustrated by the results (mainly around the in-memory part, not Druid itself).

  6. Notes and comments — October 31, 2012 | DBMS 2 : DataBase Management System Services on November 1st, 2012 7:16 am

    […] Metamarkets’ Druid was open-sourced. Numerous other product introductions and so on that I’ve hinted at have […]

  7. Big Data Warehouse in the cloud « Ravi's Technology Blog on November 28th, 2012 10:04 pm

    […] HANA but cringe at the licensing costs?  One option is to look into open source alternatives like Druid which was created by the vendor MetaMarkets.   Druid claims to provide real-time analytics using […]

  8. Hadoop’s Successors | Christopher Berry on October 5th, 2013 11:22 am

    […] “I would encourage you to keep an eye on Metamarkets’ Druid, which Curt Monash recently covered: http://www.dbms2.com/2012/06/16/metamarkets-druid-overview/ […]

  9. Hadoop’s Successors – ChristopherBerry.ca on August 7th, 2021 3:38 pm

    […] “I would encourage you to keep an eye on Metamarkets’ Druid, which Curt Monash recently covered: http://www.dbms2.com/2012/06/16/metamarkets-druid-overview/ […]

  10. deposit vivoslot on April 20th, 2022 8:14 pm

    First of all I would like to say terrific blog!
    I had a quick question in which I’d like to ask if you do not mind.

    I was interested to know how you center yourself and clear
    your head before writing. I have had trouble clearing
    my thoughts in getting my thoughts out. I truly do take pleasure in writing
    however it just seems like the first 10 to 15 minutes are usually lost simply just trying
    to figure out how to begin. Any ideas or tips? Thanks!

  11. bitsofsunshine1407.blogspot.com on July 29th, 2022 4:20 am

    Thanks for finally writing about > Metamarkets Druid overview | DBMS 2 :
    DataBase Management System Services < Loved it!

  12. read and click on October 21st, 2023 4:41 pm

    I think that what you posted was very logical.
    However, consider this, suppose you were to create a killer title?
    I ain’t suggesting your information is not solid., however what if
    you added a headline that grabbed people’s attention? I mean Metamarkets Druid overview | DBMS 2 : DataBase Management
    System Services is a little boring. You could glance at Yahoo’s front page read and click see how they create article titles to get viewers interested.

    You might add a related video or a related pic or two to get people interested about everything’ve written. In my opinion, it might make
    your website a little livelier. http://pjjivezhqqc28.mee.nu/?entry=3548990

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.