August 18, 2010

I’m collecting data points on NoSQL and HVSP adoption

I was asked to do a magazine article on NoSQL, where by “NoSQL” is meant “whatever they talk about at NoSQL conferences.” By now the number of publications planning to run the article is up to 2, the deadline is next week and, crucially, it has been agreed that I may talk about HVSP in general, NoSQL and SQL alike.

It also is understood that, realistically, I can’t be expected to know and mention the very latest news for all the many products in the categories. Even so, I think this would be fine time to check just where NoSQL and HVSP adoption stand. Here is most of what I know, or links to same; it would be great if you guys would contribute additional data in the comment thread.

In the NoSQL area: 

Among the SQL or SQL-friendly guys:

Comments

17 Responses to “I’m collecting data points on NoSQL and HVSP adoption”

  1. Curt Monash on August 18th, 2010 10:34 am
  2. Daniel Einspanjer on August 18th, 2010 11:04 am

    Mozilla has a few other clusters too.
    Besides our 4 node Vertica cluster serving our data warehouse, we have a 5 node 80 core staging Hadoop+HBase cluster, a 20 node 296 core production Hadoop+HBase cluster and a new 8 node 128 core test cluster that we are putting the latest versions of Cloudera’s CDH3 and HBase 0.90 on.

    Guess we could also toss in a 7 node Mac Mini cluster for prototyping and our two big ETL servers that run Pentaho Data Integration processes.

  3. Alan Hoffman on August 18th, 2010 11:33 am

    In the NoSQL portion of your post, you neglect to mention Apache CouchDB. I know you’ve been critical of CouchDB in the past, but it did recently release it’s 1.0 version (followed quickly by 1.0.1 to fix a critical bug). There are at least two companies that do commercial support and services for CouchDB, Cloudant (full disclosure, I am a cofounder of Cloudant) and Couch.io.(1)

    There are a great number of companies using CouchDB in production. The BBC, for instance, runs their website off of CouchDB. I know of a large retail chain (not at liberty to disclose their name) uses a CouchDB-based system to connect their many points-of-sale to their main data center. At Cloudant, we’ve built dynamo-style clustering into CouchDB to provide horizontal scalability. We have customers in real-time search, advertising, and data analytics, some of whom have many TB of data and up to 1 billion documents in a single db.

    CouchDB does not really fall into the HVSP category, but it still warrants mentioning as a NoSQL option that is gaining traction with production users.

    (1):
    http://cloudant.com
    http://couch.io

  4. Curt Monash on August 18th, 2010 11:40 am

    Daniel,

    Thanks for all the info!

    What are the specs on those 16-core machines, and why so many cores?

  5. Roger Bodamer on August 18th, 2010 2:32 pm

    MongoDB is doing great ! Regarding MongoDB usage in production:

    A public, partial list of people who are using MongoDB in production can be found here:
    http://www.mongodb.org/display/DOCS/Production+Deployments

    We’ve been seeing a significant number of people moving from development into production, so that list is growing. In terms of downloads, we’re seeing more than 50k database server downloads/month, and that number is also growing rapidly.

    Thanks,
    -Roger
    10Gen.com / mongodb.org

  6. Alex Popescu on August 18th, 2010 3:07 pm

    Curt,

    If you are truly interested in what’s going on in the NoSQL market, I think you should check on regular bases the myNoSQL blog: http://altdbase.com which is focused exactly on this.

    PS: yes, I am biased as I’m the main maintainer of the myNoSQL blog

  7. Curt Monash on August 18th, 2010 4:00 pm

    Alex,

    I scrolled through the first two pages, and I saw almost nothing that addressed the question in this blog post. How much further back should I go?

    Or if you were just plugging your blog because it is indeed active in providing other kinds of NoSQL news — well, consider it plugged! :)

  8. Alex Popescu on August 18th, 2010 7:45 pm

    Curt,

    The blog covers 9 months of activity in the NoSQL market, so it will be kind of difficult to get a detailed answer to your question directly on the home page :-). The tagging system should allow you to look for the status of the products you are missing details about.

  9. Dawn Wolthuis on August 19th, 2010 7:42 am

    Hi Curt — There is a significant base of NoSQL implementations if you include the ~40-year old data models of PICK and MUMPS, which are now known as MultiValue databases from many vendors and M or InterSystems Cache’. The latter has both of these NoSQL data models. Most of these vendors have worked hard over the years to project their models for a SQL implementation as well, so they would not all claim to be NoSQL. In the case of InterSystems, their SQL implementation is the fastest I have experienced, for example, but it need not be used in an application.

    Some folks defining the NoSQL label want to limit this tag to new databases or specific data models or architectures, but I am convinced it should include these older implementations too. It might not be proof positive, but when I made the no sql graphic on this blog entry http://www.tincat-group.com/mewsings/2007/01/otlt-metadata-piece-not-apartheid.html it was after a colleague acquired the nosql.com and .org domains, when we were planning to use those to showcase MultiValue databases and applications. We changed directions but he still has those domains. That and the fact that all MultiValue databases can be accessed without SQL (some also with SQL) should be enough to give these a seat at the table.

    Including the pre-relational data models (again, not the way these are positioned by marketing teams) gives a significant installed base from vendors such as InterSystems, Rocket Software, Tiger Logic, Revelation, jBASE, Ladybridge, and Northgate. I think these no sql databases and their logical NF2 data models should be mentioned in any treatment regarding real implementations of NoSQL databases. Thanks for your consideration and cheers!

  10. Curt Monash on August 19th, 2010 8:44 am

    Hi Dawn!

    Personally, I don’t think the term “NoSQL” is meaningful if it includes classic DBMS that happen to have a different approach to data organization or DML. And your examples are part of the reason why.

    But I do think of you every time I hear of multi-value as an exciting new feature. ;)

    Best,

    CAM

  11. Mike on August 19th, 2010 4:20 pm

    Hi Curt, I’m glad you enjoyed my April Fools post. I figured someone out there might appreciate it. Regarding our perennial beta, it has been really more of a deciduous beta. We started with one locking architecture, only to find people wanted more nodal scalability and switched to another (done). Then we found that sharing data via disk doesn’t provide sufficient performance (rather obvious, but we thought we could squeeze by on that one for a while) so we developed an alternative to Oracle’s cache fusion based on a cache tier (analogous to Memcached, but between the DB and storage devices) that also provides more storage flexibility (see various blog posts here: http://scaledb.blogspot.com/). This cache tier is in the final stages of debug/tuning work. It seems that we climb a mountain only to see another hidden behind it; fitting based on our name and logo I guess. We believe the last big mountain is addressed by the cache tier and we will soon re-enter beta in weeks.

  12. RC on August 20th, 2010 1:58 pm

    People do talk about durability when it comes to MongoDB and whether this issue is an issue or an ‘issue’. See http://nosql.mypopescu.com/post/392868405/mongodb-durability-a-tradeoff-to-be-aware-of

    I think that one of the reasons why you see so many downloads of MongoDB is that they often come with a new version. I for instance have a folder called “c:\nosql\mongodb” on my laptop that contains sub folders mongodb124, mongodb140, mongodb141, mongo142, mongodb151, mongodb152, mongodb155 and mongodb160. So I download a new version quite often.

    This however also shows that MongoDB is very easy to install.

  13. Rick Cattell on August 23rd, 2010 12:23 am

    I posted a paper comparing some of the NoSQL and SQL NVSP systems on my website, if that is of any help to you or anyone else:

    http://cattell.net/datastores/

    I plan to do an update to that paper in September, if you have any input.

  14. Curt Monash on August 23rd, 2010 1:53 am

    Looks good, Rick, with a lot of detail I’m unlikely to ever post here. :)

  15. More on NoSQL and HVSP (or OLRP) | DBMS 2 : DataBase Management System Services on August 26th, 2010 5:10 am

    […] posting last Wednesday morning that I’m looking into NoSQL and HVSP, I’ve had a lot of conversations, including with (among […]

  16. Devin Knighton on September 29th, 2010 5:39 pm

    Adoption in the Cassandra community has continued to increase since this blog post. And today another important step was taken to facilitate further adoption. For the first time, documentation is available to addresses such things as installation and configuration to data modeling. A reference guide for the API is also available.

    The documentation is hosted on Riptano’s website. It can be viewed here: http://www.riptano.com/docs/0.6.5

  17. Data management at Zynga and LinkedIn | DBMS 2 : DataBase Management System Services on September 5th, 2011 3:49 am

    […] particularly interesting. First, those 5 TB/day are going straight into Vertica (from, I presume, memcached/Membase/Couchbase), as Zynga decided that sending the data to some kind of log first was more trouble than it’s […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.