October 22, 2010

Notes and links October 22, 2010

A number of recent posts have had good comments. This time, I won’t call them out individually.

Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on “sensor data” …

… and, even better, went on to say: 

Privacy is dead.
What do we consider to be the boundaries of privacy, especially with respect to items like medical data? In a data privacy-free world, should we be regulating data usage instead? How do we deal with asymmetric access to our personal data, e.g., how is it that insurance companies claim the right to our personal information?

Obviously, my answer to the second question is Yes!!!!

Also from Hadoop World — Dave Menninger, now an analyst, reports on some Hadoop metrics:

How big is “big data”? In his opening remarks, Mike shared some statistics from a survey of attendees. The average Hadoop cluster among respondents was 66 nodes and 114 terabytes of data. However there is quite a range. The largest in the survey responses was a cluster of 1,300 nodes and more than 2 petabytes of data. (Presenters from eBay blew this away, describing their production cluster of  8,500 nodes and 16 petabytes of storage.) Over 60 percent of respondents had 10 terabytes or less, and half were running 10 nodes or less.

That eBay comment was particularly interesting. 🙂

A while back, Doug Henschen noted that Netezza flagship reference Catalina Marketing is now at 2.5 petabytes. Most of that is in one 600 billion row table. Oddly, the article talks of the Netezza/SAS partnership accelerating model-building via in-database scoring (not modeling) technology. Doug also wrote of a lot of analytic DBMS replacements, including:

Carl Olofson pointed out on Twitter that DataScaler was an in-memory database technology just bought by Oracle. This inspired me to google on them, and I found a sparse DataScaler CEO blog. I link it because of an amusing juxtaposition — the second-to-last post says, in effect, “We make appliances and we recommend all these awesome technology design partners who helped us design the hardware,” while the very last post says “Designing our own hardware was a mistake.” 🙂

Fred Holahan is now VP of Marketing at VoltDB, which is a lesson to me about giving free consulting … Anyhow, Fred tells me that VoltDB has about a dozen users on their way to production, some of whom are headed to being VoltDB paying customers, some of whom are not.


One Response to “Notes and links October 22, 2010”

  1. The privacy discussion is heating up | DBMS 2 : DataBase Management System Services on October 24th, 2010 4:55 am

    […] as I just recently noted, one venture capitalist gets it. Categories: Health care, Liberty and privacy, Web analytics  Subscribe to our complete […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.