July 2, 2009

User data vs. raw disk space as a marketing metric

I tried to post a comment on Daniel Abadi’s blog, but doing so seems to require some sort of registration process, so I’m posting here instead.

In a comment to his post on node scalability, Daniel Abadi argued that disk space is a better metric to use in marketing than (presumably compressed) user data.  Well, I imagine he didn’t quite mean to say that, but that’s actually what he wound up saying, starting from the accurate observation that compression ratios vary wildly from one data set to another, even more than they vary from product to product on the same data.

Nonetheless, I favor user data as a metric because:

Comments

3 Responses to “User data vs. raw disk space as a marketing metric”

  1. Jerome Pineau on July 2nd, 2009 6:08 pm

    To quote from Daniel’s post:
    “If you store a lot of data but don’t have the CPU processing power to process it at the speed it can be read off of disk, this is a potential indication of a scalability problem not a scalability success.”

    We’ve been saying this at XSPRADA for years now. Of course we are currently an SMP/DAS implementation. Either way I always say, the best way to configure our system is have “as many individually addressable drives accessible via I/O providing sufficient bandwidth so that the drives can deliver their peak continuous read/write speeds” — This means balancing your cores+channels+disks accordingly.

  2. Daniel Abadi on July 3rd, 2009 1:27 am

    To be honest with you, I do not know a lot about how Vertica or other vendors price their products. It seems to me that if you’re buying hardware (i.e. a traditional appliance) it’s a little more straightforward to reason about how much space you are actually getting, rather than how much data could theoretically fit into your appliance if your data compresses similarly to how other people’s data compresses. On the other hand, if you’re buying a software-only solution, then user data is easier to reason about (since there are no fixed hardware limitations). However, I don’t really feel strongly about this. I still think it is odd to quote numbers using both methods for different configurations — to me this is really misleading. But the overall point of my post is that despite playing dumb marketing tricks, the new metrics that Aster Data introduces are the right things to be thinking about in tomorrow’s market, where more and more analysis is being pushed into the DBMS.

  3. Curt Monash on July 3rd, 2009 4:12 am

    The confuison Aster’s marketing sheet was careless, not malicious. The penultimate draft didn’t have that weirdness, or else I would have talked them out of it.

    (Somehow, I doubt they’ll mind my bending NDA to say that. :D )

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.