November 12, 2011

Clarifying SAND’s customer metrics, positioning and technical story

Talking with my clients at SAND can be confusing. That said:

I need to revise my figures for SAND’s customer count way downward.
SAND finally has a reasonably clear positioning.
SAND’s product actually seems to have a lot of features.

A few months ago, I wrote:

SAND Technology reported >600 total customers, including >100 direct.

Upon talking with the company, I need to revise that figure downward, from > 600 to 15.

One embarrassing point: SAND is a client, and I view it as part of my job to save clients from that kind of inadvertent misstatement.

It turns out that SAND has a very impressive customer — Dunnhumby, a data mart outsourcer with 200 terabytes of data in SAND, 30 or so incoming data streams, 400 or so nodes … and 600 or so end customers, all of which SAND was counting as OEM end customers for its DBMS. But I, other industry observers, and other vendors generally don’t count that way.

Besides Dunnhumby, SAND has 14 other customers on maintenance, with < 1 terabyte of data each. Until recently, SAND had a couple dozen more customers than that, but it sold its SAP-oriented archiving/near-line storage product line to Informatica.

I still don’t know where the “> 100 direct” part came from.

After the sale of its other product line, SAND is squarely in the market for analytic DBMS. SAND’s sales efforts seem to be focused on investigative analytics, although some of its existing users seem to be more focused on operational analytics. Most specifically, SAND is trying to focus on “people data” — customer loyalty, health care, etc . — rather than purely machine-generated data, with the paradigmatic target application being personalized marketing.

SAND technical highlights include:

SAND sells a columnar analytic DBMS.
The SAND DBMS operates on bitmaps, with heavy use of run-length encoding on the bitmaps. Bitmaps are used for everything except BLOBs (Binary Large OBjects).
Actual data compression also comes into play, e.g. as result sets are being assembled. This is based on a true global dictionary — multiple columns are tokenized together.
Indeed, SAND can decompose columns and tokenize their parts (e.g. time stamps).
SAND’s workload management sees RAM and CPU, but not explicitly I/O.
SAND lets you pin certain tables or even table segments in RAM if you want to.

SAND’s update story is straightforward — when data comes in, all the columns and bitmaps are updated as needed. Still, since SAND is columnar, you wouldn’t expect true updates in place, and you’d be right. Rather, there’s a story with MVCC (MultiVersion Concurrency Control) and garbage collection, lock-free. The MVCC is also exploited for a kind of time travel, and further for some kind of virtual data mart capability.

SAND’s parallelization story is a bit complicated.

SAND has, or at least has the potential for, node specialization, with database and storage nodes being different.
In principle, disks are specific to storage nodes, and it’s a configuration option as to whether a database node sees one, some, or all storage nodes.
In practice, only Dunnhumby among SAND’s customers operates on other than a shared-disk basis. Dunnhumby’s configuration is mixed/matched among various SAND sharing options.

SAND is proud of its PMML (Predictive Modeling Markup Language) scoring capabilities, but otherwise hasn’t shipped much in the way of analytic platform capabilities. That said, work is underway on a user-defined table function capability that can also query external tables, fire off MapReduce jobs, and so on, under the code name UQL.

Categories: Archiving and information preservation, Columnar database management, Data mart outsourcing, Data warehousing, Database compression, Market share and customer counts, Parallelization, Predictive modeling and advanced analytics, SAND Technology, Specific users, Workload management

Subscribe to our complete feed!

Comments

One Response to “Clarifying SAND’s customer metrics, positioning and technical story”

Comments on the analytic DBMS industry and Gartner’s Magic Quadrant for same : DBMS 2 : DataBase Management System Services on February 9th, 2012 4:19 am

[…] Gartner completely missed the errors in SAND’s reported customer counts. […]

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Clarifying SAND’s customer metrics, positioning and technical story

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin