I spent a few hours at Aster Data on my West Coast swing last week, which has now officially put out Version 3 of nCluster. Highlights included:
- Aster says it now has “tens” of paying nCluster customers.
- MySpace aside, I got the sense that Aster Data’s customer databases are concentrated in the low single digits of terabytes. Speed and robustness of analytics seem to outweigh data volumes for Aster.
- Aster’s nCluster pricing remains a deep, dark secret.
- The majority of Aster customers are still internet companies, but that category no longer is 100% of the total. Aster even gave me one application example – admittedly, a not-yet-signed deal – that had nothing to do with web clicks, network management, or consumer marketing in any direct way.
- nCluster applications still appear concentrated in what Aster calls the “frontline,” which essentially equates to “decision-making based – at least in part – on very fresh data.” This fits well with Aster’s focus on uptime.
- A minority of Aster Data customers are using MapReduce.
- MapReduce is very pipelined on nCluster. Queries are rather pipelined too, RAM permitting, but – well, blocking operators are blocking operators.
- nCluster partitions are technically implemented as separate tables. Therefore, anything that’s true of nCluster tables is true of nCluster partitions as well.
- Since my last Aster Data overview, Aster has added compression, specifically three levels (Low/Medium/High) that are all variants of Lempel-Ziv. Ajeet Singh of Aster, based both on his experience there and at Oracle, confidently asserts that Aster’s compression levels approach those of column stores on comparable kinds of data. Ajeet believes it’s been proven that row stores with megabytish block sizes are close to row stores in the compression they can achieve.
- nCluster compression schemes can be chosen on a table-by-table (and hence partition-by-partition) basis.
- Aster Data has extended its node specialization (I think that’s a better name than “node heterogeneity”, which is what I had been using) story. Along with specialized nodes for bulk load and export, Aster now has added a couple of specialized node kinds for backup (one to coordinate backup work, one to actually do it). Obviously, this helps with assuring consistent analytic performance, at the cost of requiring extra hardware.
- Aster Data is proud of backup and restore performance, which it sees as central to a general 24×7 uptime story. E.g., Aster says nCluster’s incremental backup capability is robust enough that users never need to do a full backup. nCluster backup freshness can be scheduled table-by-table, and hence partition-by-partition. Similarly, restores can be done on the most crucial tables or partitions first.
In other news, Aster has picked up a second well-known customer, because Akamai is acquiring Aster customer Acerno.