February 22, 2010

Aster Data nCluster 4.5

Like Vertica, Netezza, and Teradata, Aster is using this week to pre-announce a forthcoming product release, Aster Data nCluster 4.5. Aster is really hanging its identity on “Big Data Analytics” or some variant of that concept, and so the two major named parts of Aster nCluster 4.5 are:

And in other Aster news:

Aster Data Developer Express evidently does some cool stuff, like providing some sort of parallelism testing right on your desktop. It also generates lots of stub code, saving humans from the tedium of doing that. Useful, obviously.

But mainly, I want to write about the analytic packages. I’m not convinced that they’re a big deal in themselves yet, or that a whole lot of person-months have gone into their combined development. Still, I think they provide a great indication of one direction in which analytic functionality is going. And by the way, Aster promises to release a lot more of that kind of thing over the next 12 months.

Aster’s flagship analytic package is nPath, which is like a regular expression matcher, but for (time) series of data rather than for character strings. The main use for nPath is in pulling specific kinds of event sequences out of web or network event logs. However, one could imagine uses in other sectors that focus on temporal or sequential data (e.g., trading, intelligence, other sensor analysis), should existing SQL- and/or CEP-based technologies not prove sufficiently flexible. Aster 4.5 adds some new aggregation capabilities around nPath.

Other not-wholly-new packages in the Aster Data Analytic Foundation announcement are for sessionization (of clickstream data and the like) and tokenization (of text/character string data). While sessionization can be done in SQL, Aster thinks its MapReduce-based version is faster, since it doesn’t require self-joins. Makes sense. Aster’s tokenization sounds lame, however – text analytics in MapReduce tends to reinvent simplistic wheels for no clear reason, and Aster doesn’t seem to be an exception. (Aster would argue, however, that anything it does in SQL-MapReduce is more flexible than pure SQL or pure MapReduce alternatives.)

Another example of better-living-without-self-joins is Aster’s new market basket package. This lets you look at a set of point-of-sale data, pick a small integer N, and pull out all the sets of N things that were bought by the same person at the same time. I haven’t probed the claim in detail, but Aster implies there’s less combinatorial explosion in its approach than it is in the self-join alternative.

Note: Gartner highlighted self joins as a performance challenge in its recent Data Warehouse Magic Quadrant.

Aster is also releasing a few statistical and general analytic functions — specifically (and I quote a slide):

The point of the last two items on the list is that if you set a non-zero tolerance for error, you can you can count things or order them into bins very efficiently – especially in terms of RAM — while being guaranteed not to exceed your error tolerance.

Note: One obvious inference from this list — which Aster gladly confirms — is that Aster has high hopes of selling to the financial services industry.

Finally, Aster is releasing its first pure graph-analytic function, for finding the shortest path between a given pair of nodes.

While I had the Aster folks on the phone anyway, I also took the opportunity to ask about the Aster nCluster 4.0 capability to create fairly persistent non-relational in-memory data structures. Specifically, I asked whether different users could access the same in-memory structure, and was told that this is a little klugey but not too horrendous. That suggests Aster’s capability may be a strict superset of UDF-based (User-Defined Function) approaches to meeting the same need, at least from a functionality standpoint. However, ease of creating those in-memory structures may still be better in the more SQL/UDF-centric approach favored by Teradata.

Comments

8 Responses to “Aster Data nCluster 4.5”

  1. February 2010 data warehouse DBMS news roundup | DBMS2 -- DataBase Management System Services on February 22nd, 2010 5:05 pm

    [...] Data nCluster 4.5. Much like Aster’s prior release — Aster Data nCluster 4.0 – Aster Data nCluster 4.5 has a major focus on integrating analytics and database processing. This time, the emphasis is on [...]

  2. Clarifying the state of MPP in-database SAS | DBMS2 -- DataBase Management System Services on May 7th, 2010 4:46 pm

    [...] I routinely am briefed way in advance of products’ introductions. For that reason and others, it can be hard for me to keep straight what’s been officially announced, introduced for test, introduced for general availability, vaguely planned for the indefinite future, and so on. Perhaps nothing has confused me more in that regard than the SAS Institute’s multi-year effort to get SAS integrated into various MPP DBMS, specifically Teradata, Netezza Twinfin(i), and Aster Data nCluster. [...]

  3. So can logistic regression be parallelized or not? | DBMS 2 : DataBase Management System Services on April 6th, 2011 5:04 am

    [...] the other hand, Aster Data said it had parallelized logistic regression a year ago. (Slides 6-7 from a mid-2010 Aster deck may be clearer.) I’m guessing Fuzzy Logix might make [...]

  4. Lots of Aster Data analytic packages | DBMS 2 : DataBase Management System Services on April 8th, 2011 12:00 am

    [...] start with Aster Data, which added to the list of analytic packages it previously announced, and kindly gave me permission to post a partial slide deck from the [...]

  5. TwinFin(i) – Netezza’s version of a parallel analytic platform | DBMS 2 : DataBase Management System Services on April 8th, 2011 12:02 am

    [...] like Aster Data did in Aster 4.0 and now Aster 4.5, Netezza is announcing a general parallel big data analytic platform strategy. It is called Netezza [...]

  6. Teradata’s future product strategy | DBMS 2 : DataBase Management System Services on September 24th, 2011 11:10 pm

    [...] Netezza or Aster, Teradata doesn’t seem to plan analytic capability that works outside the UDF (User Defined [...]

  7. mICHAEL rAMIREZ on November 10th, 2011 12:06 pm

    dO YOU HAVE A PRICE LIST FOR THE NC-PE-75TB
    PLEASE SEND ME THE PRICE LIST.
    THANKS,
    MIKE RAMIREZ
    PROCUREMENT DEPT
    CACI INC FEDERAL

  8. Entity-centric event series analytics | DBMS 2 : DataBase Management System Services on October 18th, 2013 4:29 am

    [...] number of my clients are focused on such scenarios, including WibiData, Teradata Aster (e.g. via nPath), Platfora (in the imminent Platfora 3), and others. And so I get involved in naming exercises. The [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.