SAS Institute

Analysis of data mining powerhouse SAS, and the especially the relationship between SAS’s data mining products and various database management systems. Related subjects include:

November 28, 2011

Agile predictive analytics – the heart of the matter

I’ve already suggested that several apparent issues in predictive analytic agility can be dismissed by straightforwardly applying best-of-breed technology, for example in analytic data management. At first blush, the same could be said about the actual analysis, which comprises:

Numerous statistical software vendors (or open source projects) help you with the second part; some make strong claims in the first area as well (e.g., my clients at KXEN). Even so, large enterprises typically have statistical silos, commonly featuring expensive annual SAS licenses and seemingly slow-moving SAS programmers.

As I see it, the predictive analytics workflow goes something like this Read more

April 21, 2011

Application areas for SAS HPA

When I talked with SAS about its forthcoming in-memory parallel SAS HPA offering, we talked briefly about application areas. The three SAS cited were:

Meanwhile, in another interview I heard about, SAS emphasized retailers. Indeed, that’s what spawned my recent post about logistic regression.

The mobile communications one is a bit scary. Your cell phone — and hence your cellular company — know where you are, pretty much from moment to moment. Even without advanced analytic technology applied to it, that’s a pretty direct privacy threat. Throw in some analytics, and your cell company might know, for example, who you hang out with (in person), where you shop, and how those things predict your future behavior. And so the government — or just your employer — might know those things too.

April 21, 2011

In-memory, parallel, not-in-database SAS HPA does make sense after all

I talked with SAS about its new approach to parallel modeling. The two key points are:

The whole thing is called SAS HPA (High-Performance Analytics), in an obvious reference to HPC (High-Performance Computing). It will run initially on RAM-heavy appliances from Teradata and EMC Greenplum.

A lot of what’s going on here is that SAS found it annoyingly difficult to parallelize modeling within the framework of a massively parallel DBMS such as Teradata. Notes on that aspect include:

Read more

April 8, 2011

Revolution Analytics update

I wasn’t too impressed when I spoke with Revolution Analytics at the time of its relaunch last year. But a conversation Thursday evening was much clearer. And I even learned some cool stuff about general predictive modeling trends (see the bottom of this post).

Revolution Analytics business and business model highlights include:

Read more

April 6, 2011

So can logistic regression be parallelized or not?

A core point in SAS’ pitch for its new MPI (Message-Passing Interface) in-memory technology seems to be logistic regression is really important, and shared-nothing MPP doesn’t let you parallelize it. The Mahout/Hadoop folks also seem to despair of parallelizing logistic regression.

On the other hand, Aster Data said it had parallelized logistic regression a year ago. (Slides 6-7 from a mid-2010 Aster deck may be clearer.) I’m guessing Fuzzy Logix might make a similar claim, although I’m not really sure.

What gives?

April 5, 2011

Comments on EMC Greenplum

I am annoyed with my former friends at Greenplum, who took umbrage at a brief sentence I wrote in October, namely “eBay has thrown out Greenplum“.  Their reaction included:

The last one really hurt, because in trusting them, I put in quite a bit of effort, and discussed their promise with quite a few other people.

Read more

October 22, 2010

Notes and links October 22, 2010

A number of recent posts have had good comments. This time, I won’t call them out individually.

Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on “sensor data” …

… and, even better, went on to say:  Read more

October 10, 2010

It can be hard to analyze analytics

When vendors talk about the integration of advanced analytics into database technology, confusion tends to ensue. For example: Read more

September 27, 2010

Further thoughts on previous posts

One thing I love about DBMS 2 is the really smart comments a number of readers — that would be you guys — make. However, not all the smart comments are made in the first 5 minutes a post is up, so some readers (unless you circle back) might miss great points other readers make. Well, here are some pointers to some of what you might have missed, along with other follow-up comments to old posts while I’m at it. Read more

September 20, 2010

Some thoughts on the announcement that IBM is buying Netezza

As you’ve probably read, IBM and Netezza announced a deal today for IBM to buy Netezza. I didn’t sit in on the conference call, but I’ve seen the reporting. Naturally, I have some quick thoughts, which I’ve broken up into several sections below:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.