SAS Institute

Analysis of data mining powerhouse SAS, and the especially the relationship between SAS’s data mining products and various database management systems. Related subjects include:

Statistics and predictive modeling
Business intelligence
(in The Monash Report) Data mining (older posts)
(in Text Technologies) SAS’ offerings in text mining

May 23, 2010

Various quick notes

As you might imagine, there are a lot of blog posts I’d like to write I never seem to get around to, or things I’d like to comment on that I don’t want to bother ever writing a full post about. In some cases I just tweet a comment or link and leave it at that.

And it’s not going to get any better. Next week = the oft-postponed elder care trip. Then I’m back for a short week. Then I’m off on my quarterly visit to the SF area. Soon thereafter I’ve have a lot to do in connection with Enzee Universe. And at that point another month will have gone by.

Anyhow: Read more

Categories: Analytic technologies, Business intelligence, Data warehousing, Exadata, GIS and geospatial, Google, IBM and DB2, Netezza, Oracle, Parallelization, SAP AG, SAS Institute

3 Comments

May 15, 2010

Further clarifying in-database MPP SAS

My recent post about SAS’ MPP/in-database efforts was based on a discussion in a shared ride to the airport, and was correspondingly rough. SAS’ Shannon Heath was kind enough to write in with clarifications, and to allow me to post same. Read more

Categories: Aster Data, Netezza, Parallelization, Predictive modeling and advanced analytics, SAS Institute

4 Comments

May 7, 2010

Clarifying the state of MPP in-database SAS

I routinely am briefed way in advance of products’ introductions. For that reason and others, it can be hard for me to keep straight what’s been officially announced, introduced for test, introduced for general availability, vaguely planned for the indefinite future, and so on. Perhaps nothing has confused me more in that regard than the SAS Institute’s multi-year effort to get SAS integrated into various MPP DBMS, specifically Teradata, Netezza Twinfin(i), and Aster Data nCluster.

However, I chatted briefly Thursday with Michelle Wilkie, who is the SAS product manager overseeing all this (and also some other stuff, like SAS running on grids without being integrated into a DBMS). As best I understood, the story is: Read more

Categories: Aster Data, Data warehouse appliances, MapReduce, Netezza, Parallelization, Predictive modeling and advanced analytics, SAS Institute, Specific users, Teradata

11 Comments

May 4, 2010

Revolution Analytics seems very confused

Revolution Analytics is a relaunch of a company previously known as REvolution Computing, built around the open source R language. Last week they sent around email claiming they were a new company (false), and asking for briefings in connection with an embargo this morning. I talked to Revolution Analytics yesterday, and they told me the embargo had been moved to Thursday.* However, Revolution apparently neglected to tell the press the same thing, and there’s an article out today — quoting me, because I’d given quotes in line with the original embargo, before I’d had the briefing myself. And what’s all this botched timing about? Mainly, it seems to be for a “statement of direction” about software Revolution Analytics hasn’t actually developed yet.

*More precisely, they spoke as if the embargo had been Thursday all along.

Categories: Investment research and trading, Parallelization, Predictive modeling and advanced analytics, Revolution Analytics, SAS Institute

13 Comments

February 22, 2010

TwinFin(i) – Netezza’s version of a parallel analytic platform

Much like Aster Data did in Aster 4.0 and now Aster 4.5, Netezza is announcing a general parallel big data analytic platform strategy. It is called Netezza TwinFin(i), it is a chargeable option for the Netezza TwinFin appliance, and many announced details are on the vague side, with Netezza promising more clarity at or before its Enzee Universe conference in June. At a high level, the Aster and Netezza approaches compare/contrast as follows: Read more

Categories: Aster Data, Data warehouse appliances, Data warehousing, Hadoop, MapReduce, Netezza, Predictive modeling and advanced analytics, SAS Institute, Teradata

10 Comments

February 22, 2010

Aster Data nCluster 4.5

Like Vertica, Netezza, and Teradata, Aster is using this week to pre-announce a forthcoming product release, Aster Data nCluster 4.5. Aster is really hanging its identity on “Big Data Analytics” or some variant of that concept, and so the two major named parts of Aster nCluster 4.5 are:

Aster Data Analytic Foundation, a set of analytic packages prebuilt in Aster’s SQL-MapReduce
Aster Data Developer Express, an Eclipse-based IDE (Integrated Development Environment) for developing and testing applications built on Aster nCluster, Aster SQL-MapReduce, and Aster Data Analytic Foundation

And in other Aster news:

Along with the development GUI in Aster nCluster 4.5, there is also a new administrative GUI.
Aster has certified that nCluster works with Fusion I/O boards, because at least one retail industry prospect cares. However, that in no way means that arm’s-length Fusion I/O certification is Aster’s ultimate solid-state memory strategy.
I had the wrong impression about how far Aster/SAS integration has gotten. So far, it’s just at the connector level.

Aster Data Developer Express evidently does some cool stuff, like providing some sort of parallelism testing right on your desktop. It also generates lots of stub code, saving humans from the tedium of doing that. Useful, obviously.

But mainly, I want to write about the analytic packages. Read more

Categories: Aster Data, Data warehousing, Investment research and trading, Predictive modeling and advanced analytics, RDF and graphs, SAS Institute, Teradata

9 Comments

September 3, 2009

SAS on Netezza and other Netezza extensibility

I chatted with SAS CTO Keith Collins yesterday about the new SAS/Netezza in-database parallel data mining scoring offering. My impression is that this is very similar to SAS’ current Teradata support, notwithstanding SAS’ and Teradata’s apparent original intention of offering in-database modeling by now as well.

I gather this is a big performance-enhancing deal, just as it is for SPSS or Oracle’s own data mining over Oracle. However, I must confess to not yet understanding why. That is, I don’t know what’s so complicated about data mining scoring algorithms that makes hand-coding them in SQL particularly forbidding. My naive view of data mining is that you do a big regression to get a bunch of weights, and the resulting scoring algorithm is a linear combination of a few dozen variables. Evidently, that’s not quite right.

Anyhow, it turns out that SAS held off on this work until it could be done for TwinFin. That’s largely because TwinFin lets partners write code on Intel CPUs, while previously they had to write in C for Netezza’s FPGAs. I got a similar sense from at least one other Netezza partner as well.

Categories: Data warehouse appliances, Data warehousing, Netezza, Predictive modeling and advanced analytics, SAS Institute

5 Comments

August 2, 2009

Teradata 13 focuses on advanced analytic performance

Last October I wrote about the Teradata 13 release of Teradata’s database management software. Teradata 13, which will be used across the various Teradata product lines, has now been announced for GCA (General Customer Availability)*. So far as I can tell, there were two main points of emphasis for Teradata 13:

Performance (of course, performance is a point of emphasis for almost any release of any analytic DBMS product), especially but not only in the areas of aggregates, ETL (Extract/Transform/Load), and UDFs.
UDFs (User Defined Functions), especially but not only in the areas of data mining and geospatial analysis.

To put it even more concisely, the focus of Teradata 13 is on advanced analytic performance, although there of course are some enhancements in simple query performance and in analytic functionality as well. Read more

Categories: Analytic technologies, Data types, Data warehouse appliances, Data warehousing, EAI, EII, ETL, ELT, ETLT, GIS and geospatial, Parallelization, SAS Institute, Teradata, Theory and architecture

6 Comments

July 28, 2009

Initial reactions to IBM acquiring SPSS

IBM is acquiring SPSS. My initial thoughts (questions by Eric Lai of Computerworld) include:

1) good buy for IBM? why or why not?

Yes. The integration of predictive analytics with other analytic or operational technologies is still ahead of us, so there was a lot of value to be gained from SPSS beyond what it had standalone. (That said, I haven’t actually looked at the numbers, so I have no comment on the price.)

By the way, SPSS coined the phrase “predictive analytics”, with the rest of the industry then coming around to use it. As with all successful marketing phrases, it’s somewhat misleading, in that it’s not wholly focused on prediction.

2) how does it position IBM vs. competitors?

IBM’s ownership immediately makes SPSS a stronger competitor to SAS. Any advantage to the rest of IBM depends on the integration roadmap and execution.

3) How does this particularly affect SAP and SAS and Oracle, IBM’s closest competitors by revenue according to IDC’s figures?

If one of Oracle or SAP had bought SPSS, it would have given them a competitive advantage against the other, in the integration of predictive analytics with packaged operational apps. That’s a missed opportunity for each.

One notable point is that SPSS is more SQL-oriented than SAS. Thus, SPSS has gotten performance benefits from Oracle’s in-database data mining technology that SAS apparently hasn’t.

IBM’s done a good job of keeping its acquired products working well with Oracle and other competitive DBMS in the past, and SPSS will surely be no exception.

Obviously, if IBM does a good job of Cognos/SPSS integration, that’s bad for competitors, starting with Oracle and SAP/Business Objects. So far business intelligence/predictive analytics integration has been pretty minor, because nobody’s figured out how to do it right, but some day that will change. Hmm — I feel another “Future of … ” post coming on.

4) Do you predict further M&A?

Always. 🙂

Related links

Official word from SPSS and IBM
Blog posts from Larry Dignan and James Taylor
James Kobelius‘s post, which includes the obvious point that Oracle — unlike SAP — has pretty decent data mining of its own
Eric Lai‘s actual article

Categories: Analytic technologies, Cognos, IBM and DB2, Oracle, SAP AG, SAS Institute

8 Comments

March 23, 2009

SAS in its own cloud

The Register has a fairly detailed article about SAS expanding its cloud/SaaS offerings. I disagree with one part, namely:

SAS may not have a choice but to build its own cloud. Given the sensitive nature of the data its customers analyze, moving that data out to a public cloud such as the Amazon EC2 and S3 combo is just not going to happen.

And even if rugged security could make customers comfortable with that idea, moving large data sets into clouds (as Sun Microsystems discovered with the Sun Grid) is problematic. Even if you can parallelize the uploads of large data sets, it takes time.

But if you run the applications locally in the SAS cloud, then doing further analysis on that data is no big deal. It’s all on the same SAN anyway, locked down locally just as you would do in your own data center.

I fail to see why SAS’s campus would be better than leading hosting companies’ data centers for either of data privacy/security or data upload speed. Rather, I think major reasons for SAS building its own data center for cloud computing probably focus on: Read more

Categories: SAS Institute, Software as a Service (SaaS)

15 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in