April 8, 2011

Revolution Analytics update

I wasn’t too impressed when I spoke with Revolution Analytics at the time of its relaunch last year. But a conversation Thursday evening was much clearer. And I even learned some cool stuff about general predictive modeling trends (see the bottom of this post).

Revolution Analytics business and business model highlights include:

Revolution Analytics is an open-core vendor built around the R language. That is, Revolution Analytics offers proprietary code and support, with subscription pricing, that help in the use of open source software.
Unlike most open-core vendors I can think of, Revolution Analytics takes little responsibility for the actual open source part. Some “grants” for developing certain open source R pieces seem to be the main exception. While this has caused some hard feelings, I don’t have an accurate sense for their scope or severity.
Revolution Analytics also sells a single-user/workstation version of its product, freely admitting that this is mainly a lead generation strategy or, in my lingo, a “break-even leader.”
Revolution Analytics boasts around 100 customers, split about 70-30 between the workstation seeding stuff and the real server product.
Revolution Analytics has “about” 37 employees. Headquarters are at 101 University Avenue (do I have to say in what city? 🙂 ). There are also a development office in Seattle and a sales office in New York.
Revolution Analytics’ pricing is by size of server. “Small” servers — i.e. up to 12 cores — start at $25K/year.
Unsurprisingly, adoption is more alongside SAS et al. than rip-and-replace.

Revolution Analytics’ top market sector by far appears to be financial services, both in trading/investment banks/hedge funds and in credit cards/risk analysis. Pharma/life sciences is second, but sales cycles are slow. There’s also been at least a little activity each in a variety of internet/media/entertainment/gaming/telecom sectors.

When I asked Revolution Analytics why one would use R rather than, say, SAS, Revolution cited three reasons that seemed to be driving customer interest:

You can do more with R. That may be debatable, but what’s harder to dispute is that there are a bunch of things you can do straightforwardly in R and its thousands of routines that would at best be more difficult in SAS.
Students today are learning R, so you have access to (affordable?) talent. That’s pretty clearly correct, although I do note SPSS’ long history of academic social sciences use .
R is cheaper. It’s hard to argue with that one. 🙂

Revolution Analytics’ parallelized-R story starts something like this:

Although R is generally thought of as requiring all data to be in RAM, Revolution also offers external memory algorithms. (“External memory algorithms” seems to be the discipline-standard way of saying “Not all data has to be in RAM.”)
In principle, Revolution is willing to parallelize external memory algorithms for you any which way — MapReduce, MPI (Message Passing Interface), and more.
Revolution parallelized for multi-core last fall. Multi-server scale-out is coming this summer.
Revolution is working on Netezza support. Revolution expects to use nzMatrix in the effort.
Yes, logistic regression is one of the algorithms Revolution parallelizes.

Like Netezza with nzMatrix or Greenplum (now EMC) with its sparse vector routine, Revolution has some useful underpinnings to help with parallelization/scale-out as well. The main one seems to be a variance/covariance matrix, which can be arbitrarily large and can be computed in a very distributed way. Revolution notes that you can use this not just on data but also, for example, on parameters.

One analytic approach — if not meta-approach — that Revolution sees as hot is ensemble learning. Specifically mentioned was Max Kuhn’s caret package, which evidently automates ensemble techniques. Also specifically mentioned was the Netflix Prize, which I gather was won by an ensemble approach. The idea behind ensemble techniques is that, rather than pick a particular kind of model, you throw a bunch against the wall. The first benefit is that you get to see what works best. The second benefit is that you can combine results and hopefully outperform any one of the models.

Obviously, ensemble techniques can require vastly more performance than just running a single model. I wouldn’t be surprised if, going forward, they turned out to be one of analytics’ biggest performance challenges.

Categories: Health care, Investment research and trading, Open source, Parallelization, Predictive modeling and advanced analytics, Pricing, Revolution Analytics, SAS Institute

Subscribe to our complete feed!

Comments

2 Responses to “Revolution Analytics update”

Ajay Ohri on April 8th, 2011 11:44 am

R basically has 2396 packages http://cran.r-project.org/web/packages/. SAS admits R is more extensive in terms of statistical functionality and offers extensions from JMP/SAS /IML, and there are SAS language clones WPS with a Bridge To R extension.
As hardware expands from PC/Server to Clouds with on demand resources, SAS’s main advantage of faster processing of data disappears. However it has much more maturity and size in business intelligence.
Students prefer to learn SAS than R, though this is changing with newer R GUIs. One reason is Job Market gives a premium to SAS skills. Even SPSS skills are more preferred by career mind students.
The innovation is R in graphics and data analysis and GUI are coming from community, 2011 was the year Revolution was going to launch their own GUI (from an IDE currently)
There is inherent tension between Revolution contributing 6-10 packages and claiming credit and sales for remaining 2380 packages.
given the cost savings from a correct analytics solution , price is not a reason for SAS to get worried from bottom feeders
High Performance Analytics « DECISION STATS on April 22nd, 2011 2:32 pm

[…] Revolution Analytics update (dbms2.com) […]

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Revolution Analytics update

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin