May 18, 2009

Followup on IBM System S/InfoSphere Streams

After posting about IBM’s System S/InfoSphere Streams CEP offering, I sent three followup questions over to Jeff Jones.  It seems simplest to just post the Q&A verbatim.

1.  Just how many processors or cores does it take to get those 5 million messages/sec through? A little birdie says 4,000 cores.

The TD Securities First of a Kind (FOAK) project ran on 356 nodes of a Blue Gene/P supercomputer. The Blue Gene/P may have one or more racks. Each rack consists of 1,024 nodes, each with four Power4 processor cores. The Power4 core was selected as the optimal choice for power consumption and performance, rather than the latest generation of Power chips (Power6). So, the 5 million messages per second were handled on 1,424 Power4 processor cores. This extra information is important so that customers understand the environment, and don’t jump to the conclusion that the latest generation of processor cores (e.g. Power6, or Intel Nehalem, for example) are required to gain this type of performance.

2.  Did the NSA really pay you for Streams, or did you throw your own money into the development, say $35 million?

The US Government has been working with IBM Research since 2003 on a radical new approach to data analysis that enables high speed, scalable and complex analytics of heterogeneous data streams in motion. We will not comment on the particular part of the US Government that we have worked with. Yes, the US Government has a paid contract with IBM Research for this work. IBM will not comment on the amount paid to develop Streams, although we have stated that we have had 40 to 70 people on the team over the last 6 years.

3.  Does Streams have much in the way of the decisioning logic that financial services outfits like for applications like order routing, trade matching, algo trading, etc.?

Streams, via the SPADE programming language, has many commands that simplify creating financial applications. For example,

  • Filtering (on things like stock tick)
  • Split and Join on streams
  • Aggregate to perform sliding window operations based on several parameters like time, count, custom window, etc
  • Math functions such as Min, Max, Avg, Sum, Any, First, Last, etc

We’ve also been working with customers and prospects to add User Defined Operators to calculate ‘greeks’ such as Volume Weighted Average Price (VWAP), Daily VWAP, Price to Earnings Ratio, Dividend Yield, Year to Date Return, etc.

Comments

7 Responses to “Followup on IBM System S/InfoSphere Streams”

  1. Marc on May 18th, 2009 5:37 pm

    Curt,

    Did you tell Jeff that VWAP is not a greek?

    -marc

  2. Curt Monash on May 18th, 2009 5:44 pm

    No, because I didn’t know any better. I’d never heard the term before in this context myself, even if I know what alpha and beta are (but not, frankly, gamma et al., at least not w/o rederiving the concepts myself … or looking them up).

  3. Robert Young on May 18th, 2009 6:57 pm

    >> Does Streams have much in the way of the decisioning logic that financial services outfits like for applications like order routing, trade matching, algo trading, etc.?

    Let me guess? If the next AIG gets its hands on one of these machines, the world economy goes down the tubes 100 times faster. Sometimes change is not progress, just change.

  4. Marco Seiriö on May 19th, 2009 1:40 am

    An announcement like this does actually have more wow factor to it than any practical value. How many % of todays data processing customers will actually need this, think its worth the $$ and at the same time have the competence to appreciate it?

    Not many if you ask me. It seems to be a good fit for a very few and very specialized use cases. So one might wonder why make so much public noise about a niche product?

    But it’s cool that the CEP label is so attractive that companies start to release products under it.

  5. rc on May 19th, 2009 3:32 am

    @marc

    So what is a greek?

    RC

  6. Notes on CEP performance | DBMS2 -- DataBase Management System Services on May 21st, 2009 4:20 am

    […] IBM just disclosed >15,000 messages/core on a pretty low-powered processor. […]

  7. Hans on May 21st, 2009 6:16 pm

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.