After posting about IBM’s System S/InfoSphere Streams CEP offering, I sent three followup questions over to Jeff Jones. It seems simplest to just post the Q&A verbatim.
1. Just how many processors or cores does it take to get those 5 million messages/sec through? A little birdie says 4,000 cores.
The TD Securities First of a Kind (FOAK) project ran on 356 nodes of a Blue Gene/P supercomputer. The Blue Gene/P may have one or more racks. Each rack consists of 1,024 nodes, each with four Power4 processor cores. The Power4 core was selected as the optimal choice for power consumption and performance, rather than the latest generation of Power chips (Power6). So, the 5 million messages per second were handled on 1,424 Power4 processor cores. This extra information is important so that customers understand the environment, and don’t jump to the conclusion that the latest generation of processor cores (e.g. Power6, or Intel Nehalem, for example) are required to gain this type of performance.
2. Did the NSA really pay you for Streams, or did you throw your own money into the development, say $35 million?
The US Government has been working with IBM Research since 2003 on a radical new approach to data analysis that enables high speed, scalable and complex analytics of heterogeneous data streams in motion. We will not comment on the particular part of the US Government that we have worked with. Yes, the US Government has a paid contract with IBM Research for this work. IBM will not comment on the amount paid to develop Streams, although we have stated that we have had 40 to 70 people on the team over the last 6 years.
3. Does Streams have much in the way of the decisioning logic that financial services outfits like for applications like order routing, trade matching, algo trading, etc.?
Streams, via the SPADE programming language, has many commands that simplify creating financial applications. For example,
- Filtering (on things like stock tick)
- Split and Join on streams
- Aggregate to perform sliding window operations based on several parameters like time, count, custom window, etc
- Math functions such as Min, Max, Avg, Sum, Any, First, Last, etc
We’ve also been working with customers and prospects to add User Defined Operators to calculate ‘greeks’ such as Volume Weighted Average Price (VWAP), Daily VWAP, Price to Earnings Ratio, Dividend Yield, Year to Date Return, etc.