Analysis of complex event/stream processing vendor StreamBase. Related subjects include:
I’ve been talking to CEP vendors on and off for a few years. So what I hear about performance is fairly patchwork. On the other hand, maybe 1-2+ year-old figures of per-core performance are still meaningful today. After all, Moore’s Law is being reflected more in core count than per-core performance, and it seems CEP vendors’ development efforts haven’t necessarily been concentrated on raw engine speed.
So anyway, what do you guys have to add to the following observations?
- Super-low-latency financial services industry tasks are often “embarrassingly parallel.” Thus, near-linear scale-out is common.
- That said, good parallelism seems fairly new in CEP engines (of course, CEP engines are fairly new themselves — for all I know, some have been parallel since inception).
- I’ve heard claims of up to 400,000 messages/second/core for simple queries or patterns.
- I’ve heard claims of 70,000 messages/core for not-so-simple queries or patterns, and probably higher than that depending on what the meaning of “simple” is.
- IBM just disclosed >15,000 messages/core on a pretty low-powered processor.
- I’ve heard that Coral8, Apama, and StreamBase rarely lost deals due to performance or throughput problems. I’ve heard that the same is not as true of Aleri.
- StreamBase proudly says it’s been fully multithreaded since academic research-project days. For Apama multithreading is evidently a more recent feature. But does it matter much?
|Categories: Aleri and Coral8, IBM and DB2, Memory-centric data management, Progress, Apama, and DataDirect, StreamBase, Streaming and complex event processing (CEP)||13 Comments|
Independent CEP (Complex/Event Processing) vendors continue to flounder, at least outside the financial services and national intelligence markets.
- StreamBase once planned to conquer the world, making an impact as big as database management’s. Now it has retreated into niche markets.
- Progress Software, a decent-sized company, put a large fraction of its energy into Apama. Little has happened outside the financial service sector.
- Coral8 has some great-sounding ideas. But Coral8 now has merged into Aleri, basically a financial-markets specialist.
- Mike Franklin says some ambitious things on behalf of Truviso, but I haven’t noticed much traction there either.
CEP’s penetration outside of its classical markets isn’t quite zero. Customers include several transportation companies (various vendors), Sallie Mae (Coral8), a game vendor or two (StreamBase, if I recall correctly), Verizon (Aleri, I think), and more. But I just wrote that list from memory — based mainly on not-so-recent deals — and a quick tour of the vendors’ web sites hasn’t turned up much I overlooked. (Truviso does have a recent deal with Technorati, but that’s not exactly a blue chip customer these days.)
So far as I can tell, this is a new version of a repeated story. Read more
|Categories: Aleri and Coral8, Analytic technologies, Business intelligence, Progress, Apama, and DataDirect, StreamBase, Streaming and complex event processing (CEP), Truviso||12 Comments|
There’s a lot of agitation today because Twitter broke under the message volume generated during Steve Jobs’ Macworld keynote. I don’t know what that volume was, but I just checked the lower volume of tweets (i.e., updates) going through the “public timeline” (i.e., everything) twice, and both times it was under 200 messages per minute. So, let’s say there’s a much higher volume at peak times, and also hypothesize that Twitter would like to grow a lot, and say that Twitter would like to handle 10-100,000 messages/minute – i.e., 1000+/second — as soon as possible.
That’s easy using CEP (Complex Event Processing). A Twitter update is just a string of 140 or fewer characters. It is associated with three pieces of metadata – author, time, and mode of posting. It should be visible in real time to any of the author’s “followers,” as well as in a single public timeline; perhaps there will be other kinds of Twitter channels in the future. In most cases, these updates are only visible to a user upon page refresh. Almost n
No Twitter user seems to have more than about 7,000 followers, even Robert Scoble or Evan Williams.* The average number of followers, at least among active updaters, is probably in the low hundreds now. So basically, this is all a heckuva lot easier than the tick-monitoring systems Wall Street firms are using today.
I believe there’s a hard cap of 7,500, but nobody seems to have bumped against it yet.Twitterholic gives a different figure than Twitter does for Scoble. And it correctly shows Dave Troy with a little over 10,000.
Here’s how to implement that. Read more
|Categories: Aleri and Coral8, Memory-centric data management, StreamBase, Streaming and complex event processing (CEP)||12 Comments|
I’m getting a flood of press releases today, because many of the companies I write about were selected to Intelligent Enterprise’s list of 12 most influential vendors plus 36 more to watch in the areas Intelligent Enterprise covers (which seems to be pretty much the analytics-related parts of what I write about here and on Text Technologies). It looks like a pretty reasonable list, although I think they forced the issue in some of the small analytics vendors they selected, and of course anybody can quibble with some of the omissions.
Among the companies they cited, you can find topical categories here for IBM (and Cognos), Informatica, Microsoft, Netezza, Oracle, SAP/Business Objects (both), SAS, and Teradata; QlikTech; Cast Iron, Coral8, DATAllegro, HP, ParAccel, and StreamBase; and Software AG. On Text Technologies you’ll find categories for some of the same vendors, plus Attensity, Clarabridge, and Google. There also are categories for some of these vendors on the Monash Report.
The highest-profile applications for complex event/stream processing are probably the ones that require super-low latency, especially in financial trading. However, as I already noted in writing about StreamBase and Truviso, there are plenty of other CEP apps with less extreme latency requirements.
Commonly, these are data reduction apps – i.e., there’s a gushing stream of inputs, and the CEP engine filters and “enhances” it, so that only a small, modified subset is sent forward. In other cases, disk-based systems could do the job perfectly well from a performance standpoint, but the pattern matching and filtering requirements are just a better fit for the CEP paradigm.
|Categories: Aleri and Coral8, IBM and DB2, Memory-centric data management, StreamBase, Streaming and complex event processing (CEP), Structured documents||3 Comments|
Besides talking about what Coral8 and StreamBase (and other CEP vendors) have in common, Mark Tsimelzon and I talked quite a bit about what he sees as some of the important differences. There were a lot, of course, but three in particular stood out.
1. Mark believes Coral8 has significantly lower latency than StreamBase. E.g., the Wombat/Coral8 combo achieves sub-millisecond latency, with Coral8 itself consuming less than a tenth of that. The best comparable figures from StreamBase that I currently know of are almost an order of magnitude slower.
Top-end speed aside, Mark believes that Coral8 is fundamentally better suited for complex queries and pattern recognition, while StreamBase works well with simpler queries. For example, his other performance claims notwithstanding, he concedes that StreamBase is at least comparable to Coral8 in its throughput for huge numbers of simple queries. (The number he mentioned was ½ million queries/second.) Indeed, while we barely talked about customer/marketing issues, Mark asserts that the companies’ respective customer bases reflect this complex/simple distinction.*
|Categories: Aleri and Coral8, Memory-centric data management, Progress, Apama, and DataDirect, StreamBase, Streaming and complex event processing (CEP)||5 Comments|
For the most part, the vendors I talk with in complex event/stream processing like and speak well of each other (most of the exceptions seem to involve StreamBase). Even so, there are a lot of interesting competitive claims and counterclaims in this market. Prior posts and comment threads have covered Apama/StreamBase jousting on the subjects of who has more business and how many financial data feeds StreamBase supports. Other areas that generate interesting sparks are performance, parallelism, and determinism. Read more
|Categories: Aleri and Coral8, Investment research and trading, Memory-centric data management, Progress, Apama, and DataDirect, StreamBase, Streaming and complex event processing (CEP)||1 Comment|
Complex event/stream processing vendor Coral8 raised its hand and offered a briefing – non-technical, alas, but at least it was a start. Here are some of the highlights: Read more
|Categories: Aleri and Coral8, Application areas, Investment research and trading, Memory-centric data management, StreamBase, Streaming and complex event processing (CEP), Structured documents||Leave a Comment|
In my post Monday about Apama, I complained that StreamBase hadn’t offered a rebuttal to some of Apama’s claims. This has now been fixed. Bill Hobbib, StreamBase’s VP of Marketing wrote in. Part of what he had to say was the following.
Adapters to Data Feeds
Your blog comment that adapters doesn’t seem like a key competitive differentiator is accurate, and since adapters are so straightforward to develop with StreamBase as part of a customer engagement, we’ve never found adapters to be a key competitive differentiator. The comment by a competitor that their advantage over StreamBase comes from their having developed more adapters suggests they cannot distinguish themselves based on the other functional capabilities that are important to customers. In reality, our speed/performance and scalability are orders of magnitude superior to competitors, as is the speed with which StreamBase applications are developed, deployed, and modified when business needs change. (If it were easy to develop applications with certain competitive systems, then one might assume they would make free evaluation versions of their product available for download from their websites!)
That being said, StreamBase offers adapters to a broad array of data feeds. Most of these are offered out-of-the-box by StreamBase, including the following:
* Financial Market Data: processes data from Reuters® RMDS™ and Reuters Triarch™
* TIBCO® Rendezvous™: converts Rendezvous message into StreamBase tuples and vice versa.
* StreamBase Adapter for JDBC: connects StreamBase to enterprise databases, allowing submission of SQL queries to external resources such as IBM® DB2™, Oracle®, Microsoft® SQLServer™, and Sybase®.
* StreamBase Adapter for JMS: integrates StreamBase with any JMS-compliant message bus.
* StreamBase Adapter for Microsoft Excel™: allows applications to publish data to Excel or read data from Excel.
* StreamBase CSV Adapters: allow applications to read data from, and write data to, comma-separated value (CSV) files.
* StreamBase SMTP adapter: taps into the IP stack on a running system to process live data, converts the IP packets into a TCP data stream, or reads IP packets from captured files.
* StreamBase XML Adapter: streams XML-formatted data records into and out of StreamBase applications
We also can connect to financial exchanges either using our own adapters or through a third-party partnership. Below you’ll find a listing of those.
|Categories: Memory-centric data management, Progress, Apama, and DataDirect, StreamBase, Streaming and complex event processing (CEP)||Leave a Comment|
Mike Stonebraker wrote in with one “nit pick” about yesterday’s blog. I had credited Truviso for strong DBMS/stream processor integration. He shot back that StreamBase has Sleepycat integrated in-process. He further pointed out that a Sleepycat record lookup takes only 5 microseconds if the data is in cache. Assuming what he means is that it’s in Sleepycat’s cache, that would be tight integration indeed.
I wonder whether StreamBase will indefinitely rely on Sleepycat, which is of course now an Oracle product …
|Categories: Memory-centric data management, Michael Stonebraker, Oracle, StreamBase, Streaming and complex event processing (CEP)||Leave a Comment|