For the most part, the vendors I talk with in complex event/stream processing like and speak well of each other (most of the exceptions seem to involve StreamBase). Even so, there are a lot of interesting competitive claims and counterclaims in this market. Prior posts and comment threads have covered Apama/StreamBase jousting on the subjects of who has more business and how many financial data feeds StreamBase supports. Other areas that generate interesting sparks are performance, parallelism, and determinism.
The most confusing of the three is determinism. Apparently, you can run the same query twice against substantially the same event streams, but if the time stamps are a little different or you parallelize differently, you can get different answers. StreamBase suggests this is a problem for Coral8 but not StreamBase. Coral8 assures me they have never proven non-deterministic in a practical customer test, and furthermore have a theoretical example in which StreamBase is non-deterministic. Apama has “determinism” mentioned on one of its slides, although (by my choice) we focused on other stuff during yesterday’s briefing. All told, I’m still in the dark as to whether the determinism problem just arises in theoretical edge cases, whether it really occurs in significant production situations, or indeed whether it’s a noticeable problem at all.
Parallelism comes up primarily as a subpoint to determinism or performance. However, Coral8 is proud of its cluster manager. Apparently, a major way to parallelize is akin to range partitioning – e.g., price data for different securities gets routed to different processors. One also clusters for high availability/hot standby, of course. Coral8 claims as a distinguishing feature that all this parallelization can be just configured from a GUI rather than coded. What’s more, one processor can serve as a hot standby to multiple “range-partitioned” parallel active nodes.
Naturally, one of the top areas for competitive debate is performance. StreamBase claims to beat the others, including Coral8, by a factor of 10 or better on performance. Coral8, however, cites figures of 60-80 microsecond latency for its portion of a 1-2 millisecond total latency (partnered with Wombat), and I’m pretty sure StreamBase doesn’t claim to be 10 times as fast as that. And Apama, which claims to be involved in trading most asset classes for most major investment banks, claims never to have lost a customer benchmark.
So what’s going on here? Well, some of it is surely “My next release beats the pants off of the competition’s old release,” a common feature of competitive jockeying. And some of it no doubt depends on whose system is most expertly tuned in which situation. Beyond that, it may not all be apples-to-apples. Indeed, Apama has a slide suggesting there are at least ten different dimensions to performance. And while that’s somewhat padded, at least three dimensions are real:
- Absolute latency (should be low).
- Total throughput (should be high).
- The complexity of the queries/patterns/filters being used when the first two factors are observed.
And then there’s the jockeying about customer lists. StreamBase doesn’t actually publish customer lists yet, leading to this blog post from Apama suggesting StreamBase has only two disclosed customers, one of them a no-name and the other a game company. Well, I poked around StreamBase’s web site some, and I found the no-name described as a “Goldman Sachs company.” I also found a reference to NASA as a customer. I also saw a claim that over half the biggest investment firms were customers, although that’s vastly less substantial than what Apama can claim (and name). Also, StreamBase is known to be active in the classified/intelligence market. Perhaps that particular blog post could use some light editing …