Besides talking about what Coral8 and StreamBase (and other CEP vendors) have in common, Mark Tsimelzon and I talked quite a bit about what he sees as some of the important differences. There were a lot, of course, but three in particular stood out.
1. Mark believes Coral8 has significantly lower latency than StreamBase. E.g., the Wombat/Coral8 combo achieves sub-millisecond latency, with Coral8 itself consuming less than a tenth of that. The best comparable figures from StreamBase that I currently know of are almost an order of magnitude slower.
Top-end speed aside, Mark believes that Coral8 is fundamentally better suited for complex queries and pattern recognition, while StreamBase works well with simpler queries. For example, his other performance claims notwithstanding, he concedes that StreamBase is at least comparable to Coral8 in its throughput for huge numbers of simple queries. (The number he mentioned was ½ million queries/second.) Indeed, while we barely talked about customer/marketing issues, Mark asserts that the companies’ respective customer bases reflect this complex/simple distinction.*
*I don’t think I can judge that claim overall yet. However, it seems consistent with one particular data point. The intelligence market – which StreamBase seems to dominate – probably does feature very high volumes of relatively simple filters.
2. Mark thinks Coral8 has a much richer set of language primitives than StreamBase. In particular, he calls attention to Coral8′s primitives for synchronizing multiple data streams. That particular example seems to be core of his claim that StreamBase was subject to risks of nondeterminism. However, he did seem to concede StreamBase might be just as deterministic as Coral8 with sufficiently careful coding.* In another example, Mark thinks subqueries are important, whereas Mike Stonebraker told me a while ago he doesn’t think they arise much in real life CEP.
*To a first approximation, I continue to doubt that determinism is more than a theoretical/edge-case issue. I don’t think many indeterminate, untestable programs are actually being written using any of these engines. On the other hand, it may be the case that programming effort to assure determinism varies significantly between different systems. For a rough analogy here, think of referential integrity. Few database application systems truly suffer from lack of integrity (but feel free to insert the MySQL snark of your choice). Even so, the effort needed to assure integrity can vary widely among different DBMS.
3. Mark positions StreamBase as having been developed with a limited-power query-specification GUI, with the SQL-based language coming only after the fact. By way of contrast, Coral8 was SQL-based all along. He attributes this to the different university research projects they are respectively based on, and asserts that most StreamBase customers he knows of still use the GUI rather than the language. This is central to his argument that StreamBase may be fine for simple queries, but has trouble with more complex ones.
Other highlights included:
- Coral8 is written in C++, while most of the competition uses Java. Mark hypothesizes that this may provide a performance advantage.
- Coral8 automatically handles locks, while in StreamBase you (sometimes?) need to code them specifically.
- The same goes for asynchronous persistence (used for high availability/failover). Coral8 boasts a flexible, no-programming-required configurable clustering capability.
- As part of the “we handle more complex stuff” theme, he thinks Coral8′s optimization is more intricate than StreamBase’s.
- Mark thinks Coral8′s homegrown portal is more flexible, with more user control, than competitors’ licensed tools.