<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; Complex event processing (CEP)</title>
	<atom:link href="http://www.dbms2.com/category/memory-centric-data-management/event-stream-processing/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 09:21:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Terminology: Data mustering</title>
		<link>http://www.dbms2.com/2011/11/28/terminology-data-mustering/</link>
		<comments>http://www.dbms2.com/2011/11/28/terminology-data-mustering/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 19:10:11 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Teradata]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=5736</guid>
		<description><![CDATA[I find myself in need of a word or phrase that means bring data together from various sources so that it&#8217;s ready to be used, where the use can be analysis or operations. The first words I thought of were &#8220;aggregation&#8221; and &#8220;collection,&#8221; but they both have other meanings in IT. Even &#8220;data marshalling&#8221; has [...]]]></description>
			<content:encoded><![CDATA[<p>I find myself in need of a word or phrase that means <strong>bring data together from various sources so that it&#8217;s ready to be used,</strong> where the use can be analysis or operations. The first words I thought of were &#8220;aggregation&#8221; and &#8220;collection,&#8221; but they both have other meanings in IT. Even &#8220;data marshalling&#8221; has a specific meaning different from what I want. So instead, I&#8217;ll go with <strong>data mustering.</strong></p>
<p>I mean for the term &#8220;data mustering&#8221; to encompass at least three scenarios:</p>
<ul>
<li>Integrated (relational) data warehouse.</li>
<li>Big bit bucket.</li>
<li>Big bit stream.</li>
</ul>
<p>Let me explain what I mean by each.  <span id="more-5736"></span></p>
<p><strong>&#8220;Integrated data warehouse&#8221;</strong> is a phrase Teradata has started using for enterprise data warehouses that, <a href="../../../../../2010/04/12/enterprise-data-warehouse-edw-myt/">like approximately every other EDW in the entire history of data warehousing</a>, aren&#8217;t truly enterprise-wide. In other words, it means &#8220;not just a data mart&#8221;. <a href="http://www.strategicmessaging.com/no-market-categorization-is-ever-precise/2011/03/01/">No category name is perfect</a>, but I think that one works reasonably well.</p>
<p>I previously described the <strong><a href="../../../../../2011/06/04/dirty-data-stored-dirt-cheap/">big bit bucket</a></strong> use case as</p>
<blockquote><p>Users take a whole lot of data, often <a href="../../../../../2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a> in logs of different kinds, and dump it into one place, managed by Hadoop, at open-source pricing.</p></blockquote>
<p>and quickly added</p>
<blockquote><p>Of course, there are various outfits who’d like to sell you not-so-cheap bit buckets. Contending technologies include <a href="../../../../../2011/06/02/why-you-would-want-an-appliance-and-when-you-wouldnt/">Hadoop appliances</a> (which I don’t believe in), <a href="../../../../../2009/10/18/technical-introduction-to-splunk/">Splunk</a> (which in many use cases I do), and <a href="../../../../../2010/11/29/marklogic-and-its-document-dbms/">MarkLogic</a> (ditto, but often the cases are different from Splunk’s). Cloudera and IBM, among other vendors, would also like to sell you some proprietary software to go with your standard Apache Hadoop code.</p></blockquote>
<p>I think I&#8217;ll stand pat on that explanation. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>By analogy, a <strong>big bit stream </strong>is various streams of data, assembled in the custody of a streaming engine. Sybase told me Wednesday that this scenario appears in both of the traditional markets for CEP/streaming &#8212; national intelligence, where it is a major use of streaming, and capital markets in some use cases as well. And it&#8217;s consistent with what I&#8217;ve heard from other CEP/streaming vendors as well.</p>
<p>As for where I got the word &#8220;mustering&#8221; &#8212; it&#8217;s a military term, for when you assemble your troops and their gear either for inspection or for actual use. The main modern usage I know of the word is as part of the phrase &#8220;pass muster&#8221;, which originally referred to the concept that the person being paid to put a regiment together should from time to time demonstrate that the regiment physically existed in the form that regimental records seemed to show.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/11/28/terminology-data-mustering/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>StreamBase LiveView &#8212; push-based real-time BI</title>
		<link>http://www.dbms2.com/2011/11/10/streambase-liveview-push-based-real-time-bi/</link>
		<comments>http://www.dbms2.com/2011/11/10/streambase-liveview-push-based-real-time-bi/#comments</comments>
		<pubDate>Fri, 11 Nov 2011 03:38:53 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[StreamBase]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=5631</guid>
		<description><![CDATA[My clients at StreamBase are coming out with a new product line called LiveView, and I agreed they could launch it via this blog. Key points about StreamBase LiveView Version 1.0 include: LiveView is a business intelligence and alerting suite built on/in the rest of StreamBase&#8217;s technology, meant to operate on streaming data. LiveView is [...]]]></description>
			<content:encoded><![CDATA[<p>My clients at StreamBase are coming out with a new product line called LiveView, and I agreed they could launch it via this blog. Key points about StreamBase LiveView Version 1.0 include:</p>
<ul>
<li>LiveView is a business intelligence and alerting suite built on/in <a href="http://www.dbms2.com/2011/11/10/streambase-catchup/">the rest of </a><a href="http://www.dbms2.com/2011/11/10/streambase-catchup/">StreamBase&#8217;s technology</a>, meant to operate on streaming data.</li>
<li>LiveView is positioned by StreamBase as having a true push event-driven architecture rather than pull/poll.</li>
<li>StreamBase LiveView is designed to query in-memory data and then have the results change in real time as the data set changes.</li>
<li>The LiveView user interface is a rapidly changing work in progress.</li>
<li>LiveView has other Version 1 limitations as well</li>
<li>LiveView is targeted squarely at StreamBase&#8217;s financial trading core market until some of the Version 1 limitations are lifted.</li>
</ul>
<p>The basic StreamBase LiveView pipeline goes something like:   <span id="more-5631"></span></p>
<ul>
<li>Data comes into the system via multiple streams.</li>
<li>Transformations upon data arrival can include but are not limited to:
<ul>
<li>Aggregations.</li>
<li>Joins to reference data.</li>
<li>Joins to other streams.</li>
</ul>
</li>
<li>The streams (transformed or perhaps otherwise) are output to tables &#8230;</li>
<li>&#8230; which are continuously updated as more data streams through.</li>
<li> The data in the resulting table can be consumed:
<ul>
<li>Via LiveView-provided BI capabilities.</li>
<li>Via an API.</li>
</ul>
</li>
</ul>
<p>When wearing my vendor consultant hat, I warmly encourage StreamBase to emphasize the lack of a batch step anywhere in this process. As an analyst, however, I&#8217;m more restrained about a claim like &#8220;We uniquely free you from batch.&#8221; I agree that avoiding batch jobs is a Very Nice Thing. But you also are spared most batch-cycle processing if you stream updates from your short-request database to an analytic DBMS, e.g. via some kind of near-real-time replication.</p>
<p>That said, the push-versus-pull continuous filtering part of the StreamBase LiveView story seems pretty real. I think having sub-second display updates is cool in all sorts of BI use cases, and seriously useful in some number of them. While I don&#8217;t have a clear opinion as to whether the StreamBase approach offers huge performance advantages for that kind of latency over &#8220;pull&#8221; alternatives, my guess is in the direction of &#8220;yes&#8221;.</p>
<p>Version 1 limitations on StreamBase LiveView include:</p>
<ul>
<li>You consume data one table at a time, with no possibility of a join after the data has originally been put into a LiveView table.</li>
<li>While LiveView in principle offers rich alerting potential, you get at it via an API rather than much in the way of alerting-specific tools.</li>
<li>The first LiveView UI StreamBase put together looks a lot like 1980s stock quote machines. The next one it added looks a lot like Panopticon. Much cool-looking enhancement remains to be done.</li>
</ul>
<p><em>One competitive (non)-note: This all sounds something like what TIBCO has been pushing for years, but in fact I don&#8217;t have much knowledge of TIBCO&#8217;s efforts in the area. I had a meeting set up to learn about it some time ago, but it got canceled because TIBCO&#8217;s PR people:</em></p>
<ul>
<li><em>Didn&#8217;t want to let any kind of meeting happen without them, even though a serious CTO-type representative seemed happy to talk, but also &#8230;</em></li>
<li><em>&#8230; didn&#8217;t want to work at dinner time.</em></li>
</ul>
<p><em>I haven&#8217;t had substantive contact with TIBCO since.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/11/10/streambase-liveview-push-based-real-time-bi/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>StreamBase catchup</title>
		<link>http://www.dbms2.com/2011/11/10/streambase-catchup/</link>
		<comments>http://www.dbms2.com/2011/11/10/streambase-catchup/#comments</comments>
		<pubDate>Fri, 11 Nov 2011 03:31:44 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[StreamBase]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=5630</guid>
		<description><![CDATA[While I was cryptic in my general CEP/streaming catchup, I&#8217;ll say a bit more regarding StreamBase in particular. At the highest level, non-technically: StreamBase once planned to conquer the world. However, StreamBase really only sold effectively in the financial trading and intelligence markets. StreamBase retrenched, focusing almost exclusively on the financial trading market. With StreamBase [...]]]></description>
			<content:encoded><![CDATA[<p>While I was cryptic in my general <a href="http://www.dbms2.com/2011/11/10/cep-streaming-catchup/">CEP/streaming catchup</a>, I&#8217;ll say a bit more regarding StreamBase in particular. At the highest level, non-technically:</p>
<ul>
<li>StreamBase once planned to conquer the world.</li>
<li>However, StreamBase really only sold effectively in the financial trading and intelligence markets.</li>
<li>StreamBase retrenched, focusing almost exclusively on the financial trading market.</li>
<li>With <a href="http://www.dbms2.com/2011/11/10/streambase-liveview-push-based-real-time-bi/">StreamBase LiveView</a>, StreamBase is expanding from embedded <a href="../../../../../2011/11/08/terminology-operational-analytics/">operational analytics</a> to do (also operational) business intelligence as well.</li>
<li>StreamBase is hopeful that, perhaps starting with Version 2 or so, LiveView will be successful outside the financial trading market.</li>
</ul>
<p><span id="more-5630"></span><em>Not coincidental to these shifts in focus, StreamBase was our client, then stopped being one for a while, and now is a client again.</em></p>
<p>StreamBase (the product set) consists primarily of three things (LiveView aside):</p>
<ul>
<li>A development environment, whose output is in &#8230;</li>
<li>&#8230; a visual programming language called EventFlow &#8230;</li>
<li>&#8230; which is complied and executed by StreamBase&#8217;s execution layers.</li>
</ul>
<p>One important set of ancillary products are StreamBase&#8217;s connectors to various data sources &#8212; StreamBase offers about 125 of its own, a number that approaches 200 when <a href="../../../../../2010/02/16/quick-thoughts-on-the-streambase-component-exchange/">community contributions</a> are included.</p>
<p>StreamBase has a second programming language called StreamSQL, but that&#8217;s rarely used except for embedding in or connecting to third-party software. EventFlow and StreamSQL compile to nearly identical byte code. (The main difference seems to be that as a practical matter you&#8217;ll name things a bit differently in the two languages, focusing on verbs in EventFlow and nouns in StreamSQL.)</p>
<p>StreamBase says that in the financial trading market, great performance out of the box equates to better time-to-value, since you are spared time you&#8217;d otherwise have to spend tuning the system. Implicit in that is a claim &#8212; which competitors might dispute <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  &#8212; that StreamBase has great <a href="../../../../../2009/05/21/notes-on-cep-performance/">performance</a>. StreamBase fondly thinks that having a domain-specific language gives it a leg up in achieving great compiler optimization. (The same would presumably apply to StreamBase&#8217;s competitors, but only if they have optimizing compilers themselves.)</p>
<p>One point that&#8217;s a little unusual for me these days is that StreamBase favors big SMP (Symmetric MultiProcessing) boxes over blade-based scale-out. 16+ cores and 256 gigabytes of RAM are not uncommon. Clusters commonly include 4-8 machines, but rarely more; the largest StreamBase cluster evidently contains 36 machines.</p>
<p>And with that I&#8217;ll turn to StreamBase&#8217;s newest offering, <a href="http://www.dbms2.com/2011/11/10/streambase-liveview-push-based-real-time-bi/">LiveView</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/11/10/streambase-catchup/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Very brief CEP/streaming catchup</title>
		<link>http://www.dbms2.com/2011/11/10/cep-streaming-catchup/</link>
		<comments>http://www.dbms2.com/2011/11/10/cep-streaming-catchup/#comments</comments>
		<pubDate>Fri, 11 Nov 2011 03:29:37 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[StreamBase]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Truviso]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=5632</guid>
		<description><![CDATA[When I agreed to launch the StreamBase LiveView product via DBMS 2, I planned to catch up on the whole CEP/streaming area first. Due to the power and internet outages last week, that didn&#8217;t entirely happen. So I&#8217;ll do a bit of that now, albeit more cryptically than I hoped and intended. The upshot of [...]]]></description>
			<content:encoded><![CDATA[<p>When I agreed to launch the StreamBase LiveView product via <em>DBMS 2,</em> I planned to catch up on the whole CEP/streaming area first. Due to the power and internet outages last week, that didn&#8217;t entirely happen. So I&#8217;ll do a bit of that now, albeit more cryptically than I hoped and intended.</p>
<ul>
<li>The upshot of my <a href="../../../../../2011/08/25/renaming-cep-or-not/">what to call CEP thread</a> in August was that &#8220;streaming&#8221; and &#8220;event processing&#8221; are not the same concept, but it so happens that they have the most traction where they intersect. That said, I both observe and endorse an apparent shift from &#8220;event&#8221; to &#8220;stream&#8221; as the core of the terminology, in <a href="../../../../../2008/03/19/what-to-call-cep/">a reversal of my opinion of several years ago</a>.</li>
<li>IBM continues to throw a lot of resources at its <a href="../../../../../2009/05/13/ibm-system-s-infosphere-streams-processing/">System S/ InfoSphere Streams</a> product, but I haven&#8217;t heard yet of much marketplace success. That said, I believe IBM is still pretty serious about Streams, as one would expect from an effort whose code name so cheekily references <a href="http://www.softwarememories.com/2008/10/02/a-bit-of-db2-history-per-ibm/">System R</a>. In particular, Streams shows up prominently on IBM&#8217;s top-level analytic architecture slide.</li>
<li>Sybase recently released its ESP (Event Stream Processor) 5.0, which it says is the full merger of the Aleri and Coral8 predecessors. You can still get Sybase ESP without buying into the full <a href="../../../../../2010/02/05/sybase-aleri-rap/">Sybase RAP</a> stack, and Sybase has no plans to change that.</li>
<li>Sybase has discontinued all <a href="../../../../../2009/03/25/aleri-update/">the business intelligence types of products Aleri and Coral8 were developing</a>. Rather, Sybase is OEMing Panopticon, which it reports has been well received. Other than the discontinuation of the BI efforts, there seem to be few Aleri or Coral8 features missing from the merged Sybase ESP product.</li>
<li>Truviso continues to be <a href="../../../../../2010/05/04/truviso-evidently-reinvents-itself/">out of the picture</a>.</li>
<li>I have more to say about <a href="http://www.dbms2.com/2011/11/10/streambase-catchup/">StreamBase</a> separately.</li>
<li>I have more to say about Sybase and IBM, which I&#8217;ll get to when I can.</li>
<li>I have nothing new on Progress Apama. I also know little about any of the open source efforts.</li>
</ul>
<p>Meanwhile, if you want to see technically nitty-gritty posts about the CEP/streaming area, you may want to look at <a href="../../../../../category/memory-centric-data-management/event-stream-processing/page/4/">my CEP/streaming coverage circa 2007-9</a>, based on conversations with (among others) <a href="../../../../../2007/06/18/mike-stonebraker-on-financial-stream-processing/">Mike Stonebraker</a>, <a href="../../../../../2007/08/03/a-deeper-dive-into-apama/">John Bates</a>, and <a href="../../../../../2007/08/10/the-essence-of-cep-according-to-coral8/">Mark Tsimelzon</a>.</p>
<p><strong> </strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/11/10/cep-streaming-catchup/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Renaming CEP &#8230; or not</title>
		<link>http://www.dbms2.com/2011/08/25/renaming-cep-or-not/</link>
		<comments>http://www.dbms2.com/2011/08/25/renaming-cep-or-not/#comments</comments>
		<pubDate>Fri, 26 Aug 2011 02:58:22 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[StreamBase]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=5127</guid>
		<description><![CDATA[One of the less popular category names I deal with is &#8220;Complex Event Processing (CEP)&#8221;. The word &#8220;complex&#8221; looks weird, and many are unsure about the &#8220;event processing&#8221; part as well. CEP does have one virtue as a name, however &#8212; it&#8217;s concise. The other main alternative is to base the name on &#8220;stream processing&#8221; [...]]]></description>
			<content:encoded><![CDATA[<p>One of the less popular category names I deal with is &#8220;Complex Event Processing (CEP)&#8221;. The word &#8220;complex&#8221; looks weird, and many are unsure about the &#8220;event processing&#8221; part as well. CEP does have one virtue as a name, however &#8212; it&#8217;s concise.</p>
<p>The other main alternative is to base the name on &#8220;stream processing&#8221; instead.* The CEP-or-whatever industry is split between these choices, with <a href="http://www.streambase.com/about-home.htm">StreamBase</a> currently favoring &#8220;CEP&#8221; (despite its company name), <a href="../../../../../2009/05/13/ibm-system-s-infosphere-streams-processing/">IBM emphatically favoring &#8220;stream&#8221;</a>, and Sybase seemingly trying to have things both ways.</p>
<p><em>*And then, of course, there is &#8220;event stream processing&#8221;, regarding which please see below.</em></p>
<p><span id="more-5127"></span>I&#8217;ve been juggling this terminological divide myself, referring to <a href="../../../../../2007/08/12/applications-for-not-so-low-latency-cep/">complex event/stream processing</a> as long as four years ago. But enough is enough. I&#8217;d like to write more about the category without repeatedly apologizing for its name. And so, always bearing in mind <a href="http://www.strategicmessaging.com/no-market-categorization-is-ever-precise/2011/03/01/">Monash&#8217;s Third Law of Commercial Semantics</a>, here&#8217;s where I&#8217;m coming down.</p>
<p>The more I think about it, the less I like the term &#8220;event processing&#8221;. Here&#8217;s why. Events happen; data is produced; CEP systems most commonly try to identify and categorize the events based on the data. The CEP systems may then do significant further processing, but more often they just pass the information on to another system (most commonly either persistent DBMS or &#8220;real-time&#8221; business intelligence). How much of that is really &#8220;event processing&#8221;? Relatively little, I&#8217;d say. And referring specifically to &#8220;complex&#8221; events doesn&#8217;t address my complaints at all.</p>
<p>So I&#8217;d like to go with some version of &#8220;stream&#8221;. But &#8220;<a href="http://en.wikipedia.org/wiki/Stream_processing">stream processing</a>&#8221; has other computer-related uses, while &#8220;Stream management&#8221; commonly describes care and planning for small waterways. So &#8220;stream&#8221; might do best with a modifier, such as &#8220;event&#8221; or &#8220;data&#8221;. Of the two, I prefer &#8220;data stream&#8221; (or &#8220;datastream&#8221;) to &#8220;event stream&#8221;; the events aren&#8217;t really streaming, but the data is.</p>
<p>So should it be &#8220;data stream processing&#8221; or &#8220;data stream management&#8221;? Well, the only one of numerous Wikipedia definitions I&#8217;ve actually liked while researching this post is the one for &#8220;<a href="http://en.wikipedia.org/wiki/Data_Stream_Management_System">Data Stream Management System</a>&#8220;:</p>
<blockquote><p>A <strong>Data Stream Management System</strong> (<strong>DSMS</strong>) is a set of computer programs that controls the maintenance and querying of data in data streams. The use of a DSMS to manage a data stream is roughly analogous to the use of a Database Management System (DBMS) to manage a conventional database.</p>
<p>A key feature of a DSMS is the ability to execute a <em>continuous query</em> against a data stream. A conventional database query executes once and returns a set of results for a given point in time. In contrast, a continuous query continues to execute over time, as new data enters the stream. The results of the continuous query are updated as new data appears.</p></blockquote>
<p>I think the data stream/database management analogy is spot on. Your queries work a little differently, but otherwise you&#8217;re doing pretty much the same things. Indeed, you&#8217;re probably even going to persistently store some of the data, and ideally that DBMS capability would be tightly integrated into your CEP system. (In practice they&#8217;re apt to be more loosely coupled; for most purposes that works well enough.) Query execution, data ingestion, performance monitoring/tuning, workload prioritization &#8212; it&#8217;s very DBMS-like stuff. And by the way, &#8220;data stream management system&#8221; is the term that was used by the researchers &#8212; Mike Stonebreaker, Stan Zdonik, Dan Abadi, et al. &#8212; who wrote a paper describing <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.67.8671&amp;rep=rep1&amp;type=pdf">the project on which StreamBase was based</a> &#8230; although some might question whether that particular observation is a strong signal of accuracy. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>This reasoning suggests <strong>Data Stream Management System</strong> is what it should be. The usual kinds of abbreviation &#8212; datastream (product), datastream manager, DSMS, etc. would no doubt follow. So should it be &#8220;Data Stream&#8221;, &#8220;Datastream&#8221;, or &#8220;Data-stream&#8221;? At that level of detail, I don&#8217;t yet have an opinion.</p>
<p>The only thing is &#8212; that&#8217;s all pretty wordy compared to <strong>CEP. </strong>So after all this, I&#8217;m still not sure which term(s) I prefer.</p>
<p>What are your thoughts?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/08/25/renaming-cep-or-not/feed/</wfw:commentRss>
		<slash:comments>24</slash:comments>
		</item>
		<item>
		<title>Eight kinds of analytic database (Part 2)</title>
		<link>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-2/</link>
		<comments>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-2/#comments</comments>
		<pubDate>Tue, 05 Jul 2011 08:18:18 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Archiving and information preservation]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Buying processes]]></category>
		<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Data mart outsourcing]]></category>
		<category><![CDATA[Data types]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Database diversity]]></category>
		<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[MOLAP]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[Predictive modeling and advanced analytics]]></category>
		<category><![CDATA[Rainstor]]></category>
		<category><![CDATA[SAND Technology]]></category>
		<category><![CDATA[Scientific research]]></category>
		<category><![CDATA[SenSage]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4867</guid>
		<description><![CDATA[In Part 1 of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I&#8217;ll cover four more kinds of analytic database &#8212; even newer, for the most part, with a use case/product short list [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-1/">Part 1</a> of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I&#8217;ll cover four more kinds of analytic database &#8212; even newer, for the most part, with a use case/product short list match that is even less clear.  <span id="more-4867"></span></p>
<p><strong><em>Bit bucket</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included: </em>Logs, other technical/external</li>
<li><em>Likely use styles:</em> Staging/ETL, investigative</li>
<li><em>Canonical example: </em>Log files in a Hadoop cluster<em> </em></li>
<li><em>Stresses:</em> TCO, scale-out, transform/big-query performance, ETL functionality</li>
</ul>
<p>With the explosion of <a href="../../../../../2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a> has come the need for a place to put it all, sometimes called the <a href="../../../../../2011/06/04/dirty-data-stored-dirt-cheap/">big bit bucket</a>. This is like the investigative data mart for big databases, but more <a href="../../../../../2011/05/17/poly-structured-database/">poly-structured</a>. In some cases it is focused on data staging and transformation; but it can also be used for analysis in place.</p>
<p>The list of candidate technologies to run your bit bucket starts with Hadoop and Splunk.</p>
<p><strong><em>Archival data store</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included: </em>Operational, CDR (call detail record), security log</li>
<li><em>Likely use styles:</em> Archival, reporting (for compliance), possibly also investigative</li>
<li><em>Examples:</em> Any long-term detailed historical store</li>
<li><em>Stresses: </em>TCO, compression, scale-out, performance (if multi-use)<em> </em></li>
</ul>
<p><em> </em></p>
<p>Analytic DBMS vendors have been insulting each other with the claim &#8220;that&#8217;s just an archival data store,&#8221; dating back at least to the first time Greenplum was deployed on an underpowered Sun Thumper system. Perhaps only <a href="../../../../../2010/06/11/rainstor-update/">Rainstor</a> truly embraces the archival positioning, and I&#8217;ve become pretty dubious about their technical claims and their company alike.</p>
<p>Still, there&#8217;s a legitimate need for data stores &#8212; especially relational analytic DBMS that:</p>
<ul>
<li>Store data cheaply, with high rates of compression.</li>
<li>Have decent performance if you do want to query the data.</li>
<li>May have archiving/compliance-specific features as well.</li>
</ul>
<p>Along with Rainstor, SAND and SenSage have at least partially targeted that use case. In addition, appliance vendors such as Teradata and Netezza try to have an archive-oriented product version in their lineups.</p>
<p><strong><em>Outsourced data mart</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included:</em> All</li>
<li><em>Likely use styles:</em> Traditional BI, investigative analytics, staging/ETL</li>
<li><em>Examples:</em> Advertising tracking, SaaS CRM</li>
<li><em>Stresses:</em> Performance, TCO, reliability, concurrency</li>
</ul>
<p>Much of what happens in analytic database management can also be outsourced. Some applications that run via SaaS (Software as a Service) are analytic. I&#8217;ve had three different clients whose main business is picking marketing targets in various vertical segments; others who wanted to add analytics to what were historically OLTP applications; and others yet who just offered online business intelligence. Also, if your fundamental business is gathering data and reselling it to a variety of user organizations, that&#8217;s an analytic data management challenge. The possibilities expand from there.</p>
<p>Data outsourcers are in the IT business, and so their IT development is &#8212; hopefully! &#8212; more serious and less politically encumbered than at many conventional enterprises. Thus, legacy systems and master data management issues are commonly less prevalent, or at least more aggressively disposed of. The same, up to a point, goes for vendor politics.*  <a href="../../../../../2011/06/26/what-to-think-about-before-you-make-a-technology-decision/">Multitenancy</a> is commonly an issue, as is running in the cloud.<em> </em></p>
<p><em>*Even so, there&#8217;s often That Guy who doesn&#8217;t want to migrate away from Oracle, no matter what.<strong> </strong></em></p>
<p>Vertica gets the nod in a number of these cases; it&#8217;s cloud-friendly, and often the problem is naturally columnar. Other columnar products can be good choices too, with added brownie points for Infobright if the shop is MySQL-oriented anyway. Running Netezza or other appliances makes sense mainly if you&#8217;re pretty sure you want to keep operating your own data centers, but some data outsourcers are just fine with that assumption.</p>
<p><strong><em>Operational analytic(s) server</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included:</em> Customer-centric, log, financial trade</li>
<li><em>Likely use styles:</em> Advanced operational analytics</li>
<li><em>Examples:</em>
<ul>
<li>Lower latency: Web or call-center personalization, anti-fraud</li>
<li>Higher latency: Customer profiling, Basel 3 risk analysis</li>
</ul>
</li>
<li><em>Stresses:</em> Performance, reliability, analytic functionality, perhaps concurrency</li>
</ul>
<p>Even with eight different choices, I need a &#8220;catch-all&#8221; category; this is it.</p>
<p>Suppose you want to do reasonably sophisticated analytics, then use the results in operations. This is the classical challenge in <a href="../../../../../2011/03/30/short-request-and-analytic-processing/">integrating short-request and analytic processing</a>. There are multiple ways to tackle it, embodying different trade-offs in cost, convenience, or analytic accuracy. If the platform on which you want to run your investigative analytics also has the reliability and concurrency appropriate for mission-critical operations, you&#8217;re set. Otherwise, you may want to pipe <a href="../../../../../2010/11/29/data-that-is-derived-augmented-enhanced-adjusted-or-cooked/">derived data</a> into a more &#8220;industrial-strength&#8221; DBMS, ideally the one that runs your operational apps anyway</p>
<p>Another option is to integrate a limited amount of analytics immediately into your short-request processing system. For example, as bad as they are at the kinds of queries that require joins, NoSQL systems are often fast at simple aggregations. As MapReduce/NoSQL integrations mature, that option may not require pumping the data anywhere else for deeper analytics; even if it does, at least you&#8217;re starting out with the data in a convenient bit bucket.</p>
<p>Streaming/CEP-centric architectures could come into play as well. And it goes on from there. The possibilities in this last category are just too varied to generalize about.</p>
<p><em>So did I get them all? Or are there yet other analytic data management use cases that I don&#8217;t fit into my eight categories?</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-2/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Some quick notes on HP-Vertica</title>
		<link>http://www.dbms2.com/2011/02/14/some-quick-notes-on-hp-vertica/</link>
		<comments>http://www.dbms2.com/2011/02/14/some-quick-notes-on-hp-vertica/#comments</comments>
		<pubDate>Mon, 14 Feb 2011 17:19:57 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[StreamBase]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3862</guid>
		<description><![CDATA[HP is acquiring Vertica.  Now we know (at least in part) why Vertica went oddly silent for a while. As per that same link, Vertica has &#62;250 ordinary customers, and &#62;70 more OEM sell-through ones. This is a setback for speculation about any kind of upcoming Aster/HP tie-up. Edit:  Forgot this one briefly &#8212; HP [...]]]></description>
			<content:encoded><![CDATA[<p>HP is acquiring Vertica.  <span id="more-3862"></span></p>
<ul>
<li>Now we know (at least in part) why <a href="http://www.dbms2.com/2011/02/14/now-we-know-why-vertica-has-been-so-weirdly-evasive/">Vertica went oddly silent</a> for a while.</li>
<li>As per that same link, Vertica has &gt;250 ordinary customers, and &gt;70 more OEM sell-through ones.</li>
<li>This is a setback for speculation about any kind of upcoming <a href="http://www.dbms2.com/2011/01/19/sound-bites-on-hpmicrosoft-and-neoview/">Aster/HP</a> tie-up.</li>
<li><em>Edit:  Forgot this one briefly &#8212; HP chairman Ray Lane was previously involved with Vertica.</em></li>
<li>Vertica arguably is the most mature of the modern <a href="http://www.dbms2.com/2011/02/06/columnar-compression-database-storage/">column-store DBMS</a> &#8212; i.e., the ones that don&#8217;t have their roots in bitmaps the way Sybase and SAND do.</li>
<li><a href="http://www.dbms2.com/2010/09/07/soundbites-about-mark-hurd-joining-oracle/">HP executed really badly in data warehouse DBMS and appliances</a> under former CEO Mark Hurd.</li>
<li>Unfortunately, if you&#8217;re quickly researching Vertica, neither <a href="http://www.dbms2.com/2011/02/05/gartner-magic-quadrant-data-warehouse-database-management-2010/">Gartner</a> nor <a href="http://www.dbms2.com/2011/02/11/comments-on-the-2011-forrester-wave-for-enterprise-data-warehouse-platforms/">Forrester</a> is a reliable source of detailed information.</li>
<li>It would make sense for HP to acquire <a href="http://www.dbms2.com/category/products-and-vendors/streambase/">StreamBase</a> too, and fold StreamBase into Vertica. Reasons include:
<ul>
<li>StreamBase and Vertica are aligned with each other. Both were founded by Mike Stonebraker, with overlapping groups of academic contributors. Both are in the Boston area. StreamBase and Vertica have worked together on various joint customer accounts, especially in the financial services sector.</li>
<li>Like other <a href="http://www.dbms2.com/2009/03/09/independent-cep-vendors-continue-to-flounder/">independent CEP vendors</a>, StreamBase can&#8217;t or won&#8217;t accomplish much outside certain niches (mainly financial services).</li>
<li>StreamBase reports, rather credibly, that it&#8217;s doing well in its niches. While StreamBase&#8217;s success seems to include a heavy dose of professional services, that hardly would be a deal-breaker for HP.</li>
</ul>
</li>
<li>It would make partial sense for HP to acquire <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">VoltDB</a>, and fold VoltDB into Vertica.
<ul>
<li>VoltDB was actually spun out of Vertica, and incubated in Vertica offices. A lot of thinking has already been done about how to integrate VoltDB and Vertica.</li>
<li>VoltDB needs the help, as its strategy is not attuned to the needs of succeeding in a highly competitive, rapidly innovative marketplace.</li>
<li>VoltDB doesn&#8217;t have the kind of traction on which a big company like HP could hang an acquisition case or acquisition strategy.</li>
</ul>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/02/14/some-quick-notes-on-hp-vertica/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Further quick SAP/Sybase reactions</title>
		<link>http://www.dbms2.com/2010/05/13/sap-sybase-reactions/</link>
		<comments>http://www.dbms2.com/2010/05/13/sap-sybase-reactions/#comments</comments>
		<pubDate>Thu, 13 May 2010 15:30:46 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Aleri and Coral8]]></category>
		<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[Mid-range]]></category>
		<category><![CDATA[SAP AG]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Theory and architecture]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2128</guid>
		<description><![CDATA[Raj Nathan of Sybase has been calling around to chat quickly about the SAP/Sybase deal and related matters. Talking with Raj didn&#8217;t change any of my initial reactions to SAP&#8217;s acquisition of Sybase. I also didn&#8217;t bother Raj with too many hard questions, as he was clearly in call-and-reassure mode, reaching out to customers and [...]]]></description>
			<content:encoded><![CDATA[<p>Raj Nathan of Sybase has been calling around to chat quickly about the SAP/Sybase deal and related matters. Talking with Raj didn&#8217;t change any of <a href="http://www.dbms2.com/2010/05/12/sap-acquire-sybase/">my initial reactions to SAP&#8217;s acquisition of Sybase</a>. I also didn&#8217;t bother Raj with too many hard questions, as he was clearly in call-and-reassure mode, reaching out to customers and influencers alike.</p>
<p>That said,   <span id="more-2128"></span></p>
<ul>
<li>Raj said that Sybase&#8217;s Aleri acquisition was, if anything, tracking ahead of expectations.</li>
<li>Raj didn&#8217;t seem the slightest bit focused on the Coral8/Aleri CEP-based BI strategy that John Morell had long championed.</li>
<li>Raj reminded me that Sybase SQL Anywhere has numerous OEMs, not just on the true desktop/laptop or smaller, but also in a return to its server/workgroup roots. Sybase SQL Anywhere even added geospatial indexing recently.</li>
</ul>
<p>Raj also spoke glowingly of SAP&#8217;s in-memory database technology and the potential for Sybase of same &#8212; until I asked a follow-up question. At that point, he confessed that he didn&#8217;t really know much about about SAP&#8217;s in-memory database technology yet. As I said before, I believe SAP is fairly sincere about its belief that its in-memory database technology will conquer the world &#8212; but this is a naive and poorly-founded opinion even so.</p>
<p>One tidbit I did get is that SAP&#8217;s in-memory database technology is not just <a href="http://www.dbms2.com/2006/09/20/saps-bi-accelerator/">son-of-T-REX</a>. A Korean (Raj thinks) company SAP had acquired is also in the mix. Raj also had the impression SAP&#8217;s in-memory technology can do rows, columns, or hybrid structures. On the one hand, that makes sense. On the other, it&#8217;s not a perfect fit with <a href="http://www.dbms2.com/2009/07/07/hasso-plattner-calls-for-in-memory-oltp-column-stores/">what Hasso Plattner said last year</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/05/13/sap-sybase-reactions/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Truviso evidently reinvents itself</title>
		<link>http://www.dbms2.com/2010/05/04/truviso-evidently-reinvents-itself/</link>
		<comments>http://www.dbms2.com/2010/05/04/truviso-evidently-reinvents-itself/#comments</comments>
		<pubDate>Tue, 04 May 2010 19:26:03 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Truviso]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2045</guid>
		<description><![CDATA[When Aleri bought Coral8 last year, I wrote that the independent CEP (Complex Event Processing) vendors were floundering. Aleri quickly threw in the towel and sold out to Sybase, which hardly changed my opinion. StreamBase actually is persevering, but not with any kind of breakout success. Big vendors, such as Microsoft and IBM, have at [...]]]></description>
			<content:encoded><![CDATA[<p>When Aleri bought Coral8 last year, I wrote that <a href="http://www.dbms2.com/2009/03/09/independent-cep-vendors-continue-to-flounder/">the independent CEP (Complex Event Processing) vendors were floundering</a>. Aleri quickly threw in the towel and <a href="http://www.dbms2.com/2010/02/05/sybase-aleri-rap/">sold out to Sybase</a>, which hardly changed my opinion. <a href="http://www.dbms2.com/2010/02/16/quick-thoughts-on-the-streambase-component-exchange/">StreamBase actually is persevering</a>, but not with any kind of breakout success. Big vendors, such as <a href="http://www.dbms2.com/2009/05/13/microsoft-announced-cep-this-week-too/">Microsoft</a> and <a href="http://www.dbms2.com/2009/05/18/followup-on-ibm-system-sinfosphere-streams/">IBM</a>, have at least some aspirations of eventually filling the gap.</p>
<p>Meanwhile, Truviso &#8212; which never got much market traction in the first place &#8212; was in hiding; Roman Bukary never did keep his promise to brief me on the company&#8217;s new and improved strategy. Then Truviso had yet another management change, amidst rumors that it was repositioning away from CEP. As per a press release Truviso emailed today, that&#8217;s now official, with Truviso&#8217;s main business being something to do with web analytics.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/05/04/truviso-evidently-reinvents-itself/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Quick thoughts on the StreamBase Component Exchange</title>
		<link>http://www.dbms2.com/2010/02/16/quick-thoughts-on-the-streambase-component-exchange/</link>
		<comments>http://www.dbms2.com/2010/02/16/quick-thoughts-on-the-streambase-component-exchange/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 15:04:22 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Games and virtual worlds]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[StreamBase]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1591</guid>
		<description><![CDATA[Streambase is announcing something called the StreamBase Component Exchange, for developers to exchange components to be used with the StreamBase engine, presumably on an open source basis. I simultaneously think: This is a good idea, and many software vendors should do it if they aren&#8217;t already. It&#8217;s no big deal. For reasons why, let me [...]]]></description>
			<content:encoded><![CDATA[<p>Streambase is announcing something called the <a href="http://streambase.com/b6409b0d-7d1f-4cf8-99b9-98b2b1858628/press-release-detail.htm">StreamBase Component Exchange</a>, for developers to exchange components to be used with the StreamBase engine, presumably on an open source basis. I simultaneously think:</p>
<ul>
<li>This is a good idea, and many software vendors should do it if they aren&#8217;t already.</li>
<li>It&#8217;s no big deal.</li>
</ul>
<p>For reasons why, let me quote an email I just sent to an inquiring reporter:</p>
<ul>
<li>StreamBase sells mainly to the financial services and intelligence community markets. Neither group will share much in the way of core algorithms.</li>
<li>But both groups are <a href="http://www.dbms2.com/2009/01/27/introduction-to-pentaho/">pretty interested in open source software</a> even so. (I think for both the price and customizability benefits.)</li>
<li>Open source software commonly gets community contributions for connectors, adapters, and (national) language translations.</li>
<li>But useful contributions in other areas are much rarer.</li>
<li>Linden Labs is one of StreamBase&#8217;s <a href="http://www.dbms2.com/2009/03/09/independent-cep-vendors-continue-to-flounder/">few significant customers outside its two core markets</a>.</li>
<li>All of the above are consistent with the press release (which quotes only one StreamBase customer &#8212; guess who?).</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/02/16/quick-thoughts-on-the-streambase-component-exchange/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

