<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; SenSage</title>
	<atom:link href="http://www.dbms2.com/category/products-and-vendors/sensage/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 09:21:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Eight kinds of analytic database (Part 2)</title>
		<link>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-2/</link>
		<comments>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-2/#comments</comments>
		<pubDate>Tue, 05 Jul 2011 08:18:18 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Archiving and information preservation]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Buying processes]]></category>
		<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Data mart outsourcing]]></category>
		<category><![CDATA[Data types]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Database diversity]]></category>
		<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[MOLAP]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[Predictive modeling and advanced analytics]]></category>
		<category><![CDATA[Rainstor]]></category>
		<category><![CDATA[SAND Technology]]></category>
		<category><![CDATA[Scientific research]]></category>
		<category><![CDATA[SenSage]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4867</guid>
		<description><![CDATA[In Part 1 of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I&#8217;ll cover four more kinds of analytic database &#8212; even newer, for the most part, with a use case/product short list [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-1/">Part 1</a> of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I&#8217;ll cover four more kinds of analytic database &#8212; even newer, for the most part, with a use case/product short list match that is even less clear.  <span id="more-4867"></span></p>
<p><strong><em>Bit bucket</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included: </em>Logs, other technical/external</li>
<li><em>Likely use styles:</em> Staging/ETL, investigative</li>
<li><em>Canonical example: </em>Log files in a Hadoop cluster<em> </em></li>
<li><em>Stresses:</em> TCO, scale-out, transform/big-query performance, ETL functionality</li>
</ul>
<p>With the explosion of <a href="../../../../../2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a> has come the need for a place to put it all, sometimes called the <a href="../../../../../2011/06/04/dirty-data-stored-dirt-cheap/">big bit bucket</a>. This is like the investigative data mart for big databases, but more <a href="../../../../../2011/05/17/poly-structured-database/">poly-structured</a>. In some cases it is focused on data staging and transformation; but it can also be used for analysis in place.</p>
<p>The list of candidate technologies to run your bit bucket starts with Hadoop and Splunk.</p>
<p><strong><em>Archival data store</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included: </em>Operational, CDR (call detail record), security log</li>
<li><em>Likely use styles:</em> Archival, reporting (for compliance), possibly also investigative</li>
<li><em>Examples:</em> Any long-term detailed historical store</li>
<li><em>Stresses: </em>TCO, compression, scale-out, performance (if multi-use)<em> </em></li>
</ul>
<p><em> </em></p>
<p>Analytic DBMS vendors have been insulting each other with the claim &#8220;that&#8217;s just an archival data store,&#8221; dating back at least to the first time Greenplum was deployed on an underpowered Sun Thumper system. Perhaps only <a href="../../../../../2010/06/11/rainstor-update/">Rainstor</a> truly embraces the archival positioning, and I&#8217;ve become pretty dubious about their technical claims and their company alike.</p>
<p>Still, there&#8217;s a legitimate need for data stores &#8212; especially relational analytic DBMS that:</p>
<ul>
<li>Store data cheaply, with high rates of compression.</li>
<li>Have decent performance if you do want to query the data.</li>
<li>May have archiving/compliance-specific features as well.</li>
</ul>
<p>Along with Rainstor, SAND and SenSage have at least partially targeted that use case. In addition, appliance vendors such as Teradata and Netezza try to have an archive-oriented product version in their lineups.</p>
<p><strong><em>Outsourced data mart</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included:</em> All</li>
<li><em>Likely use styles:</em> Traditional BI, investigative analytics, staging/ETL</li>
<li><em>Examples:</em> Advertising tracking, SaaS CRM</li>
<li><em>Stresses:</em> Performance, TCO, reliability, concurrency</li>
</ul>
<p>Much of what happens in analytic database management can also be outsourced. Some applications that run via SaaS (Software as a Service) are analytic. I&#8217;ve had three different clients whose main business is picking marketing targets in various vertical segments; others who wanted to add analytics to what were historically OLTP applications; and others yet who just offered online business intelligence. Also, if your fundamental business is gathering data and reselling it to a variety of user organizations, that&#8217;s an analytic data management challenge. The possibilities expand from there.</p>
<p>Data outsourcers are in the IT business, and so their IT development is &#8212; hopefully! &#8212; more serious and less politically encumbered than at many conventional enterprises. Thus, legacy systems and master data management issues are commonly less prevalent, or at least more aggressively disposed of. The same, up to a point, goes for vendor politics.*  <a href="../../../../../2011/06/26/what-to-think-about-before-you-make-a-technology-decision/">Multitenancy</a> is commonly an issue, as is running in the cloud.<em> </em></p>
<p><em>*Even so, there&#8217;s often That Guy who doesn&#8217;t want to migrate away from Oracle, no matter what.<strong> </strong></em></p>
<p>Vertica gets the nod in a number of these cases; it&#8217;s cloud-friendly, and often the problem is naturally columnar. Other columnar products can be good choices too, with added brownie points for Infobright if the shop is MySQL-oriented anyway. Running Netezza or other appliances makes sense mainly if you&#8217;re pretty sure you want to keep operating your own data centers, but some data outsourcers are just fine with that assumption.</p>
<p><strong><em>Operational analytic(s) server</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included:</em> Customer-centric, log, financial trade</li>
<li><em>Likely use styles:</em> Advanced operational analytics</li>
<li><em>Examples:</em>
<ul>
<li>Lower latency: Web or call-center personalization, anti-fraud</li>
<li>Higher latency: Customer profiling, Basel 3 risk analysis</li>
</ul>
</li>
<li><em>Stresses:</em> Performance, reliability, analytic functionality, perhaps concurrency</li>
</ul>
<p>Even with eight different choices, I need a &#8220;catch-all&#8221; category; this is it.</p>
<p>Suppose you want to do reasonably sophisticated analytics, then use the results in operations. This is the classical challenge in <a href="../../../../../2011/03/30/short-request-and-analytic-processing/">integrating short-request and analytic processing</a>. There are multiple ways to tackle it, embodying different trade-offs in cost, convenience, or analytic accuracy. If the platform on which you want to run your investigative analytics also has the reliability and concurrency appropriate for mission-critical operations, you&#8217;re set. Otherwise, you may want to pipe <a href="../../../../../2010/11/29/data-that-is-derived-augmented-enhanced-adjusted-or-cooked/">derived data</a> into a more &#8220;industrial-strength&#8221; DBMS, ideally the one that runs your operational apps anyway</p>
<p>Another option is to integrate a limited amount of analytics immediately into your short-request processing system. For example, as bad as they are at the kinds of queries that require joins, NoSQL systems are often fast at simple aggregations. As MapReduce/NoSQL integrations mature, that option may not require pumping the data anywhere else for deeper analytics; even if it does, at least you&#8217;re starting out with the data in a convenient bit bucket.</p>
<p>Streaming/CEP-centric architectures could come into play as well. And it goes on from there. The possibilities in this last category are just too varied to generalize about.</p>
<p><em>So did I get them all? Or are there yet other analytic data management use cases that I don&#8217;t fit into my eight categories?</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-2/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Advice for some non-clients</title>
		<link>http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/</link>
		<comments>http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/#comments</comments>
		<pubDate>Fri, 30 Jul 2010 14:35:52 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[HP and Neoview]]></category>
		<category><![CDATA[Information Builders]]></category>
		<category><![CDATA[Ingres]]></category>
		<category><![CDATA[Kalido]]></category>
		<category><![CDATA[MarkLogic]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Objectivity and Infinite Graph]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[SenSage]]></category>
		<category><![CDATA[Tableau Software]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2699</guid>
		<description><![CDATA[Edit: Any further anonymous comments to this post will be deleted. Signed comments are permitted as always. Most of what I get paid for is in some form or other consulting. (The same would be true for many other analysts.) And so I can be a bit stingy with my advice toward non-clients. But my [...]]]></description>
			<content:encoded><![CDATA[<p><em>Edit: Any further anonymous comments to this post will be deleted. Signed comments are permitted as always.<br />
</em></p>
<p>Most of what I get paid for is in some form or other consulting. (<a href="http://www.strategicmessaging.com/blurring-analyst-consultant-line/2010/07/28/">The same would be true for many other analysts</a>.) And so I can be a bit stingy with my advice toward non-clients. But my non-clients are a distinguished and powerful group, including in their number Oracle, IBM, Microsoft, and most of the BI vendors. So here&#8217;s a bit of advice for them too.</p>
<p><strong>Oracle. </strong>On the plus side, you guys have been making progress against your reputation for untruthfulness. Oh, I&#8217;ve dinged you for some <a href="http://www.dbms2.com/2008/09/30/oracle-crosses-the-line-on-integrity/">past</a> <a href="http://www.dbms2.com/2008/06/28/response-to-rita-sallam-of-oracle/">slip-ups</a>, but on the whole they&#8217;ve been no worse than other vendors.&#8217; But recently you pulled a doozy. The <a href="http://www.oracle.com/us/corporate/analystreports/infrastructure/index.html">analyst reports</a> section of your website fails to distinguish between unsponsored and sponsored work.* That is a horrible ethical stumble. Fix it fast. Then put processes in place to ensure nothing that dishonest happens again for a good long time.</p>
<p><em>*Merv Adrian&#8217;s &#8220;report&#8221; listed high on that page is actually a sponsored white paper. That Merv himself screwed up by not labeling it clearly as such in no way exonerates Oracle. Besides, I&#8217;m sure Merv won&#8217;t soon repeat the error &#8212; but for Oracle, this represents a whole pattern of behavior.</em></p>
<p><strong>Oracle.</strong> And while I&#8217;m at it, outright dishonesty isn&#8217;t your only unnecessary credibility problem. <a href="http://www.strategicmessaging.com/so-what-is-an-analyst-anyway/2010/07/25/">You&#8217;re also playing too many games in analyst relations</a>.</p>
<p><strong>HP.</strong> Neoview will never succeed. Admit it to yourselves. Go buy something that can.  <span id="more-2699"></span></p>
<p><strong>Smaller BI vendors.</strong> Analytic DBMS evaluations commonly include BI strategy and tool selection as well. If an analytic DBMS expert tells you he needs to learn more about your product line, don&#8217;t blow him off. In fact, you should be particularly embracing anybody who&#8217;s shown a fondness for small DBMS vendors; maybe he or his clients will like small BI vendors as well. That means (among others) <strong>Jaspersoft, Endeca, </strong>and <strong>Tableau.</strong></p>
<p><strong>Information Builders. </strong>Is there anything about your BI products that is in any way technologically differentiated? If so, you might want to mention some examples to somebody some time.</p>
<p><strong>Kalido.</strong> I&#8217;ve said this to you before, but it bears repeating &#8212; your positioning translates to &#8220;I-CASE for analytics,&#8221; and that&#8217;s not a good thing. If your product is not as cumbersome and entrapping as that sounds, you need to do a much better job of explaining why not.</p>
<p><strong>SenSage.</strong> You are what you are. Sell out while the selling is good. You don&#8217;t have the corporate personality to make it into the analytic DBMS mainstream on your own.</p>
<p><strong>Ingres. </strong>You need to be more engaged with analysts than you are. <a href="http://www.softwarememories.com/2010/07/25/ingres-history/">Ingres navel-gazed too much 25 years ago</a>, and evidently you haven&#8217;t outgrown it yet.</p>
<p><strong>TIBCO.</strong> You probably have a lot of cool analytic technology, but I don&#8217;t know of an influencer who has much relationship with or trust in you. Rethink how you&#8217;re approaching influencer relations top to bottom.</p>
<p><strong>Tableau.</strong> You had a lot of mindshare, but it&#8217;s fading. Do something.</p>
<p><strong>MarkLogic, graph DBMS vendors, etc.</strong> You&#8217;re clinging too hard to the NoSQL label. Nobody is out there deciding among Cassandra, neo4j, and MarkLogic. They might be deciding between MongoDB and MarkLogic, I guess, but if you admit to yourself that&#8217;s all it is you&#8217;ll probably change your messaging somewhat.</p>
<p><strong>Objectivity.</strong> Get real about marketing. Infinite Graph is a cool opportunity. But I didn&#8217;t even ping you for a meeting when I&#8217;m in your area next week, because I wouldn&#8217;t have known who to reach out to.</p>
<p><strong>Everybody (especially Objectivity).</strong> &#8220;First X deployed in the cloud&#8221; is almost surely an inaccurate claim. Don&#8217;t make it. And by the way, even if it were true, it probably wouldn&#8217;t be interesting.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/feed/</wfw:commentRss>
		<slash:comments>45</slash:comments>
		</item>
		<item>
		<title>Clearing up MapReduce confusion, yet again</title>
		<link>http://www.dbms2.com/2009/12/30/clearing-up-mapreduce-confusion-yet-again/</link>
		<comments>http://www.dbms2.com/2009/12/30/clearing-up-mapreduce-confusion-yet-again/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 10:50:53 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[SenSage]]></category>
		<category><![CDATA[Splunk]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1371</guid>
		<description><![CDATA[I&#8217;m frustrated by a constant need &#8212; or at least urge &#8212; to correct myths and errors about MapReduce. Let&#8217;s try one more time: MapReduce was named and popularized &#8212; but not invented &#8212; by Google. &#8220;MapReduce&#8221; variously refers to: A programming paradigm Execution engines that implement the programming paradigm Distributed file systems that work [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m frustrated by a constant need &#8212; or at least urge <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  &#8212; to correct <a href="http://www.dbms2.com/2009/10/18/three-big-myths-about-mapreduce/">myths and errors about MapReduce</a>. Let&#8217;s try one more time:<span id="more-1371"></span></p>
<ul>
<li>MapReduce was named and popularized &#8212; but not invented &#8212; by Google.</li>
<li>&#8220;MapReduce&#8221; variously refers to:
<ul>
<li>A programming paradigm</li>
<li>Execution engines that implement the programming paradigm</li>
<li>Distributed file systems that work with the execution engines</li>
</ul>
</li>
<li>In particular, Hadoop is a MapReduce execution engine that includes or is closely associated with HDFS (Hadoop Distributed File System).</li>
<li>MapReduce and analytic DBMS can interact in a number of different ways, including:
<ul>
<li>Tight integration between a DBMS and exposed MapReduce functionality, e.g. <a href="http://www.dbms2.com/2009/10/15/mapreduce-webinar-slides/">Aster Data&#8217;s SQL/MapReduce</a> or Greenplum.</li>
<li>Integrated MapReduce &#8220;under the covers&#8221;, e.g. SenSage or <a href="http://www.dbms2.com/2009/10/06/oracle-mapreduce/">Oracle</a>. This may or may not follow all the rules Google laid out for MapReduce, but it&#8217;s at least similar in spirit.</li>
<li>Looser coupling between DBMS and a MapReduce system, e.g. <a href="http://www.dbms2.com/2009/08/04/verticas-version-of-mapreduce-integration/">Vertica/Hadoop</a>, in which MapReduce may or may not run on a different cluster than the DBMS.</li>
<li>Not at all, except perhaps insofar as a quasi-DBMS such as <a href="http://www.dbms2.com/2009/05/11/facebook-hadoop-and-hive/">Hive</a> is implemented over a MapReduce system such as Hadoop/HDFS.</li>
</ul>
</li>
<li>As predicted by <a href="http://www.strategicmessaging.com/monashs-first-law-of-commercial-semantics-explained/2009/01/09/">Monash&#8217;s First Law of Commercial Semantics</a>, different vendors have individual variants on those themes. For example, as per <a href="http://www.splunk.com/product">a registration-required white paper</a>, Splunk is moving to publicly expose a not-quite-complete form of MapReduce.</li>
<li>MapReduce implementations such as Hadoop are sometimes regarded as part of the <a href="http://www.dbms2.com/2009/12/12/legit-nosql-key-value-store/">NoSQL</a> &#8220;movement&#8221;. When they are, many generalities about NoSQL &#8212; such as that it doesn&#8217;t deal with analytics &#8212; are falsified.</li>
<li>So far as I can tell, mainstream enterprise (as opposed to web, scientific, investment, etc.) data mining folks may be looking at MapReduce for data mining, but they haven&#8217;t done much to adopt it yet. Probably that&#8217;s because the outfits who have the greatest need are the same ones that have the largest sunk investments in more traditional ways of doing data mining.</li>
<li>Cloudera != Hadoop. On the other hand, if you want to use Hadoop, it makes a lot of sense to do business with Cloudera.</li>
<li>Non-DBMS MapReduce != Hadoop. On the other hand, Hadoop is the default choice for non-DBMS MapReduce.</li>
<li>MapReduce != Hadoop, period. DBMS-based MapReduce is also a legitimate technical strategy.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/12/30/clearing-up-mapreduce-confusion-yet-again/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Notes on RainStor, the company formerly known as Clearpace</title>
		<link>http://www.dbms2.com/2009/12/11/rainstor-clearpace/</link>
		<comments>http://www.dbms2.com/2009/12/11/rainstor-clearpace/#comments</comments>
		<pubDate>Sat, 12 Dec 2009 00:15:02 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Archiving and information preservation]]></category>
		<category><![CDATA[Market share and customer counts]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Rainstor]]></category>
		<category><![CDATA[SenSage]]></category>
		<category><![CDATA[Telecommunications]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1295</guid>
		<description><![CDATA[Information preservation* DBMS vendor Clearpace officially changed its name to RainStor this week. RainStor is also relocating its CEO John Bantleman and more generally its headquarters to San Francisco. This all led to a visit with John and his colleague Ramon Chen, highlights of which included: RainStor expects to finish the year with &#62; 50 [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;"><a href="http://www.dbms2.com/2008/12/16/database-archiving-and-information-preservation/">I</a><a href="http://www.dbms2.com/2008/12/16/database-archiving-and-information-preservation/">nformation preservation</a>* DBMS vendor Clearpace officially changed its name to RainStor this week. RainStor is also relocating its CEO John Bantleman and more generally its headquarters to San Francisco. This all led to a visit with John and his colleague Ramon Chen, highlights of which included:<span id="more-1295"></span><!--more--></p>
<ul>
<li>RainStor expects to finish the 	year with &gt; 50 users (overwhelmingly via partners)</li>
<li>A big market for RainStor (at 	least in terms of signed partnerships and large deal activity) is 	retention of telecom records, for compliance purposes, typically for 	a 1-3 year period. This includes:
<ul>
<li>CDRs (Call Detail Records)</li>
<li>Mobile phone records including 	CDRs and missed calls</li>
<li>SMS (Short Message Service), 	including the complete text of same</li>
</ul>
</li>
<li>RainStor thinks a number of larger 	telcos have the need to store a billion records per day each. (I&#8217;m 	not sure how many subscribers such a telco would have to have).</li>
<li>John further thinks that, for the 	same query performance, RainStor can handle such a database on 4 	blades. More precisely, he says that&#8217;s what happened at a test 	conducted by a major technology firm. In the same test case, SenSage 	required 40 blades, and Oracle required 80 or more cores on a pair 	of big SMP machines.  John further says that the Oracle solution 	required a new table and new tablespace every day, while RainStor&#8217;s 	took 3 days for initial installation and required no DBA afterwards. 	However, I&#8217;m in no position to verify this report independently.</li>
<li>In a different kind of proof 	point, so extreme it gives even the RainStor folks pause, a user has 	retired 300 different applications and put their databases onto a 	single 2-core box. (Presumably, this is via RainStor&#8217;s OEM 	relationship with Informatica.)</li>
<li>Coming Very Soon are some services 	tying RainStor&#8217;s DBMS to obvious-suspect SaaS offerings. The core 	positioning is “SaaS data escrow”.i.e., RainStor will help you 	ensure that, in a worst-case scenario, there&#8217;s a nice safe copy of 	your data you can get at. RainStor also encourages you to do basic 	reporting and BI against the RainStor copy of the data, if you 	choose.</li>
<li>The idea I&#8217;ve been pushing lately 	of taking a heterogeneous replication offering like Continuent&#8217;s and 	having it feed an archiving store like RainStor&#8217;s has hit a rather 	basic snag. RainStor doesn&#8217;t actually consume change data capture 	kinds of information directly, at least as of yet, because of 	difficulties fitting such a stream into its 	guaranteed-data-immutability model.</li>
</ul>
<p><em>*I coined that category description for John in the tea room of the Park Lane Hotel. He&#8217;s subsequently embraced it enthusiastically, and I kind of like it myself. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </em></p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;"><em><strong>Related links</strong></em></p>
<ul>
<li>
<p style="margin-bottom: 0in;">RainStor&#8217;s approach to 	compression, as described by <a href="http://www.dbms2.com/2009/05/14/the-secret-sauce-to-clearpaces-compression/">me</a> and by <a href="http://www.rainstor.com/news-blog/blog/rainstors-secret-sauce-data-and-pattern-deduplication">RainStor itself</a></p>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/12/11/rainstor-clearpace/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Introduction to SenSage</title>
		<link>http://www.dbms2.com/2009/10/18/introduction-to-sensage/</link>
		<comments>http://www.dbms2.com/2009/10/18/introduction-to-sensage/#comments</comments>
		<pubDate>Sun, 18 Oct 2009 16:02:42 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Complex event processing (CEP)]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[SenSage]]></category>
		<category><![CDATA[Telecommunications]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1115</guid>
		<description><![CDATA[I visited with SenSage on my two most recent trips to San Francisco. Both visits were, through no fault of SenSage&#8217;s, hasty. Still, I think I have enough of a handle on SenSage basics to be worth writing up. General SenSage highlights include: SenSage used to be known as Addamark. SenSage used to characterize itself [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I visited with SenSage on my two most recent trips to San Francisco. Both visits were, through no fault of SenSage&#8217;s, hasty.  Still, I think I have enough of a handle on SenSage basics to be worth writing up.</p>
<p style="margin-bottom: 0in;">General SenSage highlights include:</p>
<p><span id="more-1115"></span></p>
<ul>
<li>SenSage used to be known as 	Addamark.</li>
<li>SenSage used to characterize 	itself as being in the Security Information Management (SIM) market.</li>
<li>Now SenSage characterizes itself 	(approximately) as selling technology built around a columnar DBMS 	that happens to be pretty good at log analysis, compliance, and/or 	archiving.</li>
<li>More concisely, SenSage says it is 	in the <a href="http://sensage.com/company/index.php">event data 	warehouse</a> category.  (The same could arguably be said of 	<a href="http://www.dbms2.com/?p=1119">Splunk</a>.)</li>
<li>SenSage says it has &gt;400 paying 	customers, of which ~200 are direct.</li>
<li>SenSage has &gt;120 employees and, 	like Splunk, is profitable.</li>
<li>SenSage has enjoyed &gt;50% annual 	revenue growth the past four years.</li>
<li>Some SenSage deals are in the 	multiple-million dollar range.</li>
<li>A major SenSage channel partner – 	dozens of installations &#8212; is SAP, which resells SenSage software on 	HP hardware is a “Compliance Log Warehouse.”</li>
<li>A hot market for SenSage is CDRs 	(Call Detail Records).</li>
<li>SenSage says that, among analytic 	DBMS vendors, it competes with Oracle, IBM, Teradata, Netezza and, 	to some extent, Vertica and Greenplum.</li>
</ul>
<p>Technical SenSage highlights include:</p>
<ul>
<li>SenSage&#8217;s core technology is an 	append-only columnar DBMS, with no master node.</li>
<li>SenSage&#8217;s DBMS uses no indexes and 	requires “no” database administration.</li>
<li>SenSage&#8217;s database is 	range-partitioned, with the range-partition key always being time.</li>
<li>SenSage has something it calls SQO 	(Sparse Query Optimization), which sounds a lot like Netezza zone 	maps. SQO never yields a false negative on whether data is in a 	block, never yields a false positive on equality predicates, and 	only rarely yields a false positive on range predicates.</li>
<li>SenSage&#8217;s database uses large 	block sizes – typically 250,000 records/block, at 200-250 bytes 	per record.  (That&#8217;s in the range of 64 megabytes/block.)</li>
<li>SenSage says its software can load 	10-50,000 records/second/node. If I&#8217;m doing the arithmetic 	correctly, that&#8217;s roughly 7-40 gigabytes/node/hour.</li>
<li>SenSage collects log data into its 	event data warehouse in what it characterizes as an agentless 	manner. Even so, it seems that for a majority of kinds of data 	sources one does have to write custom agents. The two other ways to 	get data into SenSage – and presumably most of the data volume 	comes through these – are:
<ul>
<li>File transfer in the usual way</li>
<li>syslog</li>
</ul>
</li>
<li>SenSage says its software can read 	100s of data sources, and that this is a huge competitive advantage. 	I&#8217;m not totally sure how that jibes with the prior point.</li>
<li>SenSage says it gets 5X 	compression on CDR data, 10-20X on other kinds of logs. That&#8217;s not 	too far off from <a href="../2008/09/24/vertica-finally-spells-out-its-compression-claims/">Vertica&#8217;s 	compression figures</a>.</li>
<li>SenSage says that it has 	datatype-aware compression as well as more standard stuff, with 	VARCHAR compressing particularly well.</li>
<li>In particular, SenSage uses both 	dictionary/token and delta compression.</li>
<li>SenSage&#8217;s software is pretty 	agnostic with respect to storage kind – DAS (Direct Attached 	Storage), SAN (Storage-Area Network), or content-addressable. In 	particular, there&#8217;s only about a 4% performance hit for using 	content-addressable storage.</li>
<li>When using WORM (Write Once Read 	Many) storage like EMC&#8217;s Centera, SenSage leaves record locator 	information behind on ordinary storage and otherwise queries the 	WORM storage just like it queries anything else.</li>
<li>SenSage says it has been using 	MapReduce since “Day 1”.</li>
<li>Probably not coincidentally, you 	can use Perl and other aggregates in SenSage SQL statements.</li>
<li>Perhaps also not coincidentally, 	SenSage says it has a number of advanced built-in analytic 	functions, including some focused on sessionization.</li>
</ul>
<p style="margin-bottom: 0in;">In addition to all that, SenSage offers a built-in event processing engine, consisting of:</p>
<ul>
<li>A finite-state machine correlation 	engine.</li>
<li>A proprietary event processing 	language.</li>
<li>A GUI to “abstract” (i.e., 	generate?) the event processing language.</li>
</ul>
<p style="margin-bottom: 0in;">The SenSage event processing engine is used to generate alerts. Data that comes into SenSage actually is passed to two places at once, namely to both the event processing engine and the database itself.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/10/18/introduction-to-sensage/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

