<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS2 -- DataBase Management System Services &#187; Market share</title>
	<atom:link href="http://www.dbms2.com/category/market-share/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Fri, 30 Jul 2010 15:51:32 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Breakthrough: Exadata now has as many reference accounts as Aster Data!</title>
		<link>http://www.dbms2.com/2010/07/14/exadata-reference-accounts/</link>
		<comments>http://www.dbms2.com/2010/07/14/exadata-reference-accounts/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 13:21:59 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Oracle]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2572</guid>
		<description><![CDATA[According to Bob Evans of Information Week, there now are 15 disclosed Exadata reference accounts. Coincidentally, there are exactly 15 logos on Aster Data&#8217;s customer page. So on it own, that&#8217;s not a particularly impressive piece of information.
But other highlights of his column include:

Some of those accounts are rather big-name. However, I&#8217;m not at all [...]]]></description>
			<content:encoded><![CDATA[<p>According to Bob Evans of Information Week, there now are <a href="http://www.informationweek.com/news/global-cio/interviews/showArticle.jhtml?articleID=225800024&amp;cid=RSSfeed_IWK_ALL" onclick="javascript:pageTracker._trackPageview('/www.informationweek.com');">15 disclosed Exadata reference accounts</a>. Coincidentally, there are exactly 15 logos on <a href="http://www.asterdata.com/customers/index.php" onclick="javascript:pageTracker._trackPageview('/www.asterdata.com');">Aster Data&#8217;s customer page</a>. So on it own, that&#8217;s not a particularly impressive piece of information.</p>
<p>But other highlights of his column include:</p>
<ul>
<li><strong>Some of those accounts are rather big-name.</strong> However, I&#8217;m not at all sure whether they&#8217;re actual production references.</li>
<li>Andy Mendelsohn characterizes the sweet spot of Exadata&#8217;s market as <strong>&#8220;virtual private cloud.&#8221;</strong> That matches <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >what Juan Loaiza told me six months ago</a>.</li>
<li>Oracle claims <strong>numerous competitive wins for Exadata.</strong> Let me hasten to note that one vendor&#8217;s &#8220;competitive win&#8221; is another vendor&#8217;s &#8220;our salesman read the deal as an unfavorable one and chose not to compete,&#8221; or even sometimes &#8220;Huh? We never heard about that deal.&#8221; That said, what I&#8217;m hearing is that <a href="http://www.dbms2.com/2010/03/19/some-business-trends-in-the-data-warehouse-market/" >Exadata is indeed a much stronger competitor than it used to be</a>.</li>
<li>Oracle claims a <strong>near $1 billion sales run rate</strong> for Exadata. No doubt, a large majority of those are hardware upgrades for existing Oracle database customers, often from non-Sun/Oracle hardware. Even so, some of those are surely deals that would have migrated away from Oracle in the pre-Exadata past.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/07/14/exadata-reference-accounts/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>More on Greenplum and EMC</title>
		<link>http://www.dbms2.com/2010/07/07/more-on-greenplum-and-emc/</link>
		<comments>http://www.dbms2.com/2010/07/07/more-on-greenplum-and-emc/#comments</comments>
		<pubDate>Wed, 07 Jul 2010 19:15:21 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Market share]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2534</guid>
		<description><![CDATA[I talked with Ben Werther of Greenplum for about 40 minutes, which was my first post-merger Greenplum/EMC briefing. &#8220;Historical&#8221; highlights include:

Ben says Greenplum wasn&#8217;t being shopped, by which he means Greenplum was out raising more capital and the fund-raising was going well.  Note: Half or so of Greenplum&#8217;s deals were subscription-priced, so it had weaker [...]]]></description>
			<content:encoded><![CDATA[<p>I talked with Ben Werther of Greenplum for about 40 minutes, which was my first post-merger Greenplum/EMC briefing. &#8220;Historical&#8221; highlights include:</p>
<ul>
<li>Ben says Greenplum wasn&#8217;t being shopped, by which he means Greenplum was out raising more capital and the fund-raising was going well.  <em>Note: <a href="http://www.dbms2.com/2009/10/18/greenplum-customer-notes/" >Half or so of Greenplum&#8217;s deals were subscription-priced</a>, so it had weaker cash flow than it would have if it were doing equally well selling perpetual licenses.</em></li>
<li>However, joint engineering was also going well with, e.g., Greenplum CTO Luke Lonergan spending time at EMC facilities in Cork, Ireland. And one thing led to another &#8230;</li>
<li>Greenplum has ~ 140 customers, vs. <a href="http://www.dbms2.com/2009/06/05/greenplum-update-release-3-3/" >~65 five quarters ago</a>, 100+ at year-end, and an acquisition rate of 12-15/quarter last fall.</li>
<li>A typical &#8220;small&#8221; paying customer for Greenplum starts with 10-20 TB of data.</li>
<li><a href="http://www.dbms2.com/2010/04/12/greenplumchorus/" >Greenplum Chorus</a> isn&#8217;t generally available yet, with rollout energy being focused on Greenplum 4.0. <em>Note: As important as it is for overall industry direction, Greenplum Chorus is a product which won&#8217;t be a terribly big deal in Release 1 anyway.</em></li>
</ul>
<p>Highlights looking forward include:  <span id="more-2534"></span></p>
<ul>
<li>When I challenged him, Ben sounded quite optimistic that Pat Gelsinger will immunize Greenplum against and generally counteract some of EMC&#8217;s traditionally stifling bureaucracy. (My words, of course, not his.)</li>
<li>The initial Greenplum/EMC product vision appears truly centered around &#8220;private cloud,&#8221; specifically including Greenplum, VMware, and EMC storage arrays.</li>
<li>Some other areas of potential Greenplum/EMC technical synergy I think are cool obviously haven&#8217;t been seriously addressed yet.</li>
<li>Based on what I heard from Ben about the aura around the deal and also on what I know of the individual executives at Greenplum, I think each of them is a good bet to stick around EMC for a while. (That&#8217;s on average. Of course, it would be surprising if 100% of them stayed around very long.) Basically, there&#8217;s at least a chance EMC/Greenplum will do some pretty cool stuff, and most of the guys will probably stick around to see if that actually starts to happen.*</li>
</ul>
<p><em>*Also, when they do eventually leave, they&#8217;ll surely say things to the effect &#8220;The cool stuff is well underway; my work here is done.&#8221; That party line is almost guaranteed, no matter how things unfold in reality. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/07/07/more-on-greenplum-and-emc/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Riptano, and Cassandra adoption</title>
		<link>http://www.dbms2.com/2010/07/06/riptano-and-cassandra-adoption/</link>
		<comments>http://www.dbms2.com/2010/07/06/riptano-and-cassandra-adoption/#comments</comments>
		<pubDate>Tue, 06 Jul 2010 09:11:40 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[Riptano]]></category>
		<category><![CDATA[Specific users]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2480</guid>
		<description><![CDATA[Tonight&#8217;s Cassandra technology post got plenty long enough on its own, so I&#8217;m separating out business and adoption issues here. For starters, known Cassandra users include:

Facebook, which has said it has 	150 or so Cassandra nodes (but see below)
Twitter, which has said it has 45 	or so Cassandra nodes
Rackspace, which used to be 	Jonathan Ellis&#8217; [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Tonight&#8217;s <a href="http://www.dbms2.com/2010/07/06/cassandra-technical-overview/" >Cassandra technology post</a> got plenty long enough on its own, so I&#8217;m separating out business and adoption issues here. For starters, known Cassandra users include:</p>
<ul>
<li>Facebook, which has said it has 	150 or so Cassandra nodes (but see below)</li>
<li>Twitter, which has said it has 45 	or so Cassandra nodes</li>
<li>Rackspace, which used to be 	Jonathan Ellis&#8217; employer, and now is backing Cassandra company 	Riptano</li>
<li>Digg, which along with Twitter and 	Rackspace was one of the three major users helping advance the 	Cassandra project</li>
<li>OpenX, Simple Geo, Digital 	Reasoning, who Jonathan cited as production users in March</li>
<li>Cloudkick, as noted and linked in 	my other post</li>
<li>Two 	customers Riptano named at launch (but I&#8217;ve forgotten who they were*)</li>
</ul>
<p style="margin-bottom: 0in;">Fetlife, Meebo, and others seem to at least have a healthy interest in Cassandra, based on their level of involvement in a forthcoming <a href="http://cassandrasummit2010.eventbrite.com/" onclick="javascript:pageTracker._trackPageview('/cassandrasummit2010.eventbrite.com');">Cassandra Summit</a>. That said, the <a href="http://twitter.com/fetlife" onclick="javascript:pageTracker._trackPageview('/twitter.com');">@Fetlife</a> tweetstream features numerous yelps of pain, and I don&#8217;t mean the recreational kind.  <span id="more-2480"></span></p>
<p style="margin-bottom: 0in;"><em>*And I can&#8217;t easily find a launch press release, whether on the rather minimalist Riptano website or elsewhere.</em></p>
<p style="margin-bottom: 0in;">Beyond that, when Riptano launched in May, the Riptano guys (mainly Jonathan Ellis) said:</p>
<ul>
<li>They were sure there were dozens 	of Cassandra user organizations, maybe even &gt;100. But there 	weren&#8217;t 100s.</li>
<li>Maybe 20-40% of those Cassandra 	sites were in production. (But I don&#8217;t think I&#8217;d multiply that out 	to suggest there were, say, 35-50 production Cassandra users.)</li>
<li>4000 people were going daily to 	the Apache Cassandra site.</li>
<li>There were 250 Cassandra downloads 	daily.</li>
<li>Lots of startups were using 	Cassandra.</li>
<li>Lots of other companies were 	looking at switching over to Cassandra.</li>
<li>Many potential Cassandra users had 	been waiting for a Cassandra company to be available to support it.</li>
<li>The median number of Cassandra 	(production?) nodes is probably 8-10. 4 would be a low end figure.</li>
</ul>
<p style="margin-bottom: 0in;">That&#8217;s a lot of adoption for a not-even-Release-1 open source project. Even so, there&#8217;s a feeling going around that Cassandra has lost some momentum the past couple of months. Most notably, <a href="../2008/07/21/project-cassandra-facebook-open-sourced-quasi-dbms/">Facebook, which created Cassandra in the first place,</a> isn&#8217;t using it for new projects. True, I&#8217;m hearing even less evidence that any one of Membase, Voldemort, <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/" >VoltDB</a>, <a href="http://www.dbms2.com/2010/04/03/akiban-highlights/" >Akiban</a>, <a href="http://www.dbms2.com/2010/05/12/the-clustrix-story/" >Clustrix</a>, or Riak – for example – is setting the world on fire than I am for Cassandra. But the viable Cassandra alternatives are piling up. Cassandra isn&#8217;t the only or even primary game in town, and for that matter I haven&#8217;t heard any concise description of a niche in which Cassandra is the unquestioned leader.</p>
<p style="margin-bottom: 0in;"><em>Edit: <a href="http://twitter.com/EventCloudPro/status/17872687577" onclick="javascript:pageTracker._trackPageview('/twitter.com');">A/the Facebook project that continues to run on Cassandra</a> is Inbox search.</em></p>
<p style="margin-bottom: 0in;">As for Riptano itself:</p>
<ul>
<li>Riptano launched with two founders 	and immediately made an offer to a third guy. I don&#8217;t know how many 	folks they have now, two months later.</li>
<li>Rackspace put some funding into 	Riptano.</li>
<li>Riptano&#8217;s strategy sounds a lot 	like <a href="../2010/06/30/cloudera-enterprise-hadoop-evolution/">Cloudera&#8217;s</a>, 	by which I mean:
<ul>
<li>Riptano&#8217;s business is all 	services, whether training, consulting, or support.</li>
<li>Riptano&#8217;s intended main business 	is obviously support.</li>
<li>Notwithstanding the above, Riptano 	intends to eventually offer proprietary software, bundled with its 	support services.</li>
<li>The first area of focus for that 	proprietary software is intended to be management tools.</li>
<li>I wouldn&#8217;t be surprised if, like 	Cloudera, Riptano tweaks its software focus from “stuff that lets 	us support you better” to “integration with stuff you pay for.” 	Those strategies are actually pretty similar.</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;">Riptano seems to be starting out with support pricing around $1,000-$4,000/server/year, before quantity discounts.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/07/06/riptano-and-cassandra-adoption/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Cloudera Enterprise and Hadoop evolution</title>
		<link>http://www.dbms2.com/2010/06/30/cloudera-enterprise-hadoop-evolution/</link>
		<comments>http://www.dbms2.com/2010/06/30/cloudera-enterprise-hadoop-evolution/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 17:22:27 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Data integration and middleware]]></category>
		<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[Web analytics]]></category>
		<category><![CDATA[eBay]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2440</guid>
		<description><![CDATA[I talked with Cloudera a couple of weeks ago in connection with the impending release of Cloudera Enterprise. I&#8217;d say:  

If you are or want to be a serious 	MapReduce user – and you&#8217;re past the “play around over the 	weekend” stage &#8212; you probably should have either:

A serious non-DBMS MapReduce 	distribution.
MapReduce integrated into your [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked with Cloudera a couple of weeks ago in connection with the impending release of Cloudera Enterprise. I&#8217;d say:  <span id="more-2440"></span></p>
<ul>
<li>If you are or want to be a serious 	MapReduce user – and you&#8217;re past the “play around over the 	weekend” stage &#8212; you probably should have either:
<ul>
<li>A serious non-DBMS MapReduce 	distribution.</li>
<li>MapReduce integrated into your 	analytic DBMS.</li>
<li>Both.</li>
</ul>
</li>
<li>The obvious choice for non-DBMS 	MapReduce is Hadoop.</li>
<li>The obvious choice for a Hadoop 	distribution is <strong>Cloudera Enterprise.</strong></li>
<li>Cloudera Enterprise has three main 	aspects, in an inseparable bundle:
<ul>
<li>Distributions for a double-digit 	number of open source projects. It&#8217;s nice having all that in one 	package – unless, of course, you like playing with Tinkertoys.</li>
<li>Proprietary Cloudera code.</li>
<li>Cloudera support.</li>
</ul>
</li>
<li>Cloudera says its proprietary code 	is and in the future is planned to be concentrated – at least in 	large part &#8212; on integrating open source technology with closed 	source products. This has the virtue of being targeted directly at 	that segment of the market which has proven it&#8217;s actually willing to 	pay money for software.</li>
<li>Cloudera Enterprise areas of 	focus, now and in the presumed future, include:
<ul>
<li><strong>Core Hadoop engine,</strong> which 	Cloudera says is quite predictably and appropriately evolving more 	slowly than the tools around it.</li>
</ul>
<ul>
<li><strong>Development, management and 	administrative tools,</strong> including:
<ul>
<li><strong>Pig</strong> and <strong>Hive</strong>. Cloudera says &gt;70% 	of Facebook Hadoop jobs are initiated through Hive, and the same is 	true of Yahoo and Pig.</li>
<li>Connectivity to commercial tools.</li>
<li>The product formerly known as 	“Cloudera Desktop.”</li>
</ul>
</li>
<li><strong>Workflow</strong>, which in this context 	refers to letting you create a Hadoop application as a sequence of 	small steps, rather than forcing you to kluge it into being one 	unwieldy thing. At the moment, this is much less widely adopted than 	Pig and Hive, but Cloudera has high hopes for it, because of its 	obvious benefits in modularity and manageability.</li>
<li><strong>Quasi-DBMS technology.</strong> Besides Hive and Pig, this includes <strong>HBase.</strong> Cloudera says there has 	been considerable demand for HBase, and it is pleased that project 	is now mature enough to ship. Cloudera stresses that it intends 	HBase not for OLTP, but as an adjunct to analytic processing. E.g., 	Cloudera suggests HBase would be a fine vehicle for replicating 	dimension tables across each node of a cluster.</li>
<li><strong>Data connectivity, </strong><span style="font-weight: normal;">e.g. 	to MySQL or to sensor log files.</span></li>
</ul>
</li>
<li>Cloudera Enterprise pricing is 	well below DBMS prices – not by a full order of magnitude, if I&#8217;m 	right about everybody&#8217;s quantity discount policies, but even so by a 	lot. Details are NDA.</li>
</ul>
<p style="margin-bottom: 0in;">Cloudera sometimes sends confusing signals about its beliefs and strategies. For example, one can get different stories depending on whether one talks to:</p>
<ul>
<li>Somebody at Cloudera who comes 	primarily from the user and open source communities.</li>
<li>Somebody at Cloudera who has 	actually worked at a software company before.</li>
</ul>
<p style="margin-bottom: 0in;">But I predict that Cloudera will now stick for a while with more or less the strategy outlined above.</p>
<p style="margin-bottom: 0in;">Naturally, we also talked about Hadoop adoption. Highlights of that part – no doubt somewhat biased towards Cloudera&#8217;s own customer base &#8212; included:</p>
<ul>
<li>Notwithstanding <a href="http://www.dbms2.com/2009/04/14/ebay-thinks-mpp-dbms-clobber-mapreduce/" >eBay&#8217;s prior 	skepticism about MapReduce</a>, it is quoted saying nice things in a Cloudera press release, 	and has apparently become quite a large Hadoop user, starting out 	with a search-quality use case.</li>
<li>Typical Hadoop deployment sizes 	are 10 nodes or so when experimenting, 80-500+ in production.</li>
<li>10 terabytes/node – I&#8217;m pretty 	sure Cloudera meant of user data &#8212; is not inconceivable, so a 	cost-conscious 500-node user could have 5 petabytes of data managed 	by Hadoop.</li>
<li>Cloudera has half a dozen 	customers at the 75+ node production level.</li>
<li>Web and financial services are the 	two vertical markets moving most aggressively into Hadoop 	production. The government is also in significant Hadoop production, 	but the details of that are classified.</li>
<li>Web uses for Hadoop include:
<ul>
<li>Clickstream – sessionization, 	etc. – that&#8217;s a super-mainstream use.</li>
<li>Search – analyzing search 	attempts in conjunction with structured data.</li>
<li>Machine learning (for ad serving, 	etc.).</li>
</ul>
</li>
<li>Financial services uses for Hadoop 	include:
<ul>
<li>Internal trading rule 	enforcement/fraud detection.</li>
<li>Complex ETL.</li>
<li>Portfolio risk assessment 	(typically overnight).</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;">None of this is inconsistent with previous surveys of <a href="http://www.dbms2.com/2009/10/10/enterprises-using-hadoo/" >Hadoop use cases</a>.</p>
<p style="margin-bottom: 0in; font-style: normal;">Various users talked at the Hadoop Summit this week. I wasn&#8217;t there, and won&#8217;t write about their stories for now. That said, <a href="http://www.slideshare.net/kevinweil/hadoop-at-twitter-hadoop-summit-2010" onclick="javascript:pageTracker._trackPageview('/www.slideshare.net');">Twitter&#8217;s slide deck</a> from same has some interesting stuff, including:</p>
<ul>
<li><span style="font-style: normal;">7 	TB/day ETLed from MySQL.</span></li>
<li><span style="font-style: normal;">Petabytes-being-stored 	accordingly coming soon.</span></li>
<li><span style="font-style: normal;">Open 	sourcing their ETL tool Crane.</span></li>
<li><span style="font-style: normal;">3-4X 	LZO compression at little CPU cost.</span></li>
<li><span style="font-style: normal;">HBase 	is a more usable for them than HDFS, which isn&#8217;t mutable enough.</span></li>
<li><span style="font-style: normal;">Pig 	= 5% of code and coding effort vs. vanilla Hadoop at 30% or less 	performance hit.</span></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/30/cloudera-enterprise-hadoop-evolution/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Kickfire update</title>
		<link>http://www.dbms2.com/2010/06/11/kickfire-update-2/</link>
		<comments>http://www.dbms2.com/2010/06/11/kickfire-update-2/#comments</comments>
		<pubDate>Fri, 11 Jun 2010 11:31:57 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[Market share]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2276</guid>
		<description><![CDATA[A Kickfire competitor tipped me off that he got 3 Kickfire salesmen&#8217;s resumes in 24 hours. I ran this by Kickfire CEO Bruce Armstrong, who confirmed that Kickfire has had a layoff, but gave me no further details.
Bruce also told me that Kickfire is now up to 10 paying customers, and that there are repeat [...]]]></description>
			<content:encoded><![CDATA[<p>A Kickfire competitor tipped me off that he got 3 Kickfire salesmen&#8217;s resumes in 24 hours. I ran this by Kickfire CEO Bruce Armstrong, who confirmed that Kickfire has had a layoff, but gave me no further details.</p>
<p>Bruce also told me that Kickfire is now up to 10 paying customers, and that there are repeat deals.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/11/kickfire-update-2/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>More on Sybase IQ, including Version 15.2</title>
		<link>http://www.dbms2.com/2010/05/23/sybase-iq-15/</link>
		<comments>http://www.dbms2.com/2010/05/23/sybase-iq-15/#comments</comments>
		<pubDate>Sun, 23 May 2010 08:34:28 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data mart outsourcing]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Text]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2186</guid>
		<description><![CDATA[Back in March, Sybase was kind enough to give me permission to post a slide deck about Sybase IQ. Well, I&#8217;m finally getting around to doing so. Highlights include but are not limited to:

Slide 2 has some market success figures and so on. (&#62;3100 copies at &#62;1800 users, &#62;200 sales last year)
Slides 6-11 give more [...]]]></description>
			<content:encoded><![CDATA[<p>Back in March, Sybase was kind enough to give me permission to post <a href="http://www.monash.com/uploads/Sybase-IQ-slides-March-2010.pdf" onclick="javascript:pageTracker._trackPageview('/www.monash.com');">a slide deck about Sybase IQ</a>. Well, I&#8217;m finally getting around to doing so. Highlights include but are not limited to:</p>
<ul>
<li>Slide 2 has some market success figures and so on. (&gt;3100 copies at &gt;1800 users, &gt;200 sales last year)</li>
<li>Slides 6-11 give more detail on Sybase&#8217;s indexing and data access methods than I put into my recent <a href="http://www.dbms2.com/2010/05/17/technical-basics-of-sybase-iq/" >technical basics of Sybase IQ</a> post.</li>
<li>Slide 16 reminds us that in-database data mining is quite competitive with what <a href="http://www.dbms2.com/2010/05/15/further-clarifying-in-database-mpp-sas/" >SAS has actually delivered with its DBMS partners</a>, even if it doesn&#8217;t have the nice architectural approach of <a href="http://www.dbms2.com/2010/02/22/netezza-twinfin/" >Aster or Netezza</a>. (I.e., Sybase IQ&#8217;s more-than-SQL advanced analytics story relies on C++ UDFs  &#8212; User Defined Functions &#8212; running in-process with the DBMS.) In particular, there&#8217;s a data mining/predictive analytics library &#8212; modeling and scoring both &#8212; licensed from a small third party.</li>
<li>A number of the other later slides also have quite a bit of technical crunch. (More on some of those points below too.)</li>
</ul>
<p>Sybase IQ may have a bit of a funky architecture (e.g., no MPP), but the age of the product and the substantial revenue it generates have allowed Sybase to put in a bunch of product features that newer vendors haven&#8217;t gotten around to yet.</p>
<p>More recently, Sybase volunteered permission for me to preannounce <strong>Sybase IQ Version 15.2</strong> by a few days (it&#8217;s scheduled to come out this week). <span id="more-2186"></span>Sybase IQ seems to be focused on large part on the government/intelligent market, with three major features being:</p>
<ul>
<li>A kind of <strong>data federation,</strong> querying external databases, that makes sense mainly in the context of rigorous security rules. (I find that confusing, since Sybase IQ&#8217;s indexes tend to hold all the information in the database, but I didn&#8217;t push the point.)</li>
<li>An upgrade to Sybase IQ&#8217;s built-in <strong>text indexing.</strong> I doubt anybody would confuse this with best-of-breed text search, but evidently that intelligence community is satisfied with less. But even before 15.2, Sybase IQ could do both LIKE and WHERE CONTAINS searching.</li>
<li>Improved LOB (Large OBject) management.</li>
</ul>
<p>One part of my Sybase IQ conversations I haven&#8217;t blogged yet in much details is <strong>scale-out, concurrency, </strong>and<strong> &#8220;multiplexing.&#8221;</strong></p>
<ul>
<li>Sybase feels that Sybase IQ&#8217;s competitive sweet spot, especially in terms of performance, is reached when there are 20 or more concurrent queries.</li>
<li>In general, Sybase asserts that a shared-everything architecture is great for concurrency &#8212; just run different queries on different boxes, all against the same data.</li>
<li>The ability to use a bunch of boxes run Sybase IQ is called &#8220;multiplexing.&#8221;  This is a chargeable option, without which one is limited to a single SMP box.</li>
<li>Just under 20% of the top 250 Sybase IQ customers have multi-node scale-out configuration (vs. single-node SMP scale-up). And around 8% have it overall.</li>
<li>Sybase IQ nodes can be heterogeneous (e.g., in compute power).</li>
<li>Sybase IQ nodes can be dedicated to be read-only, or can be read-write. Indeed, Sybase IQ nodes can change roles dynamically, for example becoming write-only during nightly batch load. (I didn&#8217;t clarify whether all this applies just to nodes-as-boxes, or if some parts apply to specific processors or cores within the same box.)</li>
<li>Sybase noted that data mart outsourcers can offer differentiated SLAs (Service Level Agreements) depending upon which nodes they give which customers access to.</li>
<li>Most Sybase IQ installations start at 8 cores or more. The Sybase IQ Small Business Edition, limited to 4 cores, is not a big seller.</li>
<li>Sybase IQ has a straightforward round-robin load-balancing story via third-party technology.</li>
</ul>
<p>Finally, along the way in the discussions I picked up various tidbits about the Sybase IQ user base. Unfortunately, Sybase is pretty vague in discussing database sizes &#8212; are they user data? Are they compressed? What do the numbers mean? With that huge caveat:</p>
<ul>
<li>By some metric or other, a couple of classified customers are approaching petabyte scale.</li>
<li>The largest commercial Sybase IQ customer &#8212; a credit card company &#8212; has a couple hundred terabytes or so.</li>
<li>The largest financial services Sybase IQ databases are 50-70 terabytes. This sounds low, frankly, so maybe those are compressed figures, with user data being 200+ terabytes. But I&#8217;m just speculating there.</li>
<li>Sybase IQ has a little less than 100 customers in the &#8220;data aggregator&#8221; market, which is a lot like what I call &#8220;data mart outsourcer.&#8221;</li>
<li><a href="http://www.dbms2.com/2009/08/25/sybase-iq-technical-highlights/" >Sybase IQ&#8217;s ILM technology</a> is a chargeable option, with Sybase being &#8220;cautious&#8221; about sales. Compliance is a big market driver for it.</li>
<li>Sybase IQ&#8217;s #1 vertical market is financial services. Other biggies are government, telecom, marketing services, and to some extent retail.</li>
<li>As of February, there were 40-45 production users of Sybase IQ 15.0 and 15.1.</li>
</ul>
<p><!-- 		@page { margin: 0.79in } 		P { margin-bottom: 0.08in } --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/05/23/sybase-iq-15/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Vertica update</title>
		<link>http://www.dbms2.com/2010/04/29/vertica-zynga/</link>
		<comments>http://www.dbms2.com/2010/04/29/vertica-zynga/#comments</comments>
		<pubDate>Fri, 30 Apr 2010 03:44:59 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Games and virtual worlds]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1973</guid>
		<description><![CDATA[Last month, Vertica&#8217;s CEO Ralph Breslauer quit,* and Vertica made it sound like there would be a new CEO late in April. And indeed, as of April 29, there was. He&#8217;s a guy I&#8217;ve never heard of before named Chris Lynch, apparently quite the sales machine builder. The most substance I&#8217;ve found is a pair [...]]]></description>
			<content:encoded><![CDATA[<p>Last month, <a href="http://www.dbms2.com/2010/03/19/vertica-update-4/" >Vertica&#8217;s CEO Ralph Breslauer</a> quit,* and Vertica made it sound like there would be a new CEO late in April. And indeed, as of April 29, there was. He&#8217;s a guy I&#8217;ve never heard of before named <a href="http://www.vertica.com/company/news/Vertica-appoints-Christopher-Lynch-new-president-and-CEO" onclick="javascript:pageTracker._trackPageview('/www.vertica.com');">Chris Lynch</a>, apparently quite the sales machine builder. The most substance I&#8217;ve found is a pair of <a href="http://www.masshightech.com/stories/2010/04/26/daily40-Vertica-names-Acopia-vet-Lynch-to-CEO-post.html" onclick="javascript:pageTracker._trackPageview('/www.masshightech.com');">Mass High Tech</a> <a href="http://www.masshightech.com/stories/2010/04/26/daily42-New-Vertica-CEO-Lynch-talks-of-plans-to-hire.html" onclick="javascript:pageTracker._trackPageview('/www.masshightech.com');">articles</a> &#8212; the latter exceedingly typo-ridden &#8212; to the general effect that:</p>
<ul>
<li>Vertica plans to build a massive, world-conquering sales force.</li>
<li>If Vertica dips back into negative cash flow to do that and has to raise more venture capital, so be it.</li>
<li>&#8220;Triple-digit&#8221; revenue growth is expected for this year.</li>
</ul>
<p><em><span id="more-1973"></span>*I&#8217;ve since heard more both from Ralph and his former colleagues, and I&#8217;m comfortable taking the move more or less at face value &#8212; for some reasons he doesn&#8217;t want to spell out, Ralph really wanted to move back home to South Africa.</em></p>
<p>While they were at it, Vertica also put out a press release reporting very good <a href="http://www.vertica.com/company/news/worlds-top-social-gaming-companies-tap-Vertica" onclick="javascript:pageTracker._trackPageview('/www.vertica.com');">success in the social gaming market</a>. The biggest and best known of the bunch is Zynga. Three months ago, <a href="http://tdwi.org/Blogs/WayneEckerson/2010/02/Zynga.aspx" onclick="javascript:pageTracker._trackPageview('/tdwi.org');">Wayne Eckerson</a> had figures of 3 TB/day added to the database, 200 nodes, and &gt;40 million users. Now Zynga is using a figure of &gt;65 million daily users and 230 nodes. More precisely, at Zynga:</p>
<ul>
<li>There are two Vertica databases with identical data.</li>
<li>Each Zynga Vertica database runs on 115 nodes.</li>
<li>Zynga&#8217;s two Vertica database clusters are used for different applications.</li>
<li>It&#8217;s undisclosed exactly what Zynga runs on what Vertica cluster. But best practice would be to put mission-critical, fast-response stuff on one cluster, and use the other for longer-running or less-critical queries &#8212; plus have it be available as hot standby &#8212; given that I don&#8217;t see much reason to put data geographically close to users around the world for reasons of latency or whatever.</li>
<li>An undisclosed amount of data, amounting to all of what Wayne earlier estimated at 3 TB, is added to each of Zynga&#8217;s Vertica databases daily.</li>
</ul>
<p>In other news, Vertica now states its customer count as being &gt;130.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/29/vertica-zynga/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Greenplum Chorus and Greenplum 4.0</title>
		<link>http://www.dbms2.com/2010/04/12/greenplumchorus/</link>
		<comments>http://www.dbms2.com/2010/04/12/greenplumchorus/#comments</comments>
		<pubDate>Mon, 12 Apr 2010 11:54:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Data integration and middleware]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Theory and architecture]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1887</guid>
		<description><![CDATA[Greenplum is making two product announcements this morning. Greenplum 4.0 is a revision of the core Greenplum database technology. In addition, Greenplum is announcing Greenplum Chorus, which is the first product release instantiating last year&#8217;s EDC (Enterprise Data Cloud) vision statement and marketing campaign.
Greenplum 4.0 highlights and related observations include:

For the most part, Greenplum 	4.0 [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Greenplum is making two product announcements this morning. Greenplum 4.0 is a revision of the core Greenplum database technology. In addition, Greenplum is announcing Greenplum Chorus, which is the first product release instantiating last year&#8217;s <a href="http://www.dbms2.com/2009/06/08/the-future-of-data-marts/" >EDC (Enterprise Data Cloud) vision statement and marketing campaign</a>.</p>
<p style="margin-bottom: 0in;">Greenplum 4.0 highlights and related observations include:<span id="more-1887"></span></p>
<ul>
<li>For the most part, <strong>Greenplum 	4.0 is focused on general robustness catch-up and </strong><a href="http://www.dbms2.com/2009/08/21/bottleneck-whack-a-mole/" ><strong>Bottleneck Whack-A-Mole</strong></a><strong>,</strong><span style="font-weight: normal;"> much 	like the latest rel</span>eases from fellow analytic DBMS vendors 	<a href="http://www.dbms2.com/2010/02/22/data-warehouse-dbms-news-roundup/" >Vertica and Aster Data</a>.</li>
<li>Greenplum has switched its 	replication approach from logical (execute transactions against two 	different disks) to block-level (just ship over the blocks that were 	changed by the original transaction). This seems to increase a 	Greenplum database&#8217;s robustness/performance/uptime in the face of 	disk/node failure. It also provides Greenplum with an ongoing 	performance advantage in that data only has to be compressed once in 	total for both disk writes.</li>
<li>The Greenplum DBMS now has 	something called “tablespaces,” which sounds as if it extends 	<a href="http://www.dbms2.com/2009/10/14/greenplum-hybrid-columnar/" >Greenplum&#8217;s “polymorphic storage”</a> to accommodate different kinds 	of storage device. Everybody has to do and for the most part is 	doing this, e.g. <a href="http://www.dbms2.com/2008/10/14/teradata-virtual-storage/" >Teradata</a><span style="font-style: normal;"> and </span><a href="http://www.dbms2.com/2009/08/25/sybase-iq-technical-highlights/" >Sybase</a>. At least for now, you need to have the 	same mix of storage technology at every Greenplum node. That said, 	while Greenplum&#8217;s customers will surely want solid-state storage in 	the future, that&#8217;s not quite yet a major current issue.</li>
<li>The timetable on Greenplum 4.0 is 	a salami-thin-slicer&#8217;s delight:
<ul>
<li>Greenplum 4.0 has been used in 	POCs (Proofs of Concept) for a while.</li>
<li>Greenplum 4.0 has been in early 	access for a few weeks.</li>
<li>Greenplum 4.0 controlled 	availability is planned for the end of April.</li>
<li>Greenplum 4.0 general availability 	is planned around the end of May or early June.</li>
<li>(Note: Everything in Greenplum 4.0 	has been built, and is undergoing QA).</li>
</ul>
</li>
<li>Greenplum has put together a nice 	list of big-name customers, including <a href="http://www.dbms2.com/2009/03/05/fox-interactive-medias-multi-hundred-terabyte-database-running-on-greenplum/" >Fox/MySpace</a>, <a href="http://www.dbms2.com/2009/04/30/ebays-two-enormous-data-warehouses/" >eBay</a>, Sears, and T-Mobile. While Fox/MySpace never got to the <a href="http://www.dbms2.com/2008/08/25/greenplum-is-in-the-big-leagues/" >predicted</a> 1-petabyte level of user data, T-Mobile is loosely projected to 	indeed get there. The same 1-petabyte projection is made more 	confidently about another Greenplum telecom customer (unnamed), 	which seems to be in the process of acquiring a 300-node Greenplum 	system.</li>
</ul>
<p style="margin-bottom: 0in;">The really interesting part of this announcement, however, is Greenplum Chorus. Greenplum agrees with my assertion that <strong>Greenplum Chorus is a new kind of data integration/ETL technology.</strong> In particular, Greenplum Chorus is designed around a stance I agree with, namely <a href="http://www.dbms2.com/2010/04/12/enterprise-data-warehouse-edw-myt/" >it&#8217;s unrealistic to put everything into a single enterprise data warehouse (EDW)</a>; you need to manage data marts as well, preferably in a coordinated way. Mainstream data integration/ETL (Extract/Integration/Load) vendors such as Informatica<span style="font-style: normal;"> or </span><a href="http://www.dbms2.com/category/products-and-vendors/ab-initio-software-corporation/" >Ab Initio</a><span style="font-style: normal;"> would surely say “That&#8217;s often quite true, and our technology can handle such scenarios just as it handles single-EDW-data-sink environments.” But Greenplum Chorus offers three capabilities not generally found in traditional data integration products (and offers only those three capabilities), namely:</span></p>
<ul>
<li><span style="font-style: normal;">Spin 	out data marts, whether by recopying the data or by creating a 	virtual data mart inside another data warehouse/mart.</span></li>
<li><span style="font-style: normal;">Find/discover 	data in databases across your enterprise.</span></li>
<li><span style="font-style: normal;">Do 	social networking around databases/data marts.</span></li>
</ul>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">Greenplum Chorus is heading into early access soon, with general availability slated around midyear. Also in the mix is a Greenplum “Hypervisor” that can somehow relate to an almost unlimited number of nodes or databases; however, I didn&#8217;t get a lot of details on the Greenplum Hypervisor technology or on the target dates for delivering and integrating the Hypervisor with other parts of Greenplum&#8217;s technology.</span></p>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">When Greenplum first talked about about the enterprise data cloud (EDC) idea, it emphasized <a href="../2009/06/08/the-future-of-data-marts/">the spinning out of physical data marts in an easy way</a></span>, as opposed to the virtual d<span style="font-style: normal;">ata marts pushed by <a href="../2009/10/27/teradatas-nebulous-cloud-strategy/">Oliver Ratzesberger and Teradata</a>. Greenplum Chorus, however, supports both kinds (as, at least directionally, does Teradata), specifically letting you choose between:</span></p>
<ul>
<li>“<span style="font-style: normal;">Independent 	sandboxes” – physical copies of the data, in a separate 	Greenplum database instance.</span></li>
<li>“<span style="font-style: normal;">Satellite 	sandboxes” – virtual data marts, of course managed by the same 	Greenplum database instance.</span></li>
</ul>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">Actually, if you want to recopy data in the same Greenplum database instance, you can do that too, via something called “data sets,” but that&#8217;s not the main focus. Either option, I presume, can be configured to provide either or both of the two main benefits of spun-out data marts, namely:</span></p>
<ul>
<li><span style="font-style: normal;">Control 	over the performance and SLAs (Service-Level Agreements) of your 	analytic workload</span></li>
<li><span style="font-style: normal;">Ability 	to mix in new raw data and/or new aggregations</span></li>
</ul>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">in either case without messing up the performance, SLAs, security, or “one truth-ness” of the existing database.</span></p>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">To provide those capabilities in an analytic DBMS, you need sufficiently robust parallel data movement (for the physical sandboxes) and workload management (for the virtual ones). Greenplum obviously believes it has both. Teradata makes the same claim. Other vendors would make similar assertions, and presumably will offer similar capabilities soon. You also want some kind of ability to ingest data from foreign databases, but that can be pretty routine stuff; e.g., in Release 1 of Chorus, Greenplum is content to offer ODBC access to Oracle, SQL Server, et al.</span></p>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">The “data discovery” and “social networking” aspects of Greenplum Chorus seem to be quite Release 1 as well. Basically, Greenplum lets people post discussion threads about databases and data marts, discussing what value can be derived from them. I guess somebody could include links to web-technology reports based on those databases, but otherwise there&#8217;s no integration with business intelligence tools and their collaboration capabilities. Even so, Greenplum reports that business executives liked this capability in early access testing.</span></p>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">Greenplum Chorus is ETL without a lot of T, and without a lot of performance optimizations either. That may not be much of a problem in its paradigmatic use case, spinning out a data mart quickly for some analysis to see if valuable conclusions can be drawn. Presumably, in the most successful cases, business and technical processes would emerge after the fact to pipe up-to-date versions of that analysis into operational systems, mooting any ETL deficiencies in the initial exploration moot. In a world where “data exploration” is becoming an increasingly important concept, something like Greenplum Chorus may suffice to provide significant customer value. But whether Greenplum Chorus&#8217;s capabilities are eventually co-opted by more fully-featured data integration suites remains an open question for the future.</span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/12/greenplumchorus/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Notes on the evolution of OLTP database management systems</title>
		<link>http://www.dbms2.com/2010/04/05/oltp-database-management-systems-2/</link>
		<comments>http://www.dbms2.com/2010/04/05/oltp-database-management-systems-2/#comments</comments>
		<pubDate>Mon, 05 Apr 2010 08:22:03 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EnterpriseDB and Postgres Plus]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[Mid-range]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[RDF and graphs]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1841</guid>
		<description><![CDATA[The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part).  OLTP (OnLine Transaction Processing) and general purpose DBMS startups, however, have not yet done as well, with [...]]]></description>
			<content:encoded><![CDATA[<p>The past few years have seen a spate of startups in the analytic DBMS business. Netezza, Vertica, Greenplum, Aster Data and others are all reasonably prosperous, alongside older specialty product vendors Teradata and Sybase (the Sybase IQ part).  OLTP <span style="font-weight: normal;">(OnLine Transaction Processing) </span>and general purpose DBMS startups, however, have not yet done as well, with such success as there has been (MySQL, Intersystems Cache&#8217;, solidDB&#8217;s exit, etc.) generally accruing to products that originated in the 20th Century.</p>
<p>Nonetheless, OLTP/general-purpose data management startup activity has recently picked up, targeting what I see as some very real opportunities and needs. So as a jumping-off point for further writing, I thought it might be interesting to collect a few observations about the market in one place.  These include:</p>
<ul>
<li><span style="font-weight: normal;">Big-brand 	OLTP/general-purpose DBMS have more “stickiness” 	than analytic DBMS.</span></li>
<li><span style="font-weight: normal;">By 	number, most of an enterprise&#8217;s OLTP/general-purpose databases are low-volume and 	low-value. </span></li>
<li>Most 	interesting new OLTP/general-purpose data management products are <span style="font-style: normal;">either 	MySQL-based or NoSQL.</span></li>
<li>It&#8217;s not yet 	clear whether MySQL will prevail over MySQL forks, or vice-versa, or 	whether they will co-exist.</li>
<li>The era of 	silicon-centric relational DBMS is coming.</li>
<li>The emphasis 	on scale-out and reducing the cost of joins spans the NoSQL and 	SQL-based worlds.<em> </em></li>
<li><span style="font-weight: normal;">Users&#8217; 	instance on “free” could be a major problem for OLTP DBMS 	innovation. </span></li>
</ul>
<p style="margin-bottom: 0in;">I shall explain.<span id="more-1841"></span></p>
<p style="margin-bottom: 0in;"><strong>Big-brand OLTP/general-purpose DBMS have more “stickiness” than analytic DBMS.</strong></p>
<ul>
<li>OLTP 	applications are more complex than analytic ones, and hence more 	tightly wired into particular brands of DBMS. For example, 	third-party packaged OLTP applications are typically portable among 	only a few brands of DBMS. But third-party business intelligence 	tools, and the BI “applications” built in them, are more easily 	and widely portable.</li>
<li>Specific technical observations 	such as “OLTP apps tend to use stored procedures, which are 	DBMS-specific” or “OLTP apps tend to have lots and lots of 	tables” serve to underscore the first point.</li>
<li>An enterprise&#8217;s highest-value data 	is commonly the financial stuff handled by its core OLTP systems, so 	those are the last things they want to mess around with just to get 	some cost savings. Security, high availability, and so on are major 	considerations that can outweigh cost.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>By number, most of an enterprise&#8217;s OLTP/general-purpose databases are low-volume and low-value. </strong>Indeed, “OLTP” is often a misnomer, which is why I tend to go with “general-purpose” or some similarly wishy-washy phrase instead.</p>
<ul>
<li>In theory, this is a ripe area for 	what I&#8217;ve called <a href="http://www.dbms2.com/category/database-management-system/mid-range/" >mid-range DBMS</a>.</li>
<li>The big brand vendors try hard to 	keep as many of those databases for themselves as they can. 	Enterprise-wide license pricing helps. Going forward, so will 	virtualization/consolidation strategies, such as <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >Oracle&#8217;s 	Exadata-centric approach</a>.</li>
<li>A variety of mid-range DBMS 	alternatives beyond the big brands have technical merit, at least in 	some cases and configurations – MySQL, PostgreSQL, Intersystems 	Cache&#8217;, and so on.</li>
<li>The only such mid-range DBMS 	alternative with much large enterprise business momentum, however, 	appears to be MySQL.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>&#8220;General-purpose&#8221; might be a better term than &#8220;OLTP&#8221; anyway.</strong></p>
<ul>
<li>I don&#8217;t have a link, but it&#8217;s widely agreed that over half of the processing on an &#8220;OLTP&#8221; enterprise app is commonly reporting and so on.</li>
<li>&#8220;Operational BI&#8221; is progressing by fits and starts, but it is progressing.</li>
<li>Anything customer-facing &#8212; web-based, call center, or otherwise &#8212; is likely to include a heavy dose of &#8220;real-time&#8221; analytic optimization.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Most interesting new OLTP/general-purpose data management products are <span style="font-style: normal;">either MySQL-based or NoSQL.</span></strong></p>
<ul>
<li><a href="http://www.dbms2.com/2009/06/22/h-store-horizontica-voltdb/" >VoltDB</a> is the main 	exception that jumps to mind.</li>
<li>This isn&#8217;t true in the analytic 	DBMS area, where Netezza, Greenplum, Aster, Vertica and others 	started from PostgreSQL&#8217;s code, APIs, or both.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>It&#8217;s not yet clear whether MySQL will prevail over MySQL forks, or vice-versa, or whether they will co-exist.</strong></p>
<ul>
<li>MySQL is a limited product without 	all the third-party storage engines that are being developed.</li>
<li><a href="http://www.dbms2.com/2009/12/14/oracle-mysql-storage-engine/" >Oracle&#8217;s promise of MySQL good 	behavior</a> has an expiration date.</li>
<li>None of the MySQL front-end 	alternatives are remotely mature yet.</li>
</ul>
<p style="margin-bottom: 0in;"><strong>The era of silicon-centric relational DBMS is coming.</strong></p>
<ul>
<li>I think “silicon” means 	“solid-state memory” as much as or more than it means “RAM,” 	but that&#8217;s not yet certain.</li>
<li>What is pretty certain is that, 	thanks to Moore&#8217;s Law, some kind of silicon will increasingly 	replace disk.</li>
<li><a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >Oracle&#8217;s increasingly 	Flash-centric story</a> is a challenge to everybody.</li>
<li>RAM-centric VoltDB will launch 	fairly soon. (By the way, while VoltDB still has <a href="http://www.dbms2.com/2009/06/22/h-store-horizontica-voltdb/" >a lot in common 	with H-Store</a>, they&#8217;re not exactly the same thing. And <a href="http://bit.ly/9QxjV2." onclick="javascript:pageTracker._trackPageview('/bit.ly');">H-Store 	research</a> is progressing too.)</li>
<li><span style="font-style: normal;"><a href="http://rethinkdb.com/" onclick="javascript:pageTracker._trackPageview('/rethinkdb.com');">RethinkDB</a> is being de</span>veloped, focused directly on solid-state memory. 	Based on the sparse information available online, RethinkDB sounds 	somewhat like a dumbed-down H-Store.</li>
<li>New disk-based vendors may never 	optimize their use of disk, instead targeting a solid-state future. 	(E.g., I think Akiban should and quite well might follow this path.)</li>
</ul>
<p style="margin-bottom: 0in; font-weight: normal;"><strong>The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds.</strong> We hear that from the <a href="http://www.dbms2.com/2010/03/14/nosql-taxonomy/" >NoSQL</a> guys all the time. But I also just heard it from <a href="http://www.dbms2.com/2010/04/03/akiban-highlights/" >Akiban</a>.</p>
<p style="margin-bottom: 0in;"><strong>Users&#8217; instance on “free” could be a major problem for OLTP DBMS innovation.</strong> Vendors of new OLTP data management technologies often feel obligated to open source their products, notwithstanding the historical lack of revenue in the open source OLTP DBMS market. As just one of many examples,  <a href="http://www.novaspivack.com/uncategorized/evri-ties-the-knot-with-twine" onclick="javascript:pageTracker._trackPageview('/www.novaspivack.com');">Nova Spivack</a> wrote:</p>
<blockquote>
<p style="margin-bottom: 0in;">I have recently seen some new graph data storage products that may provide the levels of scale and performance needed, but pricing has not been determined yet. In short, storage and retrieval of semantic graph datasets is a big unsolved challenge that is holding back the entire industry. We need federated database systems that can handle hundreds of billions to trillions of triples under high load conditions, in the cloud, on commodity hardware and open source software. Only then will it be affordable to make semantic applications and services at Web-scale.</p>
</blockquote>
<p style="margin-bottom: 0in;">I hear similar things from other startups, who evidently believe they need and/or are entitled to enjoy sophisticated, high-performance, zero-cost, specialized database management technology.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/05/oltp-database-management-systems-2/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Some business trends in the data warehouse market</title>
		<link>http://www.dbms2.com/2010/03/19/some-business-trends-in-the-data-warehouse-market/</link>
		<comments>http://www.dbms2.com/2010/03/19/some-business-trends-in-the-data-warehouse-market/#comments</comments>
		<pubDate>Fri, 19 Mar 2010 13:48:42 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[HP and Neoview]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[Teradata]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1741</guid>
		<description><![CDATA[In recent conversations with various analytic DBMS vendors, a fairly consistent picture has emerged.

Business is strong. Multiple vendors claim to be going gangbusters, with the happy sounds coming out of Vertica and Infobright being echoed by several competitors. Hearsay suggests 	some other companies in related businesses are doing well too. 	Depending on who you talk [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">In recent conversations with various analytic DBMS vendors, a fairly consistent picture has emerged.</p>
<ul>
<li><strong>Business is strong.</strong> Multiple vendors claim to be going gangbusters, with the happy sounds coming out of <a href="../2010/03/19/vertica-update-4/">Vertica</a> and <a href="../2010/03/19/infobright-blog-update/">Infobright</a> being echoed by several competitors. Hearsay suggests 	some other companies in related businesses are doing well too. 	Depending on who you talk to, the business pickup dates back to Q4, give or 	take a quarter.</li>
<li><strong>Oracle Exadata has become a 	formidable competitor,</strong><span style="font-weight: normal;"> on the 	strength of Exadata 2.</span> Exadata 2&#8217;s positioning and perception 	among Oracle users seem to be pretty much in line with <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >what 	Oracle portrayed to me</a>.</li>
<li><strong>Teradata is portrayed as a weak 	competitor.</strong> Competitors don&#8217;t worry about Teradata nearly as 	much as they do about Oracle. That said, I suspect a bit of wishful 	thinking; Teradata is clearly still getting a lot of business the 	other vendors would dearly love to have.</li>
<li><strong>HP Neoview is reeling.</strong> (Almost) nobody sees Neoview competitively. The Walmart Neoview 	installation is said to have stayed small at best. JP Morgan Chase is said 	to have completely thrown Neoview out (and a bunch of HP engineers 	with it).</li>
<li><strong>(Almost) nobody mentions 	competing against DB2</strong> either. This continues to baffle me.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/03/19/some-business-trends-in-the-data-warehouse-market/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
