<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; Exasol</title>
	<atom:link href="http://www.dbms2.com/category/products-and-vendors/exasol/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 09:21:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Comments on the analytic DBMS industry and Gartner&#8217;s Magic Quadrant for same</title>
		<link>http://www.dbms2.com/2012/02/08/gartner-magic-quadrant-data-warehouse-2011-2012/</link>
		<comments>http://www.dbms2.com/2012/02/08/gartner-magic-quadrant-data-warehouse-2011-2012/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 17:17:32 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data mart outsourcing]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Exasol]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Infobright]]></category>
		<category><![CDATA[Kognitio]]></category>
		<category><![CDATA[Market share and customer counts]]></category>
		<category><![CDATA[Microsoft and SQL*Server]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[illuminate Solutions]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=5926</guid>
		<description><![CDATA[This year&#8217;s Gartner Magic Quadrant for Data Warehouse Database Management Systems is out.* I shall now comment, just as I did on the 2010, 2009, 2008, 2007, and 2006 Gartner Data Warehouse Database Management System Magic Quadrants, to varying extents. To frame the discussion, let me start by saying: In general, I regard Gartner Magic [...]]]></description>
			<content:encoded><![CDATA[<p>This year&#8217;s Gartner Magic Quadrant for Data Warehouse Database Management Systems is out.* I shall now comment, just as I did on the <a href="http://www.dbms2.com/2011/02/05/gartner-magic-quadrant-data-warehouse-database-management-2010/">2010</a>, <a href="../../../../../2010/02/10/gartner-magic-quadrant-data-warehouse-2009-2010/">2009</a>, <a href="../../../../../2009/01/12/gartners-2008-data-warehouse-database-management-system-magic-quadrant-is-out/">2008</a>, <a href="../../../../../2007/10/19/gartner-2007-magic-quadrant-for-data-warehouse-database-management-systems/">2007</a>, and <a href="../../../../../2006/10/03/vendor-segmentation-for-data-warehouse-dbms/">2006</a> Gartner Data Warehouse Database Management System Magic Quadrants, to varying extents. To frame the discussion, let me start by saying:</p>
<ul>
<li>In general, I regard Gartner Magic Quadrants as a bad use of good research.</li>
<li>Illustrating the uselessness of &#8212; or at least poor execution on &#8212; the  overall quadrant metaphor, a large majority of the vendors covered are  lined up near the line x = y, each outpacing the one below in both of  the quadrant&#8217;s dimensions.</li>
<li>I find fewer specifics to disagree with in this Gartner Magic Quadrant than in previous year&#8217;s versions. Two factors jump to mind as possible reasons:
<ul>
<li>This year&#8217;s Gartner Magic Quadrant for Data Warehouse Database Management Systems is somewhat less ambitious than others; while it gives as much company detail as its predecessors, it doesn&#8217;t add as much discussion of overall trends. So there&#8217;s less to (potentially) disagree with.</li>
<li><a href="http://www.dbms2.com/2010/12/28/evolving-definitions-and-technology-categories-for-2011/">Merv Adrian is now at Gartner</a>.</li>
</ul>
</li>
<li>Whatever the problems may be with Gartner&#8217;s approach, the whole thing comes out better than do <a href="http://www.dbms2.com/2011/02/11/comments-on-the-2011-forrester-wave-for-enterprise-data-warehouse-platforms/">Forrester&#8217;s failed imitations</a>.</li>
</ul>
<p><em>*At the time of this posting, I don&#8217;t yet have a link. However, I expect that to change quickly, and I plan to edit this paragraph accordingly. If nothing else, I hope people will drop links into the comment thread. </em></p>
<p>Specific company comments, roughly in line with Gartner&#8217;s rough single-dimensional rank ordering, include: <span id="more-5926"></span></p>
<ul>
<li>The Gartner Magic Quadrant&#8217;s comments on Teradata seem pretty fair. I don&#8217;t think I&#8217;m much in disagreement when I say:
<ul>
<li>Teradata has the richest, most mature analytic DBMS offering.</li>
<li>Teradata has an outstanding track record both for <a href="http://www.dbms2.com/2011/09/24/confusion-about-teradatas-big-customers/">managing large data volumes</a> and for high-concurrency mixed workloads.</li>
<li>Aster Data was a cool Teradata acquisition, even if Teradata/Aster synergies or integration have been nominal to date.</li>
<li>Teradata still needs to get out of its own way in marketing, positioning, packaging, and/or defining its premium-priced system vs. its more moderately-priced alternatives. Indeed, as necessary as this approach may have been to fending off encroachments by Netezza and others, what Teradata really needs to do is evolve to a more pick-your-own-node-combination mix-match kind of offering.</li>
</ul>
</li>
<li>Gartner has talked with a lot of Oracle Exadata users who say that the product works; Gartner also has stopped beating Oracle up for <a href="http://www.dbms2.com/2010/06/14/best-practices-analytic-database-poc/">its previous policy of almost never doing onsite POCs (Proofs of Concept)</a>; both parts of that ring true with me. But Gartner also rightly dings Oracle for various issues in cost and cumbersomeness. Overall, while I agree there are organizations for which Oracle should indeed be a top-ranked choice, there are many others who shouldn&#8217;t put Oracle on their short list.</li>
<li>Third in the Gartner MQ rankings is IBM.
<ul>
<li>Gartner gets so caught up in reciting the names of various IBM product offerings that it neglects to say much good about DB2 itself. (I tend to have a similar problem.)</li>
<li>But Gartner does mention concurrency as a strength. I agree, especially if we presume that that was a reference to DB2 rather than Netezza.</li>
<li>Gartner cites Netezza&#8217;s post-acquisition annual growth rate as 30%. Gartner seems to think this is a good number. I disagree, but in Netezza&#8217;s defense, it has had to endure IBM&#8217;s post-acquisition on-boarding process.</li>
</ul>
</li>
<li>Arguably fourth in the Gartner Data Warehouse Magic Quadrant rankings is EMC/Greenplum.
<ul>
<li>In general, Gartner likes the taste of Greenplum Kool-Aid.</li>
<li>Gartner neglects to ding Greenplum for concurrency challenges, which I view as an oversight given Gartner&#8217;s general stress on that area.</li>
<li>Gartner does ding Greenplum for support challenges.</li>
<li>Gartner neglects to praise Greenplum for true <a href="http://www.dbms2.com/2009/10/14/greenplum-hybrid-columnar/">hybrid row/columnar data management</a>, a feature shared by <a href="http://www.dbms2.com/2011/09/22/teradata-columnar-compression/">Teradata</a> and <a href="http://www.dbms2.com/2009/08/04/pax-analytica-row-and-column-stores-begin-to-come-together/">Vertica</a>, among others, but not by <a href="http://www.dbms2.com/2011/02/06/columnar-compression-database-storage/">Oracle</a>, DB2, or Netezza.</li>
<li>Gartner located a half-petabyte Greenplum database. This doesn&#8217;t surprise me, even though Greenplum has frequently made exaggerated claims about large-size database successes in the past.</li>
<li>Gartner reports a &gt;400 figure for Greenplum customers, which is plausible.</li>
</ul>
</li>
<li>In its first deviation from strict one-dimensional rank ordering, the Gartner Magic Quadrant ranks Sybase ahead of Greenplum in completeness of vision but behind in &#8220;ability to execute&#8221;.
<ul>
<li>If that were the other way around, it might make more sense. Greenplum promises anything and everything you might ever want for analytic data management or the associated analysis; but Sybase has vastly more analytic DBMS users than Greenplum does, running a variety of demanding workloads.</li>
<li>Gartner appears to think that Sybase IQ requires less database administration than I do.</li>
<li>Gartner seems concerned that SAP will position HANA and Sybase ASE as, between them, the only DBMS you&#8217;ll ever need, casting doubt on Sybase IQ&#8217;s future. I wouldn&#8217;t worry about that if you have a problem you want to solve today.</li>
</ul>
</li>
<li>The Gartner Magic Quadrant for Data Warehouse Database Management Systems ranks Microsoft sixth overall, despite noting that there isn&#8217;t a single production reference for Microsoft&#8217;s Parallel Data Warehouse. In support of this ranking, it for example cites the compression feature, which distinguishes Microsoft SQL Server from no other product on the list except Kognitio. If you have such an undemanding data warehousing problem that many different analytic DBMS could meet your needs, there&#8217;s a good chance Microsoft SQL Server can also do the job; and if you&#8217;ve bought into the Microsoft technology stack, you might as well keep going down that path. Otherwise, I don&#8217;t know why somebody should adopt Microsoft&#8217;s offering at this time.</li>
<li>Seventh along the main diagonal path in the Gartner Magic Quadrant is HP Vertica. I&#8217;d rank Vertica higher than that, but in fairness I note two execution concerns. First, HP has a lousy track record, both in acquisitions and in data warehousing/analytics. Second, Vertica is bad about answering my email. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Anyhow, Gartner doesn&#8217;t seem to have given Vertica credit either for <a href="http://www.dbms2.com/2011/06/20/columnar-dbms-vendor-customer-metrics/">its full customer count or for the multiple petabyte-scale databases Vertica runs</a>.</li>
<li>1010data is an outlier, with Gartner noting that it only partly fits in with other &#8220;Data Warehousing Database Management&#8221; companies, and hence kind of confessing that 1010data on the Magic Quadrant is somewhat arbitrary. Stuff like that is bound to happen, given <a href="http://www.strategicmessaging.com/no-market-categorization-is-ever-precise/2011/03/01/">the inherent difficulties of defining market categories</a>. Anyhow, my thoughts on 1010data include:
<ul>
<li>I&#8217;m nervous about the fact that 1010data doesn&#8217;t actually control its own DBMS technology, but rather relies on old code from the small private company KX Systems.</li>
</ul>
<ul>
<li> There are three main reasons to consider 1010data:
<ul>
<li>You want to enter the data mart outsourcing business in a casual way, and you like its SaaS offering.</li>
<li>You want to engage in <a href="http://www.dbms2.com/2010/05/15/stakeholder-facing-analytics/">stakeholder-facing analytics</a> in a casual way, and you like its SaaS offering.</li>
<li>You love 1010data&#8217;s particular set of interactive analytic features and performance.</li>
</ul>
</li>
</ul>
</li>
<li>Back to the main path winding along the Gartner Magic Quadrant main diagonal &#8212; next up is ParAccel. While I question some of the peripheral comments, I agree with Gartner&#8217;s core messages that:
<ul>
<li>ParAccel, the product, is blazingly fast in certain use cases.</li>
<li>ParAccel, the company, is dangerously small.</li>
</ul>
</li>
<li>Eighth on the Gartner MQ&#8217;s main path is Kognitio. This is too high. Kognitio positions itself as offering in-memory DBMS, yet stubbornly refuses to do any kind of data compression. That&#8217;s an awful combination of choices. As for using Kognitio&#8217;s data warehousing SaaS offering &#8212; why would you do that, when more modern products are available on a SaaS/cloud basis as well?</li>
<li>Ninth in the Gartner Magic Quadrant main rankings is SAND.
<ul>
<li>The SAND section is not a triumph of Gartner accuracy. For example:
<ul>
<li>Gartner completely missed <a href="http://www.dbms2.com/2011/11/12/clarifying-sands-customer-metrics-positioning-and-technical-story/">the errors in SAND&#8217;s reported customer counts</a>.</li>
<li>Gartner refers to SAND as being &#8220;in existence for approximately nine years&#8221;, which is too low by at least a factor of 2.</li>
<li>Gartner says &#8220;SAND is a privately held company&#8221;, even though <a href="http://itmarketstrategy.com/2009/06/07/sand-technology-a-risky-bet/">Merv knows better than that</a>.</li>
</ul>
</li>
<li>Otherwise, Gartner&#8217;s opinion on SAND seems to boil down to &#8220;Interesting technology and ideas, but dangerously small company.&#8221; I agree.</li>
</ul>
</li>
<li>Tenth and too low in the Gartner MQ main rankings is Infobright.
<ul>
<li>At least by some metrics (e.g. customer count), Infobright isn&#8217;t as dangerously small as ParAccel, SAND, Kognitio, et al.</li>
<li>That said, Infobright is small and focused on <a href="http://www.dbms2.com/2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a>. So I wouldn&#8217;t be confident in Infobright&#8217;s future technology path for human-generated data use cases.</li>
<li>Infobright&#8217;s performance is uneven &#8212; blazing in cases where the Knowledge Grid helps, but not necessarily stellar by analytic DBMS standards when full table scans are called for.</li>
<li>I agree with Gartner that the possibility of Oracle/MySQL future shenanigans is a concern. But while the energy behind MySQL forking efforts doesn&#8217;t seem too great right now, I&#8217;d expect them to revive and offer a successful escape path if it seemed Oracle was going to indeed play hardball.</li>
<li>Also, given that it&#8217;s already an open source vendor, there are various kinds of assurances Infobright could give that would also help alleviate customer concerns.</li>
</ul>
</li>
<li>Actian, formerly Ingres, took a big tumble in Gartner&#8217;s rankings versus last year, when I simply wrote &#8220;<a href="http://www.dbms2.com/2011/02/05/gartner-magic-quadrant-data-warehouse-database-management-2010/">What Gartner said in connection with <strong>Ingres</strong> is too inaccurate to deserve detailed attention</a>.&#8221; I&#8217;m even a little harsher about <a href="http://www.dbms2.com/2011/09/25/ingres-actian/">Ingres/Actian&#8217;s DBMS products and prospects</a> than Gartner is, but at least now we&#8217;re in the same ballpark.</li>
<li>Along with Infobright, ParAccel, and SAND, <a href="http://www.dbms2.com/2011/11/12/exasol-update/">Exasol</a> appears to be another of the &#8220;good columnar technology/small company&#8221; crowd. As with other such products, one should be careful about fit-and-finish features that are missing today, as there is no assurance they&#8217;ll be added in a timely manner going forward.</li>
<li>illuminate Solutions, which was on last year&#8217;s Gartner list, <a href="http://www.dbms2.com/2012/01/16/has-illuminate-solutions-joined-the-choir-invisible/">now appears to be an ex-company</a>.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2012/02/08/gartner-magic-quadrant-data-warehouse-2011-2012/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Exasol update</title>
		<link>http://www.dbms2.com/2011/11/12/exasol-update/</link>
		<comments>http://www.dbms2.com/2011/11/12/exasol-update/#comments</comments>
		<pubDate>Sun, 13 Nov 2011 02:37:13 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Exasol]]></category>
		<category><![CDATA[Market share and customer counts]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Workload management]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=5661</guid>
		<description><![CDATA[I last wrote about Exasol in 2008. After talking with the team Friday, I&#8217;m fixing that now. The general theme was as you&#8217;d expect: Since last we talked, Exasol has added some new management, put some effort into sales and marketing, got some customers, kept enhancing the product and so on. Top-level points included: Exasol&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p><a href="../../../../../2008/08/16/exasol-technical-briefing/">I last wrote about Exasol in 2008</a>. After talking with the team Friday, I&#8217;m fixing that now. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  The general theme was as you&#8217;d expect: Since last we talked, Exasol has added some new management, put some effort into sales and marketing, got some customers, kept enhancing the product and so on.</p>
<p>Top-level points included:</p>
<ul>
<li>Exasol&#8217;s technical philosophy is substantially the same as before, albeit not with as extreme a focus on fitting everything in RAM.</li>
<li>Exasol believes its flagship DBMS EXASolution has great performance on a load-and-go basis.</li>
<li>Exasol has 25 EXASolution customers, all in Germany.*</li>
<li>5 of those are &#8220;cloud&#8221; customers, at hosting providers engaged by Exasol.</li>
<li>EXASolution database sizes now range from the low 100s of gigabytes up to 30 terabytes.</li>
<li>Pretty much the whole company is in Nuremberg.</li>
</ul>
<p><span id="more-5661"></span><em>*That excludes some money from Hitachi. Exasol&#8217;s Hitachi partnership is still in limbo, an apparent casualty of the world economic crisis.</em></p>
<p>On the technical side:</p>
<ul>
<li>As noted in my 2008 post, EXASolution is a columnar, no-head-node MPP (Massively Parallel Processing) DBMS.</li>
<li>The main way EXASolution compresses data is via dictionary/tokenization. 5:1 is a typical compression ratio before mirroring and so on, out of a 2-10:1 range.</li>
<li>EXASolution writes data to blocks in memory that are smaller than what is otherwise its preferred size (1/2 to 5 megabytes). These are sent to disk, where merge eventually happens. Exasol insists that write performance has always been fully satisfactory to customers to date.</li>
<li>EXASolution doesn&#8217;t have much in the way of performance tuning knobs. Exasol says they aren&#8217;t needed, and says that one really can start an EXASolution POC (Proof of Concept) in a day or so.</li>
<li>EXASolution doesn&#8217;t have much in the way of workload management capabilities, except what&#8217;s automagic (e.g., short query bias). However, it does collect statistics you can query via your favorite BI tool.</li>
<li>EXASolution doesn&#8217;t have much in the way of <a href="../../../../../2011/02/24/analytic-platforms/">analytic platform</a> capabilities, although there is some Lua-based scripting. However, there&#8217;s something NDA in the analytic platform area Coming Soon.*</li>
</ul>
<p>In general, the whole thing sounds somewhat like ParAccel, at least at a high level.</p>
<p><em>*Exasol is not and never has been our client, but we can keep secrets for them even so.</em></p>
<p>Naturally, Exasol believes EXASolution has fine concurrency, with at least one customer routinely running 2000 concurrent users, 200 concurrent sessions (via connection pooling), and 5-10 concurrent queries. Another customer has 3500 Cognos users. 1-200 concurrent queries appears to be the record peak load. Anyhow, Exasol says that plans to offer real workload management could be accelerated if a need were discovered.</p>
<p>Exasol says it almost never loses POCs, but admits that it competes fairly rarely against Vertica and ParAccel, no doubt for reasons of geography. Exasol boasts one visible Sybase IQ replacement (Sony Music).</p>
<p>While Exasol&#8217;s sales to date have been in Germany, there are plans to change that soon. At least one sales cycle is well underway in Eastern Europe. Offices in other Germanic countries are planned. Existing customers are planning to deploy additional copies outside Germany. Discussions are underway regarding other geographies, e.g. English-speaking ones.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/11/12/exasol-update/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Draft slides on how to select an analytic DBMS</title>
		<link>http://www.dbms2.com/2009/02/04/draft-slides-on-how-to-select-an-analytic-dbms/</link>
		<comments>http://www.dbms2.com/2009/02/04/draft-slides-on-how-to-select-an-analytic-dbms/#comments</comments>
		<pubDate>Wed, 04 Feb 2009 22:44:12 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Buying processes]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Exasol]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Infobright]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[Microsoft and SQL*Server]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[SAND Technology]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=681</guid>
		<description><![CDATA[I need to finalize an already-too-long slide deck on how to select an analytic DBMS by late Thursday night.  Anybody see something I&#8217;m overlooking, or just plain got wrong? Edit: The slides have now been finalized.]]></description>
			<content:encoded><![CDATA[<p>I need to finalize an already-too-long <a href="http://www.monash.com/uploads/How-to-buy-data-warehouse-draft-February-2009.ppt">slide deck</a> on how to select an analytic DBMS by late Thursday night.  Anybody see something I&#8217;m overlooking, or just plain got wrong?</p>
<p><em>Edit: The slides have now been <a href="http://www.dbms2.com/2009/02/06/final-for-now-slides-on-how-to-select-a-data-warehouse-dbms/">finalized</a>.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/02/04/draft-slides-on-how-to-select-an-analytic-dbms/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Dividing the data warehousing work among MPP nodes</title>
		<link>http://www.dbms2.com/2008/09/05/mpp-data-warehouse-nodes/</link>
		<comments>http://www.dbms2.com/2008/09/05/mpp-data-warehouse-nodes/#comments</comments>
		<pubDate>Fri, 05 Sep 2008 08:48:47 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Calpont]]></category>
		<category><![CDATA[Exasol]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=522</guid>
		<description><![CDATA[I talk with lots of vendors of MPP data warehouse DBMS. I&#8217;ve now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives. The base-case MPP DBMS architecture is one in which there are two kinds of nodes: A boss node, whose jobs include: Receiving [...]]]></description>
			<content:encoded><![CDATA[<p>I talk with lots of vendors of MPP data warehouse DBMS.  I&#8217;ve now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives.</p>
<p style="margin-bottom: 0in;"><span id="more-522"></span>The base-case MPP DBMS architecture is one in which there are two kinds of nodes:</p>
<ul>
<li>A boss node, whose jobs include:
<ul>
<li>Receiving and parsing queries</li>
<li>Optimizing queries, determining 	execution plans, and sending execution plans to the nodes</li>
<li>Receiving result sets and sending 	them back to the querier</li>
</ul>
</li>
<li>Worker nodes, which do their part 	of the query execution job and eventually ship data back to the head</li>
</ul>
<p style="margin-bottom: 0in;">In primitive forms of this architecture, there&#8217;s a “fat head” that does altogether too much aggregation and query resolution.  In more mature versions, data is shipped intelligently from worker nodes to their peers, reducing or eliminating “fat head” bottlenecks.</p>
<p style="margin-bottom: 0in;">Exceptions to the base case include Vertica and Exasol.  In their systems, all nodes run identical software.  At the other extreme, some vendors use dedicated nodes for particular purposes.  For example, Aster Data famously has special  nodes for bulk data loading and export.  Greenplum has a logical split between nodes that execute queries and nodes that talk to storage, and is considering offering the option of physically separating them in a future release.</p>
<p style="margin-bottom: 0in;">The basic tradeoffs between these schemes go something like this:</p>
<ul>
<li>If there are more kinds of 	dedicated nodes, real-time load-balancing is harder; you&#8217;re more 	likely to have idle capacity.</li>
<li>If there are more kinds of 	dedicated nodes, you can optimize hardware better, by using 	different kinds of hardware for different kinds of nodes.  	Potentially, this is a bigger factor if some kinds of nodes have 	dedicated disks attached and some don&#8217;t.</li>
</ul>
<p style="margin-bottom: 0in;">Calpont, which hasn&#8217;t actually shipped a DBMS yet, has an interesting twist. They&#8217;re building a columnar DBMS in which the querying work is split between a kind of worker node, which does the query processing, and a storage node, which talks to disk.  These nodes are not in any kind of one-to-one correspondence; any worker node can talk with any storage node.  Calpont believes that in the future some of the storage node logic can migrate into storage systems themselves, in almost a Netezza-like strategy, but on more standard equipment.</p>
<p style="margin-bottom: 0in;">The Calpont story may actually make more sense in a shared-disk storage-area-network implementation than for a fully shared-nothing MPP, but that&#8217;s a subject for a different post.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2008/09/05/mpp-data-warehouse-nodes/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Exasol technical briefing</title>
		<link>http://www.dbms2.com/2008/08/16/exasol-technical-briefing/</link>
		<comments>http://www.dbms2.com/2008/08/16/exasol-technical-briefing/#comments</comments>
		<pubDate>Sun, 17 Aug 2008 00:14:56 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exasol]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[EXACluster]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=488</guid>
		<description><![CDATA[It took 5 ½ months after my non-technical introduction, but I finally got a briefing from Exasol&#8217;s technical folks (specifically, the very helpful Mathias Golombek and Carsten Weidmann). Here are some highlights: Like Vertica and ParAccel, Exasol is in the business of MPP shared-nothing software-only columnar data warehouse database management. Exasol has no concept of [...]]]></description>
			<content:encoded><![CDATA[<p>It took 5 ½ months after my <a href="../2008/02/26/introduction-to-exasol/">non-technical introduction</a>, but I finally got a briefing from Exasol&#8217;s technical folks (specifically, the very helpful Mathias Golombek and Carsten Weidmann). Here are some highlights: <span id="more-488"></span></p>
<ul>
<li>
<p style="margin-bottom: 0in;"><a href="MPP shared-nothing software-only columnar data warehouse database management">Like 	Vertica and ParAccel, <strong>Exasol is in the business of MPP 	shared-nothing software-only columnar data warehouse database 	management</strong></a><strong>.</strong></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol has no concept of a 	“head” or “master” node,</strong> with different software than 	the others.  Instead, all nodes are peers.  For example, any node&#8217;s 	IP address can be given to an application; that node will then parse 	the SQL and distribute it appropriately to the other nodes.</p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol is ACID-compliant,</strong> swapping blocks to disk when there&#8217;s an update.  And one certainly 	can query data that&#8217;s on disk &#8230;</p>
</li>
<li>
<p style="margin-bottom: 0in;">&#8230; however, Exasol&#8217;s memory 	structures are totally optimized for in-memory operation.   Exasol 	is perfectly happy to swap in different parts of the database on a 	scheduled basis every few hours, but sending queries straight to 	disk isn&#8217;t optimal. <strong>Exasol&#8217;s recommended hardware configurations 	always are designed so that most queries can be executed against 	data already in RAM.</strong><span> However, 	if for example only the last 30 days of data are in RAM and a few 	queries go against full-year data, that&#8217;s OK.</span></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol has a compression story 	typical for a columnar DBMS vendor </strong><span>– 	heavy use of dictionary/token compression, other unspecified 	compression algorithms as well, data kept compressed in RAM, etc.</span></p>
</li>
<li>
<p style="margin-bottom: 0in;">Like most 	other MPP data warehousing vendors, <strong>Exasol partitions data among 	nodes via a hash key.</strong> This is the industry&#8217;s most common 	scheme, because it has the parallelization benefits of random/equal 	distribution of data, yet still lets you get a head start on some 	hairy hash joins for extra performance.</p>
</li>
<li>
<p style="margin-bottom: 0in;"><span>Like 	Vertica, </span><strong>Exasol replicates small tables </strong><span>(e.g., 	dimension tables)</span><strong> across each node. </strong></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol&#8217;s 	optimizer creates and maintains join indexes automagically on the 	fly. </strong><span>Exasol disagreed when I 	say “Oh, like a materialized view?”  But I suspect this is the 	kind of join index that Teradata says privately is a special case of 	materialized view, and says publicly is <a href="http://datawarehouse.ittoolbox.com/groups/technical-functional/teradata-l/join-indexes-vs-classic-denormalized-base-table-1386909">a 	lot like a materialized view</a>.</span></p>
</li>
<li>
<p style="margin-bottom: 0in;"><span>Generally, 	Exasol describes its optimizer as being “very MPP-aware.”</span></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol mainly wrote its own 	code from scratch. </strong><span>Right now 	they seem to have a kind of distributed operating system called 	EXACluster running over Linux, but they seem to be replacing the 	Linux underpinnings with their own stuff.  E.g., disk access is 	going into EXACluster.</span></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>EXACluster already handles high 	availability/failover between nodes.</strong></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol replicates data among 	nodes to allow for failover.</strong><span> That sounds similar to Vertica&#8217;s approach. Also, if you add nodes 	and restart Exasol, the database will automagically be 	repartitioned.</span></p>
</li>
<li>
<p style="margin-bottom: 0in;"><span>The 	biggest deployed Exasol system mentioned has </span><strong>3 terabytes 	of user data.</strong> It is running on 5 nodes w/ 32 GB of RAM each.</p>
</li>
<li>
<p style="margin-bottom: 0in;"><span>For 	any given amount of total RAM a user is willing to deploy, </span><strong>Exasol 	recommends more nodes with less RAM/node. </strong> I didn&#8217;t probe 	directly as to why.</p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol 	doesn&#8217;t have stored procedures.</strong> They assert that stored 	procedures would be useful mainly for ELT/ETL, and that alternatives 	perform well enough.</p>
</li>
<li>
<p style="margin-bottom: 0in;">Like many 	data warehouse specialists, <strong>Exasol recommends ELT 	(Extract/Load/Transform) over ETL (Extract/Transform/Load).</strong></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol has user-defined 	functions (UDFs).</strong></p>
</li>
<li>
<p style="margin-bottom: 0in;"><strong>Exasol is 	working on BLOB support.</strong> Geospatial data is also on the radar 	(no pun intended), but it didn&#8217;t sound as if there was a currently 	active project.</p>
</li>
</ul>
<p style="margin-bottom: 0in;">We also talked about concurrency, which is always a confusing subject.  Exasol said that to date there were no more than 50 concurrent “log-ins,” which they equate to there being 1000s of named users (because queries execute so quickly).  They also say they&#8217;ve tested up to 400 concurrent queries internally.  I didn&#8217;t probe about what they&#8217;d do to balance short-running and long-running queries, in part because Exasol gives the impression that on their systems, there is no such thing as a long-running query.  But obviously this is all somewhat fuzzy.</p>
<p style="margin-bottom: 0in;">In a related point, Exasol says that overall throughput is higher when there is at least a certain number of concurrent users.  The supporting evidence offered was, of all things, <a href="http://www.tpc.org/tpch/results/tpch_price_perf_results.asp">TPC-H benchmarks</a>.  Apparently (I haven&#8217;t checked this myself), Exasol (and also ParAccel, which of course has a similar architecture) chose to run the benchmark with more than the minimum number of simultaneous users required.  SMP systems, Exasol believes, don&#8217;t exhibit similar behavior.</p>
<p style="margin-bottom: 0in;">Finally, a couple of less technical highlights:</p>
<ul>
<li>
<p style="margin-bottom: 0in;"><strong>Licensing is per-gigabyte of 	RAM. </strong> (This fits with the whole memory-centric orientation.) 100 	gigabytes of RAM are 120,000 Euros list price.  Price doesn&#8217;t scale 	linearly with amount of RAM.</p>
</li>
<li>
<p style="margin-bottom: 0in;">The partner whose name was 	redacted in February is now officially disclosed.  <strong>Exasol is 	partnering in Japan with the services side of Hitachi. </strong> Exasol 	says Hitachi has 15-20 people working on introducing Exasol to 	Japan.  Target customers are not primarily Hitachi&#8217;s hardware 	installed base.</p>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2008/08/16/exasol-technical-briefing/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Compare/constrast of Vertica, ParAccel, and Exasol</title>
		<link>http://www.dbms2.com/2008/08/12/vertica-paraccel-exasol/</link>
		<comments>http://www.dbms2.com/2008/08/12/vertica-paraccel-exasol/#comments</comments>
		<pubDate>Tue, 12 Aug 2008 22:31:21 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Exasol]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=485</guid>
		<description><![CDATA[I talked with Exasol today – at 5:00 am! &#8212; and of course want to blog about it. For clarity, I&#8217;d like to start by comparing/contrasting the fundamental data structures at Vertica, ParAccel, and Exasol. And it feels like that should be a separate post. So here goes. Exasol, Vertica, and ParAccel all store data [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked with Exasol today – at 5:00 am! &#8212; and of course want to blog about it.  For clarity, I&#8217;d like to start by comparing/contrasting the fundamental data structures at Vertica, ParAccel, and Exasol.  And it feels like that should be a separate post.  So here goes.</p>
<ul>
<li>Exasol, Vertica, and ParAccel all 	store data in columnar formats.</li>
<li>Exasol, Vertica, and ParAccel all 	compress data heavily.</li>
<li><span style="text-decoration: line-through;">Exasol and Vertica operate on 	in-memory data in compressed formats.  ParAccel decompresses the 	data when it gets to RAM</span>.   Exasol, Vertica, and ParAccel all &#8212; perhaps to varying extents &#8212; operate on 	in-memory data in compressed formats.</li>
<li>ParAccel and Exasol write data to 	what amounts to the in-memory part of their basic data structures; 	the data then gets persisted to disk.  Vertica, however, has a 	separate in-memory data structure to accept data and write it to 	disk.</li>
<li>Vertica is a disk-centric system 	that doesn&#8217;t rely on there being a lot of RAM.</li>
<li>ParAccel can be described that way 	too; however, in some cases (including on the TPC-H benchmarks), 	ParAccel recommends loading all your data into RAM for maximum 	performance.</li>
<li>Exasol is totally optimized for 	the assumption that queries will be run against data that had 	already been previously loaded into RAM.</li>
</ul>
<p>Beyond the above, I plan to discuss in a separate post how Exasol does MPP shared-nothing software-only columnar data warehouse database management differently than Vertica and ParAccel do shared-nothing software-only columnar data warehouse database management. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2008/08/12/vertica-paraccel-exasol/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Introduction to Exasol</title>
		<link>http://www.dbms2.com/2008/02/26/introduction-to-exasol/</link>
		<comments>http://www.dbms2.com/2008/02/26/introduction-to-exasol/#comments</comments>
		<pubDate>Wed, 27 Feb 2008 01:08:32 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exasol]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[Relational database management systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/2008/02/26/introduction-to-exasol/</guid>
		<description><![CDATA[I had a non-technical introduction today to Exasol, a data warehouse specialist that has gotten a little buzz recently for publishing TPC-H results even faster than ParAccel&#8217;s. Here are some highlights: Exasol was founded back in 2000. Exasol is a German company, with 60 employees. While I didn&#8217;t ask, the vast majority are surely German. [...]]]></description>
			<content:encoded><![CDATA[<p>I had a non-technical introduction today to Exasol, a data warehouse specialist that has gotten a little buzz recently for publishing TPC-H results even faster than ParAccel&#8217;s.  Here are some highlights:</p>
<ul>
<li>Exasol was founded back in 2000.</li>
<li>Exasol is a German company, with 60 employees.  While I didn&#8217;t ask, the vast majority are surely German.</li>
<li>Exasol has two customers. 6-8 more are Coming Real Soon. Most or all of those are in Germany, although one may be in Asia.</li>
<li>Karstadt (big German retailer) has had Exasol deployed for 3 years. The other deployed customer is the German subsidiary of data provider IMS Health.</li>
<li>[Redacted for confidentiality] is a strategic investor in and partner of Exasol.  [Redacted for confidentiality]&#8216;s only competing partnership is with Oracle.</li>
<li>Exasol&#8217;s system is more completely written from scratch than many.  E.g., all they use from Linux are some drivers, and maybe a microkernel.</li>
<li>Exasol runs in-memory. There doesn&#8217;t seem to be a disk-centric mode.</li>
<li>Exasol&#8217;s data access methods are sort of like columnar, but not exactly.  I look forward to a more technical discussion to sort that out.</li>
<li>Exasol&#8217;s claimed typical compression is 5-7X. As in the Vertica story, database operations are carried out on compressed data.</li>
<li>Exasol says it has performed a very fast TPC-H inhouse at the 30 terabyte level. However, its deployed sites are probably a lot smaller than that.  IMS Health is cited in its literature as 145 gigabytes.</li>
<li>Oracle and Microsoft are listed as Exasol partners, so there may be some kind of plug-compatibility or back-end processing story.</li>
</ul>
<p><em><strong></strong><br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2008/02/26/introduction-to-exasol/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

