<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; ParAccel</title>
	<atom:link href="http://www.dbms2.com/category/products-and-vendors/paraccel/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Wed, 08 Feb 2012 12:22:57 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Eight kinds of analytic database (Part 1)</title>
		<link>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-1/</link>
		<comments>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-1/#comments</comments>
		<pubDate>Tue, 05 Jul 2011 08:17:44 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Buying processes]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Database diversity]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Infobright]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[MOLAP]]></category>
		<category><![CDATA[Microsoft and SQL*Server]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[Predictive modeling and advanced analytics]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[QlikTech and QlikView]]></category>
		<category><![CDATA[SAND Technology]]></category>
		<category><![CDATA[Scientific research]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[Web analytics]]></category>
		<category><![CDATA[Workload management]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4868</guid>
		<description><![CDATA[Analytic data management technology has blossomed, leading to many questions along the lines of &#8220;So which products should I use for which category of problem?&#8221; The old EDW/data mart dichotomy is hopelessly outdated for that purpose, and adding a third category for &#8220;big data&#8221; is little help. Let&#8217;s try eight categories instead. While no categorization [...]]]></description>
			<content:encoded><![CDATA[<p>Analytic data management technology has blossomed, leading to many questions along the lines of &#8220;So which products should I use for which category of problem?&#8221; The old EDW/data mart dichotomy is hopelessly outdated for that purpose, and adding a third category for &#8220;big data&#8221; is little help.</p>
<p>Let&#8217;s try eight categories instead. While <a href="http://www.strategicmessaging.com/no-market-categorization-is-ever-precise/2011/03/01/">no categorization is ever perfect</a>, these each have at least some degree of technical homogeneity. Figuring out which types of analytic database you have or need &#8212; and in most cases you&#8217;ll need several &#8212; is a great early step in your analytic technology planning.  <span id="more-4868"></span></p>
<p><strong><em>Enterprise data warehouse</em></strong> (Full or partial)</p>
<ul>
<li><em>Kinds of data likely to be included:</em> All, but especially operational</li>
<li><em>Likely use styles:</em> All</li>
<li><em>Canonical example:</em> Central EDW for a big enterprise</li>
<li><em>Stresses:</em> Concurrency, reliability, workload management</li>
</ul>
<p>The enterprise data warehouse (EDW) ideal says that you copy all your data into one place, and drive all decision-making from there. <a href="../../../../../2011/06/21/its-official-the-grand-central-edw-will-never-happen/">Full EDWs are pipedreams</a>. Still, a partial EDW makes sense for most large enterprises, and many indeed already have one. The first product lines to consider for classical EDWs are Teradata, DB2, Exadata, and maybe Microsoft SQL Server, especially if you&#8217;re going to stress concurrency and/or operational use cases.</p>
<p><strong><em>Traditional data mart</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included:</em> All</li>
<li><em>Likely use styles:</em> Business intelligence, budgeting/consolidation, investigative</li>
<li><em>Examples:</em> Reporting servers, planning/consolidation servers, anything MOLAP, etc.</li>
<li><em>Stresses:</em> Performance, concurrency, TCO</li>
</ul>
<p>Whether or not you have something like an enterprise data warehouse, it&#8217;s common to have lighter-weight data marts as well. A traditional data mart might drive reports and dashboards. Or it might be specialized for budgeting, planning, and/or consolidation.  Some <a href="../../../../../2011/03/03/investigative-analytics/">investigative analytics</a> may be in the mix as well.</p>
<p>Any DBMS that can support an EDW can also support a data mart, but it may not be the most cost-effective way to do so. Columnar DBMS might have more attractive performance and TCO (Total Cost of Ownership); the same goes for Netezza. Some of them &#8212; e.g. Sybase IQ and <a href="../../../../../2011/06/20/vertica-release-5/">Vertica</a> &#8212; have excellent track records in concurrent usage as well. <a href="../../../../../2011/05/29/when-to-use-relational-database-management-system/">Ted Codd</a> pushed what amounts to MOLAP (Multidimensional OnLine Analytic Processing) systems for these use cases. But relational DBMS commonly do a better job, which is one reason most major MOLAP products have wound up at RDBMS companies.</p>
<p><strong><em>Investigative data mart &#8212; agile</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included:</em> All, especially customer-centric</li>
<li><em>Likely use styles</em>: Investigative</li>
<li><em>Canonical example:</em> A few analysts getting a few TB to examine</li>
<li><em>Stresses:</em> Ease of setup/load, ease of admin, price/performance</li>
</ul>
<p>Besides the traditional data mart, there are at least two other kinds. Both are focused on investigative analytics, but they&#8217;re differentiated by database size.</p>
<p>If you have just a few analysts,* looking at no more than a few terabytes of data (perhaps even just some gigabytes) &#8212; and if that data is &#8220;single-subject&#8221; and fairly homogenous &#8212; your watchwords should be &#8220;cheap&#8221;, &#8220;easy&#8221;, and &#8220;fast&#8221;. You don&#8217;t need to invest in much hardware, in expensive software, in much administrative effort (the analysts can be their own DBAs),  nor should you endure much set-up time. Just grab a product, grab some data, and start running queries (or extracts into the statistical tool of your choice).</p>
<p><em>*If you have dozens or even hundreds of analysts hitting the same database, you&#8217;re probably back to the more concurrency-oriented scenarios outlined above.</em></p>
<p>Infobright is often cost-effective among columnar analytic DBMS. Other vendors might cut you a price break as well. If you have multiple terabytes of data, don&#8217;t rule out Netezza&#8217;s lowest-end products (even if they&#8217;d really rather sell you something bigger). Or, if you&#8217;re in the sub-terabyte range, maybe you can get by with an in-memory BI tool such as QlikView, and not do anything special on the DBMS side at all.</p>
<p><strong><em>Investigative data mart &#8212; big</em></strong></p>
<ul>
<li><em>Kinds of data likely to be included:</em> All, especially customer-centric, logs, financial trade, scientific</li>
<li><em>Likely use styles</em>: Investigative</li>
<li><em>Canonical example:</em> Single-subject 20 TB &#8211; 20 PB relational database<em></em></li>
<li><em>Stresses:</em> Performance, scale-out, analytic functionality</li>
</ul>
<p>But if you&#8217;re looking at tens of terabytes of relational data, or even more, you really do have a &#8220;big data&#8221; problem. Performance and scalability are major challenges, usually best addressed by MPP (Massively Parallel Processing) systems, such as Netezza, Vertica, Aster Data, ParAccel, Teradata, or Greenplum. Performance POCs (Proofs Of Concept) are a big part of the buying process. Vendor price negotiations are crucial too.</p>
<p><em>Actually, in the low tens of terabytes you might be able to get away with a shared-disk system that has excellent compression &#8212; e.g., columnar products like Sybase IQ, Infobright, or SAND, rather than just Vertica and ParAccel.</em></p>
<p>Assuming you have affordable, scalable query performance, the competitive differentiator can switch to additional analytic functionality. Aster, Netezza, ParAccel, Vertica, and Greenplum either offer full <a href="../../../../../2011/02/24/analytic-platforms/">analytic platforms</a>, or seem to be on the path to doing so. Teradata, which now owns Aster Data, offers substantial built-in analytic capability in its traditional products as well, and the same goes for Sybase IQ.</p>
<p><em>Continued in <a href="http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-2/">Part 2</a>,</em><em> where we cover some of the more difficult use cases.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/07/05/eight-kinds-of-analytic-database-part-1/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The Vertica story (with soundbites!)</title>
		<link>http://www.dbms2.com/2011/06/20/vertica-release-5/</link>
		<comments>http://www.dbms2.com/2011/06/20/vertica-release-5/#comments</comments>
		<pubDate>Mon, 20 Jun 2011 06:14:56 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4777</guid>
		<description><![CDATA[I&#8217;ve blogged separately that: Vertica has a bunch of customers, including seven with 1 or more petabytes of data each. Vertica has progressed down the analytic platform path, with Monday&#8217;s release of Vertica 5.0. And of course you know: Vertica (the product) is columnar, MPP, and fast.* Vertica (the company) was recently acquired by HP.** [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve blogged separately that:</p>
<ul>
<li><a href="../../../../../2011/06/20/columnar-dbms-vendor-customer-metrics/">Vertica      has a bunch of customers</a>, including <strong>seven with 1 or more petabytes of      data each.</strong></li>
<li><a href="http://www.dbms2.com/2011/06/20/vertica-as-an-analytic-platform/">Vertica      has progressed down the analytic platform path</a>, with Monday&#8217;s release      of Vertica 5.0.</li>
</ul>
<p>And of course you know:</p>
<ul>
<li>Vertica (the product) is columnar, MPP, and fast.*</li>
<li>Vertica (the company) was recently acquired by HP.**</li>
</ul>
<p><span id="more-4777"></span><em>*Similar things seem true of ParAccel, but most of the other serious columnar analytic DBMS aren&#8217;t actually MPP (Massively Parallel Processing) yet. More precisely, they have  shared-everything architectures, especially on the storage level.</em></p>
<p><em>** Vertica says it has a &#8220;staggering&#8221; pipeline now that it&#8217;s been with HP for a few months.  I also gather that the post-merger HP/Vertica appliance product line formally rolled out last week.</em></p>
<p><em> </em></p>
<p>As for product maturity:</p>
<ul>
<li><a href="../../../../../2010/02/22/vertica-4/">Vertica 4.0</a> cleaned up a lot of stuff.</li>
<li>Vertica 5.0 goes further in a variety of areas, notably clustering administration and database tuning/design.</li>
</ul>
<p>But here&#8217;s something I hadn&#8217;t fully realized &#8212; <strong>Vertica claims concurrent usage as a competitive strength</strong>. By this I mean:</p>
<ul>
<li>Vertica says that it      has some customers with 1000s of users, in BI/dashboarding kinds of      applications.</li>
<li>Vertica asserts it can      support 1000 users on a single appliance rack.</li>
<li>Vertica tries to drive      POCs (Proofs Of Concept) towards testing concurrency.</li>
</ul>
<p>This is all consistent with <a href="../../../../../2010/04/16/story-of-an-analytic-dbms-evaluation/">a user example I blogged about last year</a>.</p>
<p>That said, while Vertica introduced respectable workload management features in Vertica 4.0, its main claim to concurrency is simply speed &#8212; if each query ends quickly, you never have to execute all that many of them at once.</p>
<p>Anyhow, there will (or at least should be) articles written about Vertica 5.0, and I may not be that easy to find for comment, what with <a href="../../../../../2011/06/19/investigative-analytics-derived-data/">Enzee Universe</a> and all. So here are a few <strong>Vertica soundbites:</strong></p>
<ul>
<li>Having seven petabyte-level commercial      users is an impressive testament to Vertica&#8217;s scalability. I think only      Teradata could best that number among analytic DBMS, unless you want to      count Hadoop/Hive.</li>
<li>Vertica&#8217;s analytic platform capabilities      are new, and initially not as rich as <a href="../../../../../2010/02/22/aster-data-ncluster-4-5/">Aster      Data&#8217;s</a> or <a href="../../../../../2011/04/17/netezza-twinfin-i-class-overview/">Netezza&#8217;s</a>,      especially in the area of language support. But they&#8217;re a good first step.</li>
<li>Judging by the examples of EMC/Greenplum      and IBM/Netezza, Vertica&#8217;s honeymoon period at HP is likely to last for a      while. <em>(Edit: That said, not all is peachy at <a href="http://www.dbms2.com/2011/04/16/unpacking-the-emc-greenplum-q1-sales-disaster-rumors/">EMC/Greenplum</a>.)</em></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/06/20/vertica-release-5/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Columnar DBMS vendor customer metrics</title>
		<link>http://www.dbms2.com/2011/06/20/columnar-dbms-vendor-customer-metrics/</link>
		<comments>http://www.dbms2.com/2011/06/20/columnar-dbms-vendor-customer-metrics/#comments</comments>
		<pubDate>Mon, 20 Jun 2011 05:41:54 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Games and virtual worlds]]></category>
		<category><![CDATA[Infobright]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[Market share and customer counts]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[SAND Technology]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4742</guid>
		<description><![CDATA[Last April, I asked some columnar DBMS vendors to share customer metrics. They answered, but it took until now to iron out a couple of details. Overall, the answers are pretty impressive.  Sybase said that Sybase IQ had &#62; 2000 direct customers and &#62;500 indirect customers (i.e., end customers of OEMs). That&#8217;s counting by customers; [...]]]></description>
			<content:encoded><![CDATA[<p>Last April, I asked some columnar DBMS vendors to share customer metrics. They answered, but it took until now to iron out a couple of details. Overall, the answers are pretty impressive.  <span id="more-4742"></span></p>
<p>Sybase said that <strong>Sybase IQ </strong>had<strong> &gt; 2000 direct customers </strong>and<strong> &gt;500 indirect customers</strong> (i.e., end customers of OEMs). That&#8217;s counting by customers; I know from prior discussions that Sybase IQ is running at close to two installations per customer. I also believe that Sybase counts different divisions of the same large enterprise as separate customers.</p>
<p><strong>Vertica</strong> cited a figure of <strong>500 customers</strong> as of April (end Q1?), which is close to <strong>600</strong> now, about <strong>40% or a little more direct.</strong> The difference between this and a <a href="http://www.dbms2.com/2011/02/14/now-we-know-why-vertica-has-been-so-weirdly-evasive/">2010 year-end figure of 328</a> is not only new sales, but also slow reporting by OEMs.  One cool figure &#8212; a single OEM reported 82 end sales in a single (quarterly?) report. And a number of those direct customers are substantial; Vertica&#8217;s <a href="http://www.vertica.com/customers/">customer logo</a> page features lots of telcos, lots of internet companies, and the national operation of Blue Cross/Blue Shield.</p>
<p><em>Pay no attention to small inconsistencies in the number of Vertica direct  customers (250 at year-end, no more than that now); Colin Mahony just  estimates these numbers for me from memory, and minor inaccuracies are quite excusable.</em></p>
<p>Even cooler &#8212; <strong>Vertica </strong>reports <strong>7 customers with a petabyte or more of user data each.</strong> About 5 of the 7 are obvious-suspect big-name firms; but unsurprisingly, those big names are NDA. I did secure permission to say that there are 2 telecom companies, a mobile gaming vendor, another internet company, and 3 financial services outfits of various kinds.</p>
<p><strong>SAND Technology </strong>reported <strong>&gt;600 total customers,</strong> including<strong> &gt;100 direct. </strong>Since SAND has been around since the 1990s, those aren&#8217;t great average annual figures, but they&#8217;re probably more than many people (including me) thought.</p>
<p><strong>Infobright</strong> reported around <strong>200 total paying customers, 130 direct.</strong> There are surely a lot more users of open source Infobright, but precise numbers are of course hard to come by.</p>
<p>If I asked <strong>ParAccel</strong> in the April go-round, I&#8217;ve misplaced their answer, but back in October the figure was &gt;30 customers, 2 of them over 100 terabytes. I&#8217;ve seen published figures of 40+ for ParAccel since.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/06/20/columnar-dbms-vendor-customer-metrics/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Updating our vendor client disclosures</title>
		<link>http://www.dbms2.com/2011/02/28/updating-our-vendor-client-disclosures/</link>
		<comments>http://www.dbms2.com/2011/02/28/updating-our-vendor-client-disclosures/#comments</comments>
		<pubDate>Mon, 28 Feb 2011 08:03:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[About this blog]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Infobright]]></category>
		<category><![CDATA[Intel]]></category>
		<category><![CDATA[MarkLogic]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[QlikTech and QlikView]]></category>
		<category><![CDATA[SAND Technology]]></category>
		<category><![CDATA[SAP AG]]></category>
		<category><![CDATA[Schooner Information Technology]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Tableau Software]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[dbShards and CodeFutures]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3906</guid>
		<description><![CDATA[From time to time, I disclose our vendor client lists. Another iteration is below. To be clear: This is a list of Monash Advantage members. All our vendor clients are Monash Advantage members, unless &#8230; &#8230; we work with them primarily in their capacity as technology users. (A large fraction of our user clients happen [...]]]></description>
			<content:encoded><![CDATA[<p>From time to time, I <a href="http://www.monashreport.com/2010/01/06/updating-our-disclosures/">disclose</a> our vendor client lists. Another iteration is below. To be clear:</p>
<ul>
<li>This is a list of <a href="http://www.monash.com/advantage.html"><strong><em>Monash Advantage</em></strong></a> members.</li>
<li>All our vendor clients are <strong><em>Monash Advantage</em></strong> members, unless &#8230;</li>
<li>&#8230; we work with them primarily in their capacity as technology users. (A large fraction of our user clients happen to be SaaS vendors.)</li>
<li>We do not usually disclose our user clients.</li>
<li>We do not usually disclose our venture capital clients, nor those who invest in publicly-traded securities.</li>
<li>Included in the list below are two expired <strong><em>Monash Advantage</em></strong> members who haven&#8217;t said they will renew, as mentioned in <a href="http://www.strategicmessaging.com/money-analyst-attention-and-implied-analyst-endorsement/2011/02/28/">my recent post on analyst bias</a>. (You can probably imagine a couple of reasons for that obfuscation.)</li>
</ul>
<p>With that said, our vendor client disclosures at this time are:</p>
<ul>
<li>Aster Data</li>
<li>Cloudera</li>
<li>CodeFutures/dbShards</li>
<li>Couchbase</li>
<li>EMC/Greenplum</li>
<li>Endeca</li>
<li>IBM/Netezza</li>
<li>Infobright</li>
<li>Intel</li>
<li>MarkLogic</li>
<li>ParAccel</li>
<li>QlikTech</li>
<li>salesforce.com/database.com</li>
<li>SAND Technology</li>
<li>SAP/Sybase</li>
<li>Schooner Information Technology</li>
<li>Skytide</li>
<li>Splunk</li>
<li>Teradata</li>
<li>Vertica</li>
</ul>
<p><span id="more-3906"></span>That list includes the two I&#8217;m obfuscating, plus one more who just emailed to say a signed renewal contract is arriving this week. It does not include others who, less concretely, have said they will sign up soon.</p>
<p>Also, I guess there&#8217;s a bit of a gray area for Tableau. As far as I&#8217;m concerned, I&#8217;m doing <a href="http://www.dbms2.com/2011/02/12/upcoming-webinar-on-investigative-analytics/">an upcoming co-sponsored webinar</a> just for <em><strong>Monash Advantage</strong></em> member Aster Data. Indeed, I declined to contract with or bill Tableau directly for its share,  because I had no good way to do that paperwork. But even so, Tableau is a cosponsor, was involved in the planning discussions and, behind the scenes, is surely footing part of the bill.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/02/28/updating-our-vendor-client-disclosures/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Comments on the Gartner 2010/2011 Data Warehouse Database Management Systems Magic Quadrant</title>
		<link>http://www.dbms2.com/2011/02/05/gartner-magic-quadrant-data-warehouse-database-management-2010/</link>
		<comments>http://www.dbms2.com/2011/02/05/gartner-magic-quadrant-data-warehouse-database-management-2010/#comments</comments>
		<pubDate>Sat, 05 Feb 2011 15:49:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[1010data]]></category>
		<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Infobright]]></category>
		<category><![CDATA[Ingres]]></category>
		<category><![CDATA[Microsoft and SQL*Server]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[SAND Technology]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Vertica Systems]]></category>
		<category><![CDATA[Workload management]]></category>
		<category><![CDATA[illuminate Solutions]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3744</guid>
		<description><![CDATA[The Gartner 2010 Data Warehouse Database Management Systems Magic Quadrant is out. I shall now comment, just as I did to varying degrees on the 2009, 2008, 2007, and 2006 Gartner Data Warehouse Database Management System Magic Quadrants. Note: Links to Gartner Magic Quadrants tend to be unstable. Please alert me if any problems arise; [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.gartner.com/technology/media-products/reprints/teradata/vol3/article1/article1.html">Gartner 2010 Data Warehouse Database Management Systems Magic Quadrant</a> is out. I shall now comment, just as I did to varying degrees on the <a href="../../../../../2010/02/10/gartner-magic-quadrant-data-warehouse-2009-2010/">2009</a>, <a href="../../../../../2009/01/12/gartners-2008-data-warehouse-database-management-system-magic-quadrant-is-out/">2008</a>, <a href="../../../../../2007/10/19/gartner-2007-magic-quadrant-for-data-warehouse-database-management-systems/">2007</a>, and <a href="../../../../../2006/10/03/vendor-segmentation-for-data-warehouse-dbms/">2006</a> Gartner Data Warehouse Database Management System Magic Quadrants.</p>
<p><em>Note: Links to Gartner Magic Quadrants tend to be unstable. Please alert me if any problems arise; I&#8217;ll edit accordingly.</em></p>
<p>In <a href="../../../../../2009/01/12/gartners-2008-data-warehouse-database-management-system-magic-quadrant-is-out/">my comments on the 2008 Gartner Data Warehouse Database Management Systems Magic Quadrant</a>, I observed that <strong>Gartner&#8217;s &#8220;completeness of vision&#8221; scores were generally pretty reasonable,</strong> but their<strong> &#8220;ability to execute&#8221; rankings were somewhat bizarre;</strong> the same remains true this year. For example, Gartner ranks Ingres higher by that metric than Vertica, Aster Data, ParAccel, or Infobright. Yet each of those companies is growing nicely and delivering products that meet serious cutting-edge analytic DBMS needs, neither of which has been true of Ingres since about 1987.  <span id="more-3744"></span></p>
<p>The general list of &#8220;market forces, end-user expectations and vendors&#8217; resulting solution approaches&#8221; at the top of the 2010 Gartner Data Warehouse Database Management System Magic Quadrant article is a mixed bag. Following Gartner&#8217;s order, I&#8217;ll address those first, and particular companies cited afterwards. Specific items and comments include:</p>
<ul>
<li><strong>&#8220;Increased demand for optimization techniques and performance enhancement.</strong><strong>&#8220;</strong> Gartner seems to be saying that data warehouse DBMS buyers want lists of specific, esoteric performance features. Well, buyers always want their DBMS to run fast, and they&#8217;d like the products to be mature enough to have been through a few rounds of <a href="../../../../../2009/08/21/bottleneck-whack-a-mole/">Bottleneck Whack-A-Mole</a>, but otherwise I&#8217;m not sure I&#8217;d put that at the top of my list.</li>
<li><strong>&#8220;</strong><strong>The argument made by purchasing departments that buying power increases when dealing with a single, incumbent vendor.</strong><strong>&#8220;</strong><strong> </strong>I agree that <a href="../../../../../2011/02/02/exadata-notes/">vendor consolidation and account control</a> are a huge part of the Oracle, Microsoft, IBM and even Teradata stories. (Vertica can prove it&#8217;s 10X more price-performant than Oracle and still not get the business.) But it&#8217;s not just about price negotiations; once annual maintenance is included, one has to squint pretty hard to see Oracle as a low-cost alternative. Also important is reducing the number of total product-specific skill-sets needed on the IT staff.</li>
<li><strong>&#8220;</strong><strong>Prepackaged, prebalanced warehouse environments delivered using data warehouse appliances.</strong><strong>&#8220;</strong> Yep. To varying extents, Oracle, Microsoft, Teradata, and IBM are all committed to designed-hardware strategies.</li>
<li><strong>&#8220;</strong><strong>Expectations for the delivery of on-site POCs.</strong><strong>&#8220;</strong> Honestly, not as many buyers insist on on-site Proofs of Concept as should. Still, Oracle is shameful in its reluctance to do them. (Teradata tries to avoid them too, for obvious reasons of expense, but is much more gracious about capitulating when the buyer insists.)</li>
<li><strong>&#8220;</strong><strong>Cost controls and data warehouse performance management.</strong><strong>&#8220;</strong><strong> </strong>See next comment.</li>
<li><strong>&#8220;</strong><strong>Demands for delivering a fully mixed workload.</strong><strong>&#8220;</strong><strong> </strong>I&#8217;d have phrased the workload management and administrative tools points rather differently than this, but so be it.<strong> </strong></li>
<li><strong>&#8220;</strong><strong>Demands for departmental analytics delivered quickly via data marts.</strong><strong>&#8220;</strong><strong> </strong>Agreed. Data-mart-only installations are a huge part of the market of the analytic DBMS market. <a href="../../../../../2009/06/08/the-future-of-data-marts/">Data mart spin-out</a> is also important.</li>
<li><strong>&#8220;</strong><strong>Wider indexing and fast performance within clusters of data, delivered via column-based solutions.</strong><strong>&#8220;</strong> This bizarrely seems to conflate column stores and parallel processing (both of which are of course highly important).</li>
<li><strong>&#8220;</strong><strong>A wave of new data warehouse implementers seeking fast-track, low-risk delivery.</strong><strong>&#8220;</strong> Well, yes. Netezza noticed that quite some years ago. And by now the <a href="../../../../../2010/04/12/enterprise-data-warehouse-edw-myt/">long-gestation EDW (Enterprise Data Warehouse)</a> is widely disliked.</li>
<li><strong>&#8220;</strong><strong>Global organizations seeking distributed solutions as potential architecture.</strong><strong>&#8220;</strong> If this is the MPP point, it&#8217;s oddly phrased. If this is a suggestion that data warehouses should be partitioned across wide-area networks, it&#8217;s just plain odd. If it&#8217;s a reiteration that departments like to control their own data marts, I agree. And if it&#8217;s a comment on keep-data-in-the-country privacy laws, it could be the most prescient thing Donald Feinberg has said in many years.</li>
</ul>
<p>Long though it is, that list of general items and issues for the 2010 Gartner Data Warehouse Database Management System Magic Quadrant has some gaps. Most glaringly, I don&#8217;t see any references to <a href="../../../../../2011/01/24/analytic-computing-system/">advanced analytics</a> in general, or even to the specific case of <a href="../../../../../2010/05/15/further-clarifying-in-database-mpp-sas/">integrated predictive analytics</a>. There&#8217;s also nothing about solid-state memory or other storage-technology considerations, although in fairness it&#8217;s still early days for much of what vendors conceive of as competitive differentiation in those respects.</p>
<p>Here are some vendor-specific comments on the 2010 Gartner Data Warehouse Database Management System Magic Quadrant:</p>
<ul>
<li>It&#8217;s pretty bizarre to compare <strong>1010data</strong> to database.com or Microsoft Azure. Kognitio would be a better choice. So would cloud-hosted instances of Vertica, Aster Data nCluster, or others.</li>
<li>Gartner&#8217;s comments on <strong>Aster Data</strong> and nCluster are actually pretty reasonable.</li>
<li>Gartner&#8217;s comments on <strong>EMC/Greenplum</strong> are a bit Kool-Aid-drinky, and don&#8217;t account for the inevitable flailing that occurs right after an acquisition. But otherwise they&#8217;re pretty reasonable.</li>
<li>I don&#8217;t take <strong>IBM&#8217;s</strong> super-comprehensive-all-inclusive architectural stories as seriously as Gartner does.</li>
<li>I don&#8217;t take <strong>Netezza&#8217;s</strong> small stable of OEM partners as seriously as Gartner does. I also don&#8217;t share Gartner&#8217;s optimism for the continuation of Netezza&#8217;s NEC partnership in the face of IBM&#8217;s Netezza ownership.</li>
<li>I&#8217;m even more skeptical about <a href="../../../../../2008/03/27/the-illuminate-guys-have-a-cto-blog/">illuminate</a> than Gartner is.</li>
<li>I&#8217;m delighted that Gartner has adopted my phrase <a href="../../../../../2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a> <strong>(Infobright</strong> is one of several firms pushing that one).</li>
<li>&#8220;Only open-source column-store DBMS&#8221; is a bit exaggerated, but Infobright is indeed the only one with serious traction, or offered by a serious analytic DBMS vendor.</li>
<li>What Gartner said in connection with <strong>Ingres</strong> is too inaccurate to deserve detailed attention.</li>
<li>While Gartner&#8217;s write-up of <strong>Kognitio</strong> is a bit confused, that&#8217;s excusable. Kognitio&#8217;s strategy changes often.</li>
<li>I&#8217;m not persuaded by the claim of low <strong>Microsoft</strong> TCO. The days when Microsoft&#8217;s tools were vastly better than the competition&#8217;s are long gone. And using an OLTP DBMS for data warehousing generally takes more people effort than using something more purpose-built.</li>
<li>Gartner is right to ding <strong>Oracle</strong> for high prices, high people costs, and unwillingness to do onsite POCs.</li>
<li>Gartner is right that <strong>Exadata</strong> is a huge improvement over non-Exadata Oracle data warehousing.</li>
<li>Gartner is right to suggest that Exadata can easily handle data warehouses over 20 terabytes in size, but wrong to suggest that software-only Oracle also can. Just because the pain is less than it was with earlier releases of Oracle doesn&#8217;t mean it isn&#8217;t still bad.</li>
<li>Gartner&#8217;s comments on <strong>ParAccel</strong> are pretty reasonable.</li>
<li>Gartner&#8217;s comments on compression in connection with <strong>SAND</strong> make no technical sense (tokenization is a key form of columnar compression, not an alternative to it). Also, SAP&#8217;s acquisition of Sybase is a business challenge for SAND, not a technical one.</li>
<li>Unless I&#8217;m forgetting something, <strong>Sybase IQ</strong> has no more in-database data mining than any other Fuzzy Logix partner does.</li>
<li>Gartner failed to note that, like other DBMS dating back to the 1990s and before, Sybase IQ is more complex to administer than some newer products are.</li>
<li>Gartner&#8217;s take on <strong>Teradata </strong>is pretty reasonable.</li>
<li>Gartner&#8217;s take on <strong>Vertica, </strong>while sloppy, is basically sensible. However, Gartner failed to note that Vertica is a laggard in non-query analytics. (I am sure those deficiencies are being addressed, but Vertica&#8217;s competitors are moving ahead as well.)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/02/05/gartner-magic-quadrant-data-warehouse-database-management-2010/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>ParAccel PADB technical notes</title>
		<link>http://www.dbms2.com/2011/02/03/paraccel-padb-technical-notes/</link>
		<comments>http://www.dbms2.com/2011/02/03/paraccel-padb-technical-notes/#comments</comments>
		<pubDate>Thu, 03 Feb 2011 06:13:32 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Storage]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3730</guid>
		<description><![CDATA[I posted last October about PADB (ParAccel Analytic DataBase), but held back on various topics since PADB 3.0 was still under NDA. By the time PADB 3.0 was released, I was on blogging hiatus. Let&#8217;s do a bit of ParAccel catch-up now. One big part of PADB 3.0 was an analytics extensibility framework. If we [...]]]></description>
			<content:encoded><![CDATA[<p>I posted last October about <a href="../../../../../2010/10/17/paraccel/">PADB (ParAccel Analytic DataBase)</a>, but held back on various topics since PADB 3.0 was still under NDA. By the time PADB 3.0 was released, I was on <a href="../../../../../2010/11/10/where-im-at-now/">blogging hiatus</a>. Let&#8217;s do a bit of ParAccel catch-up now.</p>
<p>One big part of PADB 3.0 was an analytics extensibility framework. If we match PADB against my recent <a href="../../../../../2011/01/24/analytic-computing-system/">analytic computing system checklist</a>,  <span id="more-3730"></span></p>
<ul>
<li>ParAccel is proud of PADB&#8217;s coverage in analytics-oriented SQL standard capabilities.</li>
<li>I&#8217;m not aware of any PADB SQL goodies that go beyond the ANSI standards.</li>
<li>PADB has a pretty flexible framework for user-defined functions (UDFs). In particular, ParAccel asserts this framework is even better than MapReduce, because it lets you do more steps at once, although I have trouble convincing myself that that makes sense in an important way.</li>
<li>Anyhow &#8212; like Aster Data, ParAccel asserts that the same framework on which its DBMS is built has now been exposed to people wanting to write other kinds of analytic processes. (But Aster Data describes its framework as being pretty straight MapReduce.)</li>
<li>All of PADB&#8217;s analytic process execution capabilities are subsumed in the UDF framework.</li>
<li>PADB does not yet contain much in the way of fully parallelized analytic libraries. Exception: Like many of its competitors, ParAccel has a Fuzzy Logix partnership.</li>
<li>ParAccel hasn&#8217;t focused yet on analytic development ease of use. (And that&#8217;s putting it mildly.)</li>
<li>The only language now supported for PADB analytics is C++. ParAccel promises more language support, with (at least) Java and R coming in the summer.</li>
<li>In line with its extreme focus on speed, ParAccel for now offers only in-process analytics execution.</li>
<li>In a near-future release (just heading into QA now), ParAccel promises that PADB UDFs will be very flexible in terms of what kinds of memory structures it manages. However, if you want a structure to persist past the end of a query, you need to map it to a row architecture.</li>
<li>ParAccel&#8217;s workload management is still primitive &#8212; just a short-query bias, rather than any kind of explicit prioritization. Hence, the question as to whether workload management extends to analytic process execution is fairly moot.</li>
</ul>
<p>In other news, ParAccel&#8217;s <a href="http://paraccel.com/youre-the-boss-try-extensible-deployment/">Bala Narasimhan</a> wrote:<strong> </strong></p>
<blockquote><p>Historically, an analyst who wants to spin up a new data mart with all of this data will have to wait for a number of days for the data copy to be made available. Instead, if you <a href="http://paraccel.com/technology/san-integration/">deploy PADB with a SAN</a> that has fast and efficient snapshot and cloning capabilities, you can spin up multi-TB data marts in seconds.<strong> </strong></p></blockquote>
<p>That turns out to be not quite as ridiculous as it sounds. The scenario is:</p>
<ul>
<li>You&#8217;re using storage-area network technology with a copy-on-write option.<strong> </strong></li>
<li>You use the SAN&#8217;s copy-on-write option to make a second virtual copy of the database in question (or of certain tables/files/blocks from it).<strong> </strong></li>
<li>You point a separate instance of PADB at it, either on a separate cluster (&#8220;in seconds&#8221; &#8212; yeah, right) or else via virtualization (e.g. VMware  &#8212; that sounds more plausible).<strong> </strong></li>
</ul>
<p>Hmm. I have no actual knowledge of this, but it sounds like a capability that EMC should also offer soon, given <a href="../../../../../2009/06/08/the-future-of-data-marts/">the historical Greenplum focus on data mart spin-out</a>.</p>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow: hidden;">
<p><!--[if gte mso 9]><xml> <w:worddocument> <w:view>Normal</w:view> <w:zoom>0</w:zoom> <w:trackmoves /> <w:trackformatting /> <w:punctuationkerning /> <w:validateagainstschemas /> <w:saveifxmlinvalid>false</w:saveifxmlinvalid> <w:ignoremixedcontent>false</w:ignoremixedcontent> <w:alwaysshowplaceholdertext>false</w:alwaysshowplaceholdertext> <w:donotpromoteqf /> <w:lidthemeother>EN-US</w:lidthemeother> <w:lidthemeasian>X-NONE</w:lidthemeasian> <w:lidthemecomplexscript>X-NONE</w:lidthemecomplexscript> <w:compatibility> <w:breakwrappedtables /> <w:snaptogridincell /> <w:wraptextwithpunct /> <w:useasianbreakrules /> <w:dontgrowautofit /> <w:splitpgbreakandparamark /> <w:dontvertaligncellwithsp /> <w:dontbreakconstrainedforcedtables /> <w:dontvertalignintxbx /> <w:word11kerningpairs /> <w:cachedcolbalance /> </w:compatibility> <w:browserlevel>MicrosoftInternetExplorer4</w:browserlevel> <m:mathpr> <m:mathfont m:val="Cambria Math" /> <m:brkbin m:val="before" /> <m:brkbinsub m:val=" " /> <m:smallfrac m:val="off" /> <m:dispdef /> <m:lmargin m:val="0" /> <m:rmargin m:val="0" /> <m:defjc m:val="centerGroup" /> <m:wrapindent m:val="1440" /> <m:intlim m:val="subSup" /> <m:narylim m:val="undOvr" /> </m:mathpr></w:worddocument> </xml>< ![endif]--><!--[if gte mso 9]><xml> <w:latentstyles DefLockedState="false" DefUnhideWhenUsed="true"   DefSemiHidden="true" DefQFormat="false" DefPriority="99"   LatentStyleCount="267"> <w:lsdexception Locked="false" Priority="0" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Normal" /> <w:lsdexception Locked="false" Priority="9" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="heading 1" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 2" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 3" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 4" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 5" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 6" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 7" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 8" /> <w:lsdexception Locked="false" Priority="9" QFormat="true" Name="heading 9" /> <w:lsdexception Locked="false" Priority="39" Name="toc 1" /> <w:lsdexception Locked="false" Priority="39" Name="toc 2" /> <w:lsdexception Locked="false" Priority="39" Name="toc 3" /> <w:lsdexception Locked="false" Priority="39" Name="toc 4" /> <w:lsdexception Locked="false" Priority="39" Name="toc 5" /> <w:lsdexception Locked="false" Priority="39" Name="toc 6" /> <w:lsdexception Locked="false" Priority="39" Name="toc 7" /> <w:lsdexception Locked="false" Priority="39" Name="toc 8" /> <w:lsdexception Locked="false" Priority="39" Name="toc 9" /> <w:lsdexception Locked="false" Priority="35" QFormat="true" Name="caption" /> <w:lsdexception Locked="false" Priority="10" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Title" /> <w:lsdexception Locked="false" Priority="1" Name="Default Paragraph Font" /> <w:lsdexception Locked="false" Priority="11" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Subtitle" /> <w:lsdexception Locked="false" Priority="22" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Strong" /> <w:lsdexception Locked="false" Priority="20" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Emphasis" /> <w:lsdexception Locked="false" Priority="59" SemiHidden="false"    UnhideWhenUsed="false" Name="Table Grid" /> <w:lsdexception Locked="false" UnhideWhenUsed="false" Name="Placeholder Text" /> <w:lsdexception Locked="false" Priority="1" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="No Spacing" /> <w:lsdexception Locked="false" Priority="60" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Shading" /> <w:lsdexception Locked="false" Priority="61" SemiHidden="false"    UnhideWhenUsed="false" Name="Light List" /> <w:lsdexception Locked="false" Priority="62" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Grid" /> <w:lsdexception Locked="false" Priority="63" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 1" /> <w:lsdexception Locked="false" Priority="64" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 2" /> <w:lsdexception Locked="false" Priority="65" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 1" /> <w:lsdexception Locked="false" Priority="66" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 2" /> <w:lsdexception Locked="false" Priority="67" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 1" /> <w:lsdexception Locked="false" Priority="68" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 2" /> <w:lsdexception Locked="false" Priority="69" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 3" /> <w:lsdexception Locked="false" Priority="70" SemiHidden="false"    UnhideWhenUsed="false" Name="Dark List" /> <w:lsdexception Locked="false" Priority="71" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Shading" /> <w:lsdexception Locked="false" Priority="72" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful List" /> <w:lsdexception Locked="false" Priority="73" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Grid" /> <w:lsdexception Locked="false" Priority="60" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Shading Accent 1" /> <w:lsdexception Locked="false" Priority="61" SemiHidden="false"    UnhideWhenUsed="false" Name="Light List Accent 1" /> <w:lsdexception Locked="false" Priority="62" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Grid Accent 1" /> <w:lsdexception Locked="false" Priority="63" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1" /> <w:lsdexception Locked="false" Priority="64" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1" /> <w:lsdexception Locked="false" Priority="65" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 1 Accent 1" /> <w:lsdexception Locked="false" UnhideWhenUsed="false" Name="Revision" /> <w:lsdexception Locked="false" Priority="34" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="List Paragraph" /> <w:lsdexception Locked="false" Priority="29" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Quote" /> <w:lsdexception Locked="false" Priority="30" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Intense Quote" /> <w:lsdexception Locked="false" Priority="66" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 2 Accent 1" /> <w:lsdexception Locked="false" Priority="67" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1" /> <w:lsdexception Locked="false" Priority="68" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1" /> <w:lsdexception Locked="false" Priority="69" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1" /> <w:lsdexception Locked="false" Priority="70" SemiHidden="false"    UnhideWhenUsed="false" Name="Dark List Accent 1" /> <w:lsdexception Locked="false" Priority="71" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Shading Accent 1" /> <w:lsdexception Locked="false" Priority="72" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful List Accent 1" /> <w:lsdexception Locked="false" Priority="73" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Grid Accent 1" /> <w:lsdexception Locked="false" Priority="60" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Shading Accent 2" /> <w:lsdexception Locked="false" Priority="61" SemiHidden="false"    UnhideWhenUsed="false" Name="Light List Accent 2" /> <w:lsdexception Locked="false" Priority="62" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Grid Accent 2" /> <w:lsdexception Locked="false" Priority="63" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2" /> <w:lsdexception Locked="false" Priority="64" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2" /> <w:lsdexception Locked="false" Priority="65" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 1 Accent 2" /> <w:lsdexception Locked="false" Priority="66" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 2 Accent 2" /> <w:lsdexception Locked="false" Priority="67" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2" /> <w:lsdexception Locked="false" Priority="68" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2" /> <w:lsdexception Locked="false" Priority="69" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2" /> <w:lsdexception Locked="false" Priority="70" SemiHidden="false"    UnhideWhenUsed="false" Name="Dark List Accent 2" /> <w:lsdexception Locked="false" Priority="71" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Shading Accent 2" /> <w:lsdexception Locked="false" Priority="72" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful List Accent 2" /> <w:lsdexception Locked="false" Priority="73" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Grid Accent 2" /> <w:lsdexception Locked="false" Priority="60" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Shading Accent 3" /> <w:lsdexception Locked="false" Priority="61" SemiHidden="false"    UnhideWhenUsed="false" Name="Light List Accent 3" /> <w:lsdexception Locked="false" Priority="62" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Grid Accent 3" /> <w:lsdexception Locked="false" Priority="63" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3" /> <w:lsdexception Locked="false" Priority="64" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3" /> <w:lsdexception Locked="false" Priority="65" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 1 Accent 3" /> <w:lsdexception Locked="false" Priority="66" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 2 Accent 3" /> <w:lsdexception Locked="false" Priority="67" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3" /> <w:lsdexception Locked="false" Priority="68" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3" /> <w:lsdexception Locked="false" Priority="69" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3" /> <w:lsdexception Locked="false" Priority="70" SemiHidden="false"    UnhideWhenUsed="false" Name="Dark List Accent 3" /> <w:lsdexception Locked="false" Priority="71" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Shading Accent 3" /> <w:lsdexception Locked="false" Priority="72" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful List Accent 3" /> <w:lsdexception Locked="false" Priority="73" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Grid Accent 3" /> <w:lsdexception Locked="false" Priority="60" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Shading Accent 4" /> <w:lsdexception Locked="false" Priority="61" SemiHidden="false"    UnhideWhenUsed="false" Name="Light List Accent 4" /> <w:lsdexception Locked="false" Priority="62" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Grid Accent 4" /> <w:lsdexception Locked="false" Priority="63" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4" /> <w:lsdexception Locked="false" Priority="64" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4" /> <w:lsdexception Locked="false" Priority="65" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 1 Accent 4" /> <w:lsdexception Locked="false" Priority="66" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 2 Accent 4" /> <w:lsdexception Locked="false" Priority="67" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4" /> <w:lsdexception Locked="false" Priority="68" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4" /> <w:lsdexception Locked="false" Priority="69" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4" /> <w:lsdexception Locked="false" Priority="70" SemiHidden="false"    UnhideWhenUsed="false" Name="Dark List Accent 4" /> <w:lsdexception Locked="false" Priority="71" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Shading Accent 4" /> <w:lsdexception Locked="false" Priority="72" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful List Accent 4" /> <w:lsdexception Locked="false" Priority="73" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Grid Accent 4" /> <w:lsdexception Locked="false" Priority="60" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Shading Accent 5" /> <w:lsdexception Locked="false" Priority="61" SemiHidden="false"    UnhideWhenUsed="false" Name="Light List Accent 5" /> <w:lsdexception Locked="false" Priority="62" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Grid Accent 5" /> <w:lsdexception Locked="false" Priority="63" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5" /> <w:lsdexception Locked="false" Priority="64" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5" /> <w:lsdexception Locked="false" Priority="65" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 1 Accent 5" /> <w:lsdexception Locked="false" Priority="66" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 2 Accent 5" /> <w:lsdexception Locked="false" Priority="67" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5" /> <w:lsdexception Locked="false" Priority="68" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5" /> <w:lsdexception Locked="false" Priority="69" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5" /> <w:lsdexception Locked="false" Priority="70" SemiHidden="false"    UnhideWhenUsed="false" Name="Dark List Accent 5" /> <w:lsdexception Locked="false" Priority="71" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Shading Accent 5" /> <w:lsdexception Locked="false" Priority="72" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful List Accent 5" /> <w:lsdexception Locked="false" Priority="73" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Grid Accent 5" /> <w:lsdexception Locked="false" Priority="60" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Shading Accent 6" /> <w:lsdexception Locked="false" Priority="61" SemiHidden="false"    UnhideWhenUsed="false" Name="Light List Accent 6" /> <w:lsdexception Locked="false" Priority="62" SemiHidden="false"    UnhideWhenUsed="false" Name="Light Grid Accent 6" /> <w:lsdexception Locked="false" Priority="63" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6" /> <w:lsdexception Locked="false" Priority="64" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6" /> <w:lsdexception Locked="false" Priority="65" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 1 Accent 6" /> <w:lsdexception Locked="false" Priority="66" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium List 2 Accent 6" /> <w:lsdexception Locked="false" Priority="67" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6" /> <w:lsdexception Locked="false" Priority="68" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6" /> <w:lsdexception Locked="false" Priority="69" SemiHidden="false"    UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6" /> <w:lsdexception Locked="false" Priority="70" SemiHidden="false"    UnhideWhenUsed="false" Name="Dark List Accent 6" /> <w:lsdexception Locked="false" Priority="71" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Shading Accent 6" /> <w:lsdexception Locked="false" Priority="72" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful List Accent 6" /> <w:lsdexception Locked="false" Priority="73" SemiHidden="false"    UnhideWhenUsed="false" Name="Colorful Grid Accent 6" /> <w:lsdexception Locked="false" Priority="19" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis" /> <w:lsdexception Locked="false" Priority="21" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis" /> <w:lsdexception Locked="false" Priority="31" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference" /> <w:lsdexception Locked="false" Priority="32" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Intense Reference" /> <w:lsdexception Locked="false" Priority="33" SemiHidden="false"    UnhideWhenUsed="false" QFormat="true" Name="Book Title" /> <w:lsdexception Locked="false" Priority="37" Name="Bibliography" /> <w:lsdexception Locked="false" Priority="39" QFormat="true" Name="TOC Heading" /> </w:latentstyles> </xml>< ![endif]--><!--  /* Font Definitions */  @font-face 	{font-family:Wingdings; 	panose-1:5 0 0 0 0 0 0 0 0 0; 	mso-font-charset:2; 	mso-generic-font-family:auto; 	mso-font-pitch:variable; 	mso-font-signature:0 268435456 0 0 -2147483648 0;} @font-face 	{font-family:"Cambria Math"; 	panose-1:2 4 5 3 5 4 6 3 2 4; 	mso-font-charset:1; 	mso-generic-font-family:roman; 	mso-font-format:other; 	mso-font-pitch:variable; 	mso-font-signature:0 0 0 0 0 0;}  /* Style Definitions */  p.MsoNormal, li.MsoNormal, div.MsoNormal 	{mso-style-unhide:no; 	mso-style-qformat:yes; 	mso-style-parent:""; 	margin-top:0in; 	margin-right:0in; 	margin-bottom:10.0pt; 	margin-left:0in; 	line-height:115%; 	mso-pagination:widow-orphan; 	font-size:11.0pt; 	font-family:"Times New Roman","serif"; 	mso-ascii-font-family:"Times New Roman"; 	mso-ascii-theme-font:minor-latin; 	mso-fareast-font-family:"Times New Roman"; 	mso-fareast-theme-font:minor-latin; 	mso-hansi-font-family:"Times New Roman"; 	mso-hansi-theme-font:minor-latin; 	mso-bidi-font-family:"Times New Roman"; 	mso-bidi-theme-font:minor-bidi;} h2 	{mso-style-priority:9; 	mso-style-unhide:no; 	mso-style-qformat:yes; 	mso-style-link:"Heading 2 Char"; 	mso-margin-top-alt:auto; 	margin-right:0in; 	mso-margin-bottom-alt:auto; 	margin-left:0in; 	mso-pagination:widow-orphan; 	mso-outline-level:2; 	font-size:18.0pt; 	font-family:"Times New Roman","serif"; 	mso-fareast-font-family:"Times New Roman";} a:link, span.MsoHyperlink 	{mso-style-priority:99; 	color:blue; 	mso-themecolor:hyperlink; 	text-decoration:underline; 	text-underline:single;} a:visited, span.MsoHyperlinkFollowed 	{mso-style-noshow:yes; 	mso-style-priority:99; 	color:purple; 	mso-themecolor:followedhyperlink; 	text-decoration:underline; 	text-underline:single;} p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph 	{mso-style-priority:34; 	mso-style-unhide:no; 	mso-style-qformat:yes; 	margin-top:0in; 	margin-right:0in; 	margin-bottom:10.0pt; 	margin-left:.5in; 	mso-add-space:auto; 	line-height:115%; 	mso-pagination:widow-orphan; 	font-size:11.0pt; 	font-family:"Times New Roman","serif"; 	mso-ascii-font-family:"Times New Roman"; 	mso-ascii-theme-font:minor-latin; 	mso-fareast-font-family:"Times New Roman"; 	mso-fareast-theme-font:minor-latin; 	mso-hansi-font-family:"Times New Roman"; 	mso-hansi-theme-font:minor-latin; 	mso-bidi-font-family:"Times New Roman"; 	mso-bidi-theme-font:minor-bidi;} p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst 	{mso-style-priority:34; 	mso-style-unhide:no; 	mso-style-qformat:yes; 	mso-style-type:export-only; 	margin-top:0in; 	margin-right:0in; 	margin-bottom:0in; 	margin-left:.5in; 	margin-bottom:.0001pt; 	mso-add-space:auto; 	line-height:115%; 	mso-pagination:widow-orphan; 	font-size:11.0pt; 	font-family:"Times New Roman","serif"; 	mso-ascii-font-family:"Times New Roman"; 	mso-ascii-theme-font:minor-latin; 	mso-fareast-font-family:"Times New Roman"; 	mso-fareast-theme-font:minor-latin; 	mso-hansi-font-family:"Times New Roman"; 	mso-hansi-theme-font:minor-latin; 	mso-bidi-font-family:"Times New Roman"; 	mso-bidi-theme-font:minor-bidi;} p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle 	{mso-style-priority:34; 	mso-style-unhide:no; 	mso-style-qformat:yes; 	mso-style-type:export-only; 	margin-top:0in; 	margin-right:0in; 	margin-bottom:0in; 	margin-left:.5in; 	margin-bottom:.0001pt; 	mso-add-space:auto; 	line-height:115%; 	mso-pagination:widow-orphan; 	font-size:11.0pt; 	font-family:"Times New Roman","serif"; 	mso-ascii-font-family:"Times New Roman"; 	mso-ascii-theme-font:minor-latin; 	mso-fareast-font-family:"Times New Roman"; 	mso-fareast-theme-font:minor-latin; 	mso-hansi-font-family:"Times New Roman"; 	mso-hansi-theme-font:minor-latin; 	mso-bidi-font-family:"Times New Roman"; 	mso-bidi-theme-font:minor-bidi;} p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast 	{mso-style-priority:34; 	mso-style-unhide:no; 	mso-style-qformat:yes; 	mso-style-type:export-only; 	margin-top:0in; 	margin-right:0in; 	margin-bottom:10.0pt; 	margin-left:.5in; 	mso-add-space:auto; 	line-height:115%; 	mso-pagination:widow-orphan; 	font-size:11.0pt; 	font-family:"Times New Roman","serif"; 	mso-ascii-font-family:"Times New Roman"; 	mso-ascii-theme-font:minor-latin; 	mso-fareast-font-family:"Times New Roman"; 	mso-fareast-theme-font:minor-latin; 	mso-hansi-font-family:"Times New Roman"; 	mso-hansi-theme-font:minor-latin; 	mso-bidi-font-family:"Times New Roman"; 	mso-bidi-theme-font:minor-bidi;} span.Heading2Char 	{mso-style-name:"Heading 2 Char"; 	mso-style-priority:9; 	mso-style-unhide:no; 	mso-style-locked:yes; 	mso-style-link:"Heading 2"; 	mso-ansi-font-size:18.0pt; 	mso-bidi-font-size:18.0pt; 	font-family:"Times New Roman","serif"; 	mso-ascii-font-family:"Times New Roman"; 	mso-fareast-font-family:"Times New Roman"; 	mso-hansi-font-family:"Times New Roman"; 	mso-bidi-font-family:"Times New Roman"; 	font-weight:bold;} .MsoChpDefault 	{mso-style-type:export-only; 	mso-default-props:yes; 	mso-ascii-font-family:"Times New Roman"; 	mso-ascii-theme-font:minor-latin; 	mso-fareast-font-family:"Times New Roman"; 	mso-fareast-theme-font:minor-latin; 	mso-hansi-font-family:"Times New Roman"; 	mso-hansi-theme-font:minor-latin; 	mso-bidi-font-family:"Times New Roman"; 	mso-bidi-theme-font:minor-bidi;} .MsoPapDefault 	{mso-style-type:export-only; 	margin-bottom:10.0pt; 	line-height:115%;} @page Section1 	{size:8.5in 11.0in; 	margin:1.0in 1.0in 1.0in 1.0in; 	mso-header-margin:.5in; 	mso-footer-margin:.5in; 	mso-paper-source:0;} div.Section1 	{page:Section1;}  /* List Definitions */  @list l0 	{mso-list-id:293759230; 	mso-list-type:hybrid; 	mso-list-template-ids:1723346182 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;} @list l0:level1 	{mso-level-number-format:bullet; 	mso-level-text:; 	mso-level-tab-stop:none; 	mso-level-number-position:left; 	text-indent:-.25in; 	font-family:Symbol;} @list l1 	{mso-list-id:1708987401; 	mso-list-type:hybrid; 	mso-list-template-ids:-1253949368 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;} @list l1:level1 	{mso-level-number-format:bullet; 	mso-level-text:; 	mso-level-tab-stop:none; 	mso-level-number-position:left; 	text-indent:-.25in; 	font-family:Symbol;} ol 	{margin-bottom:0in;} ul 	{margin-bottom:0in;} --><!--[if gte mso 10]> <mce:style>< !   /* Style Definitions */  table.MsoNormalTable 	{mso-style-name:"Table Normal"; 	mso-tstyle-rowband-size:0; 	mso-tstyle-colband-size:0; 	mso-style-noshow:yes; 	mso-style-priority:99; 	mso-style-qformat:yes; 	mso-style-parent:""; 	mso-padding-alt:0in 5.4pt 0in 5.4pt; 	mso-para-margin-top:0in; 	mso-para-margin-right:0in; 	mso-para-margin-bottom:10.0pt; 	mso-para-margin-left:0in; 	line-height:115%; 	mso-pagination:widow-orphan; 	font-size:11.0pt; 	font-family:"Times New Roman","serif"; 	mso-ascii-font-family:"Times New Roman"; 	mso-ascii-theme-font:minor-latin; 	mso-hansi-font-family:"Times New Roman"; 	mso-hansi-theme-font:minor-latin; 	mso-bidi-font-family:"Times New Roman"; 	mso-bidi-theme-font:minor-bidi;} --> <!--[endif]--></p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt;">I posted last October about <a href="../2010/10/17/paraccel/">PADB (ParAccel Analytic DataBase)</a>, but held back on various topics since PADB 3.0 was still under NDA. By the time PADB 3.0 was released, I was on <a href="../2010/11/10/where-im-at-now/">blogging hiatus</a>. Let&#8217;s do a bit of ParAccel catch-up now.</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt;">
<p class="MsoNormal" style="margin-bottom: 0.0001pt;">One big part of PADB 3.0 was an analytics extensibility framework. If we match PADB <span> </span>against my recent <a href="../2011/01/24/analytic-computing-system/">analytic computing system checklist</a>,</p>
<p class="MsoListParagraphCxSpFirst" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->ParAccel is proud of PADB&#8217;s coverage in analytics-oriented SQL standard capabilities.</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->I&#8217;m not aware of any PADB SQL goodies that go beyond the ANSI standards.</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->PADB has a pretty flexible framework for user-defined functions (UDFs). In particular, ParAccel asserts this framework is even better than MapReduce, because it lets you do more steps at once, although I have trouble convincing myself that that makes sense in an important way.</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->Anyhow &#8212; like Aster Data, ParAccel asserts that the same framework on which its DBMS is built has now been exposed to people wanting to write other kinds of analytic processes. (But Aster Data describes its framework as being pretty straight MapReduce.)</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->All of PADB&#8217;s analytic process execution capabilities are subsumed in the UDF framework.</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->PADB does not yet contain much in the way of fully parallelized analytic libraries. Exception: Like many of its competitors, ParAccel has a Fuzzy Logix partnership.</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->ParAccel hasn&#8217;t focused yet on analytic development ease of use. (And that&#8217;s putting it mildly.)</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->The only language now supported for PADB analytics is C++. ParAccel promises more language support, with (at least) Java and R coming in the summer.</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->In line with its extreme focus on speed, ParAccel for now offers only in-process analytics execution.</p>
<p class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->In a near-future release (just heading into QA now), ParAccel promises that PADB UDFs will be very flexible in terms of what kinds of memory structures it manages. However, if you want a structure to persist past the end of a query, you need to map it to a row architecture.</p>
<p class="MsoListParagraphCxSpLast" style="margin-bottom: 0.0001pt; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-family: Symbol;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]-->ParAccel&#8217;s workload management is still primitive &#8212; just a short-query bias, rather than any kind of explicit prioritization. Hence, the question as to whether workload management extends to analytic process execution is fairly moot.</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt;">
<h2 style="margin-bottom: 0.0001pt;"><span style="font-size: 11pt; font-weight: normal;">In other news, ParAccel&#8217;s <a href="http://paraccel.com/youre-the-boss-try-extensible-deployment/">Bala Narasimhan</a> wrote:</span></h2>
<h2 style="margin-bottom: 0.0001pt;"><span style="font-size: 11pt; font-weight: normal;">Historically, an analyst who wants to spin up a new data mart with all of this data will have to wait for a number of days for the data copy to be made available. Instead, if you <a href="http://paraccel.com/technology/san-integration/"><span style="color: windowtext; text-decoration: none;">deploy PADB with a SAN</span></a> that has fast and efficient snapshot and cloning capabilities, you can spin up multi-TB data marts in seconds.</span></h2>
<h2 style="margin-bottom: 0.0001pt;"><span style="font-size: 11pt; font-weight: normal;">That turns out not to be quite as ridiculous as it sounds. The scenario is:</span></h2>
<h2 style="margin: 5pt 0in 0.0001pt 0.5in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 11pt; font-family: Symbol; font-weight: normal;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]--><span style="font-size: 11pt; font-weight: normal;">You&#8217;re using storage-area network technology with a copy-on-write option.</span></h2>
<h2 style="margin: 5pt 0in 0.0001pt 0.5in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 11pt; font-family: Symbol; font-weight: normal;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]--><span style="font-size: 11pt; font-weight: normal;">You use the SAN&#8217;s copy-on-write option to make a second virtual copy of the database in question (or of certain tables/files/blocks from it).</span></h2>
<h2 style="margin: 5pt 0in 0.0001pt 0.5in; text-indent: -0.25in;"><!--[if !supportLists]--><span style="font-size: 11pt; font-family: Symbol; font-weight: normal;"><span>·<span style="font: 7pt &amp;amp;amp;"> </span></span></span><!--[endif]--><span style="font-size: 11pt; font-weight: normal;">You point a separate instance of PADB at it, either on a separate cluster (&#8220;in seconds&#8221; &#8212; yeah, right) or else via virtualization (e.g. VMware<span> </span>&#8211; that sounds more plausible).</span></h2>
<h2 style="margin-bottom: 0.0001pt;"><span style="font-size: 11pt; font-weight: normal;">Hmm. I have no actual knowledge of this, but it sounds like a capability that EMC should also<span> </span>offer soon, given <a href="../2009/06/08/the-future-of-data-marts/">the historical Greenplum focus on data mart spin-out</a>. </span></h2>
<p></mce:style></div>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/02/03/paraccel-padb-technical-notes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Choices in analytic computing system design</title>
		<link>http://www.dbms2.com/2011/01/24/analytic-computing-system/</link>
		<comments>http://www.dbms2.com/2011/01/24/analytic-computing-system/#comments</comments>
		<pubDate>Mon, 24 Jan 2011 05:28:08 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Predictive modeling and advanced analytics]]></category>
		<category><![CDATA[Workload management]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3641</guid>
		<description><![CDATA[When I posted a long list of architectural options for analytic DBMS, I left a couple of IOUs in for missing parts. One was in the area of what is sometimes called advanced-analytics functionality, which roughly speaking means aspects of analytic database management systems that are not directly related to conventional* SQL queries. *Main examples [...]]]></description>
			<content:encoded><![CDATA[<p>When I posted a long list of <a href="http://www.dbms2.com/2011/01/18/architectural-options-for-analytic-database-management-systems/">architectural options for analytic DBMS</a>, I left a couple of IOUs in for missing parts. One was in the area of what is sometimes called <strong>advanced-analytics functionality,</strong> which roughly speaking means aspects of analytic database management systems that are <strong>not directly related to conventional* SQL queries.</strong></p>
<p><em>*Main examples of &#8220;conventional&#8221; = filtering, simple aggregrations.</em></p>
<p>The point of such functionality is generally twofold. First, it helps you execute analytic algorithms with <strong>high performance,</strong> due to reducing data movement and/or executing the analytics in parallel. Second, it helps you create and execute sophisticated analytic processes with <strong>(relatively) little effort.</strong></p>
<p><strong> </strong>For now, I&#8217;m going to refer to an analytic RDBMS that has been  extended by advanced-analytics functionality as an<strong> analytic computing  system, </strong>rather than as some kind of &#8220;platform,&#8221; although I suspect  the latter term is more likely to wind up winning.  So far, there have been five major categories of subsystem or add-on module that contribute to making an analytic DBMS a more fully-fledged analytic computing system:</p>
<ul>
<li><strong>SQL extensions. </strong>Examples include SQL-2003 analytics (notably windowing), or vendor-specific temporal functionality.</li>
<li>A <strong>framework for UDFs</strong> (User-Defined Functions) to further extend SQL. At its core, a relational DBMS is a big SQL interpreter. SQL, while  powerful, only does a limited number of things. User-Defined Functions are new predicates in the SQL language that do additional things.</li>
<li>An <strong>execution engine for analytic processes</strong> that is less coupled to the SQL engine than a pure UDF framework might be. The two main approaches are MapReduce (e.g. <a href="http://www.dbms2.com/2009/10/30/aster-data-application-server-ncluster/">Aster Data</a>) and general C++ libraries (<a href="http://www.dbms2.com/2010/02/22/netezza-twinfin/">Netezza</a>, ParAccel).</li>
<li><strong>Libraries</strong> of pre-built analytic processes. Commonly included are statistics, (other machine learning), general linear algebra, and Monte Carlo analysis. Some of these functions are fully parallelized (perhaps tens per vendor). Others just play nicely with the vendor&#8217;s execution framework, in that a separate copy can be run on each node (up to thousands per vendor, for those who bring in open source statistics libraries).</li>
<li><strong>Development tools</strong> such as integrated development environments (IDEs). Aster keeps trying to convince me that having built a nice Eclipse IDE is a major competitive differentiation.</li>
</ul>
<p><span id="more-3641"></span>The most structural or architectural are the UDF framework and the non-UDF analytic execution engine.  But even those are in essence add-on modules, which means that pretty much any vendor can do any part of them if they invest enough resources in the effort. So I expect considerable convergence over time as the industry and market discover which capabilities are or aren&#8217;t particularly useful.</p>
<p>When I&#8217;m being told about an analytic DBMS that supposedly has evolved into an analytic computing system, some of my top-of-mind questions are:</p>
<ul>
<li><strong>How does the execution work?</strong> UDFs? MapReduce? Something else? What forms can the inputs and outputs of a UDF take? And by the way, what&#8217;s your complete list of <a href="http://www.dbms2.com/2010/10/10/partnering-with-cloudera/">MapReduce integration</a> possibilities?</li>
<li>What <strong>languages</strong> are currently supported? The obvious choices  are C++ (if that&#8217;s the style  of execution engine); <a href="http://www.dbms2.com/2010/05/15/further-clarifying-in-database-mpp-sas/">SAS</a>, R, or other statistical languages; and anything  that is commonly associated with MapReduce (Java, Python, et al.).</li>
<li><strong>In-process, out-of-process, or both?</strong> In-process runs faster; out-of-process is more stable, in that the advanced-analytics part can crash without bringing down the whole DBMS. Even if other languages are available out-of-process, C++ might be the only in-process choice. If you don&#8217;t have out-of-process execution, you may not be serious about offering really broad analytic capabilities.</li>
<li>Is there anything special about your<strong> library</strong> of pre-built, fully parallel processes? For example, I like Netezza&#8217;s broad approach to linear algebra, <a href="http://www.dbms2.com/2010/10/10/emc-greenplum-notes/">Greenplum&#8217;s</a> sparse vector manipulation, and a number of <a href="http://www.dbms2.com/2010/06/27/lots-of-aster-data-analytic-packages/">Aster Data&#8217;s</a> packages. <a href="http://paraccel.com/wp-content/uploads/2010/11/PA_FL_DS.pdf">ParAccel&#8217;s</a> list looks interesting too, although I haven&#8217;t grilled them about what is or isn&#8217;t fully parallel.</li>
<li>How does the associated <strong>memory management</strong> work? Can you create <strong>temporary data structures</strong> that survive longer than the process that spawned them? (Those can be useful for various kinds of lookup table.) Do they have to be tabular? (Graphs and other alternatives can be useful.) And by the way, do UDFs or other processes have enough RAM under their control to run efficiently?</li>
<li>How is <strong>workload management</strong> handled? Hopefully, everything that runs on the same cluster is handled by one integrated workload management system. Vendors for whom that isn&#8217;t true today should have a clear road map for getting there, because the alternative is something of a mess.</li>
</ul>
<p>Please note what I&#8217;m not including in this discussion &#8212; the integration of DBMS and fairly ordinary business intelligence. That may have virtues, for reasons of price or performance, and the virtues may grow as in-memory BI and/or data management capabilities evolve. But for the foreseeable future, BI/DBMS integration is a fairly separate matter from the integration of analytic DBMS with sophisticated <a href="http://www.dbms2.com/2011/03/03/investigative-analytics/">investigative analytics</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/01/24/analytic-computing-system/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Notes and links October 22, 2010</title>
		<link>http://www.dbms2.com/2010/10/22/notes-and-links-october-22-2010/</link>
		<comments>http://www.dbms2.com/2010/10/22/notes-and-links-october-22-2010/#comments</comments>
		<pubDate>Fri, 22 Oct 2010 06:47:05 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Liberty and privacy]]></category>
		<category><![CDATA[Market share and customer counts]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Petabyte-scale data management]]></category>
		<category><![CDATA[SAS Institute]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[VoltDB and H-Store]]></category>
		<category><![CDATA[eBay]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3346</guid>
		<description><![CDATA[A number of recent posts have had good comments. This time, I won&#8217;t call them out individually. Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on &#8220;sensor data&#8221; &#8230; &#8230; and, even better, went on to [...]]]></description>
			<content:encoded><![CDATA[<p>A number of recent posts have had good comments. This time, I won&#8217;t call them out individually.</p>
<p>Evidently <a href="http://www.cscyphers.com/blog/2010/10/12/hadoop-world-2010/">Mike Olson of Cloudera is still telling the machine-generated data story</a>, exactly as he should be. The <a href="http://informationarbitrage.com/post/1359525958/big-ideas-around-big-problems-in-big-data">Information Arbitrage/IA Ventures</a> folks said something similar, focusing specifically on &#8220;sensor data&#8221; &#8230;</p>
<p>&#8230; and, even better, went on to say:  <span id="more-3346"></span></p>
<blockquote><p><strong>Privacy is dead</strong>.<br />
What do we consider to be the  boundaries of privacy, especially with respect to items like medical  data? In a data privacy-free world, should we be regulating data usage  instead? How do we deal with asymmetric access to our personal data,  e.g., how is it that insurance companies claim the right to our personal  information?</p></blockquote>
<p>Obviously, <a href="http://www.dbms2.com/2010/04/04/privacy-liberty-continued/">my answer to the second question is Yes!!!!</a></p>
<p>Also from Hadoop World &#8212; Dave Menninger, now an analyst, reports on <a href="http://www.ventanaresearch.com/blog/commentblog.aspx?id=4003">some Hadoop metrics</a>:</p>
<blockquote><p><span id="Contentblock1"><span>How big is “big data”?  In his opening remarks, Mike shared some statistics from a survey of  attendees. The average Hadoop cluster among respondents was 66 nodes and  114 terabytes of data. However there is quite a range. The largest in  the survey responses was a cluster of 1,300 nodes and more than 2  petabytes of data. (Presenters from eBay blew this away, describing  their production cluster of  8,500 nodes and 16 petabytes of storage.)  Over 60 percent of respondents had 10 terabytes or less, and half were  running 10 nodes or less.</span></span></p></blockquote>
<p><a href="http://www.dbms2.com/2010/10/06/ebay-followup-greenplum-out-teradata-10-petabytes-hadoop-has-some-value-and-more/">That eBay comment was particularly interestin</a>g. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>A while back, Doug Henschen noted that Netezza flagship reference Catalina Marketing is now at <a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/07/big_data_the_ea.html#more">2.5 petabytes</a>. Most of that is in one 600 billion row table. Oddly, the article talks of the Netezza/SAS partnership accelerating model-building via in-database scoring (not modeling) technology. Doug also wrote of a lot of <a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/08/whats_at_stake.html#more">analytic DBMS replacements</a>, including:</p>
<ul>
<li>Microsoft by ParAccel</li>
<li>Oracle by Aster Data, IBM, Oracle Exadata, probably Netezza, and probably Hadoop</li>
<li>Netezza by Greenplum</li>
<li>IBM by Teradata</li>
</ul>
<p>Carl Olofson pointed out on Twitter that <a href="http://www.oracle.com/us/corporate/Acquisitions/datascaler/index.html">DataScaler was an in-memory database technology just bought by Oracle</a>. This inspired me to google on them, and I found a sparse <a href="http://www.svadventure.com/">DataScaler CEO blog</a>. I link it because of an amusing juxtaposition &#8212; the second-to-last post says, in effect, &#8220;We make appliances and we recommend all these awesome technology design partners who helped us design the hardware,&#8221; while the very last post says &#8220;Designing our own hardware was a mistake.&#8221; <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><a href="http://www.dbms2.com/2010/07/23/some-interesting-links/">Fred Holahan</a> is now VP of Marketing at <a href="http://www.dbms2.com/2010/05/25/voltdb-finally-launches/">VoltDB</a>, which is a lesson to me about giving free consulting &#8230; Anyhow, Fred tells me that VoltDB has about a dozen users on their way to production, some of whom are headed to being VoltDB paying customers, some of whom are not.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/10/22/notes-and-links-october-22-2010/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Where ParAccel is at</title>
		<link>http://www.dbms2.com/2010/10/17/paraccel/</link>
		<comments>http://www.dbms2.com/2010/10/17/paraccel/#comments</comments>
		<pubDate>Sun, 17 Oct 2010 08:21:04 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3296</guid>
		<description><![CDATA[Until recently, I was extremely critical of ParAccel&#8217;s marketing. But there was an almost-clean sweep of the relevant ParAccel executives, and the specific worst practices I was calling out have for the most part been eliminated. So I was open to talking and working with ParAccel again, and that&#8217;s now happening. On my recent California [...]]]></description>
			<content:encoded><![CDATA[<p>Until recently, <a href="http://www.dbms2.com/2010/01/15/there-sure-seem-to-be-a-lot-of-inaccuracies-on-paraccels-website/">I was extremely critical of ParAccel&#8217;s marketing</a>. But there was an almost-clean sweep of the relevant ParAccel executives, and the specific worst practices I was calling out have for the most part been eliminated. So I was open to talking and working with ParAccel again, and that&#8217;s now happening. On my recent California trip, I chatted with three ParAccel folks for a few hours. Based on that and other conversation, here&#8217;s the current ParAccel story as I understand it.<br />
<span id="more-3296"></span><br />
I&#8217;ve already noted that <a href="http://www.dbms2.com/2010/08/09/links-and-observations/">PADB 3.0 is coming soon</a> (ParAccel Analytic DataBase), but pending its arrival, ParAccel&#8217;s technical story is primarily about <strong>query performance.</strong> More specifically:</p>
<ul>
<li>ParAccel asserts that PADB is much faster than other analytic DBMS &#8212; even close competitors such as Vertica &#8212; on <strong>especially complex queries. </strong>&#8220;60-way joins&#8221; were mentioned. So was the flattening of correlated subqueries.</li>
<li>ParAccel also claims industry-leading performance on simpler queries, but not by the same (or perhaps even particular large) margins.</li>
<li>Mercifully, ParAccel no longer <a href="http://www.dbms2.com/2009/07/08/progress-in-figuring-out-what-paraccel-is-doing/">claims to have never, ever lost on performance in a customer evaluation</a>. But it still says that is very close to being true.</li>
<li>Major reasons ParAccel gives for PADB&#8217;s high performance include:
<ul>
<li>Like Vertica, Sybase IQ, and others, PADB uses a <strong>columnar</strong> architecture.</li>
<li>ParAccel thinks PADB&#8217;s newest <strong>query optimizer</strong> &#8212; fondly named <a href="http://paraccel.com/technology/omne-optimizer/">Omne</a> &#8212; is outstanding.</li>
<li>ParAccel&#8217;s PADB <strong>compiles its queries.</strong></li>
<li>In general, ParAccel is just performance-obsessed.</li>
</ul>
</li>
<li>One could also mention:
<ul>
<li>ParAccel&#8217;s PADB runs smoothly in-memory, if that&#8217;s what you want.</li>
<li>ParAccel also offers a Flash option for PADB.</li>
<li>Like many other analytic DBMS vendors, ParAccel has created a custom networking protocol. (ParAccel has talked about that <a href="http://www.dbms2.com/2010/04/16/story-of-an-analytic-dbms-evaluation/">altogether too much</a> in the past.)</li>
<li>Like Vertica, ParAccel&#8217;s PADB generally decompresses data as late as the  particular compression scheme used allows. (Well, actually, that&#8217;s not  one ParAccel mentions unless asked.)</li>
<li>ParAccel has long encouraged one to put part of one&#8217;s database on direct-attached storage as a kind of persistent cache, plus all of it on a storage-area network, because PADB can optimize its scans to go against both physical stores.</li>
<li>ParAccel&#8217;s PADB does encryption a block at a time, rather than a row at a time, so there&#8217;s very little overhead to using the encryption feature.</li>
</ul>
</li>
<li>ParAccel says that PADB has no indexes, materialized views, etc., notwithstanding that <a href="http://www.dbms2.com/2008/02/18/paraccel-technical-overview/">I heard something different from Barry Zane a few years ago</a>. This is the basis for ParAccel&#8217;s claim that <strong>no tuning</strong> (or at least very little) is required, or indeed even possible &#8230;</li>
<li>&#8230; and similarly, it is the reason ParAccel encourages prospects to do ad-hoc queries in their POCs (Proofs Of Concept), at least when Vertica is the competitor.</li>
<li>However, ParAccel&#8217;s PADB has rather <strong>complex initial set-up.</strong> This has been the basis for widespread skepticism about ParAccel&#8217;s &#8220;no tuning&#8221; claim. ParAccel is working to automate that away, but admits to being only part-way through the process.</li>
<li>Highlights of ParAccel&#8217;s data writing strategy include:
<ul>
<li>PADB sends data transactionally to disk.</li>
<li>PADB usually sends data to disk a block at a time, because it is coming in fast enough for that to work out (either due to bulk load or streaming).</li>
<li>PADB is <strong>append-only</strong> &#8230;</li>
<li>&#8230; so PADB has a garbage-collection mechanism called Vacuum. Right now Vacuum has to be started manually, but doesn&#8217;t block reads and writes; full background garbage collection is of course a roadmap feature.</li>
<li>As is natural for append-only systems, ParAccel&#8217;s PADB has MVCC (MultiVersion Concurrency Control) and snapshot isolation.</li>
</ul>
</li>
<li>Name a <strong>compression</strong> method, and PADB probably has it &#8212; 13 in all by ParAccel&#8217;s count, including dictionary/token, run-length encoding, Delta, LZ, and so on.</li>
</ul>
<p>Tracking ParAccel&#8217;s customer success has long been difficult. The <a href="http://www.dbms2.com/2010/02/10/gartner-magic-quadrant-data-warehouse-2009-2010/">2009 Gartner Magic Quadrant</a> claim of ~20 ParAccel customers seems odd to everybody, including ParAccel. ParAccel&#8217;s own reporting of customer wins around then was <a href="http://www.dbms2.com/2010/01/15/there-sure-seem-to-be-a-lot-of-inaccuracies-on-paraccels-website/">quite confusing</a>. And ParAccel&#8217;s customer count a year before that was <a href="http://www.dbms2.com/2009/01/03/paraccels-market-momentum/">extremely low</a>. But ParAccel&#8217;s Michael Weir just rounded up some figures for me, namely:</p>
<ul>
<li>ParAccel has 30+ revenue-recognized customers, not counting OEMs, OEMs&#8217; customers, or paid POCs.</li>
<li>2 ParAccel customers have &gt; 100 TB of user data.</li>
<li>7 ParAccel customers have &gt; 10 TB of user data.</li>
<li>The largest ParAccel cluster is 28 nodes and growing.</li>
</ul>
<p>Naturally, Michael went on to note that even relatively small databases can have high value.</p>
<p>One last note: ParAccel has approximately 78 employees.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/10/17/paraccel/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Links and observations</title>
		<link>http://www.dbms2.com/2010/08/09/links-and-observations/</link>
		<comments>http://www.dbms2.com/2010/08/09/links-and-observations/#comments</comments>
		<pubDate>Tue, 10 Aug 2010 02:37:51 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Calpont]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[HP and Neoview]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[XtremeData]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2743</guid>
		<description><![CDATA[I&#8217;m back from a trip to the SF Bay area, with a lot of writing ahead of me. I&#8217;ll dive in with some quick comments here, then write at greater length about some of these points when I can. From my trip:  Aster Data showed me a lot of customer names and deal sizes, across [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m back from a trip to the SF Bay area, with a lot of writing ahead of me. I&#8217;ll dive in with some quick comments here, then write at greater length about some of these points when I can. From my trip:  <span id="more-2743"></span></p>
<ul>
<li>Aster Data showed me a lot of customer names and deal sizes, across a bunch of industries (mainly enterprise rather than web). Yes, Aster&#8217;s market success is for real. (But almost all those details are NDA.)</li>
<li>Sybase&#8217;s product plans for IQ are pretty impressive. (But the most interesting parts are, you guessed it, NDA.)</li>
<li>I&#8217;ve kissed and made up* with ParAccel, now that they&#8217;ve replaced their CEO, replaced their marketing chief, and stopped the worst of the <a href="http://www.dbms2.com/2010/01/15/there-sure-seem-to-be-a-lot-of-inaccuracies-on-paraccels-website/">marketing</a> <a href="http://www.dbms2.com/2009/06/22/the-tpc-h-benchmark-is-a-blight-upon-the-industry/">nonsense</a> I used to complain about. ParAccel has some interesting plans for ParAccel 3.0 which are, naturally, NDA.</li>
<li>The Peoplesoft guys are doing it over again at Workday. Only this time, their platform isn&#8217;t a relational DBMS. Rather, it&#8217;s an in-memory, completely object-oriented data model, with disk used only on a &#8220;Just in case the power ever goes out&#8221; basis. (Thankfully, nothing at all about our conversation was NDA.)</li>
<li>I&#8217;m finally feeling good about <a href="# I spent considerable time  with my clients at both Greenplum and EMC (if we ignore the fact that  the deal has closed and they're now the same company). I also had more  of  a hardcore engineering discussion than I've had with Greenplum for  quite a while (I should have been pushier about that earlier). Takeaways  included:      * This is starting off as a honeymoon deal. Everything  Greenplum was planning to do is being continued. Additional resources  are being poured into Greenplum to do more.     * Some Greenplum execs  seem to envision staying long term, some seem to envision moving on to  their next startups. The ones who envision moving on are, however, going  to work hard first to make the merger a success.     * Greenplum has,  for quite a while, had more of an advanced analytics/embedded predictive  modeling story than I realized. Bad on them for not fleshing it out  more in marketing and product packaging alike.     * Greenplum both  denies the concurrency problems I previously noted and also has a very  credible story as to how it will eliminate them. :) Seriously, Greenplum  tells of one customer that routinely runs 150 simultaneously queries -  on what I think is not a terribly big system -- and a number of POCs  (Proofs of Concept) that simulated similar levels of concurrency.">Northscale&#8217;s  memcached-compatible persistent store Membase</a>. The main reason is  that they showed me a near-term path to interfaces that are richer than  key-value. Also, Todd Hoff reassured me that even pure persistent  memcached has a place.</li>
<li>Rumor says that even the one app for which Facebook was using Cassandra &#8212; in-box search &#8212; has been decommissioned. On the other hand, numerous other scale-0ut DBMS (SQL or otherwise) seem to have Facebook footholds. But details are &#8212; all together now! &#8212; NDA.</li>
</ul>
<p><em>*If you know ParAccel&#8217;s new marketing exec Michael Weir, you  surely guessed I mean that only in a figurative sense.</em></p>
<p>From elsewhere:</p>
<ul>
<li>Daniel Abadi offered <a href="http://dbmsmusings.blogspot.com/2010/08/thoughts-on-kickfires-apparent-demise.html">his  analysis</a> of <a href="../2010/07/27/kickfire-unlikely-to-survive/">Kickfire&#8217;s  demise</a>. In general I agree, but Daniel neglected to mention one  hugely important factor &#8212; the chicken-egg negative effect of Kickfire&#8217;s  lack of market or marketing traction. Customers were extremely reluctant to buy from Kickfire  because they perceived, correctly, that Kickfire&#8217;s survivability was far  from assured.</li>
<li>While the <a href="http://infinidb.org/community/forums/11-general-infinidb/1000-strange-issue-with-drop-table">InfiniDB forums</a> suggest that there are at least a couple of production users of Calpont&#8217;s free InfiniDB, Calpont seemingly has a long way to go to be even as successful as Kickfire. But Calpont does have a bit of money to spend on lead generation; maybe some day they&#8217;ll even have actual customers.</li>
<li>In a response to a question I messaged over, <a href="http://www.dbms2.com/2010/03/18/xtremedata-update/">XtremeData</a> tells me they have actual customers now. Press releases to follow.</li>
<li>The <a href="http://news.cnet.com/8301-31021_3-20013111-260.html?part=rss&amp;subj=news&amp;tag=2547-1_3-0-20">admiration for the job Mark Hurd did at HP</a> is in my opinion overstated. Sure, the financial/operational management appeared to work, but HP did little on Hurd&#8217;s watch to strengthen its reputation or customers&#8217; loyalty. In particular:
<ul>
<li>HP&#8217;s analytics efforts have accomplished little.</li>
<li>HP&#8217;s data warehouse appliance efforts have failed pathetically.</li>
<li>From what I hear, HP&#8217;s execution in its Exadata partnership was not good.</li>
<li>HP&#8217;s server business in general is distinguished mainly by HP being a big company.</li>
<li>HP&#8217;s EDS acquisition has been rocky, not that EDS was sailing so smoothly on its own beforehand.</li>
<li>HP&#8217;s success in PCs amounts to &#8220;arguably, HP sucks a little less than the other guys&#8221;.</li>
<li>HP&#8217;s elite reputation is long gone (admittedly, for the most part that predates Hurd).</li>
</ul>
</li>
<li><a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/08/software_innova.html">Doug Henschen</a> evidently favors really strong intellectual property protection for software, even forbidding plug-compatible reverse engineering. I agree with Doug up to the point that <a href="http://www.monashreport.com/2010/07/19/my-view-of-intellectual-property/">it should be forbidden to copy proprietary software</a>, but I don&#8217;t see why he (or a court) would view such behavior as copying.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/09/links-and-observations/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

