<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS2 -- DataBase Management System Services &#187; Data warehouse appliances</title>
	<atom:link href="http://www.dbms2.com/category/database-management-system/data-warehouse-appliances/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 18 Mar 2010 05:19:19 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>XtremeData update</title>
		<link>http://www.dbms2.com/2010/03/18/xtremedata-update/</link>
		<comments>http://www.dbms2.com/2010/03/18/xtremedata-update/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 05:17:23 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[XtremeData]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1722</guid>
		<description><![CDATA[I talked with Geno Valente of XtremeData tonight. Highlights included:

XtremeData still hasn&#8217;t sold any 	dbX stuff (they&#8217;ve had a side business in generic 	FPGA-based boards paying the bills for years). Well, there may 	have been some paid POCs (proofs of concept) or something, but real 	sales haven&#8217;t come through yet.
XtremeData does have three 	prospects who [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked with Geno Valente of XtremeData tonight. Highlights included:</p>
<ul>
<li>XtremeData still hasn&#8217;t sold any 	dbX stuff (they&#8217;ve had a side business in <a href="../2009/06/29/xtreme-data-readies-a-different-kind-of-fpga-based-data-warehouse-appliance/">generic 	FPGA-based boards</a> paying the bills for years). Well, there may 	have been some paid POCs (proofs of concept) or something, but real 	sales haven&#8217;t come through yet.</li>
<li>XtremeData does have three 	prospects who have said “Yes”, and expects one order to come 	through this month.</li>
<li>XtremeData continues to believe it 	shines when:
<ul>
<li>Data models are complex</li>
<li>In particular, there are complex 	joins</li>
<li>In particular, two large tables 	have to be joined with each other, under circumstances where no 	product can avoid doing vast data redistribution</li>
</ul>
</li>
<li>XtremeData insists that all the 	nice things Bill Inmon – including in webinars &#8212; has said about 	it has not been for pay or other similar business compensation. 	<a href="http://www.monashreport.com/2006/02/13/everybody-gets-paid-or-would-like-to/" onclick="javascript:pageTracker._trackPageview('/www.monashreport.com');">That&#8217;s 	quite unusual</a>.</li>
<li>XtremeData is coming out with a 	new product, codenamed the Personal Data Warehouse (PDW), which:
<ul>
<li>Is ready to go into beta test</li>
<li>Should be launched in a month and 	a half or so</li>
<li>Will have a different name when it 	is launched</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;">Naming aside,<span id="more-1722"></span></p>
<ul>
<li>The XtremeData PDW consists of 	XtremeData software running on a <a href="http://cray.com/Products/CX/Systems.aspx" onclick="javascript:pageTracker._trackPageview('/cray.com');">Cray 	CX1 box</a>.</li>
<li>Thus, the XtremeData PDW will plug 	into a 20 amp wall power socket. It consumes 1600 watts.</li>
<li>The XtremeData PDW also inherits 	the Cray CX1&#8217;s noise cancellation feature.</li>
<li>Bottom line on the form factor: 	<strong>The XtremeData PDW is meant to be stuck in the corner of a 	business analyst&#8217;s office, not a computer room.</strong></li>
<li>The XtremeData PDW will have 16 1 	TB disks (going up in size later), for 5 TB of uncompressed user 	data.</li>
<li>Pricing isn&#8217;t finalized for the 	XtremeData PDW, but it will be around XtremeData&#8217;s usual figure &#8212; 	$20K/TB of uncompressed user data.</li>
<li>XtremeData hasn&#8217;t “released” 	compression yet, but it&#8217;s “ready to go.”</li>
<li>The XtremeData PDW will not 	include FPGAs, <a href="../2009/07/27/xtremedata-announces-its-dbx-data-warehouse-appliance/">unlike 	other XtremeData dbX appliances</a>. It will just run the XtremeData 	dbX software on 8 Nehalem chips.</li>
<li>XtremeData calls this a “3-node” 	machine. I didn&#8217;t bother asking why it wasn&#8217;t 4-node. (Perhaps 	there&#8217;s a head node of some kind that properly isn&#8217;t counted.)</li>
</ul>
<p style="margin-bottom: 0in;">Some comparative notes:</p>
<ul>
<li>A <strong><a href="http://www.netezza.com/documents/skimmer_ds.pdf" onclick="javascript:pageTracker._trackPageview('/www.netezza.com');">Netezza 	Skimmer</a> has similar size and price</strong> to the XtremeData PDW, seems to draw less 	power, has less uncompressed user data capacity (but already has 	compression), is also in essence a three-node system (I think), and 	of course has a lot of software connectivity. If XtremeData can 	match Netezza&#8217;s compression, the XtremeData PDW will have a 2X or so 	price/TB advantage over Netezza Skimmer – but Netezza&#8217;s 	compression is of course a moving target. I don&#8217;t know how happy Skimmer is outside a computer room.</li>
<li><a href="http://www.kickfire.com/Products/Data-sheet" onclick="javascript:pageTracker._trackPageview('/www.kickfire.com');">Kickfire</a> manages similar amounts of data on a smaller box (5 rack units vs. 	7), drawing less power (600 watts vs.1600), also with a lot of BI 	and ETL tool connectivity.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/03/18/xtremedata-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TwinFin(i) – Netezza&#8217;s version of a parallel analytic platform</title>
		<link>http://www.dbms2.com/2010/02/22/netezza-twinfin/</link>
		<comments>http://www.dbms2.com/2010/02/22/netezza-twinfin/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 08:21:13 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[SAS Institute]]></category>
		<category><![CDATA[Teradata]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1613</guid>
		<description><![CDATA[Much like Aster Data did in Aster 4.0 and now Aster 4.5, Netezza is announcing a general parallel big data analytic platform strategy. It is called Netezza TwinFin(i), it is a chargeable option for the Netezza TwinFin appliance, and many announced details are on the vague side, with Netezza promising more clarity at or before [...]]]></description>
			<content:encoded><![CDATA[<p>Much like Aster Data did in <a href="http://www.dbms2.com/2009/10/30/aster-data-application-server-ncluster/" >Aster 4.0</a> and now <a href="http://www.dbms2.com/2010/02/22/aster-data-ncluster-4-5/" >Aster 4.5</a>, Netezza is announcing a general parallel big data analytic platform strategy. It is called Netezza TwinFin(i), it is a chargeable option for the <a href="http://www.dbms2.com/2009/07/30/netezza-new-product-family/" >Netezza TwinFin</a> appliance, and many announced details are on the vague side, with Netezza promising more clarity at or before its Enzee Universe conference in June. At a high level, the Aster and Netezza approaches compare/contrast as follows:<span id="more-1613"></span></p>
<ul>
<li>Netezza&#8217;s software runs on well-designed proprietary hardware. Aster runs on hardware that&#8217;s more off-the-shelf.</li>
<li>Aster was first to ship, and will also be first to ship an IDE (Integrated Development Environment).</li>
<li>MapReduce is central to Aster&#8217;s approach. Netezza TwinFin(i) supports MapReduce too, specifically a Hadoop implementation, but I don&#8217;t get the sense that everything Netezza does is built on MapReduce underpinnings.</li>
<li>Both Aster and Netezza try to provide rich functionality for creating in-memory data structures parallel analytic programs can use. Both seem to let you escape from the pure relational-table paradigm more easily than, say, Teradata&#8217;s new persistent memory capabilities do.</li>
<li>Aster and Netezza have made different choices about what kinds of prebuilt analytic packages to offer. Netezza could actually leapfrog Aster in this regard, but let&#8217;s see where each vendor is by, say, mid-year. If you care about the details of built-in analytic functions, you really should consider executing non-disclosure agreements with both those companies.</li>
<li>Both Aster and Netezza stress that you can run analytic functions out-of-process, greatly reducing the chance that they crash the database. Netezza and I&#8217;m pretty sure also Aster also retain the option of running in-process, which provides maximum performance. (In Netezza&#8217;s case C++ is the only in-process language supported, and I think Aster has a similar limitation.)</li>
<li>Like Aster, Netezza is integrating SQL queries and other analytic processing under the same workload management rubric.</li>
<li>Much like Aster, Netezza is tap-dancing by implying much richer forthcoming SAS support than anything currently announced. (The crunch-per-paragraph ratio in either vendor&#8217;s SAS-related press releases to date is distressingly low.)</li>
</ul>
<p>More specifically, here are some highlights of what I know, am guessing, and/or am allowed to say about Netezza TwinFin(i) at this time.</p>
<ul>
<li>The foundation for the analytic add-ons in Netezza TwinFin(i) is some sort of low-level “analytic executables.” Not understanding exactly what these are is my biggest area of confusion in the whole TwinFin(i) stack. Are they all C++, with everything translated into same? Is there Java all the way down as an alternative? (E.g., Hadoop is written in Java.) Anyhow, whatever it is, it&#8217;s surely a big improvement on <a href="../../../../../2007/09/27/the-netezza-developer-network/">Netezza&#8217;s prior Verilog-based generation of analytic extensibility technology</a>.</li>
<li>The announced list of languages supported in Netezza TwinFin(i) is Java, Python, Fortran, R, and C/C++. More are coming.</li>
<li>Netezza has named a lot of analytic functions it is adding, and hinting about more to come. It has named <a href="http://cran.r-project.org/" onclick="javascript:pageTracker._trackPageview('/cran.r-project.org');">CRAN/R</a> and GNU libraries, saying those have 1900 or more functions each. Netezza has also built its own linear algebra library for TwinFin(i), called nzMatrix. And as previously noted, TwinFin(i) also boasts a Hadoop implementation.</li>
<li>I haven&#8217;t heard about much in the way of TwinFin(i)-specific IDE support.</li>
<li>I don&#8217;t really have details as to what kinds of in-memory data structures Netezza TwinFin(i) does or doesn&#8217;t support.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/02/22/netezza-twinfin/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Comments on the Gartner 2009/2010 Data Warehouse Database Management System Magic Quadrant</title>
		<link>http://www.dbms2.com/2010/02/10/gartner-magic-quadrant-data-warehouse-2009-2010/</link>
		<comments>http://www.dbms2.com/2010/02/10/gartner-magic-quadrant-data-warehouse-2009-2010/#comments</comments>
		<pubDate>Wed, 10 Feb 2010 23:28:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[HP and Neoview]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Infobright]]></category>
		<category><![CDATA[Ingres]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[illuminate Solutions]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1553</guid>
		<description><![CDATA[At intervals of little over a year, Gartner Group publishes a Data Warehouse Database Management System Magic Quadrant. Gartner&#8217;s 2009 data warehouse DBMS Magic Quadrant &#8212; actually, January 2010 &#8212; is now out.* For many reasons, including those I noted in my comments on Gartner&#8217;s 2008 Data Warehouse DBMS Magic Quadrant, the Gartner quadrant pictures [...]]]></description>
			<content:encoded><![CDATA[<p>At intervals of little over a year, Gartner Group publishes a Data Warehouse Database Management System Magic Quadrant. <a href="http://www.gartner.com/technology/media-products/reprints/greenplum/173535.html" onclick="javascript:pageTracker._trackPageview('/www.gartner.com');">Gartner&#8217;s 2009 data warehouse DBMS Magic Quadrant</a> &#8212; actually, January 2010 &#8212; is now out.* For many reasons, including those I noted in <a href="http://www.dbms2.com/2009/01/12/gartners-2008-data-warehouse-database-management-system-magic-quadrant-is-out/" >my comments on Gartner&#8217;s 2008 Data Warehouse DBMS Magic Quadrant</a>, the Gartner quadrant pictures are a bad use of good research. Rather than rehash that this year, I&#8217;ll merely call out some points in the surrounding commentary that I find interesting or just plain strange.<span id="more-1553"></span></p>
<p><em>*Links to Gartner Magic Quadrants commonly break, but that one worked at the time of this posting.</em></p>
<ul>
<li>Gartner thinks that data warehouse appliances are on the rise, due to their simplicity.</li>
<li>Gartner correctly says that <a href="http://www.softwarememories.com/2008/09/15/database-machines/" onclick="javascript:pageTracker._trackPageview('/www.softwarememories.com');">Teradata has been a data warehouse appliance vendor from the getgo</a>.</li>
<li>Gartner characterizes IBM as being an appliance vendor as well.</li>
<li>Gartner suggests that HP is having trouble living up to its technical promises for Neoview.</li>
<li>Gartner further suggests &#8212; no surprise here &#8212; that HP Neoview has had very few new customers past its initial wave.</li>
<li>Gartner notes IBM&#8217;s difficulties in selling data warehouse installations of DB2, despite what on paper is great-sounding technology.</li>
<li>Gartner says &#8212; also no surprise &#8212; that illuminate &#8220;has seen little success in North America since opening its first office in the U.S. over two years ago.&#8221;</li>
<li>Ingres has evidently gotten a few BI-centric &#8220;appliance&#8221; deals, e.g. with Jaspersoft. But basically Ingres isn&#8217;t doing well in data warehousing.</li>
<li>Gartner does say Ingres has &#8220;the strongest open-source DBMS offering for data warehousing.&#8221; Being very literal about &#8220;open source,&#8221; that&#8217;s a defensible claim &#8212; but it&#8217;s pretty irrelevant in a world where <a href="http://www.dbms2.com/2009/10/19/greenplum-free-single-node-edition/" >Greenplum Single-Node Edition</a> can be had for free. It also waves away all the data mart use cases in which Infobright Community Edition shines.</li>
<li>Gartner says that Netezza is working out as a &#8220;complex workload&#8221; enterprise data warehouse provider, according to reference checks, in addition to its established success in data mart scenarios.</li>
<li>Gartner says Oracle&#8217;s offering has finally become &#8220;accepted&#8221; in the market for databases &gt;50 TB. I guess I can live with that fairly weak claim, but <a href="http://www.dbms2.com/2009/09/19/oracle-database-siz/" >I wouldn&#8217;t go much further than that</a>.</li>
<li>Gartner asserts that, unlike software-only Oracle, Oracle Exadata isn&#8217;t significantly harder to administer than &#8220;other mixed OLTP/OLAP DBMS vendors,&#8221; because Exadata is fast enough you don&#8217;t need to jump through all those hoops any more to get tolerable performance. The money quote is &#8220;one reference reported reducing the number of indexes by a factor of 100 to fewer than five.&#8221; Note, however, that Gartner does not seem to assert that Exadata&#8217;s ease of use rivals that of the newer analytic DBMS specialists.</li>
<li>Gartner confirms <a href="http://www.dbms2.com/2009/02/01/oracle-says-they-do-onsite-exadata-pocs-after-all/" >Oracle&#8217;s reluctance to do onsite Exadata POCs</a>, but says it is not absolute. This is roughly compatible with what I&#8217;m hearing elsewhere, and indeed with Oracle own claims to be ramping up availability of Exadata POC hardware.</li>
<li>Gartner&#8217;s criteria for inclusion include at least 10 different organizations having a product &#8220;in production.&#8221; Thus, the big surprise was ParAccel being included. The money quote there is &#8220;With approximately 20 customers in the pharmaceutical, retail, financial and media/advertising analytics sectors, ParAccel has a good reference base.&#8221; That assessment is difficult to reconcile with other information, but I&#8217;ve been told Gartner is sticking to its guns. That assessment would be even harder to believe if those 20 references were all alleged to be true production customers.</li>
<li>Gartner notes that you basically can&#8217;t run a 1 TB+ MySQL data warehouse without sharding. (Of course, Infobright has an alternative, and up to a small number of terabytes so does Kickfire.)</li>
<li>Gartner reports that at least some customers are pleased with Sybase IQ&#8217;s mixed workload/enterprise data warehouse capabilities.</li>
<li>Gartner correctly notes that <a href="http://www.dbms2.com/2009/10/05/oracle-exadata-2-capacity-pricing/" >Oracle Exadata is a price-competition challenge for Teradata</a>.</li>
<li>Gartner notes that 20% of Vertica&#8217;s customers are outside the US. While not shocking, that&#8217;s more than I realized.</li>
<li>Gartner notes something I don&#8217;t think I&#8217;ve posted yet, which is that Vertica has a customer with 300 TB of data. (The identity is a deep dark secret, but if I told you you probably wouldn&#8217;t recognize the name anyway.)</li>
</ul>
<p>As does any such piece, the Gartner Data Warehouse DBMS Magic Quadrant also has outright errors.  For example:</p>
<ul>
<li>Aster Data isn&#8217;t really &#8220;the newest entrant to the DBMS data warehouse world.&#8221;</li>
<li>Aster&#8217;s SQL/MapReduce was not new in Release 4.0.</li>
<li>Greenplum isn&#8217;t yet pushing down code to the storage tier.</li>
<li>I&#8217;m not sure what kind of database-tier parallelism Gartner is claiming is new in Oracle in 11g Release 2 &#8212; but I doubt it&#8217;s really new. Rather, what Oracle has done recently is <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >make parallelism less administratively cumbersome</a>.</li>
<li>Vertica wasn&#8217;t really the first DBMS in the cloud. At most it was the first pure-play analytic DBMS to get there.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/02/10/gartner-magic-quadrant-data-warehouse-2009-2010/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Netezza Skimmer</title>
		<link>http://www.dbms2.com/2010/01/25/netezza-skimmer/</link>
		<comments>http://www.dbms2.com/2010/01/25/netezza-skimmer/#comments</comments>
		<pubDate>Mon, 25 Jan 2010 14:39:00 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data mart outsourcing]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Pricing]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1441</guid>
		<description><![CDATA[As I previously complained, last week wasn&#8217;t a very convenient time for me to have briefings. So when Netezza emailed to say it would release its new entry-level Skimmer appliance this morning, while I asked for and got a Friday afternoon briefing, I kept it quick and basic.
That said, highlights of my Netezza Skimmer briefing [...]]]></description>
			<content:encoded><![CDATA[<p>As I previously <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >complained</a>, last week wasn&#8217;t a very convenient time for me to have briefings. So when Netezza emailed to say it would release its new entry-level Skimmer appliance this morning, while I asked for and got a Friday afternoon briefing, I kept it quick and basic.</p>
<p>That said, highlights of my Netezza Skimmer briefing included:</p>
<ul>
<li>In essence, Netezza Skimmer is 1/3 of Netezza&#8217;s previously smallest appliance, for 1/3 the price.</li>
<li>I.e., Netezza Skimmer has 1 S-blade and 9 disks, vs. 3 S-blades and 24 disks on the Netezza TwinFin 3.</li>
<li>With 1 disk reserved as a hot spare, that boils down to a 1:1:1 ratio among CPU cores, FPGA cores, and 1-terabyte disks on Netezza skimmer. The same could pretty much be said of Netezza TwinFin, the occasional hot-spare disk notwithstanding.</li>
<li>Netezza Skimmer costs $125K.</li>
<li>With 2.8 or so TB of space for user data before compression, that&#8217;s right in line with the <a href="http://www.dbms2.com/2009/07/30/the-netezza-price-point/" >Netezza price point</a> of slightly &lt;$20K/terabyte of user data.</li>
<li>That assumes Netezza&#8217;s usual 2.25X compression. I forgot to ask when 4X compression was actually being shipped.</li>
<li>I forgot to ask, but it seems obvious that Netezza Skimmer uses identical or substantially similar components to Netezza TwinFin&#8217;s.</li>
<li>Netezza Skimmer is 7 rack units high.</li>
<li>In place of the SMP hosts on TwinFin Systems, Netezza Skimmer has a host blade.</li>
<li>Netezza (specifically Phil Francisco) mentioned that when Kalido uses Netezza Skimmer for its appliance, there will be an additional host computer, but when it uses TwinFin for the same software, the built-in host will suffice. (Even so, I suspect it might be too strong to say that Skimmer&#8217;s built-in host computer is underpowered.)</li>
<li>Netezza also suggested that more appliance OEMs are coming down the pike specifically focused on the affordable Skimmer.</li>
</ul>
<p><span id="more-1441"></span>Obviously, Netezza Skimmer isn&#8217;t breaking any new technical ground. If Netezza had just called Skimmer &#8220;TwinFin 1,&#8221; nobody should have objected. So the main news here is that you can buy a Netezza box for $125K, plug it in, load a few terabytes of data, and be good to go with a pretty solid data warehouse.  For enterprises and data mart outsourcers with databases of the appropriate size, that could be a pretty attractive deal.</p>
<p>Is Netezza Skimmer as cheap as buying your own hardware and putting (free) <a href="http://www.dbms2.com/2009/10/19/greenplum-free-single-node-edition/" >Greenplum Single-Node Edition</a> software on it? Not even close, especially since Greenplum&#8217;s free option limits you to lower overall compute power. Does Netezza Skimmer have as high availability as more expensive alternatives? In some cases, surely not. Skimmer is neither the cheapest thing around nor an utterly high-end product.</p>
<p>But Netezza Skimmer belongs on a lot of short lists even so.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/01/25/netezza-skimmer/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Two cornerstones of Oracle’s database hardware strategy</title>
		<link>http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/</link>
		<comments>http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 08:59:23 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Cache]]></category>
		<category><![CDATA[DBMS product categories]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Theory and architecture]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1429</guid>
		<description><![CDATA[After several months of careful optimization, Oracle managed to pick the most inconvenient* day possible for me to get an Exadata update from Juan Loaiza. But the call itself was long and fascinating, with the two main takeaways being:

Oracle      thinks flash memory is the most important hardware technology of the [...]]]></description>
			<content:encoded><![CDATA[<p>After several months of careful optimization, Oracle managed to pick the most inconvenient* day possible for me to get an Exadata update from Juan Loaiza. But the call itself was long and fascinating, with the two main takeaways being:</p>
<ul>
<li>Oracle      thinks <strong>flash memory is the most important hardware technology of the      decade,</strong> one that could lead to Oracle being “bumped off” if they don’t      get it right.</li>
<li>Juan      believes <strong>the “bulk” of Oracle’s business will move over to Exadata-like      technology over the next 5-10 years. </strong>Numbers-wise, this seems to be based more      on Exadata being a platform for consolidating an enterprise’s many Oracle databases than it is on Exadata running a few Especially Big Honking Database      management tasks.</li>
</ul>
<p>And by the way, Oracle doesn’t make its storage-tier software available to run on anything than Oracle-designed boxes.  At the moment, that means Exadata Versions 1 and 2. Since Exadata is by far Oracle’s best DBMS offering (at least in theory), that means <strong>Oracle’s best database offering only runs on specific Oracle-sold hardware platforms.<span id="more-1429"></span></strong> <em></em></p>
<p><em>*E.g., I was sitting upstairs in my parents’ apartment in </em><em>Columbus</em><em>, </em><em>OH</em><em> having the call while their doctor, who I’ve never met, was visiting downstairs. He offered to make a special trip back Saturday afternoon because he missed me Wednesday, but he’s notorious for not coming when he says he will.</em> <em>Update: He didn&#8217;t come Saturday. On Saturday he said he&#8217;d come Sunday. He didn&#8217;t do that either. </em></p>
<p>Other high- and lowlights of our conversation included:</p>
<ul>
<li>Flash      is the main new hardware element in Exadata Version 2. Otherwise, Exadata      2 is just an annual refresh of Exadata Version 1 to include updated      components (Nehalem chips, bigger disk drives, etc.)</li>
<li>Juan      thinks it’s suboptimal to use flash memory through the bottleneck of disk      controllers, favoring PCIe cards instead. (I emphatically agree.)</li>
<li>Juan      resolutely ducked questions about <a href="../../../../../2009/09/25/the-hunt-for-oracle-exadata-production-references/">actual      Exadata production deployment</a>. Literally the only fact he shared in      that regard is that there are at least 2 Exadata production systems      running that each have 2 or more racks cabled together.</li>
<li>Juan      stressed that Exadata runs apps written over Oracle DBMS unchanged.</li>
<li>When      making mixed-workload claims for Exadata 2, Juan stressed consolidation of      multiple databases, some OLTP and some analytic. He didn’t really argue      with my skepticism about <a href="../../../../../2009/09/29/integration-oltp-data-warehousing-exadata-2/">integrating      OLTP and analytics in the same database</a>, with one exception:</li>
<li>Juan      pointed out that in major OLTP apps such as ERP systems, there often is      actually more processing going on in reporting and other batch stuff than      there is in true OLTP.</li>
<li>Exadata      2’s flash memory is designed as a disk cache, smarter than LRU (Least      Recently Used). The two examples Juan gave of “smarter than LRU” are that      backups and table scans don’t flush the cache.</li>
<li>I      forget whether this is new in Exadata 2 (I think it is), but anyhow –      Exadata has a “Storage Index” that’s a lot like a <a href="../../../../../2006/09/20/netezza-vs-conventional-data-warehousing-rdbms/">Netezza      zone map</a>. I.e., for each megabyte or so of data it stores the min and      max value of every column; if a query predicate rules out those ranges,      that megabyte is never retrieved.</li>
<li>Oracle      has long offered what sounds like flexible workload management capability,      and this has now been extended to specifically include I/O resources on      the storage tier.</li>
<li>This      isn’t Exadata-specific, but Oracle has built a file system on top of its      DBMS, optimized for speed, which helps with, e.g., ELT      (Extract/Load/Transform). Evidently, it’s not at all the same thing as      Mark Benioff’s 1990s Microsoft-annoying IFS (Internet File System)      project, which seems to have morphed into a content management SDK.</li>
</ul>
<p>Highlights specifically in the area of parallelization included:</p>
<ul>
<li>Juan      stressed that all databases consolidated onto an Exadata machine      are/should be striped across all storage units.</li>
<li>On the      other hand, Juan said that different databases should be confined to      specific cores or CPUs on the database tier.</li>
<li>But on      the third hand, Juan also stressed – in what could be called a “private      cloud” pitch – that there’s great elasticity as to which databases are      matched to which server CPUs.</li>
<li>Contrary      to what <a href="../../../../../2008/09/28/exadata-oracle-database-machine-parallelization/">I      thought he and/or his colleagues told me a year ago</a>, Juan said RAC      (Real Application Clusters) is a big part of Oracle’s data warehouse      processing.</li>
<li>However,      Juan says that what I regard(ed) as a major objection to Oracle’s      database-tier parallelization &#8212; the need to manually specify “degrees of      parallelism” &#8212; has now been obviated by automation. Juan thinks that few      data warehouse DBAs will now need to manually tune parallelism, with minor      exceptions. One exception he cites is that if a nightly report really is      non-urgent, it can just be forced to run on a single core with no chance      to grab more resources. (However, Juan thinks manual tuning of parallelism      will continue to play a greater role in OLTP.)</li>
</ul>
<p>OK. That’s all I can get done tonight (see above re: inconvenience of timing). Follow-on subjects I’d like to and indeed plan to post about include:</p>
<ul>
<li>What      Juan said about hybrid columnar compression</li>
<li>Oracle’s      delightfully non-confidential slide deck, and a few comments about same</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Comments on a fabricated press release quote</title>
		<link>http://www.dbms2.com/2009/11/23/fabricated-press-release-quote/</link>
		<comments>http://www.dbms2.com/2009/11/23/fabricated-press-release-quote/#comments</comments>
		<pubDate>Mon, 23 Nov 2009 21:54:03 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[About this blog]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Sybase]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1250</guid>
		<description><![CDATA[My clients at Kickfire put out a press release last week quoting me as saying things I neither said nor believe.  The press release is about a “Queen For A Day”  kind of contest announced way back in April, in which users were invited to submit stories of their data warehouse problems, with [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">My clients at Kickfire put out a press release last week quoting me as saying things I neither said nor believe.  The press release is about a “Queen For A Day”  kind of contest announced way back in April, in which users were invited to submit stories of their data warehouse problems, with the biggest sob stories winning free Kickfire appliances.  The fabricated “quote” reads:<span id="more-1250"></span></p>
<p style="margin-bottom: 0in;"><em>As we went through the contest entries in detail, it was readily apparent that today&#8217;s data warehousing solutions are either massively expensive or non-existent,&#8221; said Curt Monash, Founder of Monash Research. &#8220;Clearly, there is major dual-market opportunity for a product such as the Kickfire appliance that can not only provide an affordable data warehousing solution to small companies; but can also target larger companies that have made an initial investment in high-end solutions, yet still need to add some affordable query processing power in other areas of the organization.&#8221;</em></p>
<p style="margin-bottom: 0in;">In reality:</p>
<ul>
<li>I spent a few minutes reviewing 	summaries of eight stories selected by Kickfire from the entrants, 	and emailed comments back to Kickfire about them.  I have no further 	role to play in the contest.</li>
<li>The part of the “quote” that 	slams Kickfire&#8217;s competitors is not reflective of my views.</li>
<li>The “market opportunity” is in 	line with the positioning I&#8217;ve encouraged Kickfire to adopt. A good 	shorthand for it is the “Sybase IQ market.” In essence I see 	Kickfire as an interesting Sybase IQ alternative. But Sybase IQ is a 	formidable competitor, and there are many other competitors as well. 	This is hardly an untapped market ripe for Kickfire&#8217;s plucking.</li>
</ul>
<p style="margin-bottom: 0in;">I&#8217;m satisfied that this is all a case of lousy marketing execution – something <a href="../2009/10/18/kickfire-capacity-and-pricing/">Kickfire has a history of</a> &#8212;  rather than deliberate deception. Kickfire has recently turned over its VP of Marketing (twice) and PR resource (at least once). Scott Humphrey, Kickfire&#8217;s new outside PR guy, says he was incorrectly told by his predecessor that the press release and quote in question had been approved, and put it out without fact-checking. I believe him. I hope Kickfire CEO Bruce Armstrong will be able to add stronger marketing leadership soon. Bruce seems aware of the need, and is making reasonable marketing strategy decisions himself in the mean time, so there&#8217;s some basis for optimism.</p>
<p style="margin-bottom: 0in;">And by the way – <strong>I don&#8217;t let vendors write press release quotes for me </strong><span>anyway. I let them edit in precise product names and so on, but otherwise the words are mine. The last occasion on which I recall bending this policy was inadvertent and over a year ago, when Greenplum emailed something to me &#8212; which was genuinely similar to my opinion &#8212; while I was on the phone with Aster at <a href="../2008/08/25/mapreduce-sound-bites/">a particularly frenzied time</a>, and I didn&#8217;t immediately realize the words weren&#8217;t my own. </span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/11/23/fabricated-press-release-quote/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Boston Big Data Summit keynote outline</title>
		<link>http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/</link>
		<comments>http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/#comments</comments>
		<pubDate>Mon, 23 Nov 2009 06:25:50 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Archiving and information preservation]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[DBMS product categories]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Humor]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1227</guid>
		<description><![CDATA[Last month, Bob Zurek asked me to give a talk on “Big Data”, where “big” is anything from a few terabytes on up, then moderate a panel on cloud computing. We agreed that I could talk just from notes, without slides. So, since I have them typed up, I&#8217;m posting them below.

The top two points [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Last month, Bob Zurek asked me to give a talk on <a href="http://www.dbms2.com/2009/10/09/presentations-upcoming/" >“Big Data”, where “big” is anything from a few terabytes on up</a>, then moderate a panel on cloud computing. We agreed that I could talk just from notes, without slides. So, since I have them typed up, I&#8217;m posting them below.</p>
<p><span id="more-1227"></span></p>
<p style="margin-bottom: 0in;">The top two points from Q&amp;A probably were:</p>
<ul>
<li><strong>Big Data and the cloud actually 	have relatively little to do with each other,</strong> <a href="http://www.dbms2.com/2009/10/30/aster-data-application-server-ncluster/" >a few exceptions</a> notwithstanding, especially if the data is in a shared-nothing DBMS 	(as opposed to, say, a MapReduce-oriented file cluster). Two 	principal reasons are:
<ul>
<li>Redistributing data from node to 	node is a little slow, undermining some of the elasticity benefits 	of the cloud.</li>
<li><a href="http://www.dbms2.com/2009/05/29/sneakernet-to-the-cloud/" >Getting data into the cloud in the 	first place is a lot slow</a>.</li>
</ul>
</li>
<li><strong>The NoSQL movement is a lot like 	the Ron Paul campaign</strong> &#8212; it consists of people who are dissatisfied 	with the status quo, whose dissatisfaction has a lot to do with 	insufficient liberty and/or excessive expenditure, and who otherwise 	don&#8217;t have a whole lot in common with each other.</li>
</ul>
<p style="margin-bottom: 0in;">Anyhow, here are my notes for the talk, edited in just a couple of places for readability or linkage.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;"><strong>Quick introduction</strong></p>
<ul>
<li>Big Data vs. cloud</li>
<li>How big is Big Data?</li>
<li>At the low end of that range, 	there&#8217;s little you can&#8217;t do with conventional technology if you 	have:
<ul>
<li>An unlimited budget for hardware</li>
<li>An unlimited budget for software</li>
<li>An unlimited budget for people, 	especially Oracle DBAs</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Big Data in OLTP</strong></p>
<ul>
<li>Hard-core OLTP
<ul>
<li>Focus of DBMS technology for a 	long-time</li>
<li>Big budgets because each 	transaction has significant value</li>
<li>Tough to get users to change 	technologies</li>
</ul>
</li>
<li>Lighter-weight OLTP
<ul>
<li>Classic example = web companies
<ul>
<li>Big ones &#8212;  retail-oriented ones 	(eBay, Amazon) partially excepted &#8212; <a href="http://www.dbms2.com/2009/05/11/facebook-hadoop-and-hive/" >rolled their own technology 	stacks</a></li>
<li>Reluctant to give money to anybody
<ul>
<li>Open source, etc.</li>
</ul>
</li>
</ul>
</li>
<li>Difficulty finding market
<ul>
<li>Product vs. feature
<ul>
<li>Clustering/HA/DR/whatever</li>
<li>Ditto cloud enablement</li>
</ul>
</li>
<li>True products haven&#8217;t found much 	traction yet</li>
</ul>
</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Analytic Big Data use cases</strong></p>
<ul>
<li>Kinds of data for analytics
<ul>
<li>More of same != big</li>
<li>More detail and/or new kinds
<ul>
<li>Complete data sets</li>
<li>Transactions</li>
<li>Call details</li>
<li>Tick/trade history</li>
<li>Web clickstreams</li>
<li>Network event logs</li>
<li>Other machine-generated data</li>
<li>CAM bottom line
<ul>
<li>Anything human-generated should 	and will be retained in its entirety</li>
<li>Quantities of machine-generated 	data retained should and will grow roughly in line w/ computing cost 	reductions (Moore&#8217;s Law, etc.)</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>Analytic uses of Big Data
<ul>
<li>Analytics is mainly about three 	things
<ul>
<li>Problem detection</li>
<li>Customer relationship improvement
<ul>
<li>(Those overlap when the customer 	relationship is bad)</li>
</ul>
</li>
<li>Financial statements on steroids</li>
</ul>
</li>
</ul>
<ul>
<li>Main kinds of analytics
<ul>
<li>What BI vendors traditionally sell
<ul>
<li>General reporting and dashboards</li>
<li>Ad-hoc query (now driven from 	those reports and dashboards)</li>
<li>Planning (allegedly integrated 	with BI)</li>
</ul>
</li>
<li>Research
<ul>
<li>Ad hoc relational query (worth 	mentioning twice because it drives so much of the market)</li>
<li>Data mining</li>
<li>Most web search and web mining</li>
</ul>
</li>
<li>Operational/near-real-time</li>
<li>Archiving/compliance</li>
</ul>
</li>
<li>What gets Big?
<ul>
<li>Mainly research and archiving</li>
<li>But when reporting or operational 	get Big, you have really interesting computing problems</li>
</ul>
</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Technology issues and trends</strong></p>
<ul>
<li>Moore&#8217;s Law
<ul>
<li>CPUs &#8212; All about cores, hence 	parallelism is key</li>
<li>RAM</li>
<li>SSDs – hence replace disks</li>
<li>Sensors – hence generate lots 	more data</li>
</ul>
</li>
<li>Kryder&#8217;s Law
<ul>
<li>But <a href="http://www.dbms2.com/2005/11/13/breaking-the-disk-speed-barrier/" >rotational speeds up only 	12.5X since Eisenhower Administration</a></li>
<li>Hence solid-state memory (or RAM) 	will soon take over</li>
</ul>
</li>
<li>In the mean time, I/O bottlenecks 	have had to be beaten
<ul>
<li>Hence sequential scans</li>
<li>Hence <a href="http://www.dbms2.com/2007/03/26/index-light-mpp-data-warehouse-appliances/" >index-light</a> architectures</li>
<li>Hence columnar</li>
</ul>
</li>
<li>DBMS “overhead”
<ul>
<li>Raw license and maintenance fees – 	software increasing fraction of total</li>
<li>OLTP vestiges – locking and all 	that</li>
<li>DBAs
<ul>
<li>People costs = huge fraction of 	total</li>
<li>Index-lightness addresses</li>
<li>So does appliance</li>
</ul>
</li>
<li>Many people don&#8217;t really know how to 	write SQL</li>
</ul>
</li>
<li>Configuration
<ul>
<li>Appliance/tightly-balanced
<ul>
<li>Netezza</li>
<li>Teradata earlier</li>
<li>Greenplum/Sun</li>
<li>Oracle</li>
<li>IBM</li>
<li>Microsoft/Madison</li>
</ul>
</li>
<li>Commodity/do what you want
<ul>
<li>Vertica</li>
<li>Greenplum now</li>
<li>Infobright, Aster and others</li>
<li>MapReduce-oriented file systems</li>
</ul>
</li>
<li><a href="http://www.dbms2.com/2009/10/25/data-warehouse-balanced-hardware-configuration/" >Extreme rigidity is silly</a>
<ul>
<li><a href="http://www.dbms2.com/2009/10/25/teradata-hardware-strategy-and-tactics/" >Teradata, Oracle have both 	signaled moving to more modularity</a></li>
<li>Big driver of that = heterogeneous 	storage
<ul>
<li>Cheap disk</li>
<li>Expensive disk</li>
<li>Solid-state</li>
<li>RAM</li>
</ul>
</li>
</ul>
<ul>
<li>CPU/storage ratio is even more of a 	driver</li>
</ul>
</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Theoretically defensible ways to segment the market</strong></p>
<ul>
<li><a href="http://www.dbms2.com/2009/09/10/analytic-speed-latency/" >Latency requirements</a>
<ul>
<li>High availability and low latency 	go together</li>
</ul>
</li>
<li>Query types
<ul>
<li>Simultaneous users for same</li>
</ul>
</li>
<li>Database size</li>
<li>Budget</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Actual segments right now</strong></p>
<ul>
<li><a href="http://www.dbms2.com/2009/08/24/teradatas-active-enterprise-data-warehouse-story/" >Utter ADW/EDW</a></li>
<li>Data mart
<ul>
<li>Size</li>
<li>Naturally columnar vs. naturally 	row-based</li>
</ul>
</li>
<li>Operational/frontline</li>
<li>Less dramatic/smaller EDW</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Teradata hardware strategy and tactics</title>
		<link>http://www.dbms2.com/2009/10/25/teradata-hardware-strategy-and-tactics/</link>
		<comments>http://www.dbms2.com/2009/10/25/teradata-hardware-strategy-and-tactics/#comments</comments>
		<pubDate>Sun, 25 Oct 2009 04:12:09 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Teradata]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1171</guid>
		<description><![CDATA[In my opinion, the most important takeaways about Teradata&#8217;s hardware strategy from the Teradata Partners conference last week are:

Teradata&#8217;s future lies in 	solid-state memory. That&#8217;s in 	line with what Carson 	Schmidt told me six months ago.
To Teradata&#8217;s surprise, the 	solid-state future is imminent. Teradata is 6-9 months further along with solid-state drives (SSD) 	than it [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">In my opinion, the most important takeaways about Teradata&#8217;s hardware strategy from <a href="http://www.dbms2.com/2009/10/19/teradata-partners-2009/" >the Teradata Partners conference</a> last week are:</p>
<ul>
<li><strong>Teradata&#8217;s future lies in 	solid-state memory.</strong><span> That&#8217;s in 	line with what <a href="../2009/04/28/data-warehouse-storage-options-cheap-expensive-or-solid-state-disk-drives/">Carson 	Schmidt</a> told me six months ago.</span></li>
<li><strong>To Teradata&#8217;s surprise, the 	solid-state future is imminent.</strong><span> Teradata is 6-9 months further along with solid-state drives (SSD) 	than it thought a year ago it would be at this point.</span></li>
<li><strong>Short-term, Teradata is going 	to increase the number of appliance kinds it sells. </strong><span>I 	didn&#8217;t actually get details on anything but the new SSD-based Blurr, 	but it seems there will be others as well.</span></li>
<li><strong>Teradata&#8217;s eventual future is 	to mix and match parts (especially different kinds of storage) in a 	more modular product line.</strong><span style="font-style: normal;"><span> <a href="../2008/10/14/teradata-virtual-storage/">Teradata 	Virtual Storage</a> is of </span></span><span>pretty 	limited value otherwise. I probably believe Teradata will go modular 	more emphatically than Teradata itself does, because I think <a href="http://www.dbms2.com/2009/10/25/data-warehouse-balanced-hardware-configuration/" >doing so will meet users needs more effectively</a> than if Teradata relies strictly on fixed appliance configurations.<br />
</span></li>
</ul>
<p style="margin-bottom: 0in;">In addition, some non-SSD componentry tidbits from Carson Schmidt include:</p>
<ul>
<li>Teradata really likes Intel&#8217;s 	Nehalem CPUs, with special reference to multi-threading, QuickPath 	interconnect, and integrated memory controller. Obviously, 	Nehalem-based Teradata boxes should be expected in the not too 	distant future.</li>
<li>Teradata really likes Nehalem&#8217;s 	successor Westmere too, and expects to be pretty fast to market with 	it (faster than with Nehalem) because Nehalem and Westmere are 	plug-compatible in motherboards.</li>
<li>Teradata will go to 10-gigabit 	Ethernet for external connectivity on all its equipment, which 	should improve load performance.</li>
<li>Teradata will also go to 	10-gigabit Ethernet to play the Bynet role on appliances. Tests are 	indicating this improves query performance.</li>
<li>What&#8217;s more, Teradata believes 	there will be no practical scale-out limitations with 10-gigabit 	Ethernet.</li>
<li>Teradata hasn&#8217;t decided yet what 	to do about 2.5” SFF (Small Form Factor) disk drives, but is 	leaning favorably. Benefits would include lower power consumption 	and smaller cabinets.</li>
<li>Also on Carson&#8217;s list of 	“exciting” future technologies is SAS 2.0, which at 6 	gigabits/second doubles the I/O bandwidth of SAS 1.0.</li>
<li>Carson is even excited about 	removing universal power supplies from the cabinets, increasing 	space for other components.</li>
<li>Teradata picked Intel&#8217;s Host Bus 	Adapters for 10-gigabit Ethernet. The switch supplier hasn&#8217;t been 	determined yet.</li>
</ul>
<p style="margin-bottom: 0in;">Let&#8217;s get back now to SSDs, because over the next few years they&#8217;re the potential game-changer. <span id="more-1171"></span>The big news on SSDs is that after last year&#8217;s Teradata Partners conference, a stealth supplier* introduced itself and convinced Teradata it offers really great SSD technology. For example, not a single SSD it has provided Teradata has ever failed. (In hardware, that is. There have of course been firmware bugs, suitably squashed.) I think SSD performance is also exceeding Teradata&#8217;s expectations. This supplier is where the 6-9 month time-to-market gain comes from.</p>
<p style="margin-bottom: 0in;"><em>*Based on how often the concept of “stealth” and “name is NDAed” came up, I do not believe this is the SSD company another vendor told me about that is going around claiming it has a Teradata relationship.</em></p>
<p style="margin-bottom: 0in;">Teradata SSD highlights include:</p>
<ul>
<li>I/O speeds on “random medium 	blocks” are 520 megabytes/second, vs. 15 MB/second on their 	fastest disks. And that&#8217;s limited by SAS 1.0, load-balanced across 	two devices, not the hardware itself. (2 x 300+ MB/sec turns out to 	be 520 MB/sec in this case.) No wonder Carson is excited about SAS 	2.0.</li>
<li>Teradata is using SAS interfaces 	for its SSDs, and believes that&#8217;s unusual, in that other companies 	are using SATA or Fibre Channel.</li>
<li>Never having had a part fail, 	Teradata has no real basis to make MTTF (Mean Time To Failure) 	estimates for its SSDs.</li>
<li>Teradata&#8217;s SSD appliance design 	includes no array controllers. The biggest reason is that right now 	array controllers can&#8217;t keep up with the SSDs&#8217; speed.</li>
<li>In its SSD appliance, Teradata has 	abandoned RAID, doing mirroring instead via a DBMS feature called 	Fallback that&#8217;s been around since Teradata&#8217;s earliest days. 	(However, <a href="../2008/09/28/oracle-database-machine-performance-and-compression/">unlike 	Oracle in Exadata</a>, Teradata continues to use RAID for disks.)</li>
<li>Useful life for Teradata&#8217;s SSDs is 	estimated at 5-7 years.</li>
<li>Teradata&#8217;s SSDs are SLC 	(Single-Level Cell), as opposed to MLC (Multi-Level Cell).</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/10/25/teradata-hardware-strategy-and-tactics/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Reports of perfectly-balanced hardware configurations are greatly exaggerated</title>
		<link>http://www.dbms2.com/2009/10/25/data-warehouse-balanced-hardware-configuration/</link>
		<comments>http://www.dbms2.com/2009/10/25/data-warehouse-balanced-hardware-configuration/#comments</comments>
		<pubDate>Sun, 25 Oct 2009 04:00:27 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Teradata]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1165</guid>
		<description><![CDATA[Data warehouse appliance and software appliance vendors like to claim that they&#8217;ve worked out just the right hardware configuration(s), and that a single configuration is correct for a fairly broad range of workloads. But there are a lot of reasons to be dubious about that. Specific vendor evidence includes:

Teradata ascribes 	considerable importance to a Virtual [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Data warehouse appliance and software appliance vendors like to claim that they&#8217;ve worked out just the right hardware configuration(s), and that a single configuration is correct for a fairly broad range of workloads. But there are a lot of reasons to be dubious about that. Specific vendor evidence includes:</p>
<ul>
<li><strong>Teradata</strong> ascribes 	considerable importance to a <a href="http://www.dbms2.com/2008/10/14/teradata-virtual-storage//" >Virtual 	Storage</a> technology whose main purpose is to allow mixing of 	heterogeneous storage devices in a single system. And the discussion 	rarely suggests that these parts will be in a rigid fixed 	relationship.</li>
<li><strong>Netezza</strong> &#8212; as Teradata 	keeps reminding me &#8212; often sells boxes with the expectation that 	they won&#8217;t be filled with data, so as to increase spindle count and hence performance.</li>
<li><strong>Oracle/Sun</strong> have dropped 	some comments about Exadata being more flexibly configured going 	forward.</li>
<li><strong>Kickfire&#8217;s</strong> <a href="../2009/10/18/kickfire-capacity-and-pricing/">new 	“high-end” appliance</a> lets you attach fairly arbitrary 	amounts of external storage.</li>
<li>And of course, <strong>software-only 	analytic DBMS vendors</strong> run their software in all sorts of 	hardware and storage environments.</li>
</ul>
<p style="margin-bottom: 0in;">What&#8217;s more, the claim never made a lot of sense anyway. With the rarest of exceptions, even a single data warehouse&#8217;s workload will contain different queries that strain different parts of the system in different ratios. Calculating the “ideal” hardware configuration for that single workload would be forbiddingly difficult. And even if one could calculate it, it almost surely would be different than another user&#8217;s “ideal” configuration. How a single hardware configuration can be “ideally balanced” for a broad class of use cases boggles the imagination.</p>
<p style="margin-bottom: 0in;">
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/10/25/data-warehouse-balanced-hardware-configuration/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>This week at the Teradata Partners user conference</title>
		<link>http://www.dbms2.com/2009/10/19/teradata-partners-2009/</link>
		<comments>http://www.dbms2.com/2009/10/19/teradata-partners-2009/#comments</comments>
		<pubDate>Mon, 19 Oct 2009 13:07:31 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data integration and middleware]]></category>
		<category><![CDATA[Data types]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[GIS and geospatial]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Theory and architecture]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1150</guid>
		<description><![CDATA[Teradata tells me that its press embargoes are ending at 9:00 this morning. Here are some highlights of what&#8217;s going on, although names, dates, and details will have to await conversations and press releases this week.

Teradata is productizing 	“private cloud,” under names including “Teradata 	Enterprise Analytics Cloud,” “Teradata Agile Analytics Cloud,” 	and “Teradata Elastic Mart [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Teradata tells me that its press embargoes are ending at 9:00 this morning. Here are some highlights of what&#8217;s going on, although names, dates, and details will have to await conversations and press releases this week.</p>
<ul>
<li><strong>Teradata is productizing 	“private cloud,”</strong> under names including “Teradata 	Enterprise Analytics Cloud,” “Teradata Agile Analytics Cloud,” 	and “Teradata Elastic Mart Builder.” I.e., Teradata hopes to 	leapfrog Greenplum in its “<a href="../2009/06/08/the-future-of-data-marts/">Enterprise 	Data Cloud</a>” strategy. This is only fair, in that Greenplum 	lifted the idea from Teradata and eBay in the first place. It also 	provides major support for what I think is an extremely sensible 	trend. Give or take issues of who announces and ships what a couple 	months before or after a competitor, my early thinking is that the 	main differences between Greenplum and Teradata in this regard will 	be:
<ul>
<li>Virtual as opposed to just 	physical data marts, based on robust workload management software. 	(Advantage: Teradata)</li>
<li>Pricing, deployment options. 	(Advantage: Greenplum)</li>
<li>Features that don&#8217;t directly 	relate to enterprise/private cloud. (Advantage: Either, often 	Teradata.)</li>
</ul>
</li>
<li><strong>Teradata is generally 	strengthening its data movement technology</strong>, e.g. for making 	various appliances work in sync. I&#8217;m not too clear yet on the 	details of that. I think this is what Teradata&#8217;s phrase “ecosystem 	management” refers to.</li>
<li><strong>Teradata is (pre-)announcing – 	at least as a statement of direction &#8212; an appliance based on 	solid-state drives (SSDs). </strong>I&#8217;ve thought for a while that 	Teradata was a leader in thinking through <a href="../2008/10/23/teradata-solid-state-drives-ssd/">the 	issues around solid-state memory in data warehousing</a>, so it 	makes sense that they&#8217;re among the leaders in actually coming to 	market as well. I plan to say more after meeting with, e.g., Carson 	Schmidt.</li>
<li><strong>Teradata has achieved a 300%ish 	speed-up in geospatial processing</strong>. I gather this is largely a 	byproduct of the parallel analytics work Teradata did around 	strengthening its SAS integration. However, there don&#8217;t seem to be a 	lot of Teradata geospatial users yet.</li>
<li><span>Teradata 	Express, </span><strong>Teradata&#8217;s free Windows-based crippleware, is being 	ported to Amazon EC2 and VMware</strong> as well. Presumably to avoid 	cannibalizing Teradata product sales, there are quite a few 	limitations on Teradata Express, including system capacity, database 	size, and “no production use.”</li>
<li><strong>Teradata continues to extend 	its optimizations 	to handle queries issued by business intelligence tools. </strong><span>Previously, the focus of what 	Teradata discussed in this regard was <a href="../2009/08/02/teradata-13-focuses-on-advanced-analytic-performance/">query 	rewrite</a>. But soon automatic recommendation and creation of 	Aggregate Join Indexes – i.e.., materialized views – will be 	included as well.</span></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/10/19/teradata-partners-2009/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
