<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS2 -- DataBase Management System Services &#187; Clustering</title>
	<atom:link href="http://www.dbms2.com/category/parallelization/database-clustering/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Fri, 30 Jul 2010 15:51:32 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Gear6 seems to have failed in the memcached market too</title>
		<link>http://www.dbms2.com/2010/04/27/gear6-memcache/</link>
		<comments>http://www.dbms2.com/2010/04/27/gear6-memcache/#comments</comments>
		<pubDate>Tue, 27 Apr 2010 04:33:22 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Northscale]]></category>
		<category><![CDATA[memcached]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1961</guid>
		<description><![CDATA[As previously noted, I&#8217;ve briefly cut back on blogging (and research) due to some family health issues. The first casualty was a post about memcached. One of the two companies to be featured were my new clients at Northscale. The other was Gear6. What they had in common was:

Both Northscale and Gear6 offered distributions of [...]]]></description>
			<content:encoded><![CDATA[<p>As previously noted, <a href="http://www.dbms2.com/2010/04/16/introduction-to-datameer/" >I&#8217;ve briefly cut back on blogging</a> (and research) due to some family health issues. The first casualty was a post about memcached. One of the two companies to be featured were my new clients at <a href="http://www.dbms2.com/2010/03/16/memcached-northscale-launc/" >Northscale</a>. The other was Gear6. What they had in common was:</p>
<ul>
<li>Both Northscale and Gear6 offered distributions of memcached.</li>
<li>Both Northscale and Gear6 also wanted to sell persistent versions of memcached &#8212; in essence, simple DBMS with the memcached API in place of a substantial DML (Data Manipulation Language).</li>
</ul>
<p><span id="more-1961"></span>Differences included:</p>
<ul>
<li>Gear6 hit the market earlier.</li>
<li>Gear6 forked away from open source memcached.</li>
<li>Gear6 sold its memcached distribution, while Northscale&#8217;s is free. (But you&#8217;re encouraged to pay for support.)</li>
<li>Gear6 is also in the caching appliance business.</li>
<li>Northscale and Gear6 had different approaches to beefing up the memcached API (especially for their respective persistent memcached stores).</li>
</ul>
<p>However, both <a href="http://gigaom.com/2010/04/26/gear6-rip/" onclick="javascript:pageTracker._trackPageview('/gigaom.com');">GigaOm</a> and <a href="http://blogs.eweek.com/storage_station/content/thought_leaders/gear6_reportedly_shutting_down_operations.html" onclick="javascript:pageTracker._trackPageview('/blogs.eweek.com');">eWeek</a> are reporting that Gear6 is headed for liquidation. Even though the source of the story seems to be Gear6&#8217;s caching appliance rival Schooner, the rumor sounds legit, not least due to Gear6&#8217;s conspicuous lack of denial. Of course, we can&#8217;t rule out some kind of last-minute transaction that either keeps Gear6 going in some capacity, or else gets its technology into some other vendor&#8217;s hands.</p>
<p>Metrics on Gear6 as of my briefing a few weeks ago included:</p>
<ul>
<li>A dozen or so announced (software?) customers</li>
<li>&#8220;Dozens&#8221; of paying customers in the Amazon cloud</li>
<li>&#8220;Tons&#8221; of free customers in the Amazon cloud</li>
<li>25-30 employees</li>
<li>Low-end pricing of $3500/year/server or, on Amazon, $0.80/hour</li>
<li>Biggest known installation = about 20 servers clustered together</li>
</ul>
<p>I of course still plan to write about Northscale and memcached, but first I wanted to get Gear6&#8217;s apparent death throes out of the way in a separate post.</p>
<p>Incidentally, memcached is by no means Gear6&#8217;s original business. Gear6 was founded in 2002 as Engineered Intelligence Corporation to do something in the area of general <a href="http://executiveventures.com/bio.php" onclick="javascript:pageTracker._trackPageview('/executiveventures.com');">high-performance clustering</a>. In 2006 Gear6 said it was &#8220;solely focused&#8221; on <a href="http://www.networkcomputing.com/other/gear6.php?p=2" onclick="javascript:pageTracker._trackPageview('/www.networkcomputing.com');">storage, specifically the &#8220;server/storage gap&#8221;</a>. By 2008 Gear6&#8217;s business was <a href="http://virtualization.com/funding/2008/03/27/gear6-raises-10-million-from-horizon-ventures-us-venture-partners-and-interwest-partners/" onclick="javascript:pageTracker._trackPageview('/virtualization.com');">high-performance caching appliances</a>, not necessarily memcached-specific. memcached, in software, appliance, and cloud-based formats, came more recently.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/27/gear6-memcache/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Memcached-based company NorthScale launches</title>
		<link>http://www.dbms2.com/2010/03/16/memcached-northscale-launc/</link>
		<comments>http://www.dbms2.com/2010/03/16/memcached-northscale-launc/#comments</comments>
		<pubDate>Tue, 16 Mar 2010 17:52:48 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Cache]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Northscale]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[memcached]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1717</guid>
		<description><![CDATA[NorthScale, a start-up based around memcached, has just launched, two weeks after the Todd Hoff&#8217;s post arguing the MySQL/memcached combo is passe&#8217;. NorthScale wouldn&#8217;t necessarily argue with Todd, arguing that what you really should use instead is NorthScale&#8217;s combo of memcached and MemBase, a memcached-like DBMS &#8230;
&#8230; or something like that. I don&#8217;t intend to [...]]]></description>
			<content:encoded><![CDATA[<p>NorthScale, a start-up based around memcached, has just launched, two weeks after the Todd Hoff&#8217;s post arguing <a href="http://www.dbms2.com/2010/03/02/cassandra-nosql-scalable-oltp/" >the MySQL/memcached combo is passe&#8217;</a>. NorthScale wouldn&#8217;t necessarily argue with Todd, arguing that what you really should use instead is NorthScale&#8217;s combo of memcached and MemBase, a memcached-like DBMS &#8230;</p>
<p>&#8230; or something like that. I don&#8217;t intend to write seriously about NorthScale until I have a better idea of what MemBase is.</p>
<p>In the mean time,</p>
<ul>
<li>VentureBeat put up a solid post on <a href="http://deals.venturebeat.com/2010/03/16/northscale-zynga-memcached/" onclick="javascript:pageTracker._trackPageview('/deals.venturebeat.com');">NorthScale&#8217;s company history</a> and so on</li>
<li>Om Malik bought into <a href="http://gigaom.com/2010/03/16/northscale/" onclick="javascript:pageTracker._trackPageview('/gigaom.com');">the NorthScale memcached pitch</a></li>
<li>TechCrunch has <a href="http://techcrunch.com/2010/03/16/northscales-data-management-technology-attracts-zynga-and-others/" onclick="javascript:pageTracker._trackPageview('/techcrunch.com');">a low-quality post about NorthScale</a> (although it wasn&#8217;t as error-riddled as the same author&#8217;s post about nStein, which <a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/02/open_text_buyin.html;jsessionid=T51GQFI1CCPL1QE1GHOSKHWATMY32JVN" onclick="javascript:pageTracker._trackPageview('/intelligent-enterprise.informationweek.com');">Seth Grimes properly blasted</a>)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/03/16/memcached-northscale-launc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Boston Big Data Summit keynote outline</title>
		<link>http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/</link>
		<comments>http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/#comments</comments>
		<pubDate>Mon, 23 Nov 2009 06:25:50 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Archiving and information preservation]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[DBMS product categories]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Humor]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1227</guid>
		<description><![CDATA[Last month, Bob Zurek asked me to give a talk on “Big Data”, where “big” is anything from a few terabytes on up, then moderate a panel on cloud computing. We agreed that I could talk just from notes, without slides. So, since I have them typed up, I&#8217;m posting them below.

The top two points [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Last month, Bob Zurek asked me to give a talk on <a href="http://www.dbms2.com/2009/10/09/presentations-upcoming/" >“Big Data”, where “big” is anything from a few terabytes on up</a>, then moderate a panel on cloud computing. We agreed that I could talk just from notes, without slides. So, since I have them typed up, I&#8217;m posting them below.</p>
<p><span id="more-1227"></span></p>
<p style="margin-bottom: 0in;">The top two points from Q&amp;A probably were:</p>
<ul>
<li><strong>Big Data and the cloud actually 	have relatively little to do with each other,</strong> <a href="http://www.dbms2.com/2009/10/30/aster-data-application-server-ncluster/" >a few exceptions</a> notwithstanding, especially if the data is in a shared-nothing DBMS 	(as opposed to, say, a MapReduce-oriented file cluster). Two 	principal reasons are:
<ul>
<li>Redistributing data from node to 	node is a little slow, undermining some of the elasticity benefits 	of the cloud.</li>
<li><a href="http://www.dbms2.com/2009/05/29/sneakernet-to-the-cloud/" >Getting data into the cloud in the 	first place is a lot slow</a>.</li>
</ul>
</li>
<li><strong>The NoSQL movement is a lot like 	the Ron Paul campaign</strong> &#8212; it consists of people who are dissatisfied 	with the status quo, whose dissatisfaction has a lot to do with 	insufficient liberty and/or excessive expenditure, and who otherwise 	don&#8217;t have a whole lot in common with each other.</li>
</ul>
<p style="margin-bottom: 0in;">Anyhow, here are my notes for the talk, edited in just a couple of places for readability or linkage.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;"><strong>Quick introduction</strong></p>
<ul>
<li>Big Data vs. cloud</li>
<li>How big is Big Data?</li>
<li>At the low end of that range, 	there&#8217;s little you can&#8217;t do with conventional technology if you 	have:
<ul>
<li>An unlimited budget for hardware</li>
<li>An unlimited budget for software</li>
<li>An unlimited budget for people, 	especially Oracle DBAs</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Big Data in OLTP</strong></p>
<ul>
<li>Hard-core OLTP
<ul>
<li>Focus of DBMS technology for a 	long-time</li>
<li>Big budgets because each 	transaction has significant value</li>
<li>Tough to get users to change 	technologies</li>
</ul>
</li>
<li>Lighter-weight OLTP
<ul>
<li>Classic example = web companies
<ul>
<li>Big ones &#8212;  retail-oriented ones 	(eBay, Amazon) partially excepted &#8212; <a href="http://www.dbms2.com/2009/05/11/facebook-hadoop-and-hive/" >rolled their own technology 	stacks</a></li>
<li>Reluctant to give money to anybody
<ul>
<li>Open source, etc.</li>
</ul>
</li>
</ul>
</li>
<li>Difficulty finding market
<ul>
<li>Product vs. feature
<ul>
<li>Clustering/HA/DR/whatever</li>
<li>Ditto cloud enablement</li>
</ul>
</li>
<li>True products haven&#8217;t found much 	traction yet</li>
</ul>
</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Analytic Big Data use cases</strong></p>
<ul>
<li>Kinds of data for analytics
<ul>
<li>More of same != big</li>
<li>More detail and/or new kinds
<ul>
<li>Complete data sets</li>
<li>Transactions</li>
<li>Call details</li>
<li>Tick/trade history</li>
<li>Web clickstreams</li>
<li>Network event logs</li>
<li>Other machine-generated data</li>
<li>CAM bottom line
<ul>
<li>Anything human-generated should 	and will be retained in its entirety</li>
<li>Quantities of machine-generated 	data retained should and will grow roughly in line w/ computing cost 	reductions (Moore&#8217;s Law, etc.)</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>Analytic uses of Big Data
<ul>
<li>Analytics is mainly about three 	things
<ul>
<li>Problem detection</li>
<li>Customer relationship improvement
<ul>
<li>(Those overlap when the customer 	relationship is bad)</li>
</ul>
</li>
<li>Financial statements on steroids</li>
</ul>
</li>
</ul>
<ul>
<li>Main kinds of analytics
<ul>
<li>What BI vendors traditionally sell
<ul>
<li>General reporting and dashboards</li>
<li>Ad-hoc query (now driven from 	those reports and dashboards)</li>
<li>Planning (allegedly integrated 	with BI)</li>
</ul>
</li>
<li>Research
<ul>
<li>Ad hoc relational query (worth 	mentioning twice because it drives so much of the market)</li>
<li>Data mining</li>
<li>Most web search and web mining</li>
</ul>
</li>
<li>Operational/near-real-time</li>
<li>Archiving/compliance</li>
</ul>
</li>
<li>What gets Big?
<ul>
<li>Mainly research and archiving</li>
<li>But when reporting or operational 	get Big, you have really interesting computing problems</li>
</ul>
</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Technology issues and trends</strong></p>
<ul>
<li>Moore&#8217;s Law
<ul>
<li>CPUs &#8212; All about cores, hence 	parallelism is key</li>
<li>RAM</li>
<li>SSDs – hence replace disks</li>
<li>Sensors – hence generate lots 	more data</li>
</ul>
</li>
<li>Kryder&#8217;s Law
<ul>
<li>But <a href="http://www.dbms2.com/2005/11/13/breaking-the-disk-speed-barrier/" >rotational speeds up only 	12.5X since Eisenhower Administration</a></li>
<li>Hence solid-state memory (or RAM) 	will soon take over</li>
</ul>
</li>
<li>In the mean time, I/O bottlenecks 	have had to be beaten
<ul>
<li>Hence sequential scans</li>
<li>Hence <a href="http://www.dbms2.com/2007/03/26/index-light-mpp-data-warehouse-appliances/" >index-light</a> architectures</li>
<li>Hence columnar</li>
</ul>
</li>
<li>DBMS “overhead”
<ul>
<li>Raw license and maintenance fees – 	software increasing fraction of total</li>
<li>OLTP vestiges – locking and all 	that</li>
<li>DBAs
<ul>
<li>People costs = huge fraction of 	total</li>
<li>Index-lightness addresses</li>
<li>So does appliance</li>
</ul>
</li>
<li>Many people don&#8217;t really know how to 	write SQL</li>
</ul>
</li>
<li>Configuration
<ul>
<li>Appliance/tightly-balanced
<ul>
<li>Netezza</li>
<li>Teradata earlier</li>
<li>Greenplum/Sun</li>
<li>Oracle</li>
<li>IBM</li>
<li>Microsoft/Madison</li>
</ul>
</li>
<li>Commodity/do what you want
<ul>
<li>Vertica</li>
<li>Greenplum now</li>
<li>Infobright, Aster and others</li>
<li>MapReduce-oriented file systems</li>
</ul>
</li>
<li><a href="http://www.dbms2.com/2009/10/25/data-warehouse-balanced-hardware-configuration/" >Extreme rigidity is silly</a>
<ul>
<li><a href="http://www.dbms2.com/2009/10/25/teradata-hardware-strategy-and-tactics/" >Teradata, Oracle have both 	signaled moving to more modularity</a></li>
<li>Big driver of that = heterogeneous 	storage
<ul>
<li>Cheap disk</li>
<li>Expensive disk</li>
<li>Solid-state</li>
<li>RAM</li>
</ul>
</li>
</ul>
<ul>
<li>CPU/storage ratio is even more of a 	driver</li>
</ul>
</li>
</ul>
</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Theoretically defensible ways to segment the market</strong></p>
<ul>
<li><a href="http://www.dbms2.com/2009/09/10/analytic-speed-latency/" >Latency requirements</a>
<ul>
<li>High availability and low latency 	go together</li>
</ul>
</li>
<li>Query types
<ul>
<li>Simultaneous users for same</li>
</ul>
</li>
<li>Database size</li>
<li>Budget</li>
</ul>
<p style="margin-bottom: 0in;"><strong>Actual segments right now</strong></p>
<ul>
<li><a href="http://www.dbms2.com/2009/08/24/teradatas-active-enterprise-data-warehouse-story/" >Utter ADW/EDW</a></li>
<li>Data mart
<ul>
<li>Size</li>
<li>Naturally columnar vs. naturally 	row-based</li>
</ul>
</li>
<li>Operational/frontline</li>
<li>Less dramatic/smaller EDW</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Martin Kersten on issues in scientific data management</title>
		<link>http://www.dbms2.com/2009/10/03/martin-kersten-on-issues-in-scientific-data-management/</link>
		<comments>http://www.dbms2.com/2009/10/03/martin-kersten-on-issues-in-scientific-data-management/#comments</comments>
		<pubDate>Sat, 03 Oct 2009 10:33:52 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[SciDB]]></category>
		<category><![CDATA[Scientific research]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=998</guid>
		<description><![CDATA[Martin Kersten emailed a response to my post on issues in scientific data management. With his permission, I&#8217;ve lightly edited it, and am posting it below.

Dear Curt,
Thanks for the very nice story and perception on the XLDB meeting. It is a balanced view.
More philosophically I would add a few points:
1) A data management system architecture [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;"><a href="http://www.dbms2.com/2009/08/04/vectorwise-ingres-and-monetdb/" >Martin Kersten</a> emailed a response to my post on <a href="http://www.dbms2.com/2009/10/03/issues-in-scientific-data-management/" >issues in scientific data management</a>. With his permission, I&#8217;ve lightly edited it, and am posting it below.<span id="more-998"></span></p>
<blockquote>
<p style="margin-bottom: 0in;">Dear Curt,</p>
<p style="margin-bottom: 0in;">Thanks for the very nice story and perception on the XLDB meeting. It is a balanced view.</p>
<p style="margin-bottom: 0in;">More philosophically I would add a few points:</p>
<p style="margin-bottom: 0in;">1) A data management system architecture is a large collection of compromises amongst a number of competing parameters.</p>
<p style="margin-bottom: 0in; padding-left: 30px;">data management (hardware, data structure, algorithms, optimizers, languages) &#8211;&gt; value-for-application</p>
<p style="margin-bottom: 0in;">Given the cost to develop/maintain a dbms, we see only a few parameter constellations in the current product offerings. And the scientists have a hard time to explore the uncharted land, because of effort required and uncertain benefits. (The same holds for researchers in R&amp;D labs of vendors.)</p>
<p style="margin-bottom: 0in;">2) The research community needs a focus to move ahead. The array-dbms is such a focus, because it identifies an omission in the type structure being managed at all levels of a system. Articulation of this in the community will help to steer effort.</p>
<p style="margin-bottom: 0in;">3) The recent &#8216;hype&#8217; for going to a HadoopDB like approach should be positioned carefully. It is so far a single point experiment for a limited query domain space, carefully carved out to avoid all the issues that plague a distributed dbms. Within this space the techniques come from a different operating system functionality. <em>[Not sure what he means by this.]</em> It does not change the DBMS itself and as such it is a repetition of middleware solutions to handle a cluster of independent MySQL instances.</p>
<p style="margin-bottom: 0in;">This paper might be worth having a look at <a href="http://ic2.epfl.ch/labos/publications/freenix2004.pdf" onclick="javascript:pageTracker._trackPageview('/ic2.epfl.ch');">http://ic2.epfl.ch/labos/publications/freenix2004.pdf</a></p>
<p style="margin-bottom: 0in;">To generalize it to a complete solution e.g. calls for massive replication, to avoid that you have to ship data around during query execution. This is paid for with more expensive updates. Feasible in certain domains. <em>[I'd frame that point just as saying Hadoop-based solutions are unlikely to do as well at reducing data shipping as the better MPP DBMS.]</em></p>
</blockquote>
<p style="margin-bottom: 0in;">
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/10/03/martin-kersten-on-issues-in-scientific-data-management/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Xkoto Gridscale highlights</title>
		<link>http://www.dbms2.com/2009/09/11/xkoto-gridscale-highlights/</link>
		<comments>http://www.dbms2.com/2009/09/11/xkoto-gridscale-highlights/#comments</comments>
		<pubDate>Fri, 11 Sep 2009 18:36:03 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Market share]]></category>
		<category><![CDATA[Microsoft and SQL*Server]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Pricing]]></category>
		<category><![CDATA[Xkoto]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=881</guid>
		<description><![CDATA[I talked yesterday with cofounders Albert Lee and Ariff Kassam of Xkoto. Highlights included:

Xkoto sells Gridscale, a 	clustering server for DB2 and, more recently, MS SQL Server.
Xkoto Gridscale runs on a separate 	box, between the application and the database servers. This box is 	typically smaller and cheaper than the database server boxes.
Xkoto most typically sells [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked yesterday with cofounders Albert Lee and Ariff Kassam of Xkoto. Highlights included:<span id="more-881"></span></p>
<ul>
<li>Xkoto sells Gridscale, a 	clustering server for DB2 and, more recently, MS SQL Server.</li>
<li>Xkoto Gridscale runs on a separate 	box, between the application and the database servers. This box is 	typically smaller and cheaper than the database server boxes.</li>
<li>Xkoto most typically sells 	Gridscale into environments where there already are three database 	servers &#8212; one to do work, one for hot standby, and one for remote 	disaster recovery.</li>
<li>In such environments, Gridscale&#8217;s 	big benefit is that you can distribute the query workload among all 	three servers. Xkoto believes this big performance increase is the 	reason customers don&#8217;t get much past 3 database servers under Xkoto 	(they didn&#8217;t seem quite sure as to whether the all-time record was 4 	or 5).  Note that even if a remote server is a little too far away 	for OLTP query response, it can work fine for reporting.</li>
<li>Of course, if you don&#8217;t already 	have high/&#8221;continuous&#8221; availability and/or disaster 	recovery, then Xkoto would say those are core benefits of Gridscale 	as well.</li>
<li>Gridscale sends transactions (or 	just SQL statements?) to all servers in the cluster. Once any of 	them responds affirmatively, that update is reflected in queries. 	Gridscale maintains a small query log to make sure it gets the other 	database copies in sync. It also tries to make sure that queries 	always go to the most current copy of the database. (I didn&#8217;t ask 	what happens if Server A executes Transaction T but not U, while 	Server B executes Transaction U and not T &#8212; but that does seem like 	something of an edge case.).</li>
<li>Xkoto spun out of <a href="http://www.halcyoninc.com/" onclick="javascript:pageTracker._trackPageview('/www.halcyoninc.com');">Halcyon 	Monitoring</a> in 2006, starting with DB2 support. Microsoft SQL 	Server support was introduced in 2008.</li>
<li>Xkoto likes its partnerships with 	IBM and Microsoft. For example, IBM provides Level 1 and 2 support 	for Gridscale itself. Due in large part to this partnership 	strategy, Xkoto says it has no plans to support DBMS beyond DB2 and 	SQL Server.</li>
<li>Instead, Xkoto is pursuing 	partnerships with large application vendors and so on. (The figure 	&#8220;about 10&#8243; was mentioned.) I gather the idea is to make 	sure that neither the application support folks nor the app itself 	freak out from the fact that the app isn&#8217;t exactly talking to the 	DBMS any more.</li>
<li>Xkoto has done lab tests 	suggesting Gridscale offers near-linear scalability (in terms of SQL 	Server database throughput) on a query-only workload up to 10 	servers.</li>
<li>I gather that Xkoto and IBM have 	demos suggesting it&#8217;s a fine idea to have your disaster recovery 	server be in the Amazon cloud, but they haven&#8217;t yet made any sales 	based on that &#8212; er, based on that <em>premise.</em> <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </li>
<li>Gridscale pricing is measured in 	the same metrics as DB2 or SQL Server pricing, and in each case is 	around 1/3 what database pricing would be on the same box (I&#8217;m 	guessing that&#8217;s for enterprise additions without add-ons, but I 	didn&#8217;t probe). Specifically, Gridscale charges $12K per 100 PVUs for 	the DB2 edition, and $12K per socket for running with Microsoft SQL 	Server.</li>
<li>Gridscale typically runs on 	smaller boxes than the databases it talks to.</li>
<li>Xkoto has about 35 	revenue-recognized customers. Most are on DB2, the first environment 	Gridscale supported.</li>
<li>Average Gridscale selling prices 	are $180K on DB2, and $40-50K in the early going for SQL Server.</li>
<li>Xkoto has about 40 full-time 	employees, with engineering in Toronto and business operations in 	Waltham.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/09/11/xkoto-gridscale-highlights/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Continuent on clustering</title>
		<link>http://www.dbms2.com/2009/09/03/continuent-on-clustering/</link>
		<comments>http://www.dbms2.com/2009/09/03/continuent-on-clustering/#comments</comments>
		<pubDate>Thu, 03 Sep 2009 13:46:56 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Continuent]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=876</guid>
		<description><![CDATA[Robert Hodges, CTO of my client Continuent, put up a blog post laying out his and Continuent&#8217;s views on database clustering. Continuent offers Tungsten, its third try at database clustering technology, targeted at MySQL, PostgreSQL, and perhaps Oracle. Unlike Continuent&#8217;s more ambitious. second-generation product, Tungsten offers single-master replication, which in Robert&#8217;s view allows for great [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Robert Hodges, CTO of my client Continuent, put up <a href="http://scale-out-blog.blogspot.com/2009/09/future-of-database-clustering.html" onclick="javascript:pageTracker._trackPageview('/scale-out-blog.blogspot.com');">a blog post</a> laying out his and Continuent&#8217;s views on database clustering. Continuent offers Tungsten, its third try at database clustering technology, targeted at MySQL, PostgreSQL, and perhaps Oracle. Unlike Continuent&#8217;s more ambitious. second-generation product, Tungsten offers single-master replication, which in Robert&#8217;s view allows for great ease of deployment and administration (he likes the phrase “bone-simple”).</p>
<p style="margin-bottom: 0in;">The downside to Continuent Tungsten &#8217;s stripped down architecture is that it doesn&#8217;t solve the most extreme performance scale-out problems.  Instead, Continuent focuses on the other big benefits of keeping your data in more than one place, namely high availability and data loss prevention (i.e., backup).</p>
<p style="margin-bottom: 0in;">Continuent has been around for a number of years, starting out in Finland but now being based in Silicon Valley. For most purposes, however, it&#8217;s reasonable to think of Continuent and Tungsten as start-up efforts.</p>
<p style="margin-bottom: 0in;">As you might guess from the references to Finland and MySQL, Continuent&#8217;s products are open source, or at least have open source versions. I&#8217;m still a little fuzzy as to which features are open sourced and which are not. For that matter, I&#8217;m still unclear as to Tungsten&#8217;s feature list overall &#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/09/03/continuent-on-clustering/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>What are the best choices for scaling Postgres?</title>
		<link>http://www.dbms2.com/2009/07/29/scaling-postgres-choices/</link>
		<comments>http://www.dbms2.com/2009/07/29/scaling-postgres-choices/#comments</comments>
		<pubDate>Wed, 29 Jul 2009 06:16:02 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Cache]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Data mart outsourcing]]></category>
		<category><![CDATA[EnterpriseDB and Postgres Plus]]></category>
		<category><![CDATA[In-memory DBMS]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=849</guid>
		<description><![CDATA[I have a client who wants to build a new application with peak update volume of several million transactions per hour.  (Their base business is data mart outsourcing, but now they&#8217;re building update-heavy technology as well. ) They have a small budget.  They&#8217;ve been a MySQL shop in the past, but would prefer to contract [...]]]></description>
			<content:encoded><![CDATA[<p>I have a client who wants to build a new application with peak update volume of several million transactions per hour.  (Their base business is data mart outsourcing, but now they&#8217;re building update-heavy technology as well. ) They have a small budget.  They&#8217;ve been a MySQL shop in the past, but would prefer to contract (not eliminate) their use of MySQL rather than expand it.</p>
<p>My client actually signed a deal for EnterpriseDB&#8217;s Postgres Plus Advanced Server and GridSQL, but unwound the transaction quickly. (They say EnterpriseDB was very gracious about the reversal.) There seem to have been two main reasons for the flip-flop.  First, it seems that EnterpriseDB&#8217;s version of Postgres isn&#8217;t up to PostgreSQL&#8217;s 8.4 feature set yet, although EnterpriseDB&#8217;s timetable for catching up might have tolerable. But GridSQL apparently is further behind yet, with no timetable for up-to-date PostgreSQL compatibility.  That was the dealbreaker.</p>
<p>The current base-case plan is to use generic open source PostgreSQL, with scale-out achieved via hand sharding, Hibernate, or &#8230; ??? Experience and thoughts along those lines would be much appreciated.</p>
<p>Another option for OLTP performance and scale-out is of course memory-centric options such as <a href="http://www.dbms2.com/2009/06/22/h-store-horizontica-voltdb/" >VoltDB</a> or <a href="http://www.dbms2.com/2009/07/28/the-groovy-sql-switch/" >the Groovy SQL Switch</a>.  But this client&#8217;s database is terabyte-scale, so hardware costs could be an issue, as of course could be product maturity.</p>
<p>By the way, a large fraction of these updates will be actual changes, as opposed to new records, in case that matters.  I expect that the schema being updated will be very simple &#8212; i.e., clearly simpler than in a classic order entry scenario.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2009/07/29/scaling-postgres-choices/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Exadata and Oracle Database Machine parallelization clarified</title>
		<link>http://www.dbms2.com/2008/09/28/exadata-oracle-database-machine-parallelization/</link>
		<comments>http://www.dbms2.com/2008/09/28/exadata-oracle-database-machine-parallelization/#comments</comments>
		<pubDate>Mon, 29 Sep 2008 02:41:34 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Parallelization]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=580</guid>
		<description><![CDATA[Some kind Oracle development managers have reached out and helped me better understand where Oracle does or doesn&#8217;t stand in query and analytic parallelization.  This post supersedes prior discussions of the subject over the past week.
Let&#8217;s start with the part everybody pretty much knows already:

There are two parts to a parallelization story &#8212; how [...]]]></description>
			<content:encoded><![CDATA[<p>Some kind Oracle development managers have reached out and helped me better understand where Oracle does or doesn&#8217;t stand in query and analytic parallelization.  This post supersedes prior discussions of the subject over the past week.<span id="more-580"></span></p>
<p>Let&#8217;s start with the part everybody pretty much knows already:</p>
<ul>
<li>There are two parts to a parallelization story &#8212; how you get data off of disk, and what you do with it once you have it.</li>
<li>To a first approximation, the best way to get a lot of data off of disk is in parallel, specifically with different CPUs talking to different disk drives.   Until last week&#8217;s announcement of <a href="http://www.dbms2.com/2008/09/24/oracle-exadata/" >Exadata</a>, Oracle was the most prominent holdout against this view.  (That dubious honor now goes to Sybase.)</li>
<li>If processing units are working in parallel to get data off disk, it is optimal to, on the same <em>node</em> that receives the data:
<ul>
<li>Do your <em>projections. </em>I.e., whittle down the data to just the columns you need.  (That&#8217;s if you have a row-based system, as Oracle does; projections are moot in column-based systems.)</li>
<li>Do your <em>selections.</em> I.e., execute whichever WHERE clauses can be handled before the joins <em>start.</em></li>
<li>Do any <em>joins</em> you can.  For example, you may be joining two tables that have been partitioned on the same <em>hash key,</em> or you may be joining a large <em>fact table</em> to a small <em>dimension table, </em>where the latter has been replicated across every node.</li>
</ul>
</li>
<li>Exadata is the first mainstream Oracle product that follows this optimal storage-facing parallelization strategy. Actually, opinions differ as to <a href="http://www.dbms2.com/2008/09/06/sans-vs-das-in-mpp-data-warehousing/" >whether rigid coupling of processors to specific disks is actually necessary</a>.  But after supporting one extreme (the disk part of <em>shared-everything</em>), Oracle with Exadata has gone to the other extreme (the disk part of <em>shared-nothing</em>).  Other vendors taking this approach include Teradata and Netezza.  Greenplum and Vertica are less extreme.</li>
<li>After accessing data in parallel and filtering it to the extent possible on the nodes that retrieved it, Oracle then ships the data to a conventional <em>Oracle database server.</em> That server does any further query processing, along with any other analytics.</li>
</ul>
<p>All that has been pretty clear from the getgo.  Less obvious has been: <strong>How does the Oracle database server process the data it receives from the Exadata component?  In particular, how parallel is the Oracle database server&#8217;s processing?</strong></p>
<p>It turns out that the answer has little to do with <a href="http://www.oracle.com/technology/products/database/clustering/pdf/twp_rac11g.pdf" onclick="javascript:pageTracker._trackPageview('/www.oracle.com');">Oracle Real Application Clusters (RAC)</a>. Indeed, it has so little to do with RAC that I&#8217;m wondering what RAC does to justify its &gt;10% share of overall <a href="http://www.dbms2.com/2008/09/28/oracle-exadata-list-pricing/" >Oracle Database Machine pricing</a>.  In particular, different CPUs generally do not share RAM or cache when doing what Oracle refers to as DSS (Decision Support System &#8212; an old term) work.  Thus &#8212; while I&#8217;m still not clear on all the specifics or exceptions! &#8212; it is generally fair to say that Oracle&#8217;s architecture on the database server side is akin to <em>shared-nothing.</em></p>
<p>Please note that we&#8217;re talking here about two different pools of CPUs &#8212; the ones built into the Exadata part of the system, in charge of talking to their own private disk drives, and the ones in the RAC cluster, which do non-basic query execution, along with the rest of the analytics.  Indeed, those two pools of CPUs could be of completely different brands and configurations, although at the moment they are similarly-named HP servers using identical Intel chips.   I.e., Oracle has moved into the <a href="http://www.dbms2.com/2008/09/05/mpp-data-warehouse-nodes/" >node heterogeneity</a> camp.  By way of contrast, the usual-suspect MPP vendors &#8212; Teradata, Netezza, Greenplum, Vertica, Aster Data, Paraccel, Exasol, DATAllegro &#8212; do most or all of their subsequent processing on the same nodes that retrieve data.  Thus, Oracle is the first major vendor for whom it is important to remember that <strong>different parts of a query plan get parallelized across completely distinct sets of processors and processing units.*</strong></p>
<p><em>*Yes, I know that each Netezza SPU (Snippet Processing Unit) couples a PowerPC chip and an FPGA (Field-Programmable Gate Array).  But that&#8217;s a very different thing from having your data access occur on 14 servers and having the initial results sent to a different set of 8 servers.</em></p>
<p>So with all that background, I&#8217;m finally ready to lay out what I&#8217;ve gleaned about Oracle query and analytic parallelization, whether from public materials or private discussions.</p>
<ul>
<li>Commericial database parallelization started in the mid-1990s.  Indeed, I was writing about it back in 1994, and it was part of the story in my Sybase Sell recommendation that year.  Ken Jacobs et al. explained to me at the time that in the choice between &#8220;static&#8221; and &#8220;dynamic&#8221; partitioning and parallelization, Oracle had opted for the &#8220;dynamic.&#8221;  I.e., all CPUs could look at all data, but there would be a form of parallel processing anyway.</li>
<li>Oracle introduced a feature called <strong>Parallel Query</strong> long ago.  Then, with the release of 10g, Oracle said in effect &#8220;Now we&#8217;ve removed the prior limitations and gotten Parallel Query right!&#8221;</li>
<li>Oracle tables or queries have to be <strong>explicitly enabled for parallelization.</strong> Degrees of parallelization (minimum and maximum processors devoted to a task) are also explicitly declared.  Naturally, there are defaults and administrative tools that make that all pretty automatic or else easy.  The default default, as it were, is for most or all available processors to be used in parallel.</li>
<li>Oracle also says it has <strong>parallelized a broad range of analytic functionality</strong> &#8212; data mining, OLAP, and so on, including generic UDFs (User-Defined Functions).</li>
</ul>
<p>So how good is all this parallel technology?  On the one hand, we know Oracle has been shipping it for a long time, and has it widely deployed.  On the other, we also know that Oracle performance has been very problematic for large parallel queries.  Surely most of those problems were due to the shared-disk bottleneck.  But were they all (or so close to all as not to matter)?  I don&#8217;t yet know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2008/09/28/exadata-oracle-database-machine-parallelization/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Response to Rita Sallam of Oracle</title>
		<link>http://www.dbms2.com/2008/06/28/response-to-rita-sallam-of-oracle/</link>
		<comments>http://www.dbms2.com/2008/06/28/response-to-rita-sallam-of-oracle/#comments</comments>
		<pubDate>Sat, 28 Jun 2008 08:33:27 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Benchmarks and POCs]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Parallelization]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=445</guid>
		<description><![CDATA[In a comment thread on Seth Grimes&#8217; blog, Rita Sallam of Oracle engaged in a passionate defense of her data warehousing software.  I&#8217;d like to take it upon myself to respond to a few of here points here.
If a shared disk architecture is not scalable, then how is it that Oracle consistently has more [...]]]></description>
			<content:encoded><![CDATA[<p>In a <a href="http://www.intelligententerprise.com/blog/archives/2008/02/data_warehouse.html" onclick="javascript:pageTracker._trackPageview('/www.intelligententerprise.com');">comment thread</a> on Seth Grimes&#8217; blog, Rita Sallam of Oracle engaged in a passionate defense of her data warehousing software.  I&#8217;d like to take it upon myself to respond to a few of here points here.<span id="more-445"></span></p>
<blockquote><p>If a shared disk architecture is not scalable, then how is it that Oracle consistently has more customers in the Winter Group top ten data warehouses than any other vendor?</p></blockquote>
<p>Unfortunately, the <a href="http://www.wintercorp.com/VLDB/2005_TopTen_Survey/TopTenWinners_2005.asp" onclick="javascript:pageTracker._trackPageview('/www.wintercorp.com');">Winter Corporation list</a> is a joke, which may be why it hasn&#8217;t been updated since 2005.  (I mean it &#8212; Dick Winter seems like too good a guy to keep publishing something so misleading indefinitely.)  It counts not just user data, but indices, aggregates, and everything else.  Based on that, I&#8217;d guess the largest Oracle site listed there to be at 10-20 terabytes of user data, and all the others to be at 5 TB or less.  Even assuming 3-5X database growth since the list was compiled, that puts Oracle behind Teradata, DATAllegro, Netezza, Dataupia, and probably SAS &#8212; just counting ones I can think of quickly &#8212; none of whom are actually represented on the list.  Teradata in particular blows Oracle away on warehouse size.</p>
<p>And by the way &#8212; the largest Oracle warehouse by far on that list is at Yahoo.  But <a href="http://www.dbms2.com/2008/05/29/yahoo-scales-web-analytics-database-petabyte/" >Oracle isn&#8217;t Yahoo&#8217;s major data warehouse software provider</a>.</p>
<blockquote><p>If a shared disk architecture is not scalable, then how is it that Oracle is the leader in Data Warehouse Performance. It is the TPC-H leader in the 300GB, 1TB, 3TB, and 30TB categories.</p></blockquote>
<p>TPCs are a joke too.  Oracle&#8217;s third-longest-serving exec (or maybe second-longest &#8212; I always forget whether he or Ken Jacobs has been there longer) e-mailed me a few years ago, asking for my help in making them go away.  Be that as it may:</p>
<ul>
<li>Oracle probably won the 30 TB TPC-H because it&#8217;s the only vendor to submit a result.</li>
<li>Oracle is the &#8220;leader&#8221; on the 10 TB TPC-H by 10% in price-performance, using a system that hasn&#8217;t shipped yet, over a system that has already shipped.  5 months is worth more than 10% in this day and age.  Anyhow, the other contenders are Microsoft and IBM, which may be why Oracle finds it reasonably easy to keep up.</li>
<li>Oracle trails Exasol by a factor of 27 &#8212; that is not a typo &#8212; on the 3 TB TPC-H.</li>
<li>Oracle trails Exasol by a factor of 20 &#8212; also not a typo &#8212; on the 1 TB TPC-H.</li>
<li>Oracle trails Kickfire by a factor of 14 &#8212; also not a typo &#8212; on the 300 GB TPC-H</li>
</ul>
<blockquote><p>A shared disk architecture (Oracle) is more flexible. Since all processing units can see all data the system can at runtime decide what the degree of parallelism should be. In addition, some queries may be more efficiently run in serial (simple index lookups) in which case parallelism isnt even used.</p></blockquote>
<p>Data warehouse appliances (at least the row-based ones) excel at fast table scans.</p>
<blockquote><p>Also, if individual servers in a cluster contain many CPUs (or cores) the parallelism can be co-located on the node. Hence, statements may run in parallel but do not require the interconnect to ship data.</p></blockquote>
<p>Appliance makers use multicore systems too.  Everybody does, these days.</p>
<blockquote><p>Oracles Shared Everything architecture provides the ability to dynamically optimize each query requirement. The current workload is examined and the degree of parallelism is adjusted rather than blindly starting with the same degree of parallelism every time. Therefore, the degree of parallelism is optimized for every query and there is no requirement for a minimal degree of parallelism across all nodes. Operations can run in parallel using one, some or all nodes of a Real Application cluster depending on the current workload, the characteristics and importance of the query.</p></blockquote>
<p>That&#8217;s all irrelevant to the chief benefit of parallelism.  Parallelism isn&#8217;t about optimizing the use of CPUs.  Parallelism is about optimizing the system where it&#8217;s actually bottlenecked, which is getting data off of disk.</p>
<blockquote><p>Parallelism is not related to the partitioning strategy of the data as in a shared-nothing environment.</p></blockquote>
<p>Parallelism isn&#8217;t particularly related to partitioning strategy in a shared-nothing environment either.  Kognitio offers a competitive shared-nothing system with no partitioning whatsoever.  And many queries on most vendors&#8217; systems relate to partitioning only in that the data is distributed so that approximately equal amounts of data may be found on each node.</p>
<p>True, since that&#8217;s done by hash partitioning, you try to pick hash keys so that you get lucky as often as possible, and benefit from the hash key when doing a hash join. And further partitioning can be added as an optimization.  But that&#8217;s hardly a disadvantage for shared-nothing systems vs. Oracle.</p>
<blockquote><p>With Oracles shared disk approach, fail over is built-in and the configuration remains balanced.</p></blockquote>
<p>If your disks are failing often enough for that to be more than a tiny benefit, you might want to consider changing your storage supplier.</p>
<blockquote><p>Oracle does not require re-organization of data. Oracles hash partitioning is also automatic and does not require re-distribution of data. The Oracle Optimizer automatically tunes queries. In addition the Oracle Database 10g ADDM, (Automatic Database Diagnostic Monitor) runs automatically to make performance recommendations. Index management is very simple in Oracle. The ADDM tool recommends indexes, generates script to create indexes and will run them with the DBAs approval. Oracle also supports all types of data including stars, normalized and de-normalized data. Oracle supports Join Indexes and Aggregate Join Indexes.</p></blockquote>
<p>Somebody please remind me to start an international Scrabulous tournament for Oracle DBAs, since they have nothing else to occupy their time.</p>
<blockquote><p>In addition, Oracle supports superior concurrency and parallelism, Oracle can execute several queries at the same time (in parallel) without performance degradation. With Oracle&#8217;s model, there are several checkout counters that customers can use, which parallelizes the process and provides a higher throughput. Even if a customer at one counter takes a long time to checkout, other customers are not affected. Once all checkout counters are full, Oracle queues the remaining customers (queries) until the next checkout counter opens up and sends the next customer in line to the open counter. If this starts occurring consistently, and the processing of customers (queries) slows down, Oracle allows for more checkout counters to be added dynamically using RAC or by simply adding more CPUs to an environment.</p></blockquote>
<p>That one&#8217;s probably real.</p>
<p><strong><em>Related links:</em></strong></p>
<ul>
<li><a href="http://www.dbms2.com/2008/06/28/oracle-optimized-warehouse-initiative/" >Oracle&#8217;s Optimized Warehouse Initiative</a></li>
<li><a href="http://www.networkworld.com/community/node/29358" onclick="javascript:pageTracker._trackPageview('/www.networkworld.com');">4 reasons to reduce your Oracle usage</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2008/06/28/response-to-rita-sallam-of-oracle/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>
