<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; IBM and DB2</title>
	<atom:link href="http://www.dbms2.com/category/products-and-vendors/ibm-and-db2/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 02 Sep 2010 09:06:44 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>DB2 workload management</title>
		<link>http://www.dbms2.com/2010/08/18/ibm-db2-workload-management/</link>
		<comments>http://www.dbms2.com/2010/08/18/ibm-db2-workload-management/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 08:47:09 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Workload management]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2819</guid>
		<description><![CDATA[DB2 has added a lot of workload management features in recent releases. So when we talked Tuesday afternoon, Tim Vincent and I didn&#8217;t bother going through every one. Even so, we covered some interesting subjects in the area of DB2 workload management, including:  

If your goal is to keep a certain 	class of queries from [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;"><a href="../2009/04/24/some-db2-highlights/">DB2 has added a lot of workload management features in recent releases</a>. So when we talked Tuesday afternoon, Tim Vincent and I didn&#8217;t bother going through every one. Even so, we covered some interesting subjects in the area of DB2 workload management, including:  <span id="more-2819"></span></p>
<ul>
<li>If your goal is to keep a certain 	class of queries from taking too many resources, Tim thinks a great 	way of doing that is to control how many of them are allowed to run 	concurrently.</li>
<li>By way of contrast, Tim is 	cautious about the common approach of just lowering a query&#8217;s 	priority. His concern is that a long-running query could linger even 	longer, creating a long-lasting bottleneck in, for example, <a href="http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/" >temp 	space</a>.</li>
<li>When running over (I believe) 	Linux and AIX, DB2 workload management is integrated with operating 	system workload management. I.e., the same “service class” or 	“workload class” (at a guess, the former is the official term 	and the latter is the term that makes sense) of queries and 	associated processes gets the same treatment in both DB2 and the OS.</li>
<li>DB2&#8217;s workload management extends 	to buffer pools, to inhibit low-priority queries from evicting a 	higher-priority query&#8217;s data from cache.</li>
<li>Sometimes, workload management 	doesn&#8217;t throttle a query, but just decides to collect stats for 	future analysis. (This is on the eminently reasonably theory that 	the best stats to collect are the ones that are live when  	performance problems are actually occurring.)</li>
</ul>
<p style="margin-bottom: 0in;">Finally, Tim spoke of what I regard as the weirdest workload management requirement, one I also heard about from <a href="http://www.dbms2.com/2009/07/18/netezza-on-concurrency-and-workload-management/" >Netezza</a> <span style="font-style: normal;">(but didn&#8217;t explicitly mention) in</span> June. Sometimes, it seems, you simply don&#8217;t want queries to finish too fast. Why? Because if you give great performance when the machine is lightly loaded, then business users might expect that performance too when the machine is heavily loaded and you can&#8217;t deliver it. Apparently, in some environments it&#8217;s better to never deliver great query performance than it is to do so only inconsistently.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/18/ibm-db2-workload-management/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>More on temp space, compression, and &#8220;random&#8221; I/O</title>
		<link>http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/</link>
		<comments>http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 05:44:59 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2805</guid>
		<description><![CDATA[My PhD was in a probability-related area of mathematics (game theory), so I tend to squirm when something is described as &#8220;random&#8221; that clearly is not. That said, a comment by Shilpa Lawande on our recent Flash/temp space discussion suggests the following way of framing a key point:

You really, really want to have multiple data [...]]]></description>
			<content:encoded><![CDATA[<p>My PhD was in a probability-related area of mathematics (game theory), so I tend to squirm when something is described as &#8220;random&#8221; that clearly is not. That said, <a href="http://www.dbms2.com/2010/08/16/vertica-flash-temp-space/#comment-181134" >a comment by Shilpa Lawande</a> on our recent <a href="http://www.dbms2.com/2010/08/16/vertica-flash-temp-space/" >Flash/temp space discussion</a> suggests the following way of framing a key point:</p>
<ul>
<li>You really, really want to have multiple data streams coming out of temp space, as close to simultaneously as possible.</li>
<li>The storage performance characteristics of such a workload are more reminiscent of &#8220;random&#8221; than &#8220;sequential&#8221; I/O.</li>
</ul>
<p>If everybody else is cool with it too, I can live with that. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Meanwhile, I talked again with Tim Vincent of IBM this afternoon. Tim endorsed the temp space/Flash fit, but with a different emphasis, which upon review I find I don&#8217;t really understand. The idea is:</p>
<ul>
<li>Analytic DBMS processing generally stresses reads over writes.</li>
<li>Temp space is an exception &#8212; read and write use of temp space is pretty balanced. (You spool data out once, you read it back in once, and that&#8217;s the end of that; next time it will be overwritten.)</li>
</ul>
<p>My problem with that is: Flash typically has lower write than read IOPS (I/O per second), so being (relatively) write-intensive would, to a first approximation, seem if anything to disfavor a workload for Flash.</p>
<p>On the plus side, I was reminded of something I should have noted when I wrote about <a href="http://www.dbms2.com/2010/06/21/netezza-ibm-db2-compression/" >DB2 compression</a> before:</p>
<p>Much like Vertica, <strong>DB2 operates on compressed data all the way through, including in temp space. </strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>ANTs Software CEO insults Sybase, claims migration success</title>
		<link>http://www.dbms2.com/2010/08/04/ants-software-ceo-insults-sybase-claims-migration-success/</link>
		<comments>http://www.dbms2.com/2010/08/04/ants-software-ceo-insults-sybase-claims-migration-success/#comments</comments>
		<pubDate>Wed, 04 Aug 2010 10:43:12 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ANTs Software]]></category>
		<category><![CDATA[Emulation, transparency, portability]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Sybase]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2734</guid>
		<description><![CDATA[Jeff Pryslak of Sybase put up a post insulting ANTs Software and the general idea of ANTs-aided Sybase-to-DB2 migration. CEO Joe Kozak of ANTs hit back with a rambling diatribe, which came to my attention because he mentioned my name in it, making some rather fanciful remarks about the &#8220;long&#8221; relationship I used to have [...]]]></description>
			<content:encoded><![CDATA[<p>Jeff Pryslak of Sybase put up a post <a href="http://blogs.sybase.com/database/2010/07/elephants-and-ants-a-corporate-fable/" onclick="javascript:pageTracker._trackPageview('/blogs.sybase.com');">insulting ANTs Software and the general idea of ANTs-aided Sybase-to-DB2 migration</a>. CEO Joe Kozak of ANTs hit back with <a href="http://antsblog.typepad.com/ants-software-blogs/2010/08/sybases-jeff-pryslak.html" onclick="javascript:pageTracker._trackPageview('/antsblog.typepad.com');">a rambling diatribe</a>, which came to my attention because he mentioned my name in it, making some rather fanciful remarks about the &#8220;long&#8221; relationship I used to have with ANTs Software. (I do recall at least one briefing, plus some attempts from them to buy my services under the condition that I agree to a ridiculous NDA, which I refused to sign.)</p>
<p>This piqued my interest, so &#8212; recalling that ANTs is a public company &#8212; I decided to take a look at just how successful their software products business is. Well, for the quarter ended March 31, 2010, <a href="http://sec.gov/Archives/edgar/data/796655/000115752310003343/a6298515.htm" onclick="javascript:pageTracker._trackPageview('/sec.gov');">ANTs&#8217; 10-Q filing says</a> (emphasis mine):  <span id="more-2734"></span></p>
<blockquote><p><strong>The Company’s revenues for the three months ended March 31, 2010 and 2009 include service revenues representing managed and professional service fees for database and network maintenance and support services. </strong> Revenues for the three months ended March 31, 2010 were $1.5 million, an increase of $0.1 million compared to $1.4 million for the three months ended March 31, 2009.  <strong>For the three months ended March 31, 2010, two customers accounted for 96% of the Company’s gross revenues </strong>(Company A, 72% and Company B, 24%) <strong>compared to three customers that accounted for 97% of the Company’s gross revenues for the three months ended March 31, 2009 (Company A, 57%, Company B, 29% and Company C, 10%). </strong>The increase in revenues for the three months ended March 31, 2010 over the comparable period in 2009 is primarily attributable to professional service projects for Company A that were initiated during or subsequent to the three months ended March 31, 2009, partially offset by professional service projects for Company B and Company C that were completed subsequent to March 31, 2009.</p>
<p>Conditional on the Company’s technology developments being successful, the presence of customer demand and the Company having a competitive advantage, <strong>future revenues may include sales and licenses of its ANTs Compatibility Server (“ACS”) product and managed services revenue </strong>related to existing and new contracts and professional services revenue from pre- and post-sales consulting related to ACS and other database consolidation technologies. <strong>Sales of the Company’s first ACS product, which translates from Sybase to Oracle, have been limited </strong>due to the structure of the sales arrangement and go-to-market strategy. As such, the Company has structured the go-to-market strategy for the second ACS product differently via the use of an Original Equipment Manufacturer (“OEM”) agreement. Pursuant to the OEM agreement, ANTs is responsible for technology development specifically tailored to the OEM’s needs. The OEM will assume responsibility for marketing, sales and support of the technology on a worldwide basis, while ANTs will be the preferred service provider for migration projects. The Company is currently in the process of developing the second ACS product for a planned announcement and release in mid-2010. The Company intends to develop additional ACS products based on market demand and the availability of resources for development.</p></blockquote>
<p>In other words, <strong>as of four months ago ANTs had had $0 in business in what it says is its main product area, </strong>which is pretty much the range the company has been in throughout its <a href="http://www.dbms2.com/2007/04/11/ants-software-is-finally-making-some-sense/" >complicated</a> <a href="http://www.dbms2.com/2008/06/20/derek-rodner-blasts-ants-software/" >history</a>.  Kozak&#8217;s post did link to a claim that IBM has experienced over 300 migrations to DB2. However, that figure includes <a href="http://www.dbms2.com/2009/04/24/ibms-oracle-emulation-strategy-reconsidered/" >Oracle-to-DB2 migrations</a> that having nothing to do with ANTs. And by the way, IBM&#8217;s migration strategy is focused largely on ISVs, so the whole Sybase-ANTs dust-up may be about a type of business (direct capture by DB2 of Sybase ASE enterprise customers) nobody&#8217;s sales force is seriously pursuing.</p>
<p>True, the Sybase-to-DB2 emulation technology hadn&#8217;t been released as of then. Even so, I think it&#8217;s a wee bit early for ANTs to be acting as if there&#8217;s been any proof it ever has had or will have any significant market success.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/04/ants-software-ceo-insults-sybase-claims-migration-success/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Flash is coming, well &#8230;</title>
		<link>http://www.dbms2.com/2010/06/25/flash-is-coming-well/</link>
		<comments>http://www.dbms2.com/2010/06/25/flash-is-coming-well/#comments</comments>
		<pubDate>Fri, 25 Jun 2010 16:42:26 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data integration and middleware]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Theory and architecture]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2389</guid>
		<description><![CDATA[I really, really wanted to title this post &#8220;Flash is coming in a flash.&#8221; That seems a little exaggerated &#8212; but only a little.

Netezza now intends to come out with a Flash-based appliance earlier than it originally expected.
Indeed, Netezza has suspended &#8212; by which I mean &#8220;scrapped&#8221; &#8212; prior plans for a RAM-heavy disk-based appliance. [...]]]></description>
			<content:encoded><![CDATA[<p>I really, really wanted to title this post &#8220;Flash is coming in a flash.&#8221; That seems a little exaggerated &#8212; but only a little.</p>
<ul>
<li>Netezza now intends to come out with a Flash-based appliance earlier than it originally expected.</li>
<li>Indeed, Netezza has suspended &#8212; by which I mean &#8220;scrapped&#8221; &#8212; prior plans for a RAM-heavy disk-based appliance. It will use a RAM/Flash combo instead.*</li>
<li>Tim Vincent of IBM told me that customers seem ready to adopt solid-state memory. One interesting comment he made is that Flash isn&#8217;t really all that much more expensive than high-end storage area networks.</li>
</ul>
<p>Uptake of solid-state memory (i.e. Flash) for analytic database processing will probably stay pretty low in 2010, but in 2011 it should be a notable (b)leading-edge technology, and it should get mainstreamed pretty quickly after that.  <span id="more-2389"></span></p>
<p><em>*So far as I can tell, that&#8217;s one of the two significant roadmap changes between the 2009 and 2010 editions of <a href="http://www.dbms2.com/2010/06/23/my-talk-this-morning/" >Enzee Universe</a>. The other one is that </em><em>the robust form of</em><em> appliance-to-appliance replication technology is coming out later than Netezza had originally planned and hoped.</em></p>
<p>There also is increasing reason to think that the issues with Flash memory wearing out are overwrought.  And by the way, the entire history of enterprise solid-state memory use is basically shorter than the time in which these products supposedly will wear out, so it&#8217;s not as if there have been a lot of real-life failures out there.)</p>
<ul>
<li>First, clever things are being done in the area of error correction codes, although for the most part I defer that part of the discussion to Petascan&#8217;s Camuel Gilyadov. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  E.g., this seems to be the idea behind Anobit.</li>
<li>Second, analytic DBMS are pretty much an ideal use case for Flash reliability. Suppose, as is the case for many products and implementations, you only write things in big blocks. Then you are, ipso facto, resetting the Flash bits only in big blocks. Thus, at least in theory, you automatically have pretty perfect wear leveling.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/25/flash-is-coming-well/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>What kinds of data warehouse load latency are practical?</title>
		<link>http://www.dbms2.com/2010/06/21/data-warehouse-load-latency/</link>
		<comments>http://www.dbms2.com/2010/06/21/data-warehouse-load-latency/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 12:15:17 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2319</guid>
		<description><![CDATA[I took advantage of my recent conversations with Netezza and IBM to discuss what kinds of data warehouse load latency were practical. In both cases I got the impression:

Subsecond load latency is 	substantially impossible. Doing that amounts to OLTP.
5 seconds or so is doable with 	aggressive investment and tuning.
Several minute load latency is 	pretty easy.
10-15 [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I took advantage of my recent conversations with <a href="http://www.dbms2.com/2010/06/21/netezza-database-software-technology-overview/" >Netezza</a> and <a href="http://www.dbms2.com/2010/06/21/netezza-ibm-db2-compression/" >IBM</a> to discuss what kinds of data warehouse load latency were practical. In both cases I got the impression:</p>
<ul>
<li>Subsecond load latency is 	substantially impossible. Doing that amounts to OLTP.</li>
<li>5 seconds or so is doable with 	aggressive investment and tuning.</li>
<li>Several minute load latency is 	pretty easy.</li>
<li>10-15 minute latency or longer is 	now very routine.</li>
</ul>
<p style="margin-bottom: 0in;">There&#8217;s generally a throughput/latency tradeoff, so if you want very low latency with good throughput, you may have to throw a lot of hardware at the problem.</p>
<p style="margin-bottom: 0in;">I&#8217;d expect to hear similar things from any other vendor with reasonably mature analytic DBMS technology. Low-latency load is a problem for columnar systems, but both <a href="http://www.dbms2.com/2008/08/12/vertica-paraccel-exasol/" >Vertica <span style="font-style: normal;">and</span> ParAccel</a> designed in workarounds from the getgo. Aster Data probably didn&#8217;t meet these criteria until <a href="http://www.dbms2.com/2009/10/30/aster-data-application-server-ncluster/" >Version 4.0</a>, its old “<a href="http://www.dbms2.com/2008/10/22/aster-data-systems-ncluster/" >frontline</a>” positioning notwithstanding, but I think it does now.</p>
<p style="margin-bottom: 0in;"><em><strong>Related link</strong></em></p>
<ul>
<li>
<p style="margin-bottom: 0in;"><a href="http://www.dbms2.com/2009/09/10/analytic-speed-latency/" >Just what is your need for speed</a> anyway?</p>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/21/data-warehouse-load-latency/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The Netezza and IBM DB2 approaches to compression</title>
		<link>http://www.dbms2.com/2010/06/21/netezza-ibm-db2-compression/</link>
		<comments>http://www.dbms2.com/2010/06/21/netezza-ibm-db2-compression/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 12:05:47 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Microsoft and SQL*Server]]></category>
		<category><![CDATA[Netezza]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2320</guid>
		<description><![CDATA[Thursday, I spent 3 ½ hours talking with 10 of Netezza&#8217;s more senior engineers. Friday, I talked for 1 ½ hours with IBM Fellow and DB2 Chief Architect Tim Vincent, and we agreed we needed at least 2 hours more. In both cases, the compression part of the discussion seems like a good candidate to [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Thursday, <a href="http://www.dbms2.com/2010/06/21/netezza-database-software-technology-overview/" >I spent 3 ½ hours talking with 10 of Netezza&#8217;s more senior engineers</a>. Friday, I talked for 1 ½ hours with IBM Fellow and DB2 Chief Architect Tim Vincent, and we agreed we needed at least 2 hours more. In both cases, the compression part of the discussion seems like a good candidate to split out into a separate post. So here goes.</p>
<p style="margin-bottom: 0in;">When you sell a row-based DBMS, as Netezza and IBM do, there are a couple of approaches you can take to compression. First, you can compress the blocks of rows that your DBMS naturally stores. Second, you can compress the data in a column-aware way. Both Netezza and IBM have chosen completely column-oriented compression, with no block-based techniques entering the picture to my knowledge. But that&#8217;s about as far as the similarity between Netezza and IBM compression goes.  <span id="more-2320"></span></p>
<p style="margin-bottom: 0in;"><strong>IBM&#8217;s basic DB2 compression strategy</strong> is remarkably simple. In every table (not column) – or in each range partition in a range-partitioned table &#8212; <strong>the 4096 most common* values are identified; these are all encoded into 12-bit strings</strong>. And that&#8217;s that. This has been happening since DB2 9.1, released 4 ½ years ago. DB2&#8217;s compression persists through logs, buffer pools (i.e., RAM cache), and so on. In DB2 9.7, the most recent release, IBM extended the use of the compression to a few areas it hadn&#8217;t stretched before, such as log-based replication, native XML, or CLOBs (Character Large OBjects) that happen not to be too big.</p>
<p style="margin-bottom: 0in;"><em>*Actually, I&#8217;d presume it&#8217;s not exactly the “most common”; there surely is some minimum length of a value to be encoded, or some bias toward length. Also, the determination of what to encode is probably a little imprecise. E.g., I forgot to ask whether the choice of values ever changes as data got updated.</em></p>
<p style="margin-bottom: 0in;">The sophisticated part of DB2&#8217;s simple compression strategy is its breadth of applicability; DB2 compression can apply to:</p>
<ul>
<li>Values in columns (numeric, 	character, whatever)</li>
<li>Substrings of values in columns</li>
<li>Groups of columns (e.g., 	city/state/zip code)</li>
</ul>
<p style="margin-bottom: 0in;">Except for the 4096 values limit, that sounds at least as flexible as the <a href="http://www.dbms2.com/2009/05/14/the-secret-sauce-to-clearpaces-compression/" >Rainstor/Clearpace compression approach</a>.</p>
<p style="margin-bottom: 0in;"><strong>Netezza,</strong> unlike IBM, takes a grab-bag approach to compression – try out a bunch of techniques, see which work best, and incorporate those in the product. <a href="http://www.enzeecommunity.com/blogs/nzblog/2008/05/15/issue-19-the-compress-engine-the-netezza-philosophy" onclick="javascript:pageTracker._trackPageview('/www.enzeecommunity.com');">Netezza first introduced compression a couple of years ago,</a> for numeric columns only, especially integer.  Techniques  used in Netezza numeric compression include but are not limited to:</p>
<ul>
<li>Delta compression, wherein you 	store the increment between a value and its predecessor rather than 	a whole new value.</li>
<li>Ways of indicating that a value or 	increment was just the same as in the row before.</li>
</ul>
<p style="margin-bottom: 0in;">This was via something called Compress Engine,* now being renamed to Compress Engine 1. Netezza&#8217;s new Compress Engine 2 improves on what Netezza did in Compress Engine 1 for numeric data, most notably by trimming away excess field length. (Netezza says it got 28% better compression on a test data set with almost no character strings, primarily from that enhancement.) Further, Netezza Compress Engine 2 adds new compression techniques, allowing it to handle VARCHAR – i.e. character strings &#8212; as well.</p>
<p style="margin-bottom: 0in;"><em>*Fortunately, the original name or at least description of “Compiled Tables” is retreating ever more from view.</em></p>
<p style="margin-bottom: 0in;">Netezza&#8217;s Compress Engine 2 has two ways to compress character fields/text strings – <strong>prefix compression </strong><span style="font-weight: normal;">and </span><strong>Huffman coding.</strong> By way of contrast, Netezza tested suffix compression and decided it wasn&#8217;t beneficial enough to bother messing with.</p>
<ul>
<li>The idea behind prefix compression 	is that if two strings start with the same characters, for the 	second one you only have to record the part that&#8217;s different. Prefix 	compression has a lot of the same merits as delta compression; like 	delta compression, it works best on sorted columns. (An example of 	where prefix compression makes obvious sense is URLs, which tend to 	all start in similar ways.)</li>
<li>In Netezza&#8217;s version of Huffman 	coding, the alphabet is encoded symbol-by-symbol, with more common 	characters getting codes of shorter length. These codes are chosen 	on a column-by-column basis. (I presume the “/” character gets 	shorter code in a URL column than it would, for example, in one that 	stored addresses.)</li>
</ul>
<p style="margin-bottom: 0in;">While I didn&#8217;t ask explicitly, it seems pretty obvious that Compress Engine 2&#8217;s functionality is a strict superset of Compress Engine 1&#8217;s. <a href="http://www.dbms2.com/2010/06/21/netezza-silicon-balance/" >Netezza is going to run Compress Engines 1 and 2 side by side</a>, but expects pages to move from Compress Engine 1&#8217;s purview to Compress Engine 2&#8217;s as part of the new “table grooming” process.</p>
<p><em><strong>Related links</strong></em></p>
<ul>
<li>IBM kindly permitted me to post some of <a href="http://www.monash.com/uploads/ibm-db2-compression-june-2010.pdf" onclick="javascript:pageTracker._trackPageview('/www.monash.com');">its slides in the area of compression</a></li>
<li><a href="http://msdn.microsoft.com/en-us/library/cc280464.aspx" onclick="javascript:pageTracker._trackPageview('/msdn.microsoft.com');">Microsoft SQL Server seems to rely on prefix and dictionary compression</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/21/netezza-ibm-db2-compression/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Various quick notes</title>
		<link>http://www.dbms2.com/2010/05/23/various-quick-notes/</link>
		<comments>http://www.dbms2.com/2010/05/23/various-quick-notes/#comments</comments>
		<pubDate>Sun, 23 May 2010 08:38:51 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[GIS and geospatial]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[SAP AG]]></category>
		<category><![CDATA[SAS Institute]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2173</guid>
		<description><![CDATA[As you might imagine, there are a lot of blog posts I&#8217;d like to write I never seem to get around to, or things I&#8217;d like to comment on that I don&#8217;t want to bother ever writing a full post about. In some cases I just tweet a comment or link and leave it at [...]]]></description>
			<content:encoded><![CDATA[<p>As you might imagine, there are a lot of blog posts I&#8217;d like to write I never seem to get around to, or things I&#8217;d like to comment on that I don&#8217;t want to bother ever writing a full post about. In some cases I just <a href="http://twitter.com/CurtMonash" onclick="javascript:pageTracker._trackPageview('/twitter.com');">tweet</a> a comment or link and leave it at that.</p>
<p>And it&#8217;s not going to get any better. Next week = the oft-postponed elder care trip. Then I&#8217;m back for a short week. Then I&#8217;m off on my quarterly visit to the SF area. Soon thereafter I&#8217;ve have a lot to do in connection with <a href="http://www.netezza.com/userconference/speakers.html" onclick="javascript:pageTracker._trackPageview('/www.netezza.com');">Enzee Universe</a>. And at that point another month will have gone by.</p>
<p>Anyhow:<span id="more-2173"></span></p>
<ul>
<li>Back in January, Oracle finally briefed me on <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >Exadata 2</a>. I also requested and got permission to post what I regarded as pretty interesting slides, then never got around to doing so. Well, <a href="http://www.monash.com/uploads/Exadata-slides-January-2010.pdf" onclick="javascript:pageTracker._trackPageview('/www.monash.com');">here they are</a>. (Pay no attention to the word &#8220;Confidential&#8221;.)</li>
<li>Two people I have a lot of respect for, <a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/05/sap_and_inmemor.html" onclick="javascript:pageTracker._trackPageview('/intelligent-enterprise.informationweek.com');">Cindi Howson</a> and <a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/05/quick_takes_on.html" onclick="javascript:pageTracker._trackPageview('/intelligent-enterprise.informationweek.com');">Doug Henschen</a>, seem bullish on SAP&#8217;s in-memory NewDB efforts. But for a variety of execution reasons, I&#8217;m skeptical that this will matter for anything except SAP&#8217;s analytics suite. I.e., I don&#8217;t think anybody much except SAP will write OLTP apps to it, and I don&#8217;t think that without OLTP apps being written to it it&#8217;s much more than Business Objects&#8217; answer to QlikView.</li>
<li>I just learned that <a href="http://www.thestreet.com/story/10640248/1/tech-rights-give-companies-upper-hand.html" onclick="javascript:pageTracker._trackPageview('/www.thestreet.com');">Netezza&#8217;s previous geospatial technology didn&#8217;t get ported to TwinFin</a>. However, <a href="http://www.netezza.com/releases/2010/release021710.htm" onclick="javascript:pageTracker._trackPageview('/www.netezza.com');">Netezza obviously found a geospatial alternative</a>.</li>
</ul>
<p>I &#8216;m beginning to make a habit of asking vendors for a postable version of their slide decks. <a href="http://www.dbms2.com/2010/05/23/sybase-iq-15/" >Sybase IQ</a> is another example.</p>
<ul>
<li>Google is doing something called <a href="http://googlecode.blogspot.com/2010/05/bigquery-and-prediction-api-get-more.html" onclick="javascript:pageTracker._trackPageview('/googlecode.blogspot.com');">BigQuery</a> that is &#8220;SQL-like&#8221; for big data analytics. I don&#8217;t know anything about it.</li>
<li>I also don&#8217;t know anything about <a href="http://www-01.ibm.com/software/ebusiness/jstart/bigsheets/" onclick="javascript:pageTracker._trackPageview('/www-01.ibm.com');">IBM BigSheets</a> yet. It sounds something like <a href="http://www.dbms2.com/2010/04/16/introduction-to-datameer/" >Datameer</a>, but that could be way off the mark.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/05/23/various-quick-notes/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>IBM puts Cast Iron Systems out of its misery</title>
		<link>http://www.dbms2.com/2010/05/03/ibm-puts-cast-iron-systems-out-of-its-misery/</link>
		<comments>http://www.dbms2.com/2010/05/03/ibm-puts-cast-iron-systems-out-of-its-misery/#comments</comments>
		<pubDate>Mon, 03 May 2010 16:02:28 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Cast Iron Systems]]></category>
		<category><![CDATA[Data integration and middleware]]></category>
		<category><![CDATA[EAI, EII, ETL, ELT, ETLT]]></category>
		<category><![CDATA[IBM and DB2]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2024</guid>
		<description><![CDATA[Long ago, the first enterprise application integration (EAI) vendors offered pairwise integrations between different specific packaged applications. That was, for example what was going on at Katrina Garnett&#8217;s Crossworlds/Crossroads, which eventually became one of IBM&#8217;s first data integration software acquisitions. Years later, Cast Iron Systems tried what seemed to be pretty much the same thing, [...]]]></description>
			<content:encoded><![CDATA[<p>Long ago, the first enterprise application integration (EAI) vendors offered pairwise integrations between different specific packaged applications. That was, for example what was going on at Katrina Garnett&#8217;s Crossworlds/Crossroads, which eventually became one of IBM&#8217;s first data integration software acquisitions. Years later, Cast Iron Systems tried what seemed to be pretty much the same thing, only <a href="http://www.dbms2.com/2007/04/26/more-on-cast-iron-systems/" >better implemented</a>. Recently, however, Cast Iron has been pretty hard to get a hold of, and I also couldn&#8217;t find anybody (competitor, friend of management, whatever) who believed Cast Iron was doing particularly well. So today&#8217;s news that <strong>IBM is acquiring Cast Iron Systems</strong> comes as no big surprise.</p>
<p><span id="more-2024"></span>Cast Iron sold an integration appliance, most focused on <a href="http://www.dbms2.com/2008/03/21/cast-iron-systems-focuses-on-saas-data-integration/" >integrations that involved SaaS applications such as Salesforce</a>, with an option for doing all this purely in the <a href="http://www.dbms2.com/2008/10/09/cloud-data-integration/" >cloud</a>. IBM is accordingly spinning Cast Iron as a major cloud player, which is something of an exaggeration.</p>
<p>IBM will surely get value from whatever specific connectors Cast Iron does a better job at than IBM&#8217;s current offerings do. What I&#8217;m more curious about is whether Cast Iron&#8217;s core technology will survive in a form that continues it&#8217;s core message of &#8220;simplicity, simplicity, simplicity.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/05/03/ibm-puts-cast-iron-systems-out-of-its-misery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on IBM&#8217;s anti-Oracle announcements</title>
		<link>http://www.dbms2.com/2010/04/07/ibm-anti-oracle-announcements/</link>
		<comments>http://www.dbms2.com/2010/04/07/ibm-anti-oracle-announcements/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 15:28:15 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Solid-state memory]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1854</guid>
		<description><![CDATA[IBM is putting out a couple of press releases today that are obviously directed competitively at Oracle/Sun, and more specifically at Oracle&#8217;s Exadata-centric strategy. I haven&#8217;t been briefed, so I just have those to go on.
On the whole, the releases look pretty lame. Highlights seem to include:

Maybe a claim of enhanced data compression.
Otherwise, no obvious [...]]]></description>
			<content:encoded><![CDATA[<p>IBM is putting out a couple of press releases today that are obviously directed competitively at Oracle/Sun, and more specifically at Oracle&#8217;s <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/" >Exadata-centric strateg</a>y. I haven&#8217;t been briefed, so I just have those to go on.</p>
<p>On the whole, the releases look pretty lame. Highlights seem to include:</p>
<ul>
<li>Maybe a claim of enhanced data compression.</li>
<li>Otherwise, no obvious new technology except product packaging and bundling.</li>
<li>Aggressive plans to throw capital at the Sun channel to convert it to selling IBM gear. (A figure of $1/2 billion is mentioned, for financing.</li>
</ul>
<p>Disappointingly, IBM shows a lot of confusion between:</p>
<ul>
<li>Text data</li>
<li>Machine-generated data such as that from sensors</li>
</ul>
<p>While both highly important, those are <a href="http://www.dbms2.com/2010/01/17/three-broad-categories-of-data/" >very different things</a>. IBM has not in the past shown much impressive technology in either of those two areas, and based on these releases, I presume that trend is continuing.</p>
<p><em>Edits: </em></p>
<p><em>I see from press coverage that at least one new IBM model has some Fusion I/O solid-state memory boards in it. Makes sense.</em></p>
<p><em>A Twitter hashtag has a number of observations from the event. Not much substance I could detect except various kind of <a href="http://twitter.com/#search?q=%23ibmsmartsys" onclick="javascript:pageTracker._trackPageview('/twitter.com');">Oracle bashing</a>.<br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/07/ibm-anti-oracle-announcements/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Quick news, links, comments, etc.</title>
		<link>http://www.dbms2.com/2010/03/27/quick-news-links-comments-etc/</link>
		<comments>http://www.dbms2.com/2010/03/27/quick-news-links-comments-etc/#comments</comments>
		<pubDate>Sat, 27 Mar 2010 04:59:30 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Akiban]]></category>
		<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Fox and MySpace]]></category>
		<category><![CDATA[Games and virtual worlds]]></category>
		<category><![CDATA[Groovy Corporation]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[SAP AG]]></category>
		<category><![CDATA[Theory and architecture]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1775</guid>
		<description><![CDATA[Some notes based on what I&#8217;ve been reading recently:

Tom Foremski outlined the dire (at least in theory) privacy risks of geolocation services, going into a lot more detail on that point than I ever have. However, he topped that off with the odd claim that people pay toll (rather than using an electronic service) to [...]]]></description>
			<content:encoded><![CDATA[<p>Some notes based on what I&#8217;ve been reading recently:<span id="more-1775"></span></p>
<ul>
<li>Tom Foremski outlined the dire (at least in theory) <a href="http://www.siliconvalleywatcher.com/mt/archives/2010/03/geo_loco_and_pr.php" onclick="javascript:pageTracker._trackPageview('/www.siliconvalleywatcher.com');">privacy risks of geolocation services</a>, going into a lot more detail on that point than <a href="http://www.dbms2.com/2010/01/31/data-based-snooping-threat-libert/" >I ever have</a>. However, he topped that off with the odd claim that people pay toll (rather than using an electronic service) to cross the Bay Bridge because they fear being tracked, rather than for reasons of time or money.</li>
<li>Oracle had an earnings conference call. <a href="http://blogs.zdnet.com/BTL/?p=32389" onclick="javascript:pageTracker._trackPageview('/blogs.zdnet.com');">Larry Dignan</a> did a good job of covering the highlights; the gory details are on the <a href="http://seekingalpha.com/article/195696-oracle-f3q10-qtr-end-02-28-10-earnings-call-transcript?page=1" onclick="javascript:pageTracker._trackPageview('/seekingalpha.com');">Seeking Alpha</a> transcript, especially pp. 3-5.  Oracle now claims to be getting lots of multi-system deals for Exadata. (But I still haven&#8217;t seen much in the way of production customers named.) ULAs, which I presume are Unlimited License Agreements, are important on the software side. Besides picking on IBM and SAP, Oracle even touted a competitive win vs. EMC, which not coincidentally seems to be working on partnering with almost every Oracle competitor it can find.</li>
<li>Brian Prentice of Gartner basically <a href="http://blogs.gartner.com/brian_prentice/2010/03/23/open-sources-reality-distortion-field/" onclick="javascript:pageTracker._trackPageview('/blogs.gartner.com');">accused open source</a> of being Dotcom 2.0, in terms of dubious business models and the hype associated with same. I agree with many of his particulars, and indeed often steer vendor clients away from open source strategies. For marketing purposes, I do feel that sometimes <a href="http://www.dbms2.com/2009/10/19/greenplum-free-single-node-edition/" >free can be a real cool price</a>; but open source is not the only way to be free.</li>
<li><a href="http://www.dbms2.com/2010/03/22/akibanakiba/" >Akiban</a>, which I wrote about a couple of days ago, seems to be building out its <a href="http://akiban.com/" onclick="javascript:pageTracker._trackPageview('/akiban.com');">website</a>. As of this writing the website is still pretty raw, with bewildering messaging, carelessly repeated paragraphs, and a notable lack of clues as to who&#8217;s in company leadership. Even so &#8212; unless I missed some of the current stuff before, the site has come a long way in a few days, so maybe there&#8217;s hope.</li>
<li>Groovy Corporation, which introduced the <a href="../2009/07/28/the-groovy-sql-switch/">Groovy SQL Switch</a> just last summer, seems to be doing something different now. It&#8217;s merged into a company called uCirrus (where the u is really a mu), but uCirrus doesn&#8217;t have a meaningful website yet, whereas <a href="http://www.groovycorp.com/index.php" onclick="javascript:pageTracker._trackPageview('/www.groovycorp.com');">Groovy</a> does. There&#8217;s stuff there about a &#8220;push data cloud,&#8221; stressing the importance of not being a DBMS, under the name Cortex, whatever that all means. Groovy seems to have an online gaming deal for Cortex with MySpace, or maybe Cortex is just the name of a specific Groovy/MySpace project.</li>
<li>Mike Mooney offered a long rant on <a href="http://mooneyblog.mmdbsolutions.com/" onclick="javascript:pageTracker._trackPageview('/mooneyblog.mmdbsolutions.com');">the problems with database (design) version control</a>. He did concede that the most recent Microsoft Visual Studio might help, for those who are bought into (and can afford) the Microsoft stack. Frankly, I think that&#8217;s what views are for, updatable or otherwise. In many cases, they&#8217;ll let you build what you need, quickly and without breaking anything, and you can leave it to the DBAs to sort out database performance later.</li>
<li>I just discovered <a href="http://www.chadpluspl.us/" onclick="javascript:pageTracker._trackPageview('/www.chadpluspl.us');">Chad Stewart&#8217;s programming blog</a>. While he&#8217;s evidently a game programmer, a lot of his comments have broader applicability.</li>
<li>Chip Hazard offered a VC&#8217;s perspectives on <a href="http://hazard.typepad.com/hazard-lights/2010/02/quick-reminder-of-the-challenges-and-opportunities-in-enterprise-it.html" onclick="javascript:pageTracker._trackPageview('/hazard.typepad.com');">the difficulties facing enterprise IT startups</a>. (Hat tip to Miriam Tuerk for turning me on to him.) Although he didn&#8217;t phrase it this way, his bottom line (at least the part I agree with) is that the startup&#8217;s products have to be amazingly superior to the alternatives (big vendors or in-house).</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/03/27/quick-news-links-comments-etc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
