<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; Data warehousing</title>
	<atom:link href="http://www.dbms2.com/category/analytics-technologies/data-warehouse/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 02 Sep 2010 09:06:44 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>DB2 workload management</title>
		<link>http://www.dbms2.com/2010/08/18/ibm-db2-workload-management/</link>
		<comments>http://www.dbms2.com/2010/08/18/ibm-db2-workload-management/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 08:47:09 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Workload management]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2819</guid>
		<description><![CDATA[DB2 has added a lot of workload management features in recent releases. So when we talked Tuesday afternoon, Tim Vincent and I didn&#8217;t bother going through every one. Even so, we covered some interesting subjects in the area of DB2 workload management, including:  

If your goal is to keep a certain 	class of queries from [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;"><a href="../2009/04/24/some-db2-highlights/">DB2 has added a lot of workload management features in recent releases</a>. So when we talked Tuesday afternoon, Tim Vincent and I didn&#8217;t bother going through every one. Even so, we covered some interesting subjects in the area of DB2 workload management, including:  <span id="more-2819"></span></p>
<ul>
<li>If your goal is to keep a certain 	class of queries from taking too many resources, Tim thinks a great 	way of doing that is to control how many of them are allowed to run 	concurrently.</li>
<li>By way of contrast, Tim is 	cautious about the common approach of just lowering a query&#8217;s 	priority. His concern is that a long-running query could linger even 	longer, creating a long-lasting bottleneck in, for example, <a href="http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/" >temp 	space</a>.</li>
<li>When running over (I believe) 	Linux and AIX, DB2 workload management is integrated with operating 	system workload management. I.e., the same “service class” or 	“workload class” (at a guess, the former is the official term 	and the latter is the term that makes sense) of queries and 	associated processes gets the same treatment in both DB2 and the OS.</li>
<li>DB2&#8217;s workload management extends 	to buffer pools, to inhibit low-priority queries from evicting a 	higher-priority query&#8217;s data from cache.</li>
<li>Sometimes, workload management 	doesn&#8217;t throttle a query, but just decides to collect stats for 	future analysis. (This is on the eminently reasonably theory that 	the best stats to collect are the ones that are live when  	performance problems are actually occurring.)</li>
</ul>
<p style="margin-bottom: 0in;">Finally, Tim spoke of what I regard as the weirdest workload management requirement, one I also heard about from <a href="http://www.dbms2.com/2009/07/18/netezza-on-concurrency-and-workload-management/" >Netezza</a> <span style="font-style: normal;">(but didn&#8217;t explicitly mention) in</span> June. Sometimes, it seems, you simply don&#8217;t want queries to finish too fast. Why? Because if you give great performance when the machine is lightly loaded, then business users might expect that performance too when the machine is heavily loaded and you can&#8217;t deliver it. Apparently, in some environments it&#8217;s better to never deliver great query performance than it is to do so only inconsistently.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/18/ibm-db2-workload-management/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>More on temp space, compression, and &#8220;random&#8221; I/O</title>
		<link>http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/</link>
		<comments>http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 05:44:59 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[IBM and DB2]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2805</guid>
		<description><![CDATA[My PhD was in a probability-related area of mathematics (game theory), so I tend to squirm when something is described as &#8220;random&#8221; that clearly is not. That said, a comment by Shilpa Lawande on our recent Flash/temp space discussion suggests the following way of framing a key point:

You really, really want to have multiple data [...]]]></description>
			<content:encoded><![CDATA[<p>My PhD was in a probability-related area of mathematics (game theory), so I tend to squirm when something is described as &#8220;random&#8221; that clearly is not. That said, <a href="http://www.dbms2.com/2010/08/16/vertica-flash-temp-space/#comment-181134" >a comment by Shilpa Lawande</a> on our recent <a href="http://www.dbms2.com/2010/08/16/vertica-flash-temp-space/" >Flash/temp space discussion</a> suggests the following way of framing a key point:</p>
<ul>
<li>You really, really want to have multiple data streams coming out of temp space, as close to simultaneously as possible.</li>
<li>The storage performance characteristics of such a workload are more reminiscent of &#8220;random&#8221; than &#8220;sequential&#8221; I/O.</li>
</ul>
<p>If everybody else is cool with it too, I can live with that. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Meanwhile, I talked again with Tim Vincent of IBM this afternoon. Tim endorsed the temp space/Flash fit, but with a different emphasis, which upon review I find I don&#8217;t really understand. The idea is:</p>
<ul>
<li>Analytic DBMS processing generally stresses reads over writes.</li>
<li>Temp space is an exception &#8212; read and write use of temp space is pretty balanced. (You spool data out once, you read it back in once, and that&#8217;s the end of that; next time it will be overwritten.)</li>
</ul>
<p>My problem with that is: Flash typically has lower write than read IOPS (I/O per second), so being (relatively) write-intensive would, to a first approximation, seem if anything to disfavor a workload for Flash.</p>
<p>On the plus side, I was reminded of something I should have noted when I wrote about <a href="http://www.dbms2.com/2010/06/21/netezza-ibm-db2-compression/" >DB2 compression</a> before:</p>
<p>Much like Vertica, <strong>DB2 operates on compressed data all the way through, including in temp space. </strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/18/more-on-temp-space-compression-and-random-io/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Vertica&#8217;s innovative architecture for Flash, plus more about temp space than you perhaps wanted to know</title>
		<link>http://www.dbms2.com/2010/08/16/vertica-flash-temp-space/</link>
		<comments>http://www.dbms2.com/2010/08/16/vertica-flash-temp-space/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 08:07:33 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Columnar database management]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Database compression]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Vertica Systems]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2788</guid>
		<description><![CDATA[Vertica is announcing:

Technology it already has 	released*, but has not published any reference architectures 	for
A 	Barney partnership**

In other words, Vertica has succumbed to the common delusion that it&#8217;s a good idea to put out half-baked press releases the week of TDWI conferences. But if we look past that kind of all-too-common nonsense, Vertica is highlighting [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Vertica is announcing:</p>
<ul>
<li>Technology it already has 	released*, but has not published any reference architectures 	for</li>
<li><span style="font-style: normal;">A 	<a href="http://www.strategicmessaging.com/barney-partnerships/2010/08/12/" onclick="javascript:pageTracker._trackPageview('/www.strategicmessaging.com');">Barney</a> partnership**</span></li>
</ul>
<p style="margin-bottom: 0in;"><span style="font-style: normal;">In other words, Vertica has succumbed to the common delusion that it&#8217;s a good idea to put out half-baked press releases the week of TDWI conferences. </span>But if we look past that kind of all-too-common nonsens<span style="font-weight: normal;">e, Vertica is highlighting an interesting technical story, about </span><strong>how the analytic DBMS industry can exploit solid-state memory technology.</strong></p>
<p style="margin-bottom: 0in;"><em>*Upgrades to <a href="../2009/08/04/flexstore-and-the-rest-of-vertica-35/">Vertica FlexStore</a> to handle Flash memory, actually released as part of <a href="../2010/02/22/vertica-4/">Vertica 4.0</a></em></p>
<p style="margin-bottom: 0in;"><em>** With Fusion I/O</em></p>
<p style="margin-bottom: 0in;">To set the context, let&#8217;s recall a few points I&#8217;ve noted in the past:</p>
<ul>
<li><a href="../2010/01/31/flash-pcmsolid-state-memory-disk/">Solid-state 	memory&#8217;s price/throughput tradeoffs obviously make it the future of 	database storage</a>.</li>
<li><a href="../2010/06/25/flash-is-coming-well/">The 	Flash future is coming soon</a>, in part because Flash&#8217;s propensity 	to wear out is overstated. This is especially true in the case of 	modern analytic DBMS, which tend to write to blocks all at once, and 	most particularly the case for append-only systems such as Vertica.</li>
<li><a href="../2010/08/12/teradata-future-product-strategy/">Being 	able to intelligently split databases among various cost tiers of 	storage – e.g. Flash and disk – makes a whole lot of sense</a>.</li>
</ul>
<p style="margin-bottom: 0in;">Taken together, those points tell us:</p>
<p style="margin-bottom: 0in;"><strong>For optimal price/performance, analytic DBMS should support databases that run part on Flash, part on disk.</strong></p>
<p style="margin-bottom: 0in;">While all this is a future for some other analytic DBMS vendors, Vertica is shipping it today.* What&#8217;s more, three aspects of Vertica&#8217;s architecture make it particularly well-suited for hybrid Flash/disk storage, in each case for a similar reason – you can get most of the performance benefit of all-Flash for a relatively low actual investment in Flash chips:  <span id="more-2788"></span></p>
<ul>
<li><strong>Vertica lets you split tables 	by column, </strong><span style="font-weight: normal;">and Vertica 	FlexStore is versatile enough to let you put only the most-used 	columns in Flash. (Vertica offers a figure that 85% of usage calls 	on only 15% of columns, but I don&#8217;t know how rigorously grounded 	those numbers are.)</span></li>
<li>To the extent that Vertica data is<span style="font-weight: normal;"> <a href="../2008/09/24/vertica-finally-spells-out-its-compression-claims/">more </a></span><a href="../2008/09/24/vertica-finally-spells-out-its-compression-claims/">compressed</a> than many of Vertica&#8217;s competitors&#8217; (which it probably is, debates 	over the magnitude of Vertica&#8217;s advantage notwithstanding), the 	total storage-hardware cost of sticking stuff in Flash is less when 	you use Vertica than with other systems.</li>
<li>Vertica has <span style="font-weight: normal;">relatively 	less need for </span><strong>temp space</strong> than some other systems. 	(Vertica uses figures of &lt;20% of total storage, vs. 30%+ for some 	other systems.) If you want to use Flash for temp space, so as to 	accelerate your toughest queries, that can save you some cash …</li>
<li>… and by the way, <strong>temp space 	is an especially good use of Flash, </strong>because <strong>temp space is 	accessed in a less sequential manner than data storage is.</strong></li>
</ul>
<p style="margin-bottom: 0in;">The least obvious of those points are about temp space; I only understood the particulars when Vertica development chief Shilpa Lawande explained them to me Thursday.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;"><em>* At least in theory; customer adoption may be a different matter.</em></p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">But before drilling down on temp space, let me first note that there&#8217;s one offsetting factor to all those “We need somewhat less Flash than the other guys” Vertica advantages. Like all serious databases, a Vertica installation keeps two or more copies of all data, to that there&#8217;s no storage single point of failure. In a flexible system like Vertica, you can put one copy on Flash and one on disk. But if you do that in Vertica, you forgo fully exploiting one possible benefit of Vertica&#8217;s architecture – the ability to store different copies of a column in different orders, which are beneficial for accelerating different groups of queries.*</p>
<p style="margin-bottom: 0in;"><em>*More precisely, you don&#8217;t get the full benefits of Flash acceleration for every query touching those columns.</em></p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">OK. Back to temp space. There are four kinds of things you can put in storage if you&#8217;re running a database management system:</p>
<ul>
<li>The <strong>software</strong> itself.</li>
<li><span style="font-weight: normal;">Persistent </span><strong>data. </strong><span style="font-weight: normal;">(I.e., tables, 	if the DBMS you&#8217;re running is relational.)</span></li>
<li><strong>Metadata,</strong> especially the 	kind that lets you find data &#8211;<strong> indexes,</strong> zone maps, catalogs, 	etc.</li>
<li><strong>Temporary data constructs</strong> built as part of, say, a s<span style="font-weight: normal;">ort-merge 	join. These, by definition, are what populate temp space.</span></li>
</ul>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Just to be clear, those constructs are NOT temporary tables of the sort created by, say, Microstrategy; such tables are handled like any other data. Rather, they are ephemeral creat<span style="font-weight: normal;">ions and, so far as I can tell, not tables at all. </span></p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">Vertica offered two theories as to why its DBMS requires less temp space than competitors do:</p>
<ul>
<li>To the extent data is decompressed 	before being operated on in memory by the DBMS, that decompression 	would of course also apply to temp space as well. Vertica prides 	itself on <strong>keeping data compressed</strong> all the way through, and 	seems to get away with smaller temp space allocations as a benefit.</li>
<li>Since Vertica can store columns in 	expedient sort orders, it does less sorting overall, and sorting is 	a big use of temp space.</li>
</ul>
<p style="margin-bottom: 0in;">Obviously, no matter which DBMS you use, the amount of temp space you need is surely workload-dependent. Even so, Vertica&#8217;s claim to something of an advantage seems legit.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;"><em>Truth be told, I&#8217;m not convinced the savings involved are great enough to </em>matter<em> a whole lot – but it&#8217;s a fun subject to think through. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </em></p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">And finally: One of my biggest surprises since starting to look at analytic-DBMS-on-Flash has been the centrality of temp space. Talking to Vertica Thursday, I finally uncovered a key reason why: <strong>Temp space tends to be accessed via multiple streams of data at once.</strong> I&#8217;m still struggling with WHY that is true, with two reasons suggested being:</p>
<ul>
<li>Temp space can be accessed by 	multiple operations at once. (But isn&#8217;t that also true of the rest 	of storage?)</li>
<li>Merge sorts, a common use of temp 	space, read multiple streams of data. (Couldn&#8217;t you tweak your 	software to make that not be true?)</li>
</ul>
<p style="margin-bottom: 0in;">But if we grant that temp space naturally is accessed in multiple places at once – well, that&#8217;s a lot like random I/O, and <a href="../2005/11/13/breaking-the-disk-speed-barrier/">if you&#8217;re doing a lot of random reads, you&#8217;d love to use something other than spinning disk</a>.</p>
<p style="margin-bottom: 0in;">
<p style="margin-bottom: 0in;">
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/16/vertica-flash-temp-space/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Teradata&#8217;s future product strategy</title>
		<link>http://www.dbms2.com/2010/08/12/teradata-future-product-strategy/</link>
		<comments>http://www.dbms2.com/2010/08/12/teradata-future-product-strategy/#comments</comments>
		<pubDate>Thu, 12 Aug 2010 10:37:14 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[Microstrategy]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>
		<category><![CDATA[Teradata]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2769</guid>
		<description><![CDATA[I think Teradata&#8217;s future product strategy is coming into focus. I&#8217;ll start by outlining some particular aspects, and then show how I think it all ties together.

The immediate hook here is that I had a short conversation with Scott Gnau of Teradata yesterday, triggered by Teradata&#8217;s acquisition of Kickfire&#8217;s assets. Takeaways from that part included:

The [...]]]></description>
			<content:encoded><![CDATA[<p>I think Teradata&#8217;s future product strategy is coming into focus. I&#8217;ll start by outlining some particular aspects, and then show how I think it all ties together.<br />
<span id="more-2769"></span></p>
<p style="margin-bottom: 0in;">The immediate hook here is that I had a short conversation with Scott Gnau of Teradata yesterday, triggered by <a href="../2010/07/27/kickfire-unlikely-to-survive/">Teradata&#8217;s acquisition of Kickfire&#8217;s assets</a>. Takeaways from that part included:</p>
<ul>
<li>The acquisition is all about 	Kickfire&#8217;s <a href="../2009/08/21/kickfires-fpga-based-technical-strategy/">data 	pipelining</a> technology.</li>
<li>Scott (in my opinion rightly) 	thinks that isn&#8217;t particularly tied to Kickfire&#8217;s choice of 	particular DBMS architecture (fairly vanilla columnar).</li>
<li>No decision has been made about 	whether the right vehicle for this technology is an FPGA (Field 	Programmable Gate Array), conventional Intel CPU, RAM, etc.</li>
</ul>
<p style="margin-bottom: 0in;"><em>If you want to handicap Teradata&#8217;s future data pipelining strategy, you might note that:</em></p>
<ul>
<li><em>Kickfire&#8217;s own choice – and 	hence its existing implementation – is an FPGA.</em></li>
<li><em><a href="../2009/08/04/vectorwise-ingres-and-monetdb/">VectorWise&#8217;s 	approach to pipelining is Intel-based,</a> apparently at the cost of 	being closely tied to specific generations of Intel CPUs.</em></li>
<li><em><a href="../2009/07/27/xtremedata-announces-its-dbx-data-warehouse-appliance/">XtremeData&#8217;s 	approach to pipelining</a> is FPGA-based.</em></li>
<li><em>Teradata has a lot more 	development resources than any of those other companies, as well as 	important existing products, and hence has both means and motive to 	shoehorn new technology into older system designs.</em></li>
</ul>
<p style="margin-bottom: 0in;">While I had Scott on the phone, I brought up a few other subjects too. Highlights included:</p>
<ul>
<li>Teradata&#8217;s Flash-based appliance 	is doing just fine in beta test and customer POCs (Proofs of 	Concept).</li>
<li>Other kinds of Teradata appliance 	are not inconceivable.</li>
<li>Scott thinks <a href="http://www.dbms2.com/2010/07/31/teradata-xkoto-gridscale-rip-and-active-active-clustering/" >Michael McIntire&#8217;s 	condemnation of Active-Active architectures</a> is overstated. That 	said,
<ul>
<li>Scott does acknowledge a need for 	greater Active-Active scalability, and suggests that the reason 	Xkoto&#8217;s current products are being discontinued is their lack of 	scaling.</li>
<li>Scott seems quietly confident the 	scaling will get done.</li>
</ul>
</li>
<li>Scott is emphatic that Teradata is 	not going to go to <a href="../2009/04/20/calpont-update-you-read-it-here-first/">a 	two-tier architecture</a>. In particular, the point of splitting 	storage/lightweight database processing and heavyweight database 	processing on separate tiers is generally to save bandwidth, and 	Teradata&#8217;s BYNET is typically less than 10% loaded.</li>
<li>Scott didn&#8217;t dispute my claim that 	this all suggests <a href="../2008/10/14/teradata-virtual-storage/">Teradata 	Virtual Storage</a> is the future, at the expense of a rigid 	delineation among <a href="../2008/10/23/teradata-appliance-product-lines/">specific 	use-case-focused product lines</a>.</li>
<li>Unlike <a href="http://www.dbms2.com/2010/02/22/netezza-twinfin/" >Netezza</a> or <a href="http://www.dbms2.com/2010/02/22/aster-data-ncluster-4-5/" >Aster</a>, Teradata doesn&#8217;t seem to plan analytic capability that works outside 	the UDF (User Defined Function) framework. However, Scott noted that 	Teradata has long had the capability that Aster and Netezza now also 	have of letting you run analytic code either in “protected mode” 	(if the process fails the whole database doesn&#8217;t crash) or in the 	database kernel (best performance, if you&#8217;re sufficiently confident 	in the code&#8217;s stability to take the risk). Scott also spoke of the 	release later this quarter of Teradata FastPath, which will offer 	yet better performance (however, there&#8217;s a gotcha to Teradata 	FastPath that&#8217;s still NDA).</li>
</ul>
<p style="margin-bottom: 0in;">Putting all that together with the rest of what we know about Teradata, I&#8217;m going to call out<strong> three pillars of Teradata&#8217;s long-term product strategy:</strong></p>
<ul>
<li><strong>Same fundamentals as always.</strong> Teradata&#8217;s core product strategy is:
<ul>
<li>Single DBMS, capable of meeting 	all analytic needs while running in a single instance, usually 	running on &#8230;</li>
<li>… proprietary hardware …</li>
<li>… built from 	conservatively-chosen parts.</li>
</ul>
</li>
<li><strong>Selective vertical application 	stack.</strong> No matter how horizontally-oriented they are, many 	companies that have been in the analytic technology business for a 	while wind up with some vertical applications. It sort of just 	happens. Teradata is no exception. Teradata also likes to sell 	services to its product customers, and some of those are quite 	vertical-aware.</li>
<li><strong>Mutable, modular platform.</strong> This is what I highlighted above. Note that it&#8217;s philosophically 	attuned with the one-system-does-everything approach Teradata 	prefers. More subtly, please also note that it goes well with 	customer-by-customer price customization, which is almost a must for 	Teradata given the Innovator&#8217;s Dilemma kind of pricing box it finds 	itself in.</li>
</ul>
<p style="margin-bottom: 0in;">So far, that&#8217;s not too exciting, except in the details of how Teradata&#8217;s engineers make that all work. But there&#8217;s a <strong>fourth pillar to Teradata&#8217;s technical strategy</strong> as well, and it&#8217;s a wild card: t<strong>ight partnerships.</strong> Every time I talk with Teradata hardware chief Carson Schmidt, he seems excited about some particular version of a part or other – sometimes from a reasonably established vendor (once it was LSI Logic), sometimes from a tiny one (notably <a href="../2009/10/25/teradata-hardware-strategy-and-tactics/">the “stealth” start-up on which Teradata bet its first solid-state product</a>.) In the future, I expect tight business intelligence partnerships as well. Cognos BI will be increasingly integrated with IBM&#8217;s DBMS and hardware; Business Objects&#8217; BI will increasingly be integrated with SAP&#8217;s applications; and Oracle&#8217;s BI will eventually be integrated with everything. How do you compete with that if you<span style="font-style: normal;">&#8216;re Microstrategy? </span>Well, you try to have superior product, of course – but you also partner as closely with DBMS vendors as you can, an approach Microstrategy has already started. Predictive analytics stalwart <a href="http://www.dbms2.com/2010/05/15/further-clarifying-in-database-mpp-sas/" >SAS</a>, of course, is on a partnership binge as well.</p>
<p style="margin-bottom: 0in;">Teradata has a larger installed base than almost all its competitors, and enjoys richer third-party software and service support as a result. But I suspect that going forward,  for Teradata to remain a leading competitor at price points it is willing to accept, Teradata&#8217;s “ecosystem” advantages will need to ratchet up one or several notches.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/12/teradata-future-product-strategy/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Big Data is Watching You!</title>
		<link>http://www.dbms2.com/2010/08/11/big-data-is-watching-you/</link>
		<comments>http://www.dbms2.com/2010/08/11/big-data-is-watching-you/#comments</comments>
		<pubDate>Wed, 11 Aug 2010 05:30:22 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Log analysis]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[RDF and graphs]]></category>
		<category><![CDATA[Specific users]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[Web analytics]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2760</guid>
		<description><![CDATA[There&#8217;s a boom in large-scale analytics. The subjects of this analysis may be categorized as:

People
Financial trades
Electronic networks
Everything else

The most varied, interesting, and valuable of those four categories is the first one.

That may change some day, with the growing importance of machine-generated data, and of big-data science in particular. But I think it&#8217;s a fair assessment [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">There&#8217;s a boom in large-scale analytics. The subjects of this analysis may be categorized as:</p>
<ul>
<li>People</li>
<li>Financial trades</li>
<li>Electronic networks</li>
<li>Everything else</li>
</ul>
<p style="margin-bottom: 0in;">The most varied, interesting, and valuable of those four categories is the first one.</p>
<p><span id="more-2760"></span></p>
<p style="margin-bottom: 0in;"><em>That may change some day, with the growing importance of<a href="http://www.dbms2.com/2010/04/08/machine-generated-data-example/" > </a><a href="http://www.dbms2.com/2010/04/08/machine-generated-data-example/" >machine-generated data</a>,</em><em> and of <a href="http://www.dbms2.com/2009/10/03/issues-in-scientific-data-management/" >big-data science</a> </em><em>in particular. But I think it&#8217;s a fair assessment at the present, and for at least the next few years.</em></p>
<p style="margin-bottom: 0in;">Some of th<span style="font-weight: normal;">e most interesting use cases are concentrated in the areas of identifying individuals, groups of people, or behaviors of (groups of) people. For example:</span></p>
<ul>
<li>comScore works hard to <strong>identify 	individual web surfers </strong><span style="font-weight: normal;">– 	i.e. to </span><strong>deanonymize</strong><span style="font-weight: normal;"> them &#8212; even</span> though they may have given incomplete or false 	personal information.</li>
<li>Other companies at least try to 	figure out <strong>which information in a user&#8217;s profile is unreliable,</strong> so as to classify them better. (Yes, there are 62-year-old 	video-game-obsessed Lady Gaga fans, but that&#8217;s generally not the way 	to bet.)</li>
<li>Multiple telecom vendors try to 	identify who their <strong>most influential customers</strong> are (to a first 	approximation, they&#8217;re the ones most often called by the most 	people, but it surely gets more sophisticated than that). This 	information is then used to reduce churn, either by working hard to 	retain those users, or – if they do churn – to move very fast to 	retain the business from their friends.</li>
<li>Other kinds of companies do 	similar kinds of analysis, to the extent that they have enough of a 	social graph to do so. (This application is a case where the term 	“<a href="http://www.dbms2.com/2010/06/08/profile-of-revealed-preferences/" >social graph</a>” is not a misnomer.)</li>
<li><strong>Turing detectives</strong> (I just 	coined that phrase) try to determine whether users are humans or 	bots.</li>
<li>Central to detecting <strong>insurance 	fraud</strong> is identifying suspiciously close connections between 	claimants, service providers, and so on.</li>
<li>Identifying groups of people is 	also important in flagging <strong>insider trading.</strong><span style="font-weight: normal;"> Even more important are other kinds of analysis, along the lines of 	“is this normal innocent trading behavior?” </span></li>
<li><span style="font-weight: normal;">Intelligence 	agencies try to detect networks of </span><strong>terrorists</strong><span style="font-weight: normal;"> and their sympathizers. They further try to identify unusual 	patterns of communication or meetings along those networks that 	might indicate terrorist acts are being planned. (Civilian law 	enforcement agencies can use similar techniques.)</span></li>
</ul>
<p style="margin-bottom: 0in; font-weight: normal;">In most cases, the analysis and/or run-time execution of the relevant models is done with the help of analytic DBMS. Other technologies that come into play include non-DBMS MapReduce (Hadoop), graph engines, and CEP (Complex Event Processing). The vendor most heavily represented on that list is probably Aster Data, because:</p>
<ul>
<li>Aster Data is 	focused on hard-core analytics.</li>
<li>I talk a lot 	with Aster Data, and in particular had a long, detailed use-cases 	discussion with them last week.</li>
<li><span style="font-weight: normal;">The 	comScore example happens to come from a speaker at </span><a href="http://www.dbms2.com/2010/05/07/implications-onew-analytic-technology/" ><span style="font-weight: normal;">an 	Aster event</span></a><span style="font-weight: normal;"> I also 	participated in.</span></li>
</ul>
<p style="margin-bottom: 0in;"><span style="font-weight: normal;">And by the way, all this only scratches the surface of what will be possible down the road. It&#8217;s based mainly on where you live, what you purchase, how you behave on websites, and who you communicate with. </span><span style="color: #000080;"><span lang="zxx"><span style="text-decoration: underline;"><a href="../2010/07/04/fair-data-use/"><span style="font-weight: normal;">Other kinds of data, which could be used to be yet more intrusive</span></a></span></span></span><span style="font-weight: normal;">, generally aren&#8217;t involved.</span></p>
<p style="margin-bottom: 0in;"><span style="font-weight: normal;">I actually have two points in drawing up this list. One is golly-gee-whiz about how a lot of analytically sophisticated applications are actually getting into production. The other is to highlight the privacy and liberty threats If This Goes On Unchecked (which is why I didn&#8217;t include some other less-people-focused examples). There&#8217;s also a related danger that, to the extent we don&#8217;t get some smart regulations to keep us safe(r), we&#8217;ll get a bunch of stupid regulations instead. </span></p>
<p style="margin-bottom: 0in;"><span style="font-weight: normal;">The Analytic Era has only just begun.<br />
</span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/11/big-data-is-watching-you/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Links and observations</title>
		<link>http://www.dbms2.com/2010/08/09/links-and-observations/</link>
		<comments>http://www.dbms2.com/2010/08/09/links-and-observations/#comments</comments>
		<pubDate>Tue, 10 Aug 2010 02:37:51 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Calpont]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[HP and Neoview]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Northscale]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[ParAccel]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[XtremeData]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2743</guid>
		<description><![CDATA[I&#8217;m back from a trip to the SF Bay area, with a lot of writing ahead of me. I&#8217;ll dive in with some quick comments here, then write at greater length about some of these points when I can. From my trip:  

Aster Data showed me a lot of customer names and deal sizes, across [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m back from a trip to the SF Bay area, with a lot of writing ahead of me. I&#8217;ll dive in with some quick comments here, then write at greater length about some of these points when I can. From my trip:  <span id="more-2743"></span></p>
<ul>
<li>Aster Data showed me a lot of customer names and deal sizes, across a bunch of industries (mainly enterprise rather than web). Yes, Aster&#8217;s market success is for real. (But almost all those details are NDA.)</li>
<li>Sybase&#8217;s product plans for IQ are pretty impressive. (But the most interesting parts are, you guessed it, NDA.)</li>
<li>I&#8217;ve kissed and made up* with ParAccel, now that they&#8217;ve replaced their CEO, replaced their marketing chief, and stopped the worst of the <a href="http://www.dbms2.com/2010/01/15/there-sure-seem-to-be-a-lot-of-inaccuracies-on-paraccels-website/" >marketing</a> <a href="http://www.dbms2.com/2009/06/22/the-tpc-h-benchmark-is-a-blight-upon-the-industry/" >nonsense</a> I used to complain about. ParAccel has some interesting plans for ParAccel 3.0 which are, naturally, NDA.</li>
<li>The Peoplesoft guys are doing it over again at Workday. Only this time, their platform isn&#8217;t a relational DBMS. Rather, it&#8217;s an in-memory, completely object-oriented data model, with disk used only on a &#8220;Just in case the power ever goes out&#8221; basis. (Thankfully, nothing at all about our conversation was NDA.)</li>
<li>I&#8217;m finally feeling good about <a href="# I spent considerable time  with my clients at both Greenplum and EMC (if we ignore the fact that  the deal has closed and they're now the same company). I also had more  of  a hardcore engineering discussion than I've had with Greenplum for  quite a while (I should have been pushier about that earlier). Takeaways  included:      * This is starting off as a honeymoon deal. Everything  Greenplum was planning to do is being continued. Additional resources  are being poured into Greenplum to do more.     * Some Greenplum execs  seem to envision staying long term, some seem to envision moving on to  their next startups. The ones who envision moving on are, however, going  to work hard first to make the merger a success.     * Greenplum has,  for quite a while, had more of an advanced analytics/embedded predictive  modeling story than I realized. Bad on them for not fleshing it out  more in marketing and product packaging alike.     * Greenplum both  denies the concurrency problems I previously noted and also has a very  credible story as to how it will eliminate them. :) Seriously, Greenplum  tells of one customer that routinely runs 150 simultaneously queries -  on what I think is not a terribly big system -- and a number of POCs  (Proofs of Concept) that simulated similar levels of concurrency.">Northscale&#8217;s  memcached-compatible persistent store Membase</a>. The main reason is  that they showed me a near-term path to interfaces that are richer than  key-value. Also, Todd Hoff reassured me that even pure persistent  memcached has a place.</li>
<li>Rumor says that even the one app for which Facebook was using Cassandra &#8212; in-box search &#8212; has been decommissioned. On the other hand, numerous other scale-0ut DBMS (SQL or otherwise) seem to have Facebook footholds. But details are &#8212; all together now! &#8212; NDA.</li>
</ul>
<p><em>*If you know ParAccel&#8217;s new marketing chief Michael Weir, you  surely guessed I mean that only in a figurative sense.</em></p>
<p>From elsewhere:</p>
<ul>
<li>Daniel Abadi offered <a href="http://dbmsmusings.blogspot.com/2010/08/thoughts-on-kickfires-apparent-demise.html" onclick="javascript:pageTracker._trackPageview('/dbmsmusings.blogspot.com');">his  analysis</a> of <a href="../2010/07/27/kickfire-unlikely-to-survive/">Kickfire&#8217;s  demise</a>. In general I agree, but Daniel neglected to mention one  hugely important factor &#8212; the chicken-egg negative effect of Kickfire&#8217;s  lack of market or marketing traction. Customers were extremely reluctant to buy from Kickfire  because they perceived, correctly, that Kickfire&#8217;s survivability was far  from assured.</li>
<li>While the <a href="http://infinidb.org/community/forums/11-general-infinidb/1000-strange-issue-with-drop-table" onclick="javascript:pageTracker._trackPageview('/infinidb.org');">InfiniDB forums</a> suggest that there are at least a couple of production users of Calpont&#8217;s free InfiniDB, Calpont seemingly has a long way to go to be even as successful as Kickfire. But Calpont does have a bit of money to spend on lead generation; maybe some day they&#8217;ll even have actual customers.</li>
<li>In a response to a question I messaged over, <a href="http://www.dbms2.com/2010/03/18/xtremedata-update/" >XtremeData</a> tells me they have actual customers now. Press releases to follow.</li>
<li>The <a href="http://news.cnet.com/8301-31021_3-20013111-260.html?part=rss&amp;subj=news&amp;tag=2547-1_3-0-20" onclick="javascript:pageTracker._trackPageview('/news.cnet.com');">admiration for the job Mark Hurd did at HP</a> is in my opinion overstated. Sure, the financial/operational management appeared to work, but HP did little on Hurd&#8217;s watch to strengthen its reputation or customers&#8217; loyalty. In particular:
<ul>
<li>HP&#8217;s analytics efforts have accomplished little.</li>
<li>HP&#8217;s data warehouse appliance efforts have failed pathetically.</li>
<li>From what I hear, HP&#8217;s execution in its Exadata partnership was not good.</li>
<li>HP&#8217;s server business in general is distinguished mainly by HP being a big company.</li>
<li>HP&#8217;s EDS acquisition has been rocky, not that EDS was sailing so smoothly on its own beforehand.</li>
<li>HP&#8217;s success in PCs amounts to &#8220;arguably, HP sucks a little less than the other guys&#8221;.</li>
<li>HP&#8217;s elite reputation is long gone (admittedly, for the most part that predates Hurd).</li>
</ul>
</li>
<li><a href="http://intelligent-enterprise.informationweek.com/blog/archives/2010/08/software_innova.html" onclick="javascript:pageTracker._trackPageview('/intelligent-enterprise.informationweek.com');">Doug Henschen</a> evidently favors really strong intellectual property protection for software, even forbidding plug-compatible reverse engineering. I agree with Doug up to the point that <a href="http://www.monashreport.com/2010/07/19/my-view-of-intellectual-property/" onclick="javascript:pageTracker._trackPageview('/www.monashreport.com');">it should be forbidden to copy proprietary software</a>, but I don&#8217;t see why he (or a court) would view such behavior as copying.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/09/links-and-observations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Notes on EMC&#8217;s Greenplum subsidiary</title>
		<link>http://www.dbms2.com/2010/08/09/emc-greenplum/</link>
		<comments>http://www.dbms2.com/2010/08/09/emc-greenplum/#comments</comments>
		<pubDate>Tue, 10 Aug 2010 00:02:17 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Greenplum]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2744</guid>
		<description><![CDATA[I spent considerable time last week with my clients at both Greenplum and EMC  (if we ignore the fact that the deal has closed and they&#8217;re now the same  company). I also had more of  a hardcore engineering discussion than  I&#8217;ve had with Greenplum for quite a while (I should have been [...]]]></description>
			<content:encoded><![CDATA[<p>I spent considerable time last week with my clients at both Greenplum and EMC  (if we ignore the fact that the deal has closed and they&#8217;re now the same  company). I also had more of  a hardcore engineering discussion than  I&#8217;ve had with Greenplum for quite a while (I should have been pushier  about that earlier). Takeaways included:</p>
<ul>
<li>This is starting off as a honeymoon deal. Everything Greenplum was  planning to do is being continued. Additional resources are being  poured into Greenplum to do more.</li>
<li>Some Greenplum execs seem to envision staying long term, some seem  to envision moving on to their next startups. The ones who envision  moving on are, however, going to work hard first to make the merger a  success.</li>
<li>Greenplum has, for quite a while, had more of an advanced  analytics/embedded predictive modeling story than I realized. Bad on  them for not fleshing it out more in marketing and product packaging  alike.</li>
<li>Greenplum both denies the <a href="http://www.dbms2.com/2010/07/06/emc-is-buying-greenplum/" >concurrency  problems</a> I previously noted and also has a very credible story as  to how it will eliminate them. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Seriously, Greenplum tells of one  customer that routinely runs 150 simultaneous queries &#8211; on what I  think is not a terribly big system &#8212; and a number of POCs (Proofs of  Concept) that simulated similar levels of concurrency.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/08/09/emc-greenplum/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Teradata, Xkoto Gridscale (RIP), and active-active clustering</title>
		<link>http://www.dbms2.com/2010/07/31/teradata-xkoto-gridscale-rip-and-active-active-clustering/</link>
		<comments>http://www.dbms2.com/2010/07/31/teradata-xkoto-gridscale-rip-and-active-active-clustering/#comments</comments>
		<pubDate>Sat, 31 Jul 2010 08:23:57 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Continuent]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Theory and architecture]]></category>
		<category><![CDATA[Xkoto]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2708</guid>
		<description><![CDATA[Having gotten a number of questions about Teradata&#8217;s acquisition of Xkoto, I leaned on Teradata for an update, and eventually connected with Scott Gnau. Takeaways included:

Teradata is discontinuing  Xkoto&#8217;s existing product Gridscale, which 	Scott characterized as being too OLTP-focused to be a good fit for 	Teradata. Teradata hopes and expects that existing Xkoto Gridscale [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Having gotten a number of questions about Teradata&#8217;s acquisition of Xkoto, I leaned on Teradata for an update, and eventually connected with Scott Gnau. Takeaways included:</p>
<ul>
<li>Teradata is discontinuing <a href="http://www.dbms2.com/2009/09/11/xkoto-gridscale-highlights/" > </a><a href="http://www.dbms2.com/2009/09/11/xkoto-gridscale-highlights/" >Xkoto&#8217;s existing product Gridscale</a>, <span style="font-style: normal;">which 	Scott characterized as being too OLTP-focused to be a good fit for 	Teradata. Teradata hopes and expects that existing Xkoto Gridscale 	customers won&#8217;t renew maintenance. (I&#8217;m not sure</span> that they&#8217;ll 	even get the option to do so.)</li>
<li>The point of Teradata&#8217;s technology 	+ engineers acquisition of Xkoto is to enhance Teradata&#8217;s 	active-active or multi-active data warehousing capabilities, which 	it has had in some form for several years.</li>
<li>In particular, Teradata wants to 	tie together different products in the Teradata product line. (Note: 	Those typically all run pretty much the same Teradata database 	management software, except insofar as they might be on different 	releases.)</li>
<li>Scott rattled off all the 	plausible areas of enhancement, with multiple phrasings – 	performance, manageability, ease of use, tools, features, etc.</li>
<li>Teradata plans to have one or two 	releases based on Xkoto technology in 2011.</li>
</ul>
<p style="margin-bottom: 0in;">Frankly, I&#8217;m disappointed at the struggles of clustering efforts such as Xkoto Gridscale or <a href="http://www.dbms2.com/2009/09/03/continuent-on-clustering/" >Continuent&#8217;s pre-Tungsten products</a>, but if the DBMS vendors meet the same needs themselves, that&#8217;s OK too.</p>
<p style="margin-bottom: 0in;">The logic behind active-active database implementations actually seems pretty compelling:  <span id="more-2708"></span></p>
<ul>
<li>You may well be keeping a second 	copy of your database for high availability/hot standby.</li>
<li>You might even be keeping a third 	copy for off-site disaster recovery.</li>
<li>In some cases, you might have 	reasons beyond disaster recovery to distribute a database around the 	world.</li>
<li>So why not allow queries to be run 	against all the copies?</li>
<li>And by the way, splitting the 	workload up a bit by kinds (e.g., long-running vs. short query) 	might let you optimize the implementation of each copy of the 	database. (This last point becomes even more important with the rise 	of solid-state memory.)</li>
</ul>
<p style="margin-bottom: 0in;">Analytic DBMS vendors pretty much all need to offer this. (Possible exception: If they have a data-mart-only positioning so extreme that customers will never care about any form of failover.) That said, I must confess to not having done a good job of tracking who does or doesn&#8217;t have which features in this area to date; informative comments to this post in that regard would be much appreciated!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/07/31/teradata-xkoto-gridscale-rip-and-active-active-clustering/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Advice for some non-clients</title>
		<link>http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/</link>
		<comments>http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/#comments</comments>
		<pubDate>Fri, 30 Jul 2010 14:35:52 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[HP and Neoview]]></category>
		<category><![CDATA[Information Builders]]></category>
		<category><![CDATA[Ingres]]></category>
		<category><![CDATA[Kalido]]></category>
		<category><![CDATA[MarkLogic]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Objectivity and Infinite Graph]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[SenSage]]></category>
		<category><![CDATA[Tableau Software]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2699</guid>
		<description><![CDATA[Edit: Any further anonymous comments to this post will be deleted. Signed comments are permitted as always.

Most of what I get paid for is in some form or other consulting. (The same would be true for many other analysts.) And so I can be a bit stingy with my advice toward non-clients. But my non-clients [...]]]></description>
			<content:encoded><![CDATA[<p><em>Edit: Any further anonymous comments to this post will be deleted. Signed comments are permitted as always.<br />
</em></p>
<p>Most of what I get paid for is in some form or other consulting. (<a href="http://www.strategicmessaging.com/blurring-analyst-consultant-line/2010/07/28/" onclick="javascript:pageTracker._trackPageview('/www.strategicmessaging.com');">The same would be true for many other analysts</a>.) And so I can be a bit stingy with my advice toward non-clients. But my non-clients are a distinguished and powerful group, including in their number Oracle, IBM, Microsoft, and most of the BI vendors. So here&#8217;s a bit of advice for them too.</p>
<p><strong>Oracle. </strong>On the plus side, you guys have been making progress against your reputation for untruthfulness. Oh, I&#8217;ve dinged you for some <a href="http://www.dbms2.com/2008/09/30/oracle-crosses-the-line-on-integrity/" >past</a> <a href="http://www.dbms2.com/2008/06/28/response-to-rita-sallam-of-oracle/" >slip-ups</a>, but on the whole they&#8217;ve been no worse than other vendors.&#8217; But recently you pulled a doozy. The <a href="http://www.oracle.com/us/corporate/analystreports/infrastructure/index.html" onclick="javascript:pageTracker._trackPageview('/www.oracle.com');">analyst reports</a> section of your website fails to distinguish between unsponsored and sponsored work.* That is a horrible ethical stumble. Fix it fast. Then put processes in place to ensure nothing that dishonest happens again for a good long time.</p>
<p><em>*Merv Adrian&#8217;s &#8220;report&#8221; listed high on that page is actually a sponsored white paper. That Merv himself screwed up by not labeling it clearly as such in no way exonerates Oracle. Besides, I&#8217;m sure Merv won&#8217;t soon repeat the error &#8212; but for Oracle, this represents a whole pattern of behavior.</em></p>
<p><strong>Oracle.</strong> And while I&#8217;m at it, outright dishonesty isn&#8217;t your only unnecessary credibility problem. <a href="http://www.strategicmessaging.com/so-what-is-an-analyst-anyway/2010/07/25/" onclick="javascript:pageTracker._trackPageview('/www.strategicmessaging.com');">You&#8217;re also playing too many games in analyst relations</a>.</p>
<p><strong>HP.</strong> Neoview will never succeed. Admit it to yourselves. Go buy something that can.  <span id="more-2699"></span></p>
<p><strong>Smaller BI vendors.</strong> Analytic DBMS evaluations commonly include BI strategy and tool selection as well. If an analytic DBMS expert tells you he needs to learn more about your product line, don&#8217;t blow him off. In fact, you should be particularly embracing anybody who&#8217;s shown a fondness for small DBMS vendors; maybe he or his clients will like small BI vendors as well. That means (among others) you, <strong>Jaspersoft, Endeca, </strong>and <strong>Tableau.</strong></p>
<p><strong>Information Builders. </strong>Is there anything about your BI products that is in any way technologically differentiated? If so, you might want to mention some examples to somebody some time.</p>
<p><strong>Kalido.</strong> I&#8217;ve said this to you before, but it bears repeating &#8212; your positioning translates to &#8220;I-CASE for analytics,&#8221; and that&#8217;s not a good thing. If your product is not as cumbersome and entrapping as that sounds, you need to do a much better job of explaining why not.</p>
<p><strong>SenSage.</strong> You are what you are. Sell out while the selling is good. You don&#8217;t have the corporate personality to make it into the analytic DBMS mainstream on your own.</p>
<p><strong>Ingres. </strong>You need to be more engaged with analysts than you are. <a href="http://www.softwarememories.com/2010/07/25/ingres-history/" onclick="javascript:pageTracker._trackPageview('/www.softwarememories.com');">Ingres navel-gazed too much 25 years ago</a>, and evidently you haven&#8217;t outgrown it yet.</p>
<p><strong>TIBCO.</strong> You probably have a lot of cool analytic technology, but I don&#8217;t know of an influencer who has much relationship with or trust in you. Rethink how you&#8217;re approaching influencer relations top to bottom.</p>
<p><strong>Tableau.</strong> You had a lot of mindshare, but it&#8217;s fading. Do something.</p>
<p><strong>MarkLogic, graph DBMS vendors, etc.</strong> You&#8217;re clinging too hard to the NoSQL label. Nobody is out there deciding among Cassandra, neo4j, and MarkLogic. They might be deciding between MongoDB and MarkLogic, I guess, but if you admit to yourself that&#8217;s all it is you&#8217;ll probably change your messaging somewhat.</p>
<p><strong>Objectivity.</strong> Get real about marketing. Infinite Graph is a cool opportunity. But I didn&#8217;t even ping you for a meeting when I&#8217;m in your area next week, because I wouldn&#8217;t have known who to reach out to.</p>
<p><strong>Everybody (especially Objectivity).</strong> &#8220;First X deployed in the cloud&#8221; is almost surely an inaccurate claim. Don&#8217;t make it. And by the way, even if it were true, it probably wouldn&#8217;t be interesting.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/feed/</wfw:commentRss>
		<slash:comments>43</slash:comments>
		</item>
		<item>
		<title>Microstrategy technology notes</title>
		<link>http://www.dbms2.com/2010/07/29/microstrategy-technology-notes/</link>
		<comments>http://www.dbms2.com/2010/07/29/microstrategy-technology-notes/#comments</comments>
		<pubDate>Thu, 29 Jul 2010 17:51:42 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[Microstrategy]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2692</guid>
		<description><![CDATA[Earlier this week, Microstrategy made Mark LaRow available to talk about technology. The proximate reason was my recent mention of Microstrategy&#8217;s mobile BI emphasis, but we also touched on Microstrategy&#8217;s approach to in-memory business intelligence and some other subjects. We didn&#8217;t go into the depth of a similar conversation I had recently with Qlik Technologies, [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Earlier this week, Microstrategy made Mark LaRow available to talk about technology. The proximate reason was <a href="../2010/07/25/alerts-metrics-dashboards/">my recent mention of Microstrategy&#8217;s mobile BI emphasis</a>, but we also touched on <a href="../2009/02/19/microstrategy-tidbits/">Microstrategy&#8217;s approach to in-memory business intelligence</a> and some other subjects. We didn&#8217;t go into the depth of <a href="http://www.dbms2.com/2010/06/12/the-underlying-technology-of-qlikview/" >a similar conversation I had recently with Qlik Technologies</a>, but I found it quite interesting even so.</p>
<p style="margin-bottom: 0in;">Highlights of the <strong>in-memory BI discussion</strong> included:</p>
<ul>
<li>Microstrategy&#8217;s in-memory BI data 	structure is some kind of simple array, redundantly called a “vector 	array.” A more precise description was not available.</li>
<li>While early versions of the 	capability have been around since 2002, Microstrategy&#8217;s in-memory BI 	capability only got serious with Microstrategy 9, which was released 	in Q1 of 2009. In particular, Microstrategy 9 was the first time 	in-memory BI had full security.</li>
<li>Mark says a core reason for having 	their own in-memory BI is because Microstrategy has more smarts to 	predict which aggregates will or won&#8217;t be needed. Strictly speaking, 	that can&#8217;t be argued with. Vendors like Infobright would argue they 	come close enough to that ideal as to make little practical 	difference – but I&#8217;m also cheating by naming Infobright, which is 	particularly focused in that direction.</li>
<li>Microstrategy in-memory BI 	compresses data by about 2X. Mark didn&#8217;t know which compression 	algorithm was used.</li>
<li>The limitation on what&#8217;s in-memory 	is, of course, how much RAM you can fit on an SMP box. Microstrategy 	has seen up to ½ terabyte deployments.</li>
<li>In-memory Microstrategy data 	structures are typically built during the batch window, for 	performance reasons. This is not, strictly speaking, mandatory, but 	I didn&#8217;t get a sense that Microstrategy was being used for much that 	resembled <a href="../2008/10/20/coral8-proposes-cep-as-a-bi-data-platform/">real-time 	business intelligence</a>.</li>
<li>Mark said Microstrategy has no 	interest in using solid-state memory to expand the reach of its 	in-memory BI. Frankly, if Microstrategy doesn&#8217;t change that stance, 	it&#8217;s in-memory BI capabilities are unlikely to stay significant for 	too many years.</li>
</ul>
<p style="margin-bottom: 0in;">Another key subject we discussed was Microstrategy&#8217;s view of <strong>dashboards.</strong> <span id="more-2692"></span></p>
<ul>
<li>Microstrategy thinks that what 	customers really want is to have a whole lot of navigational 	drilldown options into a few big reports. (“50 pages, 50 columns” 	was mentioned as an example of “big”.) This has been 	Microstrategy&#8217;s approach for three years or so.</li>
<li>Microstrategy even offers a 	version of this in Flash, which can be drilled down on with no calls 	to the server whatsoever.</li>
<li>This is also Microstrategy&#8217;s 	paradigm on the iPhone and iPad, where it would seem to make 	particular sense since you aren&#8217;t exactly going to tile a portal 	page into 6 different charts anyway.</li>
<li>On the iPhone/iPad, this is all 	native code, with a simple local data structure. In a parallel 	project, Microstrategy is researching HTML 5.</li>
<li>Microstrategy would rather call 	all this “microapps” than “dashboards.”</li>
</ul>
<p style="margin-bottom: 0in;">We also discussed Microstrategy&#8217;s approach to <strong>alerting.</strong> Highlights included:</p>
<ul>
<li>Microstrategy 9 introduced an 	alerting capability that Microstrategy sees as differentiated enough 	to emphasize in lots of sales cycles.</li>
<li>Microstrategy&#8217;s alerting 	capability lets you set “thresholds”.</li>
<li>A typical Microstrategy threshold 	would be a percentage change in a variable vs. another time period. 	You get to specify the variable (duh), the percentage, and the time 	comparison.</li>
<li>When a threshold is crossed, 	Microstrategy sends you an alerting email. (There&#8217;s something native 	to Apple that&#8217;s an alternative for Apple platforms.)</li>
</ul>
<p>We discussed one other subject as well, kicked off by my question &#8220;So why does Microstrategy spawn all those temporary tables anyway?&#8221; Mark and I more or less agreed:</p>
<ul>
<li>Microstrategy tries to do bigger queries than some of its competitors like to handle, by relying more on the DBMS for query execution.</li>
<li>Not coincidentally, Microstrategy is often the favorite BI vendor of analytic DBMS vendors (and even some Hadoop folks) who specialize in very large data sets.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/07/29/microstrategy-technology-notes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
