<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBMS 2 : DataBase Management System Services &#187; Presentations</title>
	<atom:link href="http://www.dbms2.com/category/presentations/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 01:51:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Notes for my March 10 Investigative Analytics webinar</title>
		<link>http://www.dbms2.com/2011/03/10/notes-for-my-march-10-investigative-analytics-webinar/</link>
		<comments>http://www.dbms2.com/2011/03/10/notes-for-my-march-10-investigative-analytics-webinar/#comments</comments>
		<pubDate>Thu, 10 Mar 2011 08:23:18 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=4005</guid>
		<description><![CDATA[It turns out that the slide deck I posted a couple of days ago underwent more changes than I expected. Here&#8217;s a more current version. A number of the changes arose when I thought more about how to categorize analytic business benefits; hence that blog post a few minutes ago with more detail on the [...]]]></description>
			<content:encoded><![CDATA[<p>It turns out that <a href="http://www.dbms2.com/2011/03/08/investigative-analytics-slide-deck-and-march-10-webinar/">the slide deck I posted a couple of days ago</a> underwent more changes than I expected. Here&#8217;s <a href="http://www.monash.com/uploads/Investigative-analytics-slides-late-draft-March-10-2011.ppt">a more current version</a>. A number of the changes arose when I thought more about <a href="http://www.dbms2.com/2011/03/10/the-three-principal-kinds-of-analytic-business-benefit/">how to categorize analytic business benefits</a>; hence that blog post a few minutes ago with more detail on the same subject.</p>
<p>Unchanged, however, is the more technical list of <a href="http://www.dbms2.com/2011/01/03/the-six-useful-things-you-can-do-with-analytic-technology/">six things you can do with analytic technology</a>, taken from a blog post late last year. Also unaltered are my definitions of <a href="http://www.dbms2.com/2011/03/03/investigative-analytics/">investigative analytics</a> and <a href="http://www.dbms2.com/2010/12/30/examples-and-definition-of-machine-generated-data/">machine-generated data</a>.</p>
<p>I write extensively on <a href="http://www.dbms2.com/category/liberty-privacy/">privacy</a>. This <a href="http://www.dbms2.com/2011/01/11/the-technology-of-privacy-threats/">technological overview of privacy threats</a> doubles as a survey of advanced investigative analytics techniques now coming into practical use.</p>
<p>And finally, on a happier note &#8212; if you enjoyed the xkcd cartoon, here are <a href="http://www.dbms2.com/2010/03/29/apocryphal-pranks/">two</a> <a href="http://www.dbms2.com/2009/03/27/what-you-learn-in-statistics-class/">links</a> to that one and a few more.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/03/10/notes-for-my-march-10-investigative-analytics-webinar/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Investigative analytics: Slide deck and March 10 webinar</title>
		<link>http://www.dbms2.com/2011/03/08/investigative-analytics-slide-deck-and-march-10-webinar/</link>
		<comments>http://www.dbms2.com/2011/03/08/investigative-analytics-slide-deck-and-march-10-webinar/#comments</comments>
		<pubDate>Tue, 08 Mar 2011 11:19:20 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3990</guid>
		<description><![CDATA[As previously noted, I&#8217;m doing a webinar on investigative analytics on Thursday, March 10, at 2 pm Eastern time. I&#8217;ve now uploaded a late-draft slide deck for same. It&#8217;s pretty concise; the deck is in no way a substitute for the webinar itself, which I urge you to attend (or catch a recording of after-the-fact). [...]]]></description>
			<content:encoded><![CDATA[<p>As previously noted, I&#8217;m doing a <a href="http://www.dbms2.com/2011/02/12/upcoming-webinar-on-investigative-analytics/">webinar on investigative analytics</a> on Thursday, March 10, at 2 pm Eastern time. I&#8217;ve now uploaded a <a href="http://www.monash.com/uploads/Investigative-analytics-slides-late-draft-March-8-2011.ppt">late-draft slide deck</a> for same. It&#8217;s pretty concise; the deck is in no way a substitute for the webinar itself, which I urge you to attend (or catch a recording of after-the-fact). But the slides &#8212; and in a couple of cases comments below them &#8212; may add some value to the <a href="http://www.dbms2.com/2011/03/03/investigative-analytics/">definition of investigative analytics</a> I recently posted.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/03/08/investigative-analytics-slide-deck-and-march-10-webinar/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Upcoming webinar on investigative analytics</title>
		<link>http://www.dbms2.com/2011/02/12/upcoming-webinar-on-investigative-analytics/</link>
		<comments>http://www.dbms2.com/2011/02/12/upcoming-webinar-on-investigative-analytics/#comments</comments>
		<pubDate>Sat, 12 Feb 2011 12:32:35 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Tableau Software]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=3846</guid>
		<description><![CDATA[I recently coined the phrase investigative analytics to conflate Statistics, data mining, machine learning, and/or predictive analytics.  The more research-oriented aspects of business intelligence tools: Ad-hoc query. Drilldown. Most things done by BI-using “business analysts” Most things within BI called “data exploration.” Analogous technologies as applied to non-tabular data types such as text or graph. [...]]]></description>
			<content:encoded><![CDATA[<p>I recently coined the phrase <a href="http://www.dbms2.com/2011/03/03/investigative-analytics/">investigative analytics</a> to conflate</p>
<ul>
<blockquote>
<li>Statistics, data mining, machine        learning, and/or predictive  analytics. <em></em></li>
<li>The more research-oriented aspects        of business intelligence  tools:
<ul>
<li>Ad-hoc query.</li>
<li>Drilldown.</li>
<li>Most things done by BI-using         “business analysts”</li>
<li>Most things within BI called         “data exploration.”</li>
</ul>
</li>
<li>Analogous technologies as        applied to non-tabular data types  such as <a onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');" href="http://www.texttechnologies.com/2010/12/01/state-of-the-art-text-analytics-mining-applications/">text</a> or <a href="../2009/08/21/social-network-analysis-aka-relationship-analytics/">graph</a>.</li>
</blockquote>
</ul>
<p>This will be be basis for my part of <a href="http://www.asterdata.com/wc_110310-Monash-data-ninja/index.php">a webcast on March 10 at 11 am Pacific/2 pm Eastern time</a>. The other main part of the webcast will be a demo by the webcast&#8217;s joint sponsors Aster Data and Tableau Software.</p>
<p>Some of Aster&#8217;s verbiage in describing and titling the webinar is so hyperbolic that I do not want to give the impression of endorsing it. But I am very hopeful that the webinar itself will be interesting and informative, and will point people at least somewhat in the direction of the benefits Aster is claiming.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2011/02/12/upcoming-webinar-on-investigative-analytics/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>My talk this morning</title>
		<link>http://www.dbms2.com/2010/06/23/my-talk-this-morning/</link>
		<comments>http://www.dbms2.com/2010/06/23/my-talk-this-morning/#comments</comments>
		<pubDate>Wed, 23 Jun 2010 11:33:18 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2378</guid>
		<description><![CDATA[Netezza&#8217;s Enzee Universe conference is now almost over, and I still haven&#8217;t figured out what my gig as &#8220;conference blogger&#8221; entails. More precisely, I&#8217;m operating from our unspoken fallback plan, namely &#8220;If all else fails, do what you&#8217;d do anyway, but do more of it.&#8221; For me to live up to that, all Netezza had [...]]]></description>
			<content:encoded><![CDATA[<p>Netezza&#8217;s Enzee Universe conference is now almost over, and I still haven&#8217;t figured out what my gig as <a href="http://www.dbms2.com/2010/06/21/notes-on-a-spate-of-netezza-related-blog-posts/">&#8220;conference blogger&#8221;</a> entails. More precisely, I&#8217;m operating from our unspoken fallback plan, namely &#8220;If all else fails, do what you&#8217;d do anyway, but do more of it.&#8221; For me to live up to that, all Netezza had to do was find interesting things to write about &#8212; and as far as I&#8217;m concerned, they already did that last Thursday in spades; the five interesting meetings they set up for with users and partners on Tuesday were just gravy.</p>
<p>Another part of the deal was that I&#8217;d give a talk this morning at 9:30 am. And when I give talks, I like to put up posts that cover whatever material I haven&#8217;t written up before, while also offering the talk&#8217;s listeners convenient links to materials I have already covered previously at length.</p>
<p><span id="more-2378"></span>So anyway:</p>
<p>As I&#8217;ve been doing all year, I plan to start the talk with the subject of <strong>liberty and privacy.</strong> My most recent <a href="http://www.dbms2.com/2010/04/04/privacy-liberty-continued/">overview post on privacy and liberty</a> has a bunch of links to what I and other people have said before.</p>
<p>This year&#8217;s Enzee Universe keynote speaker was Stephen Baker, author of <em>Numerati. </em>His talk, the only one all week I&#8217;ve attended in its entirety (I do intend for mine to be the second one, however <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  ) reminded me that some of my ideas had been inspired by his book, specifically the part about <a href="http://www.dbms2.com/2010/04/20/big-brother-watching-our-parents/">sensors in our elderly relatives&#8217; homes tracking every movement</a>, for legitimate reasons of health care and physical safety.</p>
<p>One thing I&#8217;m reminded of when talking with users is that they tend to be a bit focused on their projects or areas, and almost never have the opportunity to consider the full range of possibilities open to them. So I&#8217;ve put in two slides to raise consciousness on the point.</p>
<ul>
<li>$ per user
<ul>
<li>$1000s or maybe $10,000s (perhaps a small team of analysts looking at Big Data)</li>
</ul>
<ul>
<li>$100s or maybe $1000s (perhaps conventional BI)</li>
<li>On the order of $.10 or $1 (<a href="http://www.dbms2.com/2010/05/15/stakeholder-facing-analytics/">stakeholder-facing analytics</a>)</li>
</ul>
</li>
<li>Three benefits of better price/performance
<ul>
<li>Do the same thing, cheaper</li>
<li>Do the same thing, better</li>
<li>Do something different</li>
</ul>
</li>
</ul>
<p>Then I put together a list of &#8220;cool technologies in analytics&#8221; people might want to think about, including:</p>
<ul>
<li>Solid-state memory (there&#8217;s a whole section here about that; see the sidebar)</li>
<li><a href="http://www.dbms2.com/2010/04/12/greenplumchorus/">Data mart spin-out</a></li>
<li>Exploratory BI (e.g., <a href="http://www.dbms2.com/2010/06/12/the-underlying-technology-of-qlikview/">QlikView</a>)</li>
<li>Advanced analytics (platforms)
<ul>
<li>DBMS-centric (more on that coming soon, but meanwhile <a href="http://www.dbms2.com/2010/02/22/netezza-twinfin/">February&#8217;s posts</a> can be a placeholder)</li>
<li>MapReduce-centric (there&#8217;s a whole section here on that too)</li>
</ul>
</li>
<li>Advanced analytics (UIs and algorithms)
<ul>
<li>SQL tasting (I just coined that to talk about the useful idea of getting fast, partial results on long queries)</li>
<li>Stats/predictive (I&#8217;ve really got to build a blog category for that)</li>
<li><a href="http://www.dbms2.com/2010/06/19/objectivity-infinite-graph/">Graph</a></li>
<li>Matrix/optimization</li>
</ul>
</li>
</ul>
<p>And finally, I listed three &#8220;aggravating analytic challenges&#8221; in areas where I&#8217;m disappointed with the progress of and/or prospects for technology, including:</p>
<ul>
<li><a href="http://www.monashreport.com/2006/10/05/dashboard-business-intelligence-bi-segmentation/">KPI management</a></li>
<li>Text/tabular integration</li>
<li><a href="http://www.dbms2.com/2010/06/08/profile-of-revealed-preferences/">Profile of Revealed Preferences aka “Social graph”</a></li>
</ul>
<p><em><strong>Posts on Netezza&#8217;s announcements around the time of Enzee Universe</strong></em></p>
<ul>
<li><a href="../2010/06/21/netezza-database-software-technology-overview/">A long discussion of Netezza’s 	technology, focusing on the database parts</a></li>
<li><a href="../2010/06/21/netezza-ibm-db2-compression/">A discussion of Netezza’s and 	IBM’s compression strategies</a></li>
<li><a href="../2010/06/21/netezza-silicon-balance/">Notes on how Netezza balances 	its silicon and uses its FPGAs</a></li>
<li><a href="../2010/06/21/data-warehouse-load-latency/">A quickie on data warehouse 	loading latency</a></li>
<li><a href="http://www.dbms2.com/2010/06/26/netezza-migrator/">How Netezza Migrator works</a></li>
<li><a href="http://www.dbms2.com/2010/06/25/flash-is-coming-well/">Netezza&#8217;s strategy for RAM and Flash</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/23/my-talk-this-morning/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Notes on a spate of Netezza-related blog posts</title>
		<link>http://www.dbms2.com/2010/06/21/notes-on-a-spate-of-netezza-related-blog-posts/</link>
		<comments>http://www.dbms2.com/2010/06/21/notes-on-a-spate-of-netezza-related-blog-posts/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 11:55:33 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Data warehouse appliances]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2318</guid>
		<description><![CDATA[Fearing that last year&#8217;s tight travel budgets would hamper attendance, Netezza – like a number of other vendors – decided to forgo a traditional user conference. Instead, it took its Enzee Universe show on the road, essentially spreading the conference across eight cities. I was asked to keynote six of the installments. After the first [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">Fearing that last year&#8217;s tight travel budgets would hamper attendance, Netezza – like a number of other vendors – decided to forgo a traditional user conference. Instead, it took its Enzee Universe show on the road, essentially spreading the conference across eight cities. I was asked to <a href="http://www.dbms2.com/2009/07/30/netezza-enzee-universe/">keynote</a> six of the installments.</p>
<p style="margin-bottom: 0in;">After the first one, Netezza Marketing VP Tim Young took me aside for two pieces of constructive criticism. The surprising one* was that he felt I had been INSUFFICIENTLY critical of Netezza. Since then, every other conversation we&#8217;ve had about content creation has also featured ringing reassurances that Tim truly wants independent, non-pandering work.</p>
<p style="margin-bottom: 0in;"><em>*The unsurprising one was that I&#8217;d rushed. Well, duh. After months of telling me I had a 1 hour slot, Netezza cut me to ½ hour a few days beforehand. And my talk had been designed to be high-speed even in the longer time slot … </em></p>
<p style="margin-bottom: 0in;">As a result, I accepted a subsequent gig from Netezza that I would barely consider from most other vendors. Namely, for this year&#8217;s Enzee Universe – <a href="http://www.netezza.com/userconference/agenda.html">June 21-23, aka Monday-Wednesday of this week, at the Westin Waterfront Hotel in Boston</a> – I would do some contemporaneous blogging. The parameters we agreed on included:  <span id="more-2318"></span></p>
<ul>
<li>I would just blog here on <a href="http://www.dbms2.com">DBMS2</a>, with 	Netezza allowed to reuse posts in their entirety on its site(s).</li>
<li>I also would give a talk on the 	conference&#8217;s last day.</li>
<li>I wouldn&#8217;t say much about 	conference sessions, because:
<ul>
<li>I&#8217;m not a session-attending kind 	of guy. (I wasn&#8217;t particularly good at sitting still in class in 8<sup>th</sup> grade. I haven&#8217;t gotten much better since. And <a href="http://www.strategicmessaging.com/powerpoints/2008/02/02/">I have a huge 	aversion to other people&#8217;s uninterruptible PowerPoints</a>.)</li>
<li>I think Netezza&#8217;s sessions are 	just as hype-filled as anybody else&#8217;s. (Much as I enjoyed traveling 	around the world with Netezza last year, it was painful hearing Jim 	Baum claim in city after city that Netezza boasts a 50X performance 	advantage vs. the competition.)</li>
</ul>
</li>
<li>Rather, I&#8217;d base things much more 	on individual conversations and meetings.</li>
<li>Because I didn&#8217;t see how 	turnaround time could work otherwise, we&#8217;d have some of those 	meetings beforehand, and others early in the conference.</li>
</ul>
<p style="margin-bottom: 0in;">That last bit didn&#8217;t exactly wholly work out; for the second consecutive year Netezza pulled a surprise schedule switch a few days beforehand. But:</p>
<ul>
<li>I did have extensive, fascinating 	meetings at Netezza&#8217;s offices on Thursday, which were the fodder for 	multiple posts going up today.</li>
<li>I have a nice meeting schedule set 	up for Tuesday.</li>
<li>There should be plenty of 	opportunity for hallway and exhibit-floor conversation as the 	conference progresses.</li>
<li>I even have my own private 	conference room, with a lovely name (the “Paine Room”).</li>
</ul>
<p style="margin-bottom: 0in;">So far as I know, the rest of the plan is still operative.</p>
<p style="margin-bottom: 0in;">Posts already written as I draft this one include:</p>
<ul>
<li><a href="http://www.dbms2.com/2010/06/21/netezza-database-software-technology-overview/">A long discussion of Netezza&#8217;s 	technology, focusing on the database parts</a></li>
<li><a href="http://www.dbms2.com/2010/06/21/netezza-ibm-db2-compression/">A discussion of Netezza&#8217;s and 	IBM&#8217;s compression strategies</a></li>
<li><a href="http://www.dbms2.com/2010/06/21/netezza-silicon-balance/">Notes on how Netezza balances 	its silicon and uses its FPGAs</a></li>
<li><a href="http://www.dbms2.com/2010/06/21/data-warehouse-load-latency/">A quickie on data warehouse 	loading latency</a></li>
</ul>
<p style="margin-bottom: 0in;">I still need to write one focusing on Netezza&#8217;s advanced analytics strategy, and plan to edit in a link to it when it&#8217;s up.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/06/21/notes-on-a-spate-of-netezza-related-blog-posts/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Notes and cautions about new analytic technology</title>
		<link>http://www.dbms2.com/2010/05/07/implications-onew-analytic-technology/</link>
		<comments>http://www.dbms2.com/2010/05/07/implications-onew-analytic-technology/#comments</comments>
		<pubDate>Sat, 08 May 2010 03:05:25 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Predictive modeling and advanced analytics]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=2070</guid>
		<description><![CDATA[As previously noted, I headlined Aster&#8217;s Big Data Summit in Washington, DC last Thursday. More than others, that talk did reuse material I&#8217;d presented before.  I promised the audience that when I got back I&#8217;d put up a blog post linking to supporting material for the talk. Part of the time, I talked about things [...]]]></description>
			<content:encoded><![CDATA[<p>As <a href="http://www.dbms2.com/2010/04/18/washington-dc-may-2010-big-data-summi/">previously noted</a>, I headlined Aster&#8217;s Big Data Summit in Washington, DC last Thursday. More than others, that talk did reuse material I&#8217;d presented before.  I promised the audience that when I got back I&#8217;d put up a blog post linking to supporting material for the talk.</p>
<p>Part of the time, I talked about things I&#8217;ve written about before. For example:<span id="more-2070"></span></p>
<ul>
<li><a href="http://www.dbms2.com/2010/04/04/privacy-liberty-continued/">Liberty and privacy</a>. That&#8217;s a link to my most recent overview post on <strong>the liberty and privacy implications of modern analytic technology. </strong>The notes I spoke from were actually posted previously, after I spoke from them at the <a href="http://www.dbms2.com/2010/01/31/data-based-snooping-threat-libert/">New England Database Summit</a> at MIT in January. I&#8217;m gratified that, at both events, I got very positive feedback on liberty and privacy issues.</li>
<li><a href="http://www.dbms2.com/2009/09/10/analytic-speed-latency/">Pick the right latency</a>. That&#8217;s a link to a post (also based on a previous talk, in this case the one I traveled around the world giving for Netezza last September) in which I laid out<strong> the different levels of speed and latency</strong> an analytic application might require. I counted 9 orders of magnitude between the slowest and fastest, which is pretty much the difference between the speed of a turtle (at least a small, slow one) and the speed of light.</li>
<li>On the more general point of <strong>operationalizing analytics,</strong> my best or at least most detailed writing to date may be in <a href="http://www.monash.com/3GABP.pdf">a 2004 whitepaper on analytic business processes</a> which, sadly, is still fairly futuristic today.</li>
<li>I offered a few ways to think about <strong>the different kinds of data that go into data warehouses. </strong>
<ul>
<li>Some of those were outlined in a post last January about <a href="../2010/01/17/three-broad-categories-of-data/">three broad categories of data</a>, distinguishing among<strong> human/tabular, human/non-tabular,</strong> and <strong>machine-generated</strong>data.</li>
<li>That was a kind of sequel to a post last December about a <a href="http://www.dbms2.com/2009/12/07/data-warehouse-volume-growth/">three broad categories of data warehouse house growth drivers</a>, namely <strong>more of the same</strong> vs. <strong>more detai</strong>l vs. <strong>wholly new kinds of data.</strong></li>
<li>I gave some examples of <a href="http://www.monashreport.com/2006/10/04/data-mining-requires-data/">creating new data to analyze</a> back in 2005 and 2006.</li>
</ul>
</li>
<li>Comments I made at various points were foreshadowed in a post on <a href="http://www.dbms2.com/2009/05/30/reinventing-business-intelligence/">reinventing business intelligence</a>.</li>
</ul>
<p>I also raised a few points that I&#8217;m not finding good links for. I&#8217;ll try to cover those in future blog posts.</p>
<p><em><strong>Related link</strong></em></p>
<ul>
<li>Notes for my <a href="http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/">Boston Big Data Summit</a> (no relation to Aster Data&#8217;s Big Data Summit series) talk in October, 2009</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/05/07/implications-onew-analytic-technology/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>I&#8217;ll be speaking in Washington, DC on May 6</title>
		<link>http://www.dbms2.com/2010/04/18/washington-dc-may-2010-big-data-summi/</link>
		<comments>http://www.dbms2.com/2010/04/18/washington-dc-may-2010-big-data-summi/#comments</comments>
		<pubDate>Sun, 18 Apr 2010 21:48:15 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Archiving and information preservation]]></category>
		<category><![CDATA[Aster Data]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Liberty and privacy]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1937</guid>
		<description><![CDATA[My clients at Aster Data are putting on a sequence of conferences called &#8220;Big Data Summit(s)&#8221;, and wanted me to keynote one. I agreed to the one in Washington, DC, on May 6, on the condition that I would be allowed to start with the same liberty and privacy themes I started my New England [...]]]></description>
			<content:encoded><![CDATA[<p>My clients at Aster Data are putting on a sequence of conferences called &#8220;Big Data Summit(s)&#8221;, and wanted me to keynote one. I agreed to the one <a href="http://bigdatasummit.com/2010/dc/">in Washington, DC, on May 6</a>, on the condition that I would be allowed to start with the same liberty and privacy themes I started my <a href="http://www.dbms2.com/2010/01/31/data-based-snooping-threat-libert/">New England Database Summit keynote</a> with. Since I already knew Aster to be one of the multiple companies in this industry that is responsibly concerned about the liberty and privacy threats we&#8217;re all helping cause, I expected them to agree to that condition immediately, and indeed they did.</p>
<p>On a rough-draft basis, my talk concept is:</p>
<p style="margin-bottom: 0in;"><strong>Implications of New Analytic Technology in four areas:</strong></p>
<ul>
<li><strong>Liberty &amp; privacy</strong></li>
<li><strong>Data acquisition &amp; retention</strong></li>
<li><strong>Data exploration</strong></li>
<li><strong>Operationalized analytics</strong></li>
</ul>
<p>I haven&#8217;t done any work yet on the talk besides coming up with that snippet, and probably won&#8217;t until the week before I give it. Suggestions are welcome.</p>
<p>If anybody actually has a link to a clear discussion of legislative and regulatory data retention requirements, that would be cool. I know they&#8217;ve exploded, but I don&#8217;t  have the details.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/18/washington-dc-may-2010-big-data-summi/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Liberty and privacy, once again</title>
		<link>http://www.dbms2.com/2010/04/04/privacy-liberty-continued/</link>
		<comments>http://www.dbms2.com/2010/04/04/privacy-liberty-continued/#comments</comments>
		<pubDate>Sun, 04 Apr 2010 04:49:54 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Liberty and privacy]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1820</guid>
		<description><![CDATA[I&#8217;ve long argued three points: It is inevitable* that governments and other constituencies will obtain huge amounts of information, which can be used to drastically restrict everybody&#8217;s privacy and freedom. To protect against this grave threat, multiple layers of defense are needed, technical and legal/regulatory/social/political alike. One particular layer is getting insufficient attention, namely restrictions [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve long argued three points:</p>
<ul>
<li>It is inevitable* that governments and other constituencies will obtain huge amounts of information, which can be used to drastically restrict everybody&#8217;s privacy and freedom.</li>
<li>To protect against this grave threat, multiple layers of defense are needed, technical and legal/regulatory/social/political alike.</li>
<li>One particular layer is getting insufficient attention, namely<strong> restrictions upon the use</strong> (as opposed to the acquisition or retention) <strong>of data</strong>.</li>
</ul>
<p><em>*And indeed in many ways even desirable</em></p>
<p>I surprised people by leading with the liberty/privacy subject at my <a href="http://www.dbms2.com/2010/01/31/data-based-snooping-threat-libert/">New England Database Summit keynote</a>; considerable discussion ensued, largely supportive. I hope for a similar outcome when I keynote the Aster Big Data Summit in Washington, DC in May. And I expect to do even more to advance the liberty/privacy discussion as 2010 unfolds.</p>
<p>Fortunately, I&#8217;m not the only only thinking or talking about these liberty/privacy issues. <span id="more-1820"></span>A group of Very Big Companies (Google, eBay, Microsoft, AOL, AT&amp;T, Intel, et al.) have backed something called <a href="http://www.digitaldueprocess.org/index.cfm">Digital Due Process</a>, to restore the Fourth Amendment&#8217;s teeth in the electronic age. (The Fourth Amendment to the US Constitution is the one that restricts and governs law enforcement activities in the areas of &#8220;search and seizure&#8221;.) That fits quite nicely with <a href="http://www.networkworld.com/community/node/35626">the privacy/liberty point I&#8217;m most emphasizing</a>, to wit:</p>
<blockquote><p><strong>I think that regulating use actually suffices.</strong> Look again at the <a rel="nofollow" href="http://www.usconstitution.net/const.html#Amends">Bill of Rights</a>. On a first reading, it seems that the Fourth and Fifth Amendments prevent the government from getting certain kinds of information. It can&#8217;t look inside our houses, and it can&#8217;t make us answer questions &#8230; wrong! Actually, the courts can compel you to testify on any subject they choose; if you refuse to answer, you can go to jail for contempt of court. But they can only compel you <em>if</em> you are given immunity, so that you can&#8217;t be convicted of the crimes the testimony reveals.  I.e., <strong>the government can <em>get</em> the information it wants; it just can&#8217;t <em>use</em> that information to harm you. </strong>(Fourth Amendment rights are a little murkier; if the government finagles its way into your house for any reasons, there are still actively litigated questions as to what kinds of information it can use that it may find there.)</p></blockquote>
<p>A week ago, Michael Arrington posted the social side of the same coin, saying <a href="http://www.texttechnologies.com/2010/03/28/online-reputation/">It&#8217;s Time To Overlook Our Indiscretions</a>, by which he seemed to mean: In a world where all kinds of embarrassing information will come out, <strong>we should not (always) penalize people for the existence of mildly embarrassing information</strong> about them. I am in vigorous agreement with that.</p>
<p>Progress is being made. But we&#8217;re just at the beginning of what will be a long, difficult, and hugely important process. The stakes, without exaggeration, are human freedom, on a national and indeed global scale.</p>
<p><em>Edit: <a href="http://www.bigdatanews.com/content/your-data-rules-world-part-3">Scott Yara of Greenplum</a> says it simply (emphasis mine):</em></p>
<blockquote><p><em><strong>We need laws to keep sensitive stuff</strong> such as credit records and medical histories <strong>from getting into the wrong hands.</strong> But even with such laws, it&#8217;s inevitable that information will leak out. So <strong>we also need rules about how personal information</strong> that might come into the hands of employers, police, lenders, and others <strong>can be used.</strong></em></p></blockquote>
<p><em><strong>Related links</strong></em></p>
<ul>
<li>Declan McCullagh wrote up the <a href="http://news.cnet.com/8301-13578_3-20001463-38.html">Digital Due Process</a> initiative</li>
<li>India is <a href="http://yro.slashdot.org/story/10/04/02/0050221/Indian-Census-To-Collect-Fingerprints-Photos">fingerprinting and photographing</a> its whole population</li>
<li>Finland, notwithstanding that it is a large forest products producer, is demonstrating that an <a href="http://tech.slashdot.org/story/10/04/02/2031254/Finland-To-Try-Scanning-Snail-MailDavid/Goliath">almost entirely paperless</a> society is realistic.</li>
<li><a href="http://research.microsoft.com/en-us/projects/DatabasePrivacy/">Microsoft Research</a> is trying  to separate data-for-analysis from data-with-enough-detail-to-use-against-you</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/04/04/privacy-liberty-continued/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Open issues in database and analytic technology</title>
		<link>http://www.dbms2.com/2010/02/01/open-issues-in-database-and-analytic-technology/</link>
		<comments>http://www.dbms2.com/2010/02/01/open-issues-in-database-and-analytic-technology/#comments</comments>
		<pubDate>Mon, 01 Feb 2010 22:04:31 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[RDF and graphs]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Theory and architecture]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1507</guid>
		<description><![CDATA[The last part of my New England Database Summit talk was on open issues in database and analytic technology. This was closely intertwined with the previous section, and also relied on a lot that I&#8217;ve posted here. So I&#8217;ll just put up a few notes on that part, with lots of linkage to prior discussion [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">The last part of my <a href="http://www.dbms2.com/2009/11/25/new-england-database-summit-january-28-2010/">New England Database Summit</a> talk was on open issues in database and analytic technology. This was closely intertwined with the <a href="http://www.dbms2.com/2010/01/31/trends-database-aanalytic-technology/">previous section</a>, and also relied on a lot that I&#8217;ve posted here. So I&#8217;ll just put up a few notes on that part, with lots of linkage to prior discussion of the same points.<span id="more-1507"></span></p>
<p><!-- 		@page { margin: 0.79in } 		P { margin-bottom: 0.08in } --></p>
<ul>
<li>The most important issue in 	database and analytic technology, in my opinion, isn&#8217;t technological 	at all – rather, it&#8217;s the legal and political steps needed to <a href="http://www.dbms2.com/2010/01/31/data-based-snooping-threat-libert/"> preserve liberty</a> in the face of advancing, intrusive 	technology.</li>
<li>Another important issue for 	society – and this one does involve a lot of technology – is 	scientific number crunching. In particular, <a href="http://www.dbms2.com/2009/10/03/issues-in-scientific-data-management/">database technology for 	scientific computing</a> needs to be developed much further. I&#8217;ll have 	more to say on all this soon.</li>
<li>More generally, technology needs 	to keep advancing for parallel analytics. Fortunately, it is. Watch 	this space over the next few weeks.</li>
<li>Oracle has said, in effect, that <a href="http://www.dbms2.com/2010/01/22/oracle-database-hardware-strategy/"> its most important technological challenge of the decade</a> is getting 	<a href="http://www.dbms2.com/2010/01/31/flash-pcmsolid-state-memory-disk/">solid-state memory</a> right. I agree.</li>
<li>Data volumes will keep going up, 	up, up. Technology needs to keep evolving accordingly. Much of what 	I write is on that subject.</li>
<li>Data needs to be processed and analyzed at <a href="http://www.dbms2.com/2009/09/10/analytic-speed-latency/">very 	different latencies</a>. And there&#8217;s much further to go in integrating 	disparate latencies.</li>
<li>Analytic database management in 	the cloud hasn&#8217;t been solved yet, especially for Big Data. Among the 	reasons are the difficulty of moving data into the cloud (unless it 	originated there), the slowness of moving it from node to node in 	shared-nothing architectures (which reduces the elasticity benefit), 	and above all the long and unpredictable latencies of interprocessor 	communication while queries are running (a key subject of discussion 	at the <a href="http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/">Boston Big Data Summit</a>).</li>
<li>Better business intelligence user 	interfaces are increasingly available. I&#8217;m thinking particularly of 	approaches with buzzwords like <a href="http://www.dbms2.com/2008/08/04/qliktech-qlikview-update/">visualization/interactive exploration</a> or <a href="http://www.texttechnologies.com/2007/08/03/the-case-for-inxight-awareness-server/">faceted</a>. But they aren&#8217;t well-integrated into the overall 	analytic stack, as big BI vendors are trailing the smaller ones in 	this regards. (Part of the problem relates to my previous point.)</li>
<li>Application development over text 	search isn&#8217;t in the same league as application development over 	relational DBMS. The choices are mainly XML (e.g., <a href="http://www.texttechnologies.com/2008/04/29/mark-logic-viewed-as-a-different-kind-of-text-search-technology-vendor/">MarkLogic</a>), SQL 	for text integrated into RDBMS (limited by the weakness of those 	integrations), and something like <a href="http://www.texttechnologies.com/2008/09/20/attivio-update/">Attivio&#8217;s Java SDK</a>. There&#8217;s a 	major conceptual barrier in building those apps, namely the 	unpredictability of query results. Still, it should be possible to 	do better.</li>
<li>Similarly, text analytics and 	conventional analytics exist well side by side. They can even be in 	the same database and/or dashboard, although in practice that is 	limited by the strong <a href="http://www.texttechnologies.com/2008/10/24/attensity-update-2/">SaaS focus of text mining vendors and users</a>. But analytic 	integration of them is really hard. Linguistic imprecision is, in my 	opinion, only the #2 reason for this difficulty. The #1 reason is 	that trends detected by text analytics are much less precise than 	trends on tabular data – e.g., a 50% increase in a certain kind of 	complaint may be no more significant than a 5% change in a revenue 	variable.</li>
<li>I&#8217;m increasingly persuaded that <a href="http://www.dbms2.com/2009/08/21/social-network-analysis-aka-relationship-analytics/"> graph analytics</a> can be handled without a graph-centric data model. 	But right now, it isn&#8217;t being handled well at all. Lots more needs 	to be done – although when it is, it will just exacerbate the 	privacy/liberty dangers that so concern me.</li>
</ul>
<p><em><strong>Other posts based on my January, 2010 New England Database Summit keynote address</strong></em></p>
<ul>
<li><a title="Data-based snooping — a huge threat to liberty that we’re all helping make worse" href="../2010/01/31/data-based-snooping-threat-libert/">Data-based snooping — a huge threat to liberty that we’re all helping make worse</a></li>
<li><a title="Flash, other solid-state memory, and disk" href="../2010/01/31/flash-pcmsolid-state-memory-disk/">Flash, other solid-state memory, and disk</a></li>
<li><a title="Interesting trends in database and analytic technology" href="../2010/01/31/trends-database-aanalytic-technology/">Interesting trends in database and analytic technology</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/02/01/open-issues-in-database-and-analytic-technology/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Interesting trends in database and analytic technology</title>
		<link>http://www.dbms2.com/2010/01/31/trends-database-aanalytic-technology/</link>
		<comments>http://www.dbms2.com/2010/01/31/trends-database-aanalytic-technology/#comments</comments>
		<pubDate>Mon, 01 Feb 2010 02:11:17 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data models and architecture]]></category>
		<category><![CDATA[Data warehousing]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Memory-centric data management]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Parallelization]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Solid-state memory]]></category>
		<category><![CDATA[Storage]]></category>

		<guid isPermaLink="false">http://www.dbms2.com/?p=1492</guid>
		<description><![CDATA[My project for the day is blogging based on my “Database and analytic technology: State of the union” talk of a few days ago. (I called it that because of when it was given, because it mixed prescriptive and descriptive elements, and because I wanted to call attention to the fact that I cover the [...]]]></description>
			<content:encoded><![CDATA[<p>My project for the day is blogging based on my “<a href="http://www.dbms2.com/2009/11/25/new-england-database-summit-january-28-2010/">Database and analytic technology: </a><a href="http://www.dbms2.com/2009/11/25/new-england-database-summit-january-28-2010/">State of the union</a>” talk of a few days ago. (I called it that because of when it was given, because it mixed prescriptive and descriptive elements, and because I wanted to call attention to the fact that I cover the <em>union</em> of database and analytic technologies – the <em>intersection</em> of those two sectors is an area of particular focus, but is far from the whole of my coverage.)</p>
<p>One section covered recent/ongoing/near-future trends that I thought were particularly interesting, including:<span id="more-1492"></span></p>
<p><strong>Simpler database technology,</strong> by which I mean DBMS that are:</p>
<ul>
<li>Easier 	to administer than market-leading systems &#8230;</li>
<li>… even if at the cost of being special-purpose</li>
<li>E.g.,
<ul>
<li>MySQL and older mid-tier RDBMS such as Progress</li>
<li>Many analytic DBMS and appliances, most notably Netezza&#8217;s</li>
</ul>
</li>
</ul>
<p>For general purpose or OLTP uses, I&#8217;m not a big fan of MySQL (not enough progress in making it industrial-strength), PostgreSQL (no good company behind it – I&#8217;m a non-fan of EnterpriseDB), or Ingres (open source or not, it&#8217;s an antiquated system that hasn&#8217;t been invested in as much as Oracle, DB2 or SQL Server).</p>
<p>But I get the impression there are a lot of contenders among small startups, featuring very new architectures for OLTP or general-purpose database management. VoltDB comes to mind. NimbusDB is finally within range of getting funded. Dan Weinreb told me Friday he knows of a bunch of others as well. And that&#8217;s all before we even get into the <a href="http://www.dbms2.com/2009/12/12/legit-nosql-key-value-store/">NoSQL</a> kind of alternative.</p>
<p><strong>Flexible storage architectures.</strong> That&#8217;s starting out with an emphasis on hybrid columnar, as in the examples of <a href="http://www.dbms2.com/2009/08/04/pax-analytica-row-and-column-stores-begin-to-come-together/">Vertica</a> and <a href="http://www.dbms2.com/2009/10/14/greenplum-hybrid-columnar/">Greenplum</a>. Oracle (to whom I&#8217;m under no NDA obligation) and other vendors (to whom I am) are going that way as well.</p>
<p><strong>Multi-tier database architectures,</strong> by which I mean at least two things:</p>
<ul>
<li>The database tier/server tier split of Exadata</li>
<li>Hybrid RAM/disk architectures, examples of which include
<ul>
<li>Vertica&#8217;s RAM-based write-optimized store</li>
<li><a href="http://www.dbms2.com/2009/10/18/introduction-to-sensage/">Sensage&#8217;s CEP-in-the-DBMS</a></li>
<li>This in-memory analytics stuff we keep hearing about from the BI vendors</li>
<li>Any true in-memory/disk hybrid, such as the regrettably sidelined <a href="http://www.dbms2.com/2007/12/21/ibm-acquires-soliddb/">solidDB</a></li>
<li>Smart thinking by numerous DBMS vendors about optimizing the use of RAM and/or Level 2 cache</li>
</ul>
</li>
</ul>
<p>Netezza is particularly interesting to watch in this regard because it:</p>
<ul>
<li>Had a pretty strict storage/other processing split in prior product generations and &#8230;</li>
<li>… <a href="http://www.dbms2.com/2009/07/30/netezza-new-product-family/">ditched that in its latest generation</a> …</li>
<li>… which however is focused on optimizing the use of RAM cache</li>
</ul>
<p>Also noteworthy is Petascan, the stealth-mode –and therefore harder to watch right now <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  – company I keep teasing about, which makes a strong case for carrying the database/storage tier split into the flash/solid-state memory technology generation. <a href="../2009/04/20/calpont-update-you-read-it-here-first/">Calpont</a> also has a server/storage tier split, but that&#8217;s of mainly theoretical interest unless and until Calpont actually ships an MPP version of <a href="../2009/11/07/calponts-infinidb/">InfiniDB</a>.</p>
<p><strong>Cheaper parts,</strong> which have of course been a huge trend for decades.<a href="../2010/01/31/flash-pcmsolid-state-memory-disk/"> Solid-state memory</a> will soon conquer the world. Meanwhile, cheaper sensors drive that <a href="../2010/01/17/three-broad-categories-of-data/">machine-generated data</a> I keep talking about.</p>
<p>An ever-better understanding of <strong>scale-out technology,</strong> in several respects, including:</p>
<ul>
<li>Query, notably data movement for MPP DBMS</li>
<li>Update, especially minimalistic DBMS approaches, be they sharded MySQL or more NoSQLish</li>
<li>Number-crunching, especially via MapReduce and/or parallel analytic libraries integrated into DBMS</li>
</ul>
<p>Cool trends I touched on more briefly include:</p>
<ul>
<li>More data being available for analysis. This was a core theme of my <a href="http://www.dbms2.com/2009/07/30/netezza-enzee-universe/">Enzee Universe keynote speeches</a>; there are also some notes on it in my 	post based on my <a href="http://www.dbms2.com/2009/11/23/boston-big-data-summit-keynote-outline/">Boston Big Data Summit</a> talk.</li>
<li>More users being served by analytics. Ditto.</li>
<li>Data exploration/visualization, ala QlikView, Spotfire, or Tableau, and also the faceted stuff.</li>
<li>The democratization of data mining. But I&#8217;m not as sure of that one as of the others&#8230;</li>
</ul>
<p>One area I flat-out forgot to mention is <a href="http://www.dbms2.com/2009/06/08/the-future-of-data-marts/">easy data mart spin-out</a>.</p>
<p><em><strong>Other posts based on my January, 2010 New England Database Summit keynote address</strong></em></p>
<ul>
<li><a title="Data-based snooping — a huge threat to liberty that we’re all helping make worse" href="../2010/01/31/data-based-snooping-threat-libert/">Data-based snooping — a huge threat to liberty that we’re all helping make worse</a></li>
<li><a title="Flash, other solid-state memory, and disk" href="../2010/01/31/flash-pcmsolid-state-memory-disk/">Flash, other solid-state memory, and disk</a></li>
<li><a title="Open issues in database and analytic technology" href="../2010/02/01/open-issues-in-database-and-analytic-technology/">Open issues in database and analytic technology</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dbms2.com/2010/01/31/trends-database-aanalytic-technology/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>

