<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How big are the intelligence agencies&#8217; data warehouses?</title>
	<atom:link href="http://www.dbms2.com/2009/05/21/how-big-are-the-intelligence-agencies-data-warehouses/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2009/05/21/how-big-are-the-intelligence-agencies-data-warehouses/</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 16:57:09 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: Joe Harris</title>
		<link>http://www.dbms2.com/2009/05/21/how-big-are-the-intelligence-agencies-data-warehouses/#comment-122795</link>
		<dc:creator>Joe Harris</dc:creator>
		<pubDate>Mon, 25 May 2009 20:10:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=789#comment-122795</guid>
		<description>Pawel,

Think you may have missed the intended humour in my comment… ;-)


Joe</description>
		<content:encoded><![CDATA[<p>Pawel,</p>
<p>Think you may have missed the intended humour in my comment… <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>Joe</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pawel Plaszczak</title>
		<link>http://www.dbms2.com/2009/05/21/how-big-are-the-intelligence-agencies-data-warehouses/#comment-122680</link>
		<dc:creator>Pawel Plaszczak</dc:creator>
		<pubDate>Sun, 24 May 2009 20:17:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=789#comment-122680</guid>
		<description>Curt, 
heh, so you are hinting back at the 4 EB hypothesis. That order of magnitude is obviously wrong though, as you said already. By the way, the idea that someone would store all the world&#039;s voice communication, in the amount of 12 EB (or even any visible fraction of it), seems absurd too.
Before going into any speculation on what government agencies are up to, there is a quicker way to impose realistic upper boundary here: how much storage has been produced worldwide in the past year?
I tried to estimate this:
http://bigdatamatters.com/bigdatamatters/2009/05/he-with-the-most-data-wins-.html
I&#039;d be very interested to see someone&#039;s more precise estimate than my 50 EB. 
Another other way to impose an upper boundary here would be to divide the federal budget by the cost of 1 GB of storage...Either way, there&#039;s little chance that in 2009 a federal agency, or anyone else would own exabytes.</description>
		<content:encoded><![CDATA[<p>Curt,<br />
heh, so you are hinting back at the 4 EB hypothesis. That order of magnitude is obviously wrong though, as you said already. By the way, the idea that someone would store all the world&#8217;s voice communication, in the amount of 12 EB (or even any visible fraction of it), seems absurd too.<br />
Before going into any speculation on what government agencies are up to, there is a quicker way to impose realistic upper boundary here: how much storage has been produced worldwide in the past year?<br />
I tried to estimate this:<br />
<a href="http://bigdatamatters.com/bigdatamatters/2009/05/he-with-the-most-data-wins-.html" rel="nofollow">http://bigdatamatters.com/bigdatamatters/2009/05/he-with-the-most-data-wins-.html</a><br />
I&#8217;d be very interested to see someone&#8217;s more precise estimate than my 50 EB.<br />
Another other way to impose an upper boundary here would be to divide the federal budget by the cost of 1 GB of storage&#8230;Either way, there&#8217;s little chance that in 2009 a federal agency, or anyone else would own exabytes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2009/05/21/how-big-are-the-intelligence-agencies-data-warehouses/#comment-122337</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Thu, 21 May 2009 18:26:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=789#comment-122337</guid>
		<description>Jeff,

I hate it when that happens. Thanks for stopping by with the clarification!  

Now, let&#039;s check orders of magnitude. In your correction, you&#039;re hypothesizing 5 TB/minute -- not per day! That reduces the 800,000 day figure I was mocking to something under 2 years, which makes a lot more sense. :)

Thanks,

CAM</description>
		<content:encoded><![CDATA[<p>Jeff,</p>
<p>I hate it when that happens. Thanks for stopping by with the clarification!  </p>
<p>Now, let&#8217;s check orders of magnitude. In your correction, you&#8217;re hypothesizing 5 TB/minute &#8212; not per day! That reduces the 800,000 day figure I was mocking to something under 2 years, which makes a lot more sense. <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Thanks,</p>
<p>CAM</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Jonas</title>
		<link>http://www.dbms2.com/2009/05/21/how-big-are-the-intelligence-agencies-data-warehouses/#comment-122336</link>
		<dc:creator>Jeff Jonas</dc:creator>
		<pubDate>Thu, 21 May 2009 17:58:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=789#comment-122336</guid>
		<description>For the record the writer of the original article got a number of facts twisted.  Actually, in this case he simply misquoted me.  With respect to use of the word Exabyte … I suggested this verbiage to correct for the errors: 

=======================
Jonas got to thinking what if they had 4 exabytes (EB) of data in the basement, and some have said they get new data in through the pipes at 5 TB a minute! &quot;You sit there and realize you don’t get to Friday night and run a batch job to answer the question what does all this mean?,&quot; he says. &quot;You could use all the computing power and energy on Earth and you wouldn&#039;t be able to do it.&quot;
=======================

Note ... I did never said in a database nor did I even imply in one system - in my mind probably lots of piles of many different kinds of data and in many different forms.  I did use the term Exabytes ... but more as an expression of gobs of data.  Point being ... batch periodic processing ain’t going to cut it.  As I think the smartest and fastest a system can be involves sensemaking on streams.  I blogged a bit more about this here: http://jeffjonas.typepad.com/jeff_jonas/2006/08/accumulating_co.html

Unfortunately for me, there we even more problematic discrepancies between what I said and the story.  I hate it when that happens.</description>
		<content:encoded><![CDATA[<p>For the record the writer of the original article got a number of facts twisted.  Actually, in this case he simply misquoted me.  With respect to use of the word Exabyte … I suggested this verbiage to correct for the errors: </p>
<p>=======================<br />
Jonas got to thinking what if they had 4 exabytes (EB) of data in the basement, and some have said they get new data in through the pipes at 5 TB a minute! &#8220;You sit there and realize you don’t get to Friday night and run a batch job to answer the question what does all this mean?,&#8221; he says. &#8220;You could use all the computing power and energy on Earth and you wouldn&#8217;t be able to do it.&#8221;<br />
=======================</p>
<p>Note &#8230; I did never said in a database nor did I even imply in one system &#8211; in my mind probably lots of piles of many different kinds of data and in many different forms.  I did use the term Exabytes &#8230; but more as an expression of gobs of data.  Point being &#8230; batch periodic processing ain’t going to cut it.  As I think the smartest and fastest a system can be involves sensemaking on streams.  I blogged a bit more about this here: <a href="http://jeffjonas.typepad.com/jeff_jonas/2006/08/accumulating_co.html" rel="nofollow">http://jeffjonas.typepad.com/jeff_jonas/2006/08/accumulating_co.html</a></p>
<p>Unfortunately for me, there we even more problematic discrepancies between what I said and the story.  I hate it when that happens.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joe Harris</title>
		<link>http://www.dbms2.com/2009/05/21/how-big-are-the-intelligence-agencies-data-warehouses/#comment-122329</link>
		<dc:creator>Joe Harris</dc:creator>
		<pubDate>Thu, 21 May 2009 15:13:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=789#comment-122329</guid>
		<description>I agree that he must have meant &quot;petabytes&quot; which is, to paraphrase, still pretty freakin&#039; huge.

However I did some digging around (i.e. Googling…) and came across some figures for the total digitized size of global voice communication being 12 exabytes per year.

Given the NSA&#039;s penchant for listening in to everyone&#039;s phone calls I can well imagine a *data store* of this size existing somewhere. 

It&#039;s a bit of stretch to call it a database but it&#039;s still one hell of a lot of data.

As an aside, I understand that the NSA are the only agency who can decrypt Skype calls but it takes them a long time to do it. 

If that&#039;s true they would need to first store them somewhere while they decide which ones to decrypt.

{ What&#039;s that black helicopter doing over my house? ;-) }

Joe</description>
		<content:encoded><![CDATA[<p>I agree that he must have meant &#8220;petabytes&#8221; which is, to paraphrase, still pretty freakin&#8217; huge.</p>
<p>However I did some digging around (i.e. Googling…) and came across some figures for the total digitized size of global voice communication being 12 exabytes per year.</p>
<p>Given the NSA&#8217;s penchant for listening in to everyone&#8217;s phone calls I can well imagine a *data store* of this size existing somewhere. </p>
<p>It&#8217;s a bit of stretch to call it a database but it&#8217;s still one hell of a lot of data.</p>
<p>As an aside, I understand that the NSA are the only agency who can decrypt Skype calls but it takes them a long time to do it. </p>
<p>If that&#8217;s true they would need to first store them somewhere while they decide which ones to decrypt.</p>
<p>{ What&#8217;s that black helicopter doing over my house? <img src='http://www.dbms2.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  }</p>
<p>Joe</p>
]]></content:encoded>
	</item>
</channel>
</rss>

