<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Notes on SciDB and scientific data management</title>
	<atom:link href="http://www.dbms2.com/2010/05/22/scidb-and-scientific-database-management/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dbms2.com/2010/05/22/scidb-and-scientific-database-management/</link>
	<description>Choices in data management and analysis</description>
	<lastBuildDate>Thu, 09 Feb 2012 16:57:09 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: Michael McIntire</title>
		<link>http://www.dbms2.com/2010/05/22/scidb-and-scientific-database-management/#comment-178412</link>
		<dc:creator>Michael McIntire</dc:creator>
		<pubDate>Sat, 31 Jul 2010 17:27:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=2178#comment-178412</guid>
		<description>What is driving the move to hadoop and other non-relational platforms is the cost and culture of RDBMS implementations. 


The culture problem is related to data management systems forcing data to be transformed into a private and internal form, and all the process that fronts it. Dimensional Modeling is an example. Let&#039;s stop physicalizing dimensional design because that&#039;s what RDBMS products support. 

On the cost front, generating data declines at roughly the inverse of moore&#039;s law, not counting non-native per transaction data growth (I&#039;m collecting more and more data about every event). 

On the analytics side of this problem - there are many more scans of the full dataset to get a single metric, so this function cost grows non-linearly in relation to the data size. 

So - Data costs are declining at the same rate of hardware. Data Analytics costs are RISING per unit of data.  Put quite simply, at the upper end of the data size spectrum - data owners cannot afford to buy data management software.</description>
		<content:encoded><![CDATA[<p>What is driving the move to hadoop and other non-relational platforms is the cost and culture of RDBMS implementations. </p>
<p>The culture problem is related to data management systems forcing data to be transformed into a private and internal form, and all the process that fronts it. Dimensional Modeling is an example. Let&#8217;s stop physicalizing dimensional design because that&#8217;s what RDBMS products support. </p>
<p>On the cost front, generating data declines at roughly the inverse of moore&#8217;s law, not counting non-native per transaction data growth (I&#8217;m collecting more and more data about every event). </p>
<p>On the analytics side of this problem &#8211; there are many more scans of the full dataset to get a single metric, so this function cost grows non-linearly in relation to the data size. </p>
<p>So &#8211; Data costs are declining at the same rate of hardware. Data Analytics costs are RISING per unit of data.  Put quite simply, at the upper end of the data size spectrum &#8211; data owners cannot afford to buy data management software.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.dbms2.com/2010/05/22/scidb-and-scientific-database-management/#comment-170644</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Thu, 03 Jun 2010 10:11:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=2178#comment-170644</guid>
		<description>Michael,

SciDB is for analytics; Cassandra is for OLTP, hold the &quot;T&quot;, which I called HVSP in http://www.dbms2.com/2010/03/13/the-naming-of-the-foo/.

Hadoop is a closer competitor, as are RDBMS, MapReduce-enabled or otherwise.</description>
		<content:encoded><![CDATA[<p>Michael,</p>
<p>SciDB is for analytics; Cassandra is for OLTP, hold the &#8220;T&#8221;, which I called HVSP in <a href="http://www.dbms2.com/2010/03/13/the-naming-of-the-foo/" rel="nofollow">http://www.dbms2.com/2010/03/13/the-naming-of-the-foo/</a>.</p>
<p>Hadoop is a closer competitor, as are RDBMS, MapReduce-enabled or otherwise.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael</title>
		<link>http://www.dbms2.com/2010/05/22/scidb-and-scientific-database-management/#comment-169953</link>
		<dc:creator>Michael</dc:creator>
		<pubDate>Wed, 26 May 2010 20:17:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.dbms2.com/?p=2178#comment-169953</guid>
		<description>Why has interest from &quot;web analytics users&quot; receded recently? Could this be due to the increased interest in Hadoop/Cassandra and similar products?</description>
		<content:encoded><![CDATA[<p>Why has interest from &#8220;web analytics users&#8221; receded recently? Could this be due to the increased interest in Hadoop/Cassandra and similar products?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

