Parallelization

Analysis of issues in parallel computing, especially parallelized database management. Related subjects include:

March 14, 2010

Toward a NoSQL taxonomy

I talked Friday with Dwight Merriman, founder of 10gen (the MongoDB company). He more or less convinced me of his definition of NoSQL systems, which in my adaptation goes:

NoSQL = HVSP (High Volume Simple Processing) without joins or explicit transactions

Within that realm, Dwight offered a two-part taxonomy of NoSQL systems, according to their data model and replication/sharding strategy. I’d be happier, however, with at least three parts to the taxonomy:

March 13, 2010

The Naming of the Foo

Let’s start from some reasonable premises. Read more

March 2, 2010

Cassandra and the NoSQL scalable OLTP argument

Todd Hoff put up a provocative post on High Scalability called MySQL and Memcached: End of an Era? The post itself focuses on observations like:

But in addition, he provides a lot of useful links, which DBMS-oriented folks such as myself might have previously overlooked. Read more

February 22, 2010

TwinFin(i) – Netezza’s version of a parallel analytic platform

Much like Aster Data did in Aster 4.0 and now Aster 4.5, Netezza is announcing a general parallel big data analytic platform strategy. It is called Netezza TwinFin(i), it is a chargeable option for the Netezza TwinFin appliance, and many announced details are on the vague side, with Netezza promising more clarity at or before its Enzee Universe conference in June. At a high level, the Aster and Netezza approaches compare/contrast as follows: Read more

February 11, 2010

More patent nonsense — Google MapReduce

Google recently received a patent for MapReduce. The first and most general claim is (formatting and emphasis mine): Read more

January 31, 2010

Interesting trends in database and analytic technology

My project for the day is blogging based on my “Database and analytic technology: State of the union” talk of a few days ago. (I called it that because of when it was given, because it mixed prescriptive and descriptive elements, and because I wanted to call attention to the fact that I cover the union of database and analytic technologies – the intersection of those two sectors is an area of particular focus, but is far from the whole of my coverage.)

One section covered recent/ongoing/near-future trends that I thought were particularly interesting, including: Read more

January 22, 2010

Two cornerstones of Oracle’s database hardware strategy

After several months of careful optimization, Oracle managed to pick the most inconvenient* day possible for me to get an Exadata update from Juan Loaiza. But the call itself was long and fascinating, with the two main takeaways being:

And by the way, Oracle doesn’t make its storage-tier software available to run on anything than Oracle-designed boxes.  At the moment, that means Exadata Versions 1 and 2. Since Exadata is by far Oracle’s best DBMS offering (at least in theory), that means Oracle’s best database offering only runs on specific Oracle-sold hardware platforms. Read more

December 30, 2009

Clearing up MapReduce confusion, yet again

I’m frustrated by a constant need — or at least urge 🙂 — to correct myths and errors about MapReduce. Let’s try one more time: Read more

December 2, 2009

Webinar on MapReduce for complex analytics (Thursday, December 3, 10 am and 2 pm Eastern)

The second in my two-webinar series for Aster Data will occur tomorrow, twice (both live), at 10 am and 2 pm Eastern time. The other presenters will be Jonathan Goldman, who was a Principal Scientist at LinkedIn but now has joined Aster himself, and Steve Wooledge of Aster (playing host). Key links are:

The main subjects of the webinar will be:

Arguably, aspects of data transformation fit into each of those three categories, which may help explain why data transformation has been so prominent among the early applications of MapReduce.

As you can see from Aster’s title for the webinar (which they picked while I was on vacation), at least their portion will be focused on customer analytics, e.g. web analytics.

November 23, 2009

Boston Big Data Summit keynote outline

Last month, Bob Zurek asked me to give a talk on “Big Data”, where “big” is anything from a few terabytes on up, then moderate a panel on cloud computing. We agreed that I could talk just from notes, without slides. So, since I have them typed up, I’m posting them below.

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.