Hadoop

Discussion of open source MapReduce implementation Hadoop. Related subjects include:

February 8, 2012

Comments on SAS

A reporter interviewed me via IM about how CIOs should view SAS Institute and its products. Naturally, I have edited my comments (lightly) into a blog post. They turned out to be clustered into three groups, as follows:

February 7, 2012

Hadoop-related market categorization

I wasn’t the only one to be dubious about Forrester Research’s Hadoop taxonomy (or lack thereof). GigaOm’s Derrick Harris was as well, and offered a much superior approach of his own. In Derrick’s view, there’s Hadoop, Hadoop distributions, Hadoop management, and Hadoop applications. Taking those out of order, and recalling that no market categorization is ever precise:

Let’s drill down into that last one. Derrick refers to Hadoop distributions as “products” that:

package a set of Hadoop projects (MapReduce, Hive, Sqoop, Pig, etc.) in a way that in theory makes them integrate more naturally, and to run both smoothly and securely.

While that’s a reasonable recitation of the idea’s benefits, I’d rather say that a “distribution” of open source software comprises: Read more

February 6, 2012

Comments on the 2012 Forrester Wave: Enterprise Hadoop Solutions

Forrester has released its Q1 2012 Forrester Wave: Enterprise Hadoop Solutions. (Googling turns up a direct link, but in case that doesn’t prove stable, here also is a registration-required link from IBM’s Conor O’Mahony.) My comments include:

January 10, 2012

Notes on the Oracle Big Data Appliance

Oracle announced its Big Data Appliance. Specs may be found in the Oracle Big Data Appliance press release. Beyond that:

Read more

January 10, 2012

A couple of links explaining Cloudera Manager

Predictably, I wasn’t pre-briefed on the details of Oracle’s Big Data Appliance announcement today, and an inquiry to partner Cloudera doesn’t happen to have been immediately answered.* But anyhow, it’s clear from coverage by Larry Dignan and Derrick Harris that Oracle’s Big Data Appliance includes:

In other words, it’s a lot like getting Cloudera Enterprise,* plus some hardware, plus some other stuff.

*Edit: About 2 minutes after I posted this, I got email from Cloudera CEO Mike Olson. Yes, the Oracle Big Data Appliance bundles Cloudera Enterprise.

That raises an anyway recurring question: What exactly is Cloudera Manager? Read more

January 8, 2012

Big data terminology and positioning

Recently, I observed that Big Data terminology is seriously broken. It is reasonable to reduce the subject to two quasi-dimensions:

given that

But the conflation should stop there.

*Low-volume/high-velocity problems are commonly referred to as “event processing” and/or “streaming”.

When people claim that bigness and structure are the same issue, they oversimplify into mush. So I think we need four pieces of terminology, reflective of a 2×2 matrix of possibilities. For want of better alternatives, my suggestions are:

Read more

November 21, 2011

Some big-vendor execution questions, and why they matter

When I drafted a list of key analytics-sector issues in honor of look-ahead season, the first item was “execution of various big vendors’ ambitious initiatives”.  By “execute” I mean mainly:

Vendors mentioned here are Oracle, SAP, HP, and IBM. Anybody smaller got left out due to the length of this post. Among the bigger omissions were:

Read more

November 8, 2011

Hadapt is moving forward

I’ve talked with my clients at Hadapt a couple of times recently. News highlights include:

The Hadapt product story hasn’t changed significantly from what it was before. Specific points I can add include:   Read more

November 3, 2011

MarkLogic’s Hadoop connector

It’s time to circle back to a subject I skipped when I otherwise wrote about MarkLogic 5: MarkLogic’s new Hadoop connector.

Most of what’s confusing about the MarkLogic Hadoop Connector lies in two pairs of options it presents you:

Otherwise, the whole thing is just what you would think:

MarkLogic said that it wrote this Hadoop connector itself.

Read more

November 2, 2011

The cool aspects of Odiago WibiData

Christophe Bisciglia and Aaron Kimball have a new company.

WibiData is designed for management of, investigative analytics on, and operational analytics on consumer internet data, the main examples of which are web site traffic and personalization and their analogues for games and/or mobile devices. The core WibiData technology, built on HBase and Hadoop,* is a data management and analytic execution layer. That’s where the secret sauce resides. Also included are:

The whole thing is in beta, with about three (paying) beta customers.

*And Avro and so on.

The core ideas of WibiData include:

Read more

Next Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.