July 5, 2012

Introduction to Neo Technology and Neo4j

I’ve been talking some with the Neo Technology/Neo4j guys, including Emil Eifrem (CEO/cofounder), Johan Svensson (CTO/cofounder), and Philip Rathle (Senior Director of Products). Basics include:

Numbers and historical facts include:

Read more

July 2, 2012

Introduction to Yarcdata

Cray’s strategy these days seems to be:

At the moment, the main diversifications are:

The last of the three is what Cray subsidiary Yarcdata is all about. Read more

July 2, 2012

Catching up with Cray

Cray is a legendary name in supercomputing hardware. Cray CTO Bill Blake (Netezza’s early-rise VP Development) seem to be there in part because of Cray’s name and history. I’m now consulting to Cray largely because of Bill Blake, specifically to Cray subsidiary Yarcdata. Along the way, I’ve picked up enough about Cray in general — largely from Bill and from Cray president Pete Ungaro — to perhaps be worth splitting out as a separate post.

Cray business highlights include:

I haven’t sorted through all the details in Cray’s SEC filings, but huge government contracts play a big role, as do the associated revenue recognition delays.

At the highest level, Cray’s technical story looks like: Read more

June 27, 2012

Schooner got acquired by SanDisk

SanDisk has acquired my client Schooner Information Technology. Notes on that include:

That’s about all I have at this time.

June 26, 2012

Is salesforce.com going to stick with Oracle?

Surprisingly often, I’m asked “Is salesforce.com going to stick with Oracle?” So let me refer to and expand upon my previous post about salesforce.com’s database architecture by saying:

Some day, Marc Benioff will probably say “We turned off Oracle across most of our applications a while ago, and nobody outside the company even noticed.”

*in that

Note: This blog post is less readable than it would be if I’d found a better workaround to WordPress’ bugs in the area of nested bullet points. I’m sorry.

June 26, 2012

Teradata SQL-H, using HCatalog

When I grumbled about the conference-related rush of Hadoop announcements, one example of many was Teradata Aster’s SQL-H. Still, it’s an interesting idea, and a good hook for my first shot at writing about HCatalog. Indeed, other than the Talend integration bundled into Hortonworks’ HDP 1, Teradata SQL-H is the first real use of HCatalog I’m aware of.

The Teradata SQL-H idea is:

At least in theory, Teradata SQL-H lets you use a full set of analytic tools against your Hadoop data, with little limitation except price and/or performance. Teradata thinks the performance of all this can be much better than if you just use Hadoop (35X was mentioned in one particularly favorable example), but perhaps much worse than if you just copy/extract the data to an Aster cluster in the first place.

So what might the use cases be for something like SQL-H? Offhand, I’d say:

By way of contrast, the whole thing makes less sense for dashboarding kinds of uses, unless the dashboard users are very patient when they want to drill down.

June 25, 2012

Why I’m so forward-leaning about Hadoop features

In my recent series of Hadoop posts, there were several cases where I had to choose between recommending that enterprises:

I favored the more advanced features each time. Here’s why.

To a first approximation, I divide Hadoop use cases into two major buckets, only one of which I was addressing with my comments:

1. Analytic data management.* Here I favored features over reliability because they are more important, for Hadoop as for analytic RDBMS before it. When somebody complains about an analytic data store not being ready for prime time, never really working, or causing them to tear their hair out, what they usually mean is that:

Those complaints are much, much, more frequent than “It crashed”. So it was for Netezza, DATAllegro, Greenplum, Aster Data, Vertica, Infobright, et al. So it also is for Hadoop. And how does one address those complaints? By performance and feature enhancements, of the kind that the Hadoop community is introducing at high speed. Read more

June 19, 2012

Notes on HBase 0.92

This is part of a four-post series, covering:

As part of my recent round of Hadoop research, I talked with Cloudera’s Todd Lipcon. Naturally, one of the subjects was HBase, and specifically HBase 0.92. I gather that the major themes to HBase 0.92 are:

HBase coprocessors are Java code that links straight into HBase. As with other DBMS extensions of the “links straight into the DBMS code” kind,* HBase coprocessors seem best suited for very sophisticated users and third parties.** Evidently, coprocessors have already been used to make HBase security more granular — role-based, per-column-family/per-table, etc. Further, Todd thinks coprocessors could serve as a good basis for future HBase enhancements in areas such as aggregation or secondary indexing. Read more

June 19, 2012

“Enterprise-ready Hadoop”

This is part of a four-post series, covering:

The posts depend on each other in various ways.

Cloudera, Hortonworks, and MapR all claim, in effect, “Our version of Hadoop is enterprise-ready, unlike those other guys’.” I’m dubious.

That said, “enterprise-ready Hadoop” really is an important topic.

So what does it mean for something to be “enterprise-ready”, in whole or in part? Common themes in distinguishing between “enterprise-class” and other software include:

For Hadoop, as for most things, these concepts overlap in many ways. Read more

June 19, 2012

Hadoop distributions: CDH 4, HDP 1, Hadoop 2.0, Hadoop 1.0 and all that

This is part of a four-post series, covering:

The posts depend on each other in various ways.

My clients at Cloudera and Hortonworks have somewhat different views as to the maturity of various pieces of Hadoop technology. In particular:

*”CDH” stands, due to some trademarking weirdness, for “Cloudera’s Distribution including Apache Hadoop”. “HDP” stands for “Hortonworks Data Platform”.

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.