Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

January 22, 2016

Cloudera in the cloud(s)

Cloudera released Version 2 of Cloudera Director, which is a companion product to Cloudera Manager focused specifically on the cloud. This led to a discussion about — you guessed it! — Cloudera and the cloud.

Making Cloudera run in the cloud has three major aspects:

Features new in this week’s release of Cloudera Director include:

I.e., we’re talking about some pretty basic/checklist kinds of things. Cloudera Director is evidently working for Amazon AWS and Google GCP, and planned for Windows Azure, VMware and OpenStack.

As for porting, let me start by noting: Read more

January 14, 2016

BI and quasi-DBMS

I’m on two overlapping posting kicks, namely “lessons from the past” and “stuff I keep saying so might as well also write down”. My recent piece on Oracle as the new IBM is an example of both themes. In this post, another example, I’d like to memorialize some points I keep making about business intelligence and other analytics. In particular:

Similarly, BI has often been tied to data integration/ETL (Extract/Transform/Load) functionality.* But I won’t address that subject further at this time.

*In the Hadoop/Spark era, that’s even truer of other analytics than it is of BI.

My top historical examples include:

Read more

December 10, 2015

Readings in Database Systems

Mike Stonebraker and Larry Ellison have numerous things in common. If nothing else:

I mention the latter because there’s a new edition of Readings in Database Systems, aka the Red Book, available online, courtesy of Mike, Joe Hellerstein and Peter Bailis. Besides the recommended-reading academic papers themselves, there are 12 survey articles by the editors, and an occasional response where, for example, editors disagree. Whether or not one chooses to tackle the papers themselves — and I in fact have not dived into them — the commentary is of great interest.

But I would not take every word as the gospel truth, especially when academics describe what they see as commercial market realities. In particular, as per my quip in the first paragraph, the data warehouse market has not yet gone to the extremes that Mike suggests,* if indeed it ever will. And while Joe is close to correct when he says that the company Essbase was acquired by Oracle, what actually happened is that Arbor Software, which made Essbase, merged with Hyperion Software, and the latter was eventually indeed bought by the giant of Redwood Shores.**

*When it comes to data warehouse market assessment, Mike seems to often be ahead of the trend.

**Let me interrupt my tweaking of very smart people to confess that my own commentary on the Oracle/Hyperion deal was not, in retrospect, especially prescient.

Mike pretty much opened the discussion with a blistering attack against hierarchical data models such as JSON or XML. To a first approximation, his views might be summarized as:  Read more

December 1, 2015

Machine learning’s connection to (the rest of) AI

This is part of a four post series spanning two blogs.

1. I think the technical essence of AI is usually:

Of course, a lot of non-AI software can be described the same way.

To check my claim, please consider:

To see why it’s true from a bottom-up standpoint, please consider the next two points.

2. It is my opinion that most things called “intelligence” — natural and artificial alike — have a great deal to do with pattern recognition and response. Examples of what I mean include:  Read more

November 19, 2015

The questionably named Cloudera Navigator Optimizer

I only have mixed success at getting my clients to reach out to me for messaging advice when they’re introducing something new. Cloudera Navigator Optimizer, which is being announced along with Cloudera 5.5, is one of my failures in that respect; I heard about it for the first time Tuesday afternoon. I hate the name. I hate some of the slides I saw. But I do like one part of the messaging, namely the statement that this is about “refactoring” queries.

All messaging quibbles aside, I think the Cloudera Navigator Optimizer story is actually pretty interesting, and perhaps not just to users of SQL-on-Hadoop technologies such as Hive (which I guess I’d put in that category for simplicity) or Impala. As I understand Cloudera Navigator Optimizer:

Read more

November 19, 2015

CDH 5.5

I talked with Cloudera shortly ahead of today’s announcement of Cloudera 5.5. Much of what we talked about had something or other to do with SQL data management. Highlights include:

While I had Cloudera on the phone, I asked a few questions about Impala adoption, specifically focused on concurrency. There was mention of: Read more

October 26, 2015

Differentiation in business intelligence

Parts of the business intelligence differentiation story resemble the one I just posted for data management. After all:

That said, insofar as BI’s competitive issues resemble those of DBMS, they are those of DBMS-lite. For example:

And full-stack analytic systems — perhaps delivered via SaaS (Software as a Service) — can moot the BI/data management distinction anyway.

Of course, there are major differences between how DBMS and BI are differentiated. The biggest are in user experience. I’d say: Read more

October 26, 2015

Differentiation in data management

In the previous post I broke product differentiation into 6-8 overlapping categories, which may be abbreviated as:

and sometimes also issues in adoption and administration.

Now let’s use this framework to examine two market categories I cover — data management and, in separate post, business intelligence.

Applying this taxonomy to data management:
Read more

October 26, 2015

Sources of differentiation

Obviously, a large fraction of what I write about involves technical differentiation. So let’s try for a framework where differentiation claims can be placed in context. This post will get through the generalities. The sequels will apply them to specific cases.

Many buying and design considerations for IT fall into six interrelated areas:  Read more

September 28, 2015

Introduction to Cloudera Kudu

This is part of a three-post series on Kudu, a new data storage system from Cloudera.

Cloudera is introducing a new open source project, Kudu,* which from Cloudera’s standpoint is meant to eventually become the single best underpinning for analytics on the Hadoop stack. I’ve spent multiple hours discussing Kudu with Cloudera, mainly with Todd Lipcon. Any errors are of course entirely mine.

*Like the impala, the kudu is a kind of antelope. I knew that, because I enjoy word games. What I didn’t know — and which is germane to the naming choice — is that the kudu has stripes. :)

For starters:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.