Data warehousing

Analysis of issues in data warehousing, with extensive coverage of database management systems and data warehouse appliances that are optimized to query large volumes of data. Related subjects include:

April 17, 2017

Interana

Interana has an interesting story, in technology and business model alike. For starters:

And to be clear — if we leave aside any questions of marketing-name sizzle, this really is business intelligence. The closest Interana comes to helping with predictive modeling is giving its ad-hoc users inspiration as to where they should focus their modeling attention.

Interana also has an interesting twist in its business model, which I hope can be used successfully by other enterprise software startups as well. Read more

April 13, 2017

Analyzing the right data

0. A huge fraction of what’s important in analytics amounts to making sure that you are analyzing the right data. To a large extent, “the right data” means “the right subset of your data”.

1. In line with that theme:

2. Business intelligence interfaces today don’t look that different from what we had in the 1980s or 1990s. The biggest visible* changes, in my opinion, have been in the realm of better drilldown, ala QlikView and then Tableau. Drilldown, of course, is the main UI for business analysts and end users to subset data themselves.

*I used the word “visible” on purpose. The advances at the back end have been enormous, and much of that redounds to the benefit of BI.

3. I wrote 2 1/2 years ago that sophisticated predictive modeling commonly fit the template:

That continues to be tough work. Attempts to productize shortcuts have not caught fire.

Read more

March 12, 2017

Introduction to SequoiaDB and SequoiaCM

For starters, let me say:

Also:

Unfortunately, SequoiaDB has not captured a lot of detailed information about unpaid open source production usage.

Read more

October 3, 2016

Notes on the transition to the cloud

1. The cloud is super-hot. Duh. And so, like any hot buzzword, “cloud” means different things to different marketers. Four of the biggest things that have been called “cloud” are:

Further, there’s always the idea of hybrid cloud, in which a vendor peddles private cloud systems (usually appliances) running similar technology stacks to what they run in their proprietary public clouds. A number of vendors have backed away from such stories, but a few are still pushing it, including Oracle and Microsoft.

This is a good example of Monash’s Laws of Commercial Semantics.

2. Due to economies of scale, only a few companies should operate their own data centers, aka true on-prem(ises). The rest should use some combination of colo, SaaS, and public cloud.

This fact now seems to be widely understood.

Read more

September 6, 2016

“Real-time” is getting real

I’ve been an analyst for 35 years, and debates about “real-time” technology have run through my whole career. Some of those debates are by now pretty much settled. In particular:

A big issue that does remain open is: How fresh does data need to be? My preferred summary answer is: As fresh as is needed to support the best decision-making. I think that formulation starts with several advantages:

Straightforward applications of this principle include: Read more

August 28, 2016

Are analytic RDBMS and data warehouse appliances obsolete?

I used to spend most of my time — blogging and consulting alike — on data warehouse appliances and analytic DBMS. Now I’m barely involved with them. The most obvious reason is that there have been drastic changes in industry structure:

Simply reciting all that, however, begs the question of whether one should still care about analytic RDBMS at all.

My answer, in a nutshell, is:

Analytic RDBMS — whether on premises in software, in the form of data warehouse appliances, or in the cloud – are still great for hard-core business intelligence, where “hard-core” can refer to ad-hoc query complexity, reporting/dashboard concurrency, or both. But they aren’t good for much else.

Read more

August 21, 2016

More about Databricks and Spark

Databricks CEO Ali Ghodsi checked in because he disagreed with part of my recent post about Databricks. Ali’s take on Databricks’ position in the Spark world includes:

Ali also walked me through customer use cases and adoption in wonderful detail. In general:

The story on those sectors, per Ali, is:  Read more

July 19, 2016

Notes from a long trip, July 19, 2016

For starters:

A running list of recent posts is:

Subjects I’d like to add to that list include:

Read more

December 10, 2015

Readings in Database Systems

Mike Stonebraker and Larry Ellison have numerous things in common. If nothing else:

I mention the latter because there’s a new edition of Readings in Database Systems, aka the Red Book, available online, courtesy of Mike, Joe Hellerstein and Peter Bailis. Besides the recommended-reading academic papers themselves, there are 12 survey articles by the editors, and an occasional response where, for example, editors disagree. Whether or not one chooses to tackle the papers themselves — and I in fact have not dived into them — the commentary is of great interest.

But I would not take every word as the gospel truth, especially when academics describe what they see as commercial market realities. In particular, as per my quip in the first paragraph, the data warehouse market has not yet gone to the extremes that Mike suggests,* if indeed it ever will. And while Joe is close to correct when he says that the company Essbase was acquired by Oracle, what actually happened is that Arbor Software, which made Essbase, merged with Hyperion Software, and the latter was eventually indeed bought by the giant of Redwood Shores.**

*When it comes to data warehouse market assessment, Mike seems to often be ahead of the trend.

**Let me interrupt my tweaking of very smart people to confess that my own commentary on the Oracle/Hyperion deal was not, in retrospect, especially prescient.

Mike pretty much opened the discussion with a blistering attack against hierarchical data models such as JSON or XML. To a first approximation, his views might be summarized as:  Read more

November 19, 2015

The questionably named Cloudera Navigator Optimizer

I only have mixed success at getting my clients to reach out to me for messaging advice when they’re introducing something new. Cloudera Navigator Optimizer, which is being announced along with Cloudera 5.5, is one of my failures in that respect; I heard about it for the first time Tuesday afternoon. I hate the name. I hate some of the slides I saw. But I do like one part of the messaging, namely the statement that this is about “refactoring” queries.

All messaging quibbles aside, I think the Cloudera Navigator Optimizer story is actually pretty interesting, and perhaps not just to users of SQL-on-Hadoop technologies such as Hive (which I guess I’d put in that category for simplicity) or Impala. As I understand Cloudera Navigator Optimizer:

Read more

Next Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.