May 13, 2015

Notes on analytic technology, May 13, 2015

1. There are multiple ways in which analytics is inherently modular. For example:

Also, analytics is inherently iterative.

If I’m right that analytics is or at least should be modular and iterative, it’s easy to see why people hate multi-year data warehouse creation projects. Perhaps it’s also easy to see why I like the idea of schema-on-need.

2. In 2011, I wrote, in the context of agile predictive analytics, that

… the “business analyst” role should be expanded beyond BI and planning to include lightweight predictive analytics as well.

I gather that a similar point is at the heart of Gartner’s new term citizen data scientist. I am told that the term resonates with at least some enterprises. 

3. Speaking of Gartner, Mark Beyer tweeted

In data management’s future “hybrid” becomes a useless term. Data management is mutable, location agnostic and services oriented.

I replied

And that’s why I launched DBMS2 a decade ago, for “DataBase Management System SERVICES”. 🙂

A post earlier this year offers a strong clue as to why Mark’s tweet was at least directionally correct: The best structures for writing data are the worst for query, and vice-versa.

4. The foregoing notwithstanding, I continue to believe that there’s a large place in the world for “full-stack” analytics. Of course, some stacks are fuller than others, with SaaS (Software as a Service) offerings probably being the only true complete-stack products.

5. Speaking of full-stack vendors, some of the thoughts in this post were sparked by a recent conversation with Platfora. Platfora, of course, is full-stack except for the Hadoop underneath. They’ve taken to saying “data lake” instead of Hadoop, because they believe:

6. Platfora is coy about metrics, but does boast of high growth, and had >100 employees earlier this year. However, they are refreshingly precise about competition, saying they primarily see four competitors — Tableau, SAS Visual Analytics, Datameer (“sometimes”), and Oracle Data Discovery (who they view as flatteringly imitative of them).

Platfora seems to have a classic BI “land-and-expand” kind of model, with initial installations commonly being a few servers and a few terabytes. Applications cited were the usual suspects — customer analytics, clickstream, and compliance/governance. But they do have some big customer/big database stories as well, including:

7. Another full-stack vendor, ScalingData, has been renamed to Rocana, for “root cause analysis”. I’m hearing broader support for their ideas about BI/predictive modeling integration. For example, Platfora has something similar on its roadmap.

Related links

Comments

2 Responses to “Notes on analytic technology, May 13, 2015”

  1. Ben Werther on May 14th, 2015 3:02 pm

    Thanks Curt — great to catch up. Fully on board with your comments here, and I wanted to re-emphasize that our customers are increasingly differentiated between ‘traditional data discovery’ (ala Tableau) that supports quick visualization against prepared SQL sources, and ‘big data discovery’ (ala Platfora) which allows regular analysts to look at patterns of behavior around customers, products, security threats, etc across diverse large datasets in the data lake. The latter really requires superpowering the analyst with a platform that lets them interactively and visually connect the dots down to raw datasets in Hadoop and weaves together data prep, in-memory acceleration and visual analysis into one seamless end-to-end experience. Litmus test — can a non-technical analysis point at a Petabyte of raw customer-related data in Hadoop (clickstream, social, loyalty, etc, etc) and answer meaningful multi-channel or behavioral/segmentation questions with interactive performance in an afternoon without IT involvement?

  2. John on May 14th, 2015 10:00 pm

    Ben, are you claiming that your product can be pointed to some bytes on hadoop and magically it understands data and provide analytics? I am sorry but let’s not convert this into marketing forum and no you cannot put a petabyte of data on any platform and get interactive performance on every question so let’s not go there and let’s talk facts.

    Please come with facts and explain a clear use case of PB data set and what questions did you answer with what kind of HW etc.

    Thanks

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.