Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

September 27, 2012

Hoping for true columnar storage in Oracle12c

I was asked to clarify one of my July comments on Oracle12c,

I wonder whether Oracle will finally introduce a true columnar storage option, a year behind Teradata. That would be the obvious enhancement on the data warehousing side, if they can pull it off. If they can’t, it’s a damning commentary on the core Oracle codebase.

by somebody smart who however seemed to have half-forgotten my post comparing (hybrid) columnar compression to (hybrid) columnar storage.

In simplest terms:

September 26, 2012

When should analytics be in-memory?

I was asked today for rules or guidance regarding “analytical problems, situations, or techniques better suited for in-database versus in-memory processing”. There are actually two kinds of distinction to be drawn:

Let’s focus on the first part of that — what work, in principle, should be done in memory?  Read more

September 24, 2012

Notes on Hadoop adoption

I successfully resisted telephone consulting while on vacation, but I did do some by email. One was on the oft-recurring subject of Hadoop adoption. I think it’s OK to adapt some of that into a post.

Notes on past and current Hadoop adoption include:

Thoughts on how Hadoop adoption will look going forward include: Read more

September 7, 2012

Integrated internet system design

What are the central challenges in internet system design? We probably all have similar lists, comprising issues such as scale, scale-out, throughput, availability, security, programming ease, UI, or general cost-effectiveness. Screw those up, and you don’t have an internet business.

Much new technology addresses those challenges, with considerable success. But the success is usually one silo at a time — a short-request application here, an analytic database there. When it comes to integration, unsolved problems abound.

The top integration and integration-like challenges for me, from a practical standpoint, are:

Other concerns that get mentioned include:

Let’s skip those latter issues for now, focusing instead on the first four.

Read more

August 19, 2012

In-database analytics — analytic glossary draft entry

This is a draft entry for the DBMS2 analytic glossary. Please comment with any ideas you have for its improvement!

Note: Words and phrases in italics will be linked to other entries when the glossary is complete.

“In-database analytics” is a catch-all term for analytic capabilities, beyond standard SQL, running on the same machine as and under the management of an analytic DBMS. These can run in one or both of two modes:

In-database analytics may offer great performance and scalability advantages versus the alternative of extracting data and having it be processed on a separate server. This is particularly likely to be the case in MPP (Massively Parallel Processing) analytic DBMS environments.

Examples of in-database analytics include:

Other common domains for in-database analytics include sessionization, time series analysis, and relationship analytics.

Notable products offering in-database analytics include:

August 19, 2012

Analytic platform — analytic glossary draft entry

This is a draft entry for the DBMS2 analytic glossary. Please comment with any ideas you have for its improvement!

Note: Words and phrases in italics will be linked to other entries when the glossary is complete.

In our usage, an “analytic platform” is an analytic DBMS with well-integrated in-database analytics, or a data warehouse appliance that includes one. The term is also sometimes used to refer to:

To varying extents, most major vendors of analytic DBMS or data warehouse appliances have extended their products into analytic platforms; see, for example, our original coverage of analytic platform versions of as Aster, Netezza, or Vertica.

Related posts

August 19, 2012

Data warehouse appliance — analytic glossary draft entry

This is a draft entry for the DBMS2 analytic glossary. Please comment with any ideas you have for its improvement!

Note: Words and phrases in italics will be linked to other entries when the glossary is complete.

A data warehouse appliance is a combination of hardware and software that includes an analytic DBMS (DataBase Management System). However, some observers incorrectly apply the term “data warehouse appliance” to any analytic DBMS.

The paradigmatic vendors of data warehouse appliances are:

Further, vendors of analytic DBMS commonly offer — directly or through partnerships — optional data warehouse appliance configurations; examples include:

Oracle Exadata is sometimes regarded as a data warehouse appliance as well, despite not being solely focused on analytic use cases.

Data warehouse appliances inherit marketing claims from the category of analytic DBMS, such as: Read more

August 6, 2012

People’s facility with statistics — extremely difficult to predict

My recent post on broadening the usefulness of statistics presupposed two things about the statistical sophistication of business intelligence tool users:

Let me now say a little more on the subject. My basic message is — people’s facility with statistics is extremely difficult to predict.

If you DO have to make a point estimate, however, you could do worse than just putting quotation marks around the last four words of that sentence …

Suppose we measure people’s statistical understanding on a 5-point scale:

  1. People who haven’t clue what a p-value is.
  2. People who think a p-value of .05 signifies a 95% chance of truth.
  3. People who know better than that, but who still think that “statistically significant” is pretty close to the same as “true”.
  4. People who know better yet, but aren’t fluent in using statistical techniques correctly.
  5. People who are fluent in statistics.

Just knowing somebody’s job description, can you confidently predict their ranking to within, say, +/- 1 point? I suggest you can’t. People differ wildly in general numeracy and in specific statistical knowledge.

Even our guesses about average knowledge may be off, not least because education is changing things. Read more

July 31, 2012

Integrating statistical analysis into business intelligence

Business intelligence tools have been around for two decades.* In that time, many people have had the idea of integrating statistical analysis into classical BI. Yet I can’t think of a single example that was more than a small, niche success.

*Or four decades, if you count predecessor technologies.

The first challenge, I think, lies in the paradigm. Three choices that come to mind are:

But the first of those approaches requires too much intelligence from the software, while the third requires too much numeracy from the users. So only the second option has a reasonable chance to work, and even that one may be hard to pull off unless vendors focus on one vertical market at a time.

The challenges in full automation start: Read more

July 28, 2012

Some Vertica 6 features

Vertica 6 was recently announced, and so it seemed like a good time to catch up on Vertica features. The main topics I want to address are:

Also:

In general, the main themes of Vertica 6 appear to be:

Let’s do the analytic functionality first. Notes on that include:

I’ll also take this opportunity to expand on something I wrote about a few vendors — including Vertica — at the end of my post on approximate query results. When I probed how customers of Vertica and other RDBMS-based analytic platform vendors used vendor-proprietary advanced analytic SQL and other analytic capabilities, answers included: Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.