February 6, 2012

WibiData, derived data, and analytic schema flexibility

My clients at Odiago, vendors of WibiData, have changed their company name simply to WibiData. Even better, they blogged with more detail as to how WibiData works, in what is essentially a follow-on to my original WibiData post last October. Among other virtues, WibiData turns out to be a poster child for my views on derived data and the corresponding schema evolution.

Interesting quotes include:

WibiData is designed to store … transactional data side-by-side with profile and other derived data attributes.

… the ability to add new ad-hoc columns to a table enables more flexible analysis: output data that is the result of one analytic pipeline is stored adjacent to its input data, meaning that you can easily use this as input to second- or third-order derived data as well.

schemas can vary over time; you can easily add a field to a record, or delete a field. … But even though you start collecting that new data, your existing analysis pipelines can treat records like they always did; programs that don’t yet know about the new cookie are still compatible with both the old records already collected, and the new records with the additional field. New programs fill in default values for old data recorded before a field was added, applying the new schema at read time.

schemas for every column are stored in a data dictionary that matches column names with their schemas, as well as human-readable descriptions of the data.

Interesting aspects of the post that don’t lend themselves as well to being excerpted include:

November 2, 2011

The cool aspects of Odiago WibiData

Christophe Bisciglia and Aaron Kimball have a new company.

WibiData is designed for management of, investigative analytics on, and operational analytics on consumer internet data, the main examples of which are web site traffic and personalization and their analogues for games and/or mobile devices. The core WibiData technology, built on HBase and Hadoop,* is a data management and analytic execution layer. That’s where the secret sauce resides. Also included are:

The whole thing is in beta, with about three (paying) beta customers.

*And Avro and so on.

The core ideas of WibiData include:

Read more

← Previous Page

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.