Stuart Frost, of DATAllegro fame, has started a small family of companies, and they’ve become my clients sort of as a group. The first one that I’m choosing to write about is Cirro, for which the basics are:
- Cirro does data federation for analytics.
- Cirro has 10 full-time people plus 4 part-timers.
- Cirro launched its product in June.
- Cirro doesn’t have customers yet, but hopes to fix that soon.
Data federation stories are often hard to understand because, until you drill down, they implausibly sound as if they do anything for everybody. That said, it’s reasonable to think of Cirro as a layer between Hadoop and your BI tool that:
- Helps with data transformations.
- Helps join Hadoop data to relational tables, even if the joins are large ones.
In both cases, Cirro is calling on your data management software for help, RDBMS or Hadoop as the case may be.
More precisely, Cirro’s approach is:
- Read data from the relational database(s) of your choice and/or from Hadoop.
- Execute queries and transformations on whichever of those systems you permit, plus (optionally) Cirro’s own private Hadoop cluster and/or single-server MySQL.
- Have a cost-based optimizer smart enough to control work at that level of granularity. (Since an entire join or transformation is shipped off to a single data store, that’s much coarser-grained than what actual DBMS optimizers have to adjudicate.)
- Provide enough tools via Excel to support the usual claim “business analysts can do what they need without relying on IT”.
- Offer a library of simple analytic functions and transformations that:
- Are part of what you can call from the Excel workbench.
- Can be executed as Hadoop MapReduce jobs.
- Can be pushed down to the SQL engines of RDBMS when the appropriate capabilities are there.
While you can get a Cirro result set into Excel if that meets your needs, Cirro also is partnering with Tableau for real business intelligence capabilities.
I view the proposition for Cirro as Hadoop-centric; if your problem is merely that you’re trying to combine information from a multitude of analytic RDBMS, there are many other solutions you could try. Indeed, Cirro is one of the examples I had in mind when I wrote last week that:
One theme of the season is BI over Hadoop. I have at least 5 clients claiming they’re uniquely positioned to support that (most of whom partner with a 6th client, Tableau)
That said, if an enterprise owns Cirro for other reasons, there’s something to be said for a desktop capability — the Excel-based tool — that lets you fetch and somewhat massage data in a simple and straightforward way.