July 7, 2015

Zoomdata and the Vs

Let’s start with some terminology biases:

So when my clients at Zoomdata told me that they’re in the business of providing “the fastest visual analytics for big data”, I understood their choice, but rolled my eyes anyway. And then I immediately started to check how their strategy actually plays against the “big data” Vs.

It turns out that:

*The HDFS/S3 aspect seems to be a major part of Zoomdata’s current story.

Core aspects of Zoomdata’s technical strategy include: 

*Apparently it doesn’t make sense in some major operational/general-purpose — as opposed to analytic — RDBMS. From those systems, Zoomdata may actually extract and pre-cube data.

The technology story for “data sharpening” starts:

The point of data sharpening, besides simply giving immediate gratification, is that hopefully the results for even a small sample will be enough for the user to determine:

I like this early drilldown story for a couple of reasons:

Aka “Honey, I shrunk the query!”

Zoomdata’s query execution strategy depends heavily on doing lots of “micro-queries” and unioning their result sets. In particular:

Even for not-so-micro queries, Zoomdata may find itself doing a lot of unioning, as data from different time periods may be in different stores.

Architectural choices in support of all this include:

When a young company has good ideas, it’s natural to wonder how established or mature this all is. Well:

As for technological maturity:

Related link


9 Responses to “Zoomdata and the Vs”

  1. David Gruzman on July 9th, 2015 12:39 pm

    It sounds like Zoomdata is developing their own federated query engine over spark, in some sense competing with Spark SQL. Is it true?

  2. Justin Langseth on July 9th, 2015 11:13 pm

    Not at all… Zoomdata is leveraging Spark and SparkSQL for the federation-like fusion operations. We leverage Spark SQL’s External Data Connectors where possible and prefer to execute the queries to the remote sources directly from Spark, where possible.

  3. David Gruzman on July 10th, 2015 2:50 am

    I thought that efficient federation required rework of the optimizer.

  4. Curt Monash on July 10th, 2015 12:18 pm

    Justin mentioned an optimizer to me as we talked, specifically when I asked about what provided performance. Indeed, that was his only answer other than micro-queries and approximate results, the latter of which is powered by micro-queries.

    I didn’t pursue details, because in my experience optimizers in new-ish products are usually rather primitive anyway.

  5. Naveen Michaud-Agrawal on July 10th, 2015 3:05 pm

    Hi Curt,

    You mention here that you hope vendors with a much longer track record have more nuances in their UIs. Would you care to expand on that? Some earlier posts discuss navigation as being more important than visualization (http://www.dbms2.com/2013/09/29/visualization-or-navigation/) – by this do you mean interaction with data through visualization? Or more the ability to easily facet/slice/analyze a multidimensional dataset across various dimensions, and see how subsets in one projection are reflected in another (so called linking/brushing). Thanks.

  6. Big Analytics Roundup (July 13, 2015) | The Big Analytics Blog on July 13th, 2015 12:51 pm

    […] Monash explains […]

  7. Data messes | DBMS 2 : DataBase Management System Services on August 3rd, 2015 5:58 am

    […] It’s been part of BI since the introduction of Business Objects’ “semantic layer”. (See, for example, my recent post on Zoomdata.) […]

  8. Narendran Thillaisthanam on August 23rd, 2015 10:26 am

    Justin – Zoomdata is great for real-time viz. I tested basic data ingestion of clickstream data from Kafka into ZD (using upload API) via Spark Streaming. Pretty easy to setup and had to write less than 100 lines of code end-end!!

    Wondering if you guys are looking at Druid. Druid offers decent time series based OLAP and a single query interface for real-time and batch data. Connecting ZD to Druid, I thought would provide real benefits.

    Any druid connectors? Thoughts?

  9. Monitoring | DBMS 2 : DataBase Management System Services on March 26th, 2017 7:16 am

    […] via approximate query results. This can be done entirely via your BI tool (e.g. Zoomdata’s “query sharpening”) or more by your DBMS/platform software (the Snappy Data folks pitched me on that approach this […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.