June 26, 2012

Teradata SQL-H, using HCatalog

When I grumbled about the conference-related rush of Hadoop announcements, one example of many was Teradata Aster’s SQL-H. Still, it’s an interesting idea, and a good hook for my first shot at writing about HCatalog. Indeed, other than the Talend integration bundled into Hortonworks’ HDP 1, Teradata SQL-H is the first real use of HCatalog I’m aware of.

The Teradata SQL-H idea is:

At least in theory, Teradata SQL-H lets you use a full set of analytic tools against your Hadoop data, with little limitation except price and/or performance. Teradata thinks the performance of all this can be much better than if you just use Hadoop (35X was mentioned in one particularly favorable example), but perhaps much worse than if you just copy/extract the data to an Aster cluster in the first place.

So what might the use cases be for something like SQL-H? Offhand, I’d say:

By way of contrast, the whole thing makes less sense for dashboarding kinds of uses, unless the dashboard users are very patient when they want to drill down.

Comments

10 Responses to “Teradata SQL-H, using HCatalog”

  1. Vlad Rodionov on June 26th, 2012 11:20 am

    I think that days of old good MPP databases are over (at least when we talk about “big data analytics”). All attempts to marry Terradata, Aster, Greenplum etc with Hadoop look unnatural. Combination of Hive and R covers probably more than 90% of all possible use cases in analytical data processing: extract sample of data from Hadoop cluster using Hive and run R scripts on that data sample. I am not aware about single use case where processing of 100% of data (terabytes and petabytes) is a MUST requirement.

  2. Michael Mcintire on June 27th, 2012 12:08 pm

    hCatalog is a “Dictionary” or “Catalog”. It’s used to store the metadata about the structure of data, not the data itself. In this way, PIG and all other implementations can map structure at runtime. Say what you want about “Unstructured” data, but the vast majority of applications bind a structure to the underling data so it can be consumed… this just makes that declaration portable across platforms. And who is using it? Ask the guys at Yahoo. Indispensable.

  3. Cesar Rojas on July 2nd, 2012 2:47 am

    Hi Vlad,

    I work at Teradata Aster and I appreciate your comments.

    We are very customer driven. We’ve talked to many Hadoop customers before developing the SQL-H functionality.

    Extracting samples and using R may work for some use cases, but the majority of enterprise Hadoop customers want a scalable way to do SQL & BI processing on their Hadoop data. Also, not everyone is willing to go to R, due to the large adoption of SQL-based tools.

    I also understand that R breaks down at the Gigabyte range which is too little (let me know if you have heard anything otherwise).

    Thanks,
    Cesar

  4. Curt Monash on July 2nd, 2012 4:21 am

    Hi Cesar,

    Thanks for commenting!

    Your “gigabyte range” figure for R breaking down sounds very odd to me. R assumes all data is in memory, which might be what you’re thinking of. But various vendors try to work around even that limitation.

  5. Cesar Rojas on July 2nd, 2012 1:21 pm

    Thanks Curt for the info, it makes sense. Thanks also for writing this note. Regards.

  6. HCatalog — yes, it matters | DBMS 2 : DataBase Management System Services on August 13th, 2012 12:02 pm

    [...] DBMS integrations such as Teradata Aster’s SQL-H. [...]

  7. The Teradata Aster Big Analytics Aster/Hadoop appliance | DBMS 2 : DataBase Management System Services on October 17th, 2012 8:04 am

    [...] A central part of Teradata’s strategy is that Aster and Hadoop nodes can work together via SQL-H. [...]

  8. Hadoop/RDBMS integration: Aster SQL-H and Hadapt | DBMS 2 : DataBase Management System Services on October 17th, 2012 8:05 am

    [...] Hadoop and MapReduce with relational DBMS come from my clients at Teradata Aster (via SQL/MR and SQL-H)  and Hadapt. In both cases, the story [...]

  9. Teradata SQL-H | DBMS 2 : DataBase Management System Services on April 15th, 2013 2:46 am

    [...] vendors so often do, Teradata has caused itself some naming confusion. SQL-H was introduced as a facility of Teradata Aster, to complement SQL-MR.* But while SQL-MR is in essence a set of SQL extensions, SQL-H is not. [...]

  10. Aster 6, graph analytics, and BSP | DBMS 2 : DataBase Management System Services on October 10th, 2013 7:42 am

    [...] 6, aka the Teradata Aster Discovery Platform, includes HDFS compatibility, native MapReduce and ways of invoking Hadoop MapReduce on non-Aster nodes or clusters — but even so, you can’t run Hadoop MapReduce within Aster over Aster’s version [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.