August 2, 2009

Teradata 13 focuses on advanced analytic performance

Last October I wrote about the Teradata 13 release of Teradata’s database management software. Teradata 13, which will be used across the various Teradata product lines, has now been announced for GCA (General Customer Availability)*. So far as I can tell, there were two main points of emphasis for Teradata 13:

To put it even more concisely, the focus of Teradata 13 is on advanced analytic performance, although there of course are some enhancements in simple query performance and in analytic functionality as well.

*Teradata development chief Scott Gnau said a couple of customers have already received Teradata 13, although this was recent enough that presumably nobody has it in production. But let’s not take all that too literally, since — for example — I heard nothing about the length or breadth of the beta cycle.

As just one example, when I asked Scott what was different between Teradata 13 as it is shipping now vs. Teradata 13 as it was foreshadowed back in October, he cited:

But the parts of Teradata 13 that Scott already discussed back in October, 2008 largely boil down to performance and/or UDFs as well.

Scott also foreshadowed an area of emphasis for future Teradata releases — temporal data analysis. Teradata 13 offers a new PERIOD datatype, which Scott thinks is a “sleeper” on its own for the value customers will find in it. And Scott made it clear that Teradata plans much more functionality for temporal data analysis in the future.

As I understand it, PERIOD works like this: Suppose you have a table that maintains, say, address or employment status. When you update it, you naturally create a Start_Date and End_Date for the validity of certain information. Teradata’s PERIOD datatype automagically uses this to maintain a Period where information was true, even when that period is wholly in the past. Thus when you update a row with new information, you wind up with two rows — the newly changed row, and also a second row with the old information and an effectiveness period for same.

Note: I have no further detail about Teradata’s PERIOD datatype at this time. Even what I said includes enough guesswork that there are probably at least small errors in it.

The Teradata 13 UDF, in-database data mining, and SAS integration stories seem to go something like this:

Besides UDFs, the other performance focus in Teradata 13 seems to be aggregations and OLAP. One Teradata 13 performance boost lies in aggressive query rewriting. Business intelligence tools, written to support multiple analytic DBMS (including non-current versions), can produce very messy SQL queries. Teradata 13 takes an optimizing compiler mindset to those, and in some cases can get significant speedup as a results. I get the impression there was work on other OLAP and aggregation speed-ups as well.

Also, Teradata 13 added a feature for load performance that Scott cites as being useful in the cases of heavy ETL (actually, it sounded more like ELT — Extract/Load/Transform) and OLAP aggregate-building. Namely, for the first time Teradata lets you turn off hash distribution. Teradata still wants you to hash-distribute whatever you’re going to persist to disk. But if you’re just creating a temporary table that will be dropped as soon as the load process completes, you’re now allowed to skip the hash distribution step. Scott says this can lead to >30% improvements in load performance.

Comments

6 Responses to “Teradata 13 focuses on advanced analytic performance”

  1. SAS on Netezza and other Netezza extensibility | DBMS2 -- DataBase Management System Services on September 3rd, 2009 6:40 am

    [...] data mining scoring offering. My impression is that this is very similar to SAS’ current Teradata support, notwithstanding SAS’ and Teradata’s apparent original intention of offering [...]

  2. Clarifying the state of MPP in-database SAS | DBMS2 -- DataBase Management System Services on May 7th, 2010 2:23 am

    [...] SAS Institute’s multi-year effort to get SAS integrated into various MPP DBMS, specifically Teradata, Netezza Twinfin(i), and Aster Data [...]

  3. jitendra on September 29th, 2010 7:46 am

    is tpt included in terradata 13 express?

  4. harinath on October 11th, 2010 4:47 am

    sir/madam 2 days back am joined teradata course it is going in s/w market very fastly/not it is good decision /not plz tell me

  5. Teradata announcements made very simple | DBMS 2 : DataBase Management System Services on October 25th, 2010 10:58 pm

    [...] Teradata signaled a year ago that its software focus was on adding analytic functionality, including…. [...]

  6. Temporal data, time series, and imprecise predicates | DBMS 2 : DataBase Management System Services on June 20th, 2011 1:11 am

    [...] validity duration. A Wikipedia article seems to cover the subject pretty well, and I touched on Teradata’s bitemporal plans back in [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.