November 15, 2008

High-performance analytics

For the past few months, I’ve collected a lot of data points to the effect that high-performance analytics – i.e., beyond straightforward query — is becoming increasingly important. And I’ve written about some of them at length. For example:

Ack. I can’t decide whether “analytics” should be a singular or plural noun. Thoughts?

Another area that’s come up which I haven‘t blogged about so much is data mining in the database. Data mining accounts for a large part of data warehouse use. The traditional way to do data mining is to extract data from the database and dump it into SAS. But there are problems with this scenario, including:

Various interesting fixes have been tried.

Vendors who are putting considerable marketing emphasis on parallel analytics include:

I’m sure others would say they belong on the list as well. It’s an important area of competitive differentiation.

Comments

6 Responses to “High-performance analytics”

  1. Beyond query | DBMS2 -- DataBase Management System Services on November 15th, 2008 3:40 pm

    [...] the only way in which data warehousing issues go “beyond query”; another important subject is high-performance analytics. Share: These icons link to social bookmarking sites where readers can share and discover new web [...]

  2. Dave on November 24th, 2008 2:11 pm

    You should mention leveraging SQL analytic functions on other SQL capabilities. If one can code complex “in database” SQL it will often blow the pajamas off the time to transfer/crunch an equivalent SAS->data dump-> crunch data approach. IF one can code equivalents to FOR/NEXT loops (e.g. via row_number() with logic), IF/THEN constructs (via CASE/WHEN) and procedural flow (via nested in line views) there are many set based approaches where one can take on problems previously in the SAS/SPSS/”R” domain. -D

  3. Curt Monash on November 24th, 2008 3:15 pm

    Dave,

    Exactly!

    Would you care to elaborate further? :)

    Best,

    CAM

  4. Dave on November 24th, 2008 4:49 pm

    CAM,
    Sadly I cannot elaborate much since most of our SQL based techniques are IP and can’t be shared in a public forum. I can say that many signal detection, scoring, interpolation, and fuzzy matching techniques can be coded with creative SQL.
    -D

  5. Another dubious “end of computer history” argument | DBMS2 -- DataBase Management System Services on November 26th, 2008 12:04 am

    [...] Kognitio, and Greenplum each have run on configurations with over 100 processors or cores.* Other analytic processing – data mining, geospatial analysis, etc. — benefits from massive parallelization as well. [...]

  6. Gartner’s 2008 data warehouse database management system Magic Quadrant is out | DBMS 2 : DataBase Management System Services on February 5th, 2011 6:17 am

    [...] SQL 2003 and further features in integrated analytics. [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.