September 27, 2007

The Netezza Developer Network

Netezza has officially announced the Netezza Developer Network. Associated with that is a set of technical capabilities, which basically boil down to programming user-defined functions or other capabilities straight onto the Netezza nodes (aka SPUs). And this is specifically onto the FPGAs, not the PowerPC processors. In C. Technically, I think what this boils down to is:

The applications mentioned in the NDN press release, and I quote directly, are:

  • Multi-dimensional geospatial analytics on comprehensive data sets for risk management
  • Predictive model scoring for customer segmentation, enabling real-time offer provisioning for customers
  • Iterative modeling and analytics on billions of call detail records (CDRs) for telco price optimization
  • Real-time Monte Carlo simulations on terabytes of detail-level data for risk management
  • “Fingerprinting” with hashing algorithms for chain-of-custody document fingerprinting and to ensure that files transferred are intact
  • Fuzzy text search analysis uses algorithms that provide a “best guess” of most likely results

Netezza says that the greatest interest has come from usual-suspect sophisticated users, specifically intelligence agencies and perhaps also financial services firms. But naturally, the partners actually trotted out at Netezza’s user conference were mainly hopeful small-company ISVs. The biggest stir was made by not-so-small SAS, which evidently believes this new capability will provide massive improvements to SAS/Netezza combined performance.

In principle, there are four different ways this new programmability could be a big win:

  • Code might just run faster on FPGAs — or on an MPP system in general — than on standard processors. I don’t currently have an opinion as to whether this situation is likely to arise in practice to any significant degree. (Note to self: Talk with one or both of Netezza partners SAS and SPSS on this subject soon.)
  • A communication bottleneck is eliminated, whereby query result sets currently have to be sent to an application box via gigabit Ethernet (or whatever) to be processed. I’m sure that’s a biggie. Rival vendors, who run on (more) standard hardware, have this problem to a much lesser extent.
  • Network traffic internal to the appliance is also reduced, as data can be massaged right on the node rather than shipped off for processing elsewhere. For some kinds of applications, such as scoring or certain kinds of data reduction, this is surely a big deal. Once again, other MPP data warehouse specialists can and should offer such capabilities too.
  • Non-tabular datatypes can now be supported. E.g., there are small outfits offering XML and geospatial, and Netezza has done some internal work to show off its ability to store and load images. I’ll say more about this in another post, not necessarily tonight.
  • Share: These icons link to social bookmarking sites where readers can share and discover new web pages.
    • del.icio.us
    • Digg
    • DZone
    • Mixx
    • Reddit
    • Slashdot
    • Sphinn
    • StumbleUpon
    • Technorati

    Comments

    8 Responses to “The Netezza Developer Network”

    1. Stuart Frost on September 27th, 2007 10:07 am

      Curt,

      Technically, this looks the same as regular User Defined Functions (UDFs), which we (and some other appliance vendors) already support.

      As you indicate, there can be huge advantages to using UDFs on an MPP system, due to the reduced network traffic and sheer processing power available.

      However, I’ll admit that it’s an interesting marketing spin.

      Stuart

    2. Tom Briggs on September 27th, 2007 1:55 pm

      For the record, I do not believe that NZ’s engine is related to PostgreSQL; they use it on the front end, but I think the actual query processing is an entirely separate beast.

    3. Stuart Frost on September 27th, 2007 2:45 pm

      My understanding is that they started with PostgreSQL and then rewrote the back-end to embed in the FPGA.

      Query processing on a SPU is split between the general purpose CPU and the FPGA, with the latter mostly responsible for restricting rows and projecting columns.

      I’m not sure how much of PostgreSQL is left and I don’t believe they contribute to or benefit from the open source community. Effectively, it’s a proprietary DBMS engine that Netezza develops and supports themselves. Nothing particularly wrong with that, but it’s different to our model.

      Stuart
      CEO, DATAllegro

    4. Tom Briggs on September 27th, 2007 3:58 pm

      So is your model the same as Greenplum’s then?

    5. Curt Monash on September 27th, 2007 6:58 pm

      Well, DATAllegro uses Ingres rather than PostgreSQL, claiming the latter didn’t offer enough support for partitioning. And they’re optimized for a lot less index use than Greenplum is. Not coincidentally, they have less support for exotic indices or datatypes than Greenplum seems to.

      Those are a few differences that come to mind.

      CAM

    6. Stuart Frost on October 1st, 2007 1:15 pm

      Tom,

      Our business model is a little different to Greenplum’s. They offer Bizgres as an open source variant of PostgreSQL and then sell Bizgres MPP under a software license.

      We embed a set of Ingres licenses under our own commercial MPP layer and sell the solution as an appliance on Dell/EMC/Cisco hardware (and Bull/EMC/Cisco in Continental Europe). We contribute most of our changes to Ingres to the open source version, but we don’t use the GPL version, so we can be selective.

      In effect, our model is a hybrid of Netezza’s appliance and Greenplum’s use of an open source, commodity database.

      Stuart
      CEO, DATAllegro

    7. DBMS2 — DataBase Management System Services » Blog Archive » SAS goes MPP on Teradata first on April 25th, 2008 12:08 am

      [...] is more than a theoretical question — well, both SAS and SPSS are disclosed members of the Netezza Developers Network. As for SMP DBMS — well, some of the work certainly could be replicated, but other important [...]

    8. DBMS2 — DataBase Management System Services » Blog Archive » Open source DBMS as a business model on April 25th, 2008 12:10 am

      [...] one example: The Netezza Development Network seems to consist mainly of ISVs and classified-agency government users. Or to be even more [...]

    Leave a Reply




    Feed including blog about database management, data warehousing, and business intelligence Subscribe to the Monash Research feed via RSS or email:

    Login

    Search our blogs and white papers

    Monash Research blogs

    User consulting

    Building a short list? Refining your strategic plan? We can help.

    Vendor advisory

    We tell vendors what's happening -- and, more important, what they should do about it.

    Recent white paper

    Pervasive PSQL Summit v10 Highlights

    September, 2007

    Recent webcast

    What leading database vendors don't want you to know

    Originally broadcast April 9, 2008

    Monash Research highlights

    Learn about white papers, webcasts, and blog highlights, by RSS or email.