September 27, 2007

The Netezza Developer Network

Netezza has officially announced the Netezza Developer Network. Associated with that is a set of technical capabilities, which basically boil down to programming user-defined functions or other capabilities straight onto the Netezza nodes (aka SPUs). And this is specifically onto the FPGAs, not the PowerPC processors. In C. Technically, I think what this boils down to is:

Extending Netezza’s SQL via user-defined functions (which probably wasn’t too hard, especially since the Netezza engine is related to PostgreSQL).
Providing a C-to-Verilog compiler.
Providing an application development environment and associated tools. (Presumably rather primitive, but I haven’t really checked it out.)

The applications mentioned in the NDN press release, and I quote directly, are:

Multi-dimensional geospatial analytics on comprehensive data sets for risk management

Predictive model scoring for customer segmentation, enabling real-time offer provisioning for customers

Iterative modeling and analytics on billions of call detail records (CDRs) for telco price optimization

Real-time Monte Carlo simulations on terabytes of detail-level data for risk management

“Fingerprinting” with hashing algorithms for chain-of-custody document fingerprinting and to ensure that files transferred are intact

Fuzzy text search analysis uses algorithms that provide a “best guess” of most likely results

Netezza says that the greatest interest has come from usual-suspect sophisticated users, specifically intelligence agencies and perhaps also financial services firms. But naturally, the partners actually trotted out at Netezza’s user conference were mainly hopeful small-company ISVs. The biggest stir was made by not-so-small SAS, which evidently believes this new capability will provide massive improvements to SAS/Netezza combined performance.

In principle, there are four different ways this new programmability could be a big win:

Code might just run faster on FPGAs — or on an MPP system in general — than on standard processors. I don’t currently have an opinion as to whether this situation is likely to arise in practice to any significant degree. (Note to self: Talk with one or both of Netezza partners SAS and SPSS on this subject soon.)
A communication bottleneck is eliminated, whereby query result sets currently have to be sent to an application box via gigabit Ethernet (or whatever) to be processed. I’m sure that’s a biggie. Rival vendors, who run on (more) standard hardware, have this problem to a much lesser extent.
Network traffic internal to the appliance is also reduced, as data can be massaged right on the node rather than shipped off for processing elsewhere. For some kinds of applications, such as scoring or certain kinds of data reduction, this is surely a big deal. Once again, other MPP data warehouse specialists can and should offer such capabilities too.
Non-tabular datatypes can now be supported. E.g., there are small outfits offering XML and geospatial, and Netezza has done some internal work to show off its ability to store and load images. I’ll say more about this in another post, not necessarily tonight.

Categories: Data types, Data warehouse appliances, Data warehousing, GIS and geospatial, Netezza, OLTP, Open source, PostgreSQL, SAS Institute, Structured documents, Theory and architecture

Subscribe to our complete feed!

Comments

9 Responses to “The Netezza Developer Network”

Stuart Frost on September 27th, 2007 10:07 am

Curt,

Technically, this looks the same as regular User Defined Functions (UDFs), which we (and some other appliance vendors) already support.

As you indicate, there can be huge advantages to using UDFs on an MPP system, due to the reduced network traffic and sheer processing power available.

However, I’ll admit that it’s an interesting marketing spin.

Stuart
Tom Briggs on September 27th, 2007 1:55 pm

For the record, I do not believe that NZ’s engine is related to PostgreSQL; they use it on the front end, but I think the actual query processing is an entirely separate beast.
Stuart Frost on September 27th, 2007 2:45 pm

My understanding is that they started with PostgreSQL and then rewrote the back-end to embed in the FPGA.

Query processing on a SPU is split between the general purpose CPU and the FPGA, with the latter mostly responsible for restricting rows and projecting columns.

I’m not sure how much of PostgreSQL is left and I don’t believe they contribute to or benefit from the open source community. Effectively, it’s a proprietary DBMS engine that Netezza develops and supports themselves. Nothing particularly wrong with that, but it’s different to our model.

Stuart
CEO, DATAllegro
Tom Briggs on September 27th, 2007 3:58 pm

So is your model the same as Greenplum’s then?
Curt Monash on September 27th, 2007 6:58 pm

Well, DATAllegro uses Ingres rather than PostgreSQL, claiming the latter didn’t offer enough support for partitioning. And they’re optimized for a lot less index use than Greenplum is. Not coincidentally, they have less support for exotic indices or datatypes than Greenplum seems to.

Those are a few differences that come to mind.

CAM
Stuart Frost on October 1st, 2007 1:15 pm

Tom,

Our business model is a little different to Greenplum’s. They offer Bizgres as an open source variant of PostgreSQL and then sell Bizgres MPP under a software license.

We embed a set of Ingres licenses under our own commercial MPP layer and sell the solution as an appliance on Dell/EMC/Cisco hardware (and Bull/EMC/Cisco in Continental Europe). We contribute most of our changes to Ingres to the open source version, but we don’t use the GPL version, so we can be selective.

In effect, our model is a hybrid of Netezza’s appliance and Greenplum’s use of an open source, commodity database.

Stuart
CEO, DATAllegro
DBMS2 — DataBase Management System Services » Blog Archive » SAS goes MPP on Teradata first on April 25th, 2008 12:08 am

[…] is more than a theoretical question — well, both SAS and SPSS are disclosed members of the Netezza Developers Network. As for SMP DBMS — well, some of the work certainly could be replicated, but other important […]
DBMS2 — DataBase Management System Services » Blog Archive » Open source DBMS as a business model on April 25th, 2008 12:10 am

[…] one example: The Netezza Development Network seems to consist mainly of ISVs and classified-agency government users. Or to be even more […]
What does Netezza do in the FPGAs anyway, and other questions | DBMS2 -- DataBase Management System Services on August 8th, 2009 3:39 pm

[…] Netezza’s form of UDFs (User-Defined Functions) […]

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

The Netezza Developer Network

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin