Much like Aster Data did in Aster 4.0 and now Aster 4.5, Netezza is announcing a general parallel big data analytic platform strategy. It is called Netezza TwinFin(i), it is a chargeable option for the Netezza TwinFin appliance, and many announced details are on the vague side, with Netezza promising more clarity at or before its Enzee Universe conference in June. At a high level, the Aster and Netezza approaches compare/contrast as follows:
- Netezza’s software runs on well-designed proprietary hardware. Aster runs on hardware that’s more off-the-shelf.
- Aster was first to ship, and will also be first to ship an IDE (Integrated Development Environment).
- MapReduce is central to Aster’s approach. Netezza TwinFin(i) supports MapReduce too, specifically a Hadoop implementation, but I don’t get the sense that everything Netezza does is built on MapReduce underpinnings.
- Both Aster and Netezza try to provide rich functionality for creating in-memory data structures parallel analytic programs can use. Both seem to let you escape from the pure relational-table paradigm more easily than, say, Teradata’s new persistent memory capabilities do.
- Aster and Netezza have made different choices about what kinds of prebuilt analytic packages to offer. Netezza could actually leapfrog Aster in this regard, but let’s see where each vendor is by, say, mid-year. If you care about the details of built-in analytic functions, you really should consider executing non-disclosure agreements with both those companies.
- Both Aster and Netezza stress that you can run analytic functions out-of-process, greatly reducing the chance that they crash the database. Netezza and I’m pretty sure also Aster also retain the option of running in-process, which provides maximum performance. (In Netezza’s case C++ is the only in-process language supported, and I think Aster has a similar limitation.)
- Like Aster, Netezza is integrating SQL queries and other analytic processing under the same workload management rubric.
- Much like Aster, Netezza is tap-dancing by implying much richer forthcoming SAS support than anything currently announced. (The crunch-per-paragraph ratio in either vendor’s SAS-related press releases to date is distressingly low.)
More specifically, here are some highlights of what I know, am guessing, and/or am allowed to say about Netezza TwinFin(i) at this time.
- The foundation for the analytic add-ons in Netezza TwinFin(i) is some sort of low-level “analytic executables.” Not understanding exactly what these are is my biggest area of confusion in the whole TwinFin(i) stack. Are they all C++, with everything translated into same? Is there Java all the way down as an alternative? (E.g., Hadoop is written in Java.) Anyhow, whatever it is, it’s surely a big improvement on Netezza’s prior Verilog-based generation of analytic extensibility technology.
- The announced list of languages supported in Netezza TwinFin(i) is Java, Python, Fortran, R, and C/C++. More are coming.
- Netezza has named a lot of analytic functions it is adding, and hinting about more to come. It has named CRAN/R and GNU libraries, saying those have 1900 or more functions each. Netezza has also built its own linear algebra library for TwinFin(i), called nzMatrix. And as previously noted, TwinFin(i) also boasts a Hadoop implementation.
- I haven’t heard about much in the way of TwinFin(i)-specific IDE support.
- I don’t really have details as to what kinds of in-memory data structures Netezza TwinFin(i) does or doesn’t support.