August 2, 2009

Teradata 13 focuses on advanced analytic performance

Last October I wrote about the Teradata 13 release of Teradata’s database management software. Teradata 13, which will be used across the various Teradata product lines, has now been announced for GCA (General Customer Availability)*. So far as I can tell, there were two main points of emphasis for Teradata 13:

Performance (of course, performance is a point of emphasis for almost any release of any analytic DBMS product), especially but not only in the areas of aggregates, ETL (Extract/Transform/Load), and UDFs.
UDFs (User Defined Functions), especially but not only in the areas of data mining and geospatial analysis.

To put it even more concisely, the focus of Teradata 13 is on advanced analytic performance, although there of course are some enhancements in simple query performance and in analytic functionality as well. Read more

Categories: Analytic technologies, Data types, Data warehouse appliances, Data warehousing, EAI, EII, ETL, ELT, ETLT, GIS and geospatial, Parallelization, SAS Institute, Teradata, Theory and architecture

6 Comments

July 30, 2009

“The Netezza price point”

Over the past couple of years, quite a few data warehouse appliance or DBMS vendors have talked to me directly in terms of “Netezza’s price point,” or some similar phrase. Some have indicated that they’re right around the Netezza price point, but think their products are superior to Netezza’s. Others have stressed the large gap between their price and Netezza’s. But one way or the other, “Netezza’s price” has been an industry metric.

One reason everybody talks about the “Netezza (list) price” is that it hasn’t been changing much, seemingly staying stable at $50-60K/terabyte for a long time. And thus Teradata’s 2550 and Oracle’s larger-disk Exadata configuration — both priced more or less in the same range — have clearly been price-competitive with Netezza since their respective introductions.

That just changed. Netezza is cutting its pricing to the $20K/terabyte range imminently, with further cuts to come. So where does that leave competitors?

The Teradata 1550 is in the Netezza price range (still a little below, actually).
Oracle basically has nothing price-competitive with Netezza.
Microsoft has stated it plans to introduce Madison below the old DATAllegro price points; conceivably, that could be competitive with Netezza’s new pricing, although I haven’t checked as to how much it now costs simply to buy a lot of SQL Server licenses (which presumably would be a Madison lower bound, and might except for hardware be the whole thing, since Microsoft likes to create large product bundles).
XtremeData just launched in the new Netezza price range.
Troubled Dataupia is hard to judge. While on the surface Dataupia’s prices sound very low, you can’t use a Dataupia box unless you also have a brand-name DBMS (license and hardware) alongside it. That obviously affects total cost significantly.
Kickfire seems unaffected, as it doesn’t and most likely won’t compete with Netezza (different database size ranges).
For the most part, software-only vendors are free to adapt or not as they choose. Hardware prices generally don’t need to be over $10K/terabyte, and in some cases could be a lot less. So the question is how far they’re willing to discount their software.

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Dataupia, Exadata, Kickfire, Oracle, Pricing, Teradata, XtremeData

14 Comments

July 30, 2009

Netezza’s worldwide show-and-tell

In this economy, conference attendance is way down. Accordingly, a number of vendors have reevaluated whether it makes sense to have a traditional big-bang user conference, or whether it might make more sense to do a tour, bringing their message to multiple geographical areas. Netezza has opted for the latter course, something I’ve been well aware of for two reasons:

Planning for the conferences and for Netezza’s product roll-out is of course coordinated, and product roll-out is something I advise my clients on.
Netezza engaged me to speak at six different versions of the event (i.e., America and Europe, but not the Far East). There’s still time to contribute suggestions about my talk here.

Apparently, I’ll be talking late morning each time. My dates are:

September 2, Boston
September 9, Washington, DC
September 15, Milan
September 17, London
September 24, San Francisco
September 29, Chicago

The brand name of the events is Enzee Universe. Locations, registration information, and other particulars may be found on the Enzee Universe website.

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Netezza, Presentations

2 Comments

July 30, 2009

Netezza is changing its hardware architecture and slashing prices accordingly

Netezza is about to make its biggest product announcement in years. In particular:

Netezza is cutting prices to under $20K/terabyte of user data, with even lower numbers promised for the near future.
Netezza is replacing its PowerPC chips with Intel-based IBM blades.
There will be substantial changes in how data flows between the various parts of a Netezza node.
Netezza claims this will all produce an immediate 10-15X increase in price-performance, based on a 3X cut in price/terabyte and a 3-5X improvement in mixed workload performance. (Edit: Netezza now agrees that it shouldn’t have phrased things that way”.)

Allow me to explain. Read more

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Netezza, Pricing, Theory and architecture

35 Comments

July 30, 2009

Groovy Corp puts out a ridiculous press release

I knew Groovy Corp’s press release today would be bad, as it was pitched in advance as being about an awe-inspiring benchmark. That part met my very low expectations, emphasizing how the Groovy SQL Switch massively outperformed MySQL* in a benchmark, and how this supposedly shows the Groovy SQL Switch would outperform every other competitive RDBMS by at least similar margins.

*While a few use cases are exceptions, being “better than MySQL” for a DBMS is basically like being “better than Pabst Blue Ribbon” for a beer. Unless price is your top consideration, why are you even making the comparison?

Even worse, the press release, from its subhead and very first sentence, emphasizes the claim “the Groovy SQL Switch’s ability to significantly outperform relational databases.” As CEO Joe Ward quickly agreed by email, that’s not accurate. As you would expect from the “SQL” in its name, the Groovy SQL Switch is just as relational as the products it’s being contrasted to. Unfortunately for Joe, who I gather aspires to edit it to say something more sensible, the press release is out already in multiple places.

More favorably, Renee Blodgett has a short, laudatory post about Groovy, with some kind of embedded video.

Categories: Groovy Corporation, In-memory DBMS, Memory-centric data management, MySQL, OLTP

17 Comments

July 29, 2009

What are the best choices for scaling Postgres?

March, 2011 edit: In its quaintness, this post is a reminder of just how fast Short Request Processing DBMS technology has been moving ahead. If I had to do it all over again, I’d suggest they use one of the high-performance MySQL options like dbShards, Schooner, or both together. I actually don’t know what they finally decided on in that area. (I do know that for analytic DBMS they chose Vertica.)

I have a client who wants to build a new application with peak update volume of several million transactions per hour. (Their base business is data mart outsourcing, but now they’re building update-heavy technology as well. ) They have a small budget. They’ve been a MySQL shop in the past, but would prefer to contract (not eliminate) their use of MySQL rather than expand it.

My client actually signed a deal for EnterpriseDB’s Postgres Plus Advanced Server and GridSQL, but unwound the transaction quickly. (They say EnterpriseDB was very gracious about the reversal.) There seem to have been two main reasons for the flip-flop. First, it seems that EnterpriseDB’s version of Postgres isn’t up to PostgreSQL’s 8.4 feature set yet, although EnterpriseDB’s timetable for catching up might have tolerable. But GridSQL apparently is further behind yet, with no timetable for up-to-date PostgreSQL compatibility. That was the dealbreaker.

The current base-case plan is to use generic open source PostgreSQL, with scale-out achieved via hand sharding, Hibernate, or … ??? Experience and thoughts along those lines would be much appreciated.

Another option for OLTP performance and scale-out is of course memory-centric options such as VoltDB or the Groovy SQL Switch. But this client’s database is terabyte-scale, so hardware costs could be an issue, as of course could be product maturity.

By the way, a large fraction of these updates will be actual changes, as opposed to new records, in case that matters. I expect that the schema being updated will be very simple — i.e., clearly simpler than in a classic order entry scenario.

Categories: Cache, Clustering, Data mart outsourcing, EnterpriseDB and Postgres Plus, In-memory DBMS, Memory-centric data management, MySQL, OLTP, Open source, Parallelization, PostgreSQL, Software as a Service (SaaS), Vertica Systems

30 Comments

July 28, 2009

The Groovy SQL Switch

I’ve now had a chance to talk with Groovy Corporation CEO Joe Ward, and can add to what Groovy advisor Tony Bain wrote about Groovy Corp and its SQL Switch DBMS. Highlights include: Read more

Categories: Groovy Corporation, In-memory DBMS, Memory-centric data management, OLTP

2 Comments

July 28, 2009

Oops, I didn’t have caching turned on

My blogs, especially this one, haven’t been very robust in the face of increasing traffic volume. A few minutes ago DBMS2 was down again, and my hosting company called me out for being a resource hog and asked me to optimize.

It turns out that while I’d installed and activated the WP-Cache plug-in, I’d never actually turned caching on. This is now changed. But that means you may see cached pages instead of live ones, e.g. missing responses to your comments. The cache is currently configured to flush every 10 minutes, but that setting could of course change. I plan to make the same change on all five blogs.

Anyhow, please let me know if you have any problems that seem related to caching. (Also, if anybody has any experience differentiating between WP-Super Cache (the other main option) and WP-Cache, I’d love to hear about it.

Thanks! And thanks also for causing these query volume problems in the first place! 🙂

Categories: About this blog

1 Comment

July 28, 2009

Initial reactions to IBM acquiring SPSS

IBM is acquiring SPSS. My initial thoughts (questions by Eric Lai of Computerworld) include:

1) good buy for IBM? why or why not?

Yes. The integration of predictive analytics with other analytic or operational technologies is still ahead of us, so there was a lot of value to be gained from SPSS beyond what it had standalone. (That said, I haven’t actually looked at the numbers, so I have no comment on the price.)

By the way, SPSS coined the phrase “predictive analytics”, with the rest of the industry then coming around to use it. As with all successful marketing phrases, it’s somewhat misleading, in that it’s not wholly focused on prediction.

2) how does it position IBM vs. competitors?

IBM’s ownership immediately makes SPSS a stronger competitor to SAS. Any advantage to the rest of IBM depends on the integration roadmap and execution.

3) How does this particularly affect SAP and SAS and Oracle, IBM’s closest competitors by revenue according to IDC’s figures?

If one of Oracle or SAP had bought SPSS, it would have given them a competitive advantage against the other, in the integration of predictive analytics with packaged operational apps. That’s a missed opportunity for each.

One notable point is that SPSS is more SQL-oriented than SAS. Thus, SPSS has gotten performance benefits from Oracle’s in-database data mining technology that SAS apparently hasn’t.

IBM’s done a good job of keeping its acquired products working well with Oracle and other competitive DBMS in the past, and SPSS will surely be no exception.

Obviously, if IBM does a good job of Cognos/SPSS integration, that’s bad for competitors, starting with Oracle and SAP/Business Objects. So far business intelligence/predictive analytics integration has been pretty minor, because nobody’s figured out how to do it right, but some day that will change. Hmm — I feel another “Future of … ” post coming on.

4) Do you predict further M&A?

Always. 🙂

Related links

Official word from SPSS and IBM
Blog posts from Larry Dignan and James Taylor
James Kobelius‘s post, which includes the obvious point that Oracle — unlike SAP — has pretty decent data mining of its own
Eric Lai‘s actual article