Groovy Corp puts out a ridiculous press release
I knew Groovy Corp’s press release today would be bad, as it was pitched in advance as being about an awe-inspiring benchmark. That part met my very low expectations, emphasizing how the Groovy SQL Switch massively outperformed MySQL* in a benchmark, and how this supposedly shows the Groovy SQL Switch would outperform every other competitive RDBMS by at least similar margins.
*While a few use cases are exceptions, being “better than MySQL” for a DBMS is basically like being “better than Pabst Blue Ribbon” for a beer. Unless price is your top consideration, why are you even making the comparison?
Even worse, the press release, from its subhead and very first sentence, emphasizes the claim “the Groovy SQL Switch’s ability to significantly outperform relational databases.” As CEO Joe Ward quickly agreed by email, that’s not accurate. As you would expect from the “SQL” in its name, the Groovy SQL Switch is just as relational as the products it’s being contrasted to. Unfortunately for Joe, who I gather aspires to edit it to say something more sensible, the press release is out already in multiple places.
More favorably, Renee Blodgett has a short, laudatory post about Groovy, with some kind of embedded video.
Categories: Groovy Corporation, In-memory DBMS, Memory-centric data management, MySQL, OLTP | 17 Comments |
What are the best choices for scaling Postgres?
March, 2011 edit: In its quaintness, this post is a reminder of just how fast Short Request Processing DBMS technology has been moving ahead. If I had to do it all over again, I’d suggest they use one of the high-performance MySQL options like dbShards, Schooner, or both together. I actually don’t know what they finally decided on in that area. (I do know that for analytic DBMS they chose Vertica.)
I have a client who wants to build a new application with peak update volume of several million transactions per hour. (Their base business is data mart outsourcing, but now they’re building update-heavy technology as well. ) They have a small budget. They’ve been a MySQL shop in the past, but would prefer to contract (not eliminate) their use of MySQL rather than expand it.
My client actually signed a deal for EnterpriseDB’s Postgres Plus Advanced Server and GridSQL, but unwound the transaction quickly. (They say EnterpriseDB was very gracious about the reversal.) There seem to have been two main reasons for the flip-flop. First, it seems that EnterpriseDB’s version of Postgres isn’t up to PostgreSQL’s 8.4 feature set yet, although EnterpriseDB’s timetable for catching up might have tolerable. But GridSQL apparently is further behind yet, with no timetable for up-to-date PostgreSQL compatibility. That was the dealbreaker.
The current base-case plan is to use generic open source PostgreSQL, with scale-out achieved via hand sharding, Hibernate, or … ??? Experience and thoughts along those lines would be much appreciated.
Another option for OLTP performance and scale-out is of course memory-centric options such as VoltDB or the Groovy SQL Switch. But this client’s database is terabyte-scale, so hardware costs could be an issue, as of course could be product maturity.
By the way, a large fraction of these updates will be actual changes, as opposed to new records, in case that matters. I expect that the schema being updated will be very simple — i.e., clearly simpler than in a classic order entry scenario.
The Groovy SQL Switch
I’ve now had a chance to talk with Groovy Corporation CEO Joe Ward, and can add to what Groovy advisor Tony Bain wrote about Groovy Corp and its SQL Switch DBMS. Highlights include: Read more
Categories: Groovy Corporation, In-memory DBMS, Memory-centric data management, OLTP | 2 Comments |
Oops, I didn’t have caching turned on
My blogs, especially this one, haven’t been very robust in the face of increasing traffic volume. A few minutes ago DBMS2 was down again, and my hosting company called me out for being a resource hog and asked me to optimize.
It turns out that while I’d installed and activated the WP-Cache plug-in, I’d never actually turned caching on. This is now changed. But that means you may see cached pages instead of live ones, e.g. missing responses to your comments. The cache is currently configured to flush every 10 minutes, but that setting could of course change. I plan to make the same change on all five blogs.
Anyhow, please let me know if you have any problems that seem related to caching. (Also, if anybody has any experience differentiating between WP-Super Cache (the other main option) and WP-Cache, I’d love to hear about it.
Thanks! And thanks also for causing these query volume problems in the first place! 🙂
Categories: About this blog | 1 Comment |
Initial reactions to IBM acquiring SPSS
IBM is acquiring SPSS. My initial thoughts (questions by Eric Lai of Computerworld) include:
1) good buy for IBM? why or why not?
Yes. The integration of predictive analytics with other analytic or operational technologies is still ahead of us, so there was a lot of value to be gained from SPSS beyond what it had standalone. (That said, I haven’t actually looked at the numbers, so I have no comment on the price.)
By the way, SPSS coined the phrase “predictive analytics”, with the rest of the industry then coming around to use it. As with all successful marketing phrases, it’s somewhat misleading, in that it’s not wholly focused on prediction.
2) how does it position IBM vs. competitors?
IBM’s ownership immediately makes SPSS a stronger competitor to SAS. Any advantage to the rest of IBM depends on the integration roadmap and execution.
3) How does this particularly affect SAP and SAS and Oracle, IBM’s closest competitors by revenue according to IDC’s figures?
If one of Oracle or SAP had bought SPSS, it would have given them a competitive advantage against the other, in the integration of predictive analytics with packaged operational apps. That’s a missed opportunity for each.
One notable point is that SPSS is more SQL-oriented than SAS. Thus, SPSS has gotten performance benefits from Oracle’s in-database data mining technology that SAS apparently hasn’t.
IBM’s done a good job of keeping its acquired products working well with Oracle and other competitive DBMS in the past, and SPSS will surely be no exception.
Obviously, if IBM does a good job of Cognos/SPSS integration, that’s bad for competitors, starting with Oracle and SAP/Business Objects. So far business intelligence/predictive analytics integration has been pretty minor, because nobody’s figured out how to do it right, but some day that will change. Hmm — I feel another “Future of … ” post coming on.
4) Do you predict further M&A?
Always. 🙂
Related links
- Official word from SPSS and IBM
- Blog posts from Larry Dignan and James Taylor
- James Kobelius‘s post, which includes the obvious point that Oracle — unlike SAP — has pretty decent data mining of its own
- Eric Lai‘s actual article
Categories: Analytic technologies, Cognos, IBM and DB2, Oracle, SAP AG, SAS Institute | 8 Comments |
XtremeData announces its DBx data warehouse appliance
XtremeData is announcing its DBx data warehouse appliance today. Highlights include: Read more
Categories: Benchmarks and POCs, Data warehouse appliances, Data warehousing, Pricing, XtremeData | 34 Comments |
Not-so-great moments in planning
Categories: Analytic technologies, Fun stuff, Humor, Microsoft and SQL*Server | Leave a Comment |
Netezza on concurrency and workload management
I visited Netezza Friday for what was mainly an NDA meeting. But while I was there I asked where Netezza stood on concurrency, workload management, and rapid data mart spin-out. Netezza’s claims in those regards turned out to be surprisingly strong.
In the biggest surprise, Netezza claimed at least one customer had >5,000 simultaneous users, and a second had >4,000. Both are household names. Other unspecified Netezza customers apparently also have >1,000 simultaneous users. Read more
Categories: Data warehouse appliances, Data warehousing, Netezza, Teradata, Theory and architecture, Workload management | 13 Comments |
Vertica customer notes
Dave Menninger of Vertica called to discuss NDA product futures, as vendors tend to do in the weeks before a TDWI conference. So we also talked a bit about the Vertica customer base. That’s listed as 86 at the end of Q2, up from 74 in Q1. That’s pretty small growth compared with Q1, which Dave didn’t fully explain. But then, off the top of his head, he was recalling Q1 numbers as being lower than that 74, so maybe there’s a reporting glitch in the loop somewhere.
Vertica’s two biggest customer segments are telecommunications and financial services, and Dave drew an interesting distinction between what the two groups care about. Telecom companies care about data warehouses that are big and 24/7 reliable, but don’t do particularly complex analytics. Financial services — by which he presumably means mainly proprietary traders — are most focused on complex and competitively innovative analytics.
Also mentioned in various contexts were web-based outfits such as data mart outsourcers, social networkers, and open-source software providers.
Vertica also offers customer win stories in other segments, but most actual discussion about what Vertica does revolves around the application areas mentioned above, just as it has been in the past.
Similar (not necessarily identical) generalizations would be true of many other analytic DBMS vendors.
Update on Microsoft’s Madison and Fast Track data warehouse products
I chatted with Stuart Frost of Microsoft yesterday. Stuart is and remains GM of Microsoft’s data warehouse product unit, covering about $1 billion or so of revenue. While rumors of Stuart’s departure from Microsoft are clearly exaggerated, it does seem that his role is more one of coordination than actual management.
Microsoft Madison availability remains scheduled for H1 2010. Nothing new there. Tangible progress includes a few customer commitments of various sorts, including one outright planned purchase (due to some internal customer considerations around using up a budget). At the moment various Microsoft Madison technology “previews” are going on, which seem to amount to proofs-of-concept, that:
- Start with actual customer data (some from Microsoft, some from outside)
- Generate larger synthesized data sets based on those (database size seems to be 10-100 TB)
- Run in Microsoft data centers or “technology centers”, rather than on customer premises.
The basic Microsoft Madison product distribution strategy seems to be: Read more