August 21, 2009

Bottleneck Whack-A-Mole

Developing a good software product is often a process of incremental improvement. Obviously, that can happen in the case of feature addition or bug-fixing. Less obviously, there’s also great scope for incremental improvement in how the product works at its core.

And it goes even further. For example, I was told by a guy who is now a senior researcher at Attivio: “How do you make a good speech recognition product? You start with a bad one and keep incrementally improving it.”

In particular, I’ve taken to calling the process of enhancing a product’s performance across multiple releases “Bottleneck Whack-A-Mole” (rhymes with guacamole)*. This is a reference to the Whack-A-Mole arcade game, the core idea of which is:

You can see Whack-A-Mole in a great picture here.

Improving performance in, for example, a database management system has a lot in common with Whack-A-Mole. Unclog your worst performance bottleneck(s), and what do you get? You get better performance, limited by other bottlenecks, which may not even have been apparent while the first ones were still in place. For example, Oracle is surely going through that now with Exadata. In its very first release, Exadata probably solved the basic I/O problem that had been limiting Oracle’s analytic query performance – edge cases perhaps aside. With that out of the way, Oracle now gets to:

When I spoke with Oracle’s development managers last fall, they didn’t really know how many development iterations would be needed to get the product truly unclogged. Of course, they professed optimism — which seemed quite sincere – that it wouldn’t be many iterations at all. But they confessed, as well they should have, to not truly knowing.

*In one way, the metaphor falls short – in the game, you have to whack a mole quickly or else you lose your opportunity entirely, while in software the problems just linger until you fix them. Well – who ever said games were PERFECT mirrors of reality? :)

Netezza is an even better example. Originally, Netezza had a “fat head,” in which a lot of query processing was done at a single master node. They fixed that, whereupon they had to get data redistribution right. Now Netezza’s performance focus is in yet different areas.

And in line with this theory – if you plotted a graph comparing analytic DBMS product age vs. maximum number of concurrent users supported, you could get a strong fit to a monotonically increasing curve. Evidently, concurrent performance is another of those things that takes multiple product revisions to get right.

Comments

19 Responses to “Bottleneck Whack-A-Mole”

  1. Jerome Pineau on August 26th, 2009 6:41 pm

    I think this is the one and only CAM post I’ve ever seen without comments :)

    So as mine was too large to fit, I posted it on my blog instead at http://jeromepineau.blogspot.com

    Apologies for the plagiarism.

  2. Curt Monash on August 27th, 2009 12:45 am

    I’m glad you didn’t drop a direct link to the post here, Jerome. It wasn’t one of your smarter ones.

    If you’re claiming that your employer precisely predicted every aspect of product performance in every release, multiple releases and years in advance, I find it hard to believe you.

    If you’re not claiming that, the whole premise of your post is wrong.

  3. Jerome Pineau on August 27th, 2009 10:54 am

    @Curt, no I’m not implying anything about my employer besides the fact that their engineering practices might, apparently from your post, be superior to a company that has been around for 30 years, spent 100s of millions of dollars on development, claims market leadership, costs millions of dollars to buys, but is yet apparently still “putzing” around with performance issues – If I were a customer reading this, I’d have to think twice about dropping $6M on a product which “probably solved the I/O problem” — I might want a little better reassurance. As far as I know Oracle has some fairly impressive benchmarks too so I’m not sure why their own people would express doubt about their capabilities. It’s puzzling to me.

    That is the premise of my post and nothing else.

  4. Curt Monash on August 27th, 2009 12:36 pm

    Oracle expressed doubt because they were honest when I pressed them.

    I am sorry that you do not live up to the same standard in this matter.

    CAM

  5. Jerome Pineau on August 27th, 2009 1:40 pm

    And that’s all in their honor clearly! But I don’t see how I am being dishonest in the least by simply pointing out what you yourself have written about and drawing conclusions. Everyone is free to interpret your paraphrasing of Oracle as they wish.

  6. Justin Swanhart on August 27th, 2009 2:20 pm

    You’ll notice a similar trend with InnoDB recently. As computers get faster, new bottlenecks show up that were not as evident before, so you whack ‘em. When we whack those, new ones show up. Remember that these software products were produced in a time when hardware technology was significantly different, so it is understandable that as time goes by, incremental improvements can be made.

    As far as Exadata goes, didn’t Oracle buy them? It is reasonable to expect that it will take some time for Oracle engineers to fully dig into every last bit of the code.

    It is also possible that due to time constraints, certain performance features or optimizations are not put into development, because it is unclear how useful such optimization will be in the real world, since every database faces a multitude of workload scenarios. Some of these become big bottlenecks which are whack’ed in the next release.

    It is simply the nature of a database product with a long lifecycle.

  7. Justin Swanhart on August 27th, 2009 2:23 pm

    Sorry, my bad, it is homegrown.

  8. Jerome Pineau on August 27th, 2009 2:31 pm

    I totally get that, and in no way am I implying it’s possible or realistic to get everything right on the first try, but at least plan/goal is nice. Now,m clearly you improve release by release (hopfully) but to simply go into such an endeavor strategizing with “oh well, we’ll cross the bridge when we get there, it’s an iterative process anyway” is a little shocking to me from a company of such size and resources. You’d think by now they would have pretty much figured out all this stuff no? If not maybe they’ve hit a wall? I mean correct me if I’m wrong but this Exadata is basically a storage layer designed to feed faster I/O to the same old RAC database isnt’t it? I find it less than re-assuring (thinking in a customer’s shoes) that they would then say hey, we’re not sure how many other rounds we’ll need to “unclog” this thing (not my term, mind you) – If that’s the case then be upfront about it I say — no shame there I’m sure Oracle’s engineering teams are probably some of the best in the world. You dont get to where they are by sucking at doing this :)

  9. Jerome Pineau on August 27th, 2009 2:36 pm

    “Fix whatever other bottlenecks are next-worst in the highly engineered, highly complex Oracle DBMS.”

    And highly expensive, I might add. I guess herein lies the problem. When you have this level of complexity/engineering, you tend to lose control. Inherently, this is the message being put out here from where I stand. I don’t think Oracle’s complexity is a big secret and as you point out, their lifecycle is quite long, which also explains the issue I suppose – It’s hard to control something so huge and so old.

  10. Curt Monash on August 27th, 2009 8:06 pm

    Jerome,

    No, you’re not free to read my paraphrase very differently than I read it.

    Oracle views its development operation through rosy glasses similar to those through which you view yours. But, as I said, they’re honest enough to admit that they could be mistaken.

    I continue to find it regrettable that you are (were) punishing them for their honest in what looks like an attempt to score cheap marketing points at their expense. Hence my vigorous defense of them against your misrepresentation.

    CAM

  11. Jerome Pineau on August 28th, 2009 10:07 am

    Curt,

    I addressed the same from Greg on my blog so I won’t re-iterate here (too much echo ). The day I am either competent enough or powerful enough to “punish” anything the size/success of Oracle is not likely to come :)

  12. RC on August 28th, 2009 1:21 pm

    Users will come up with new ways to use a certain technology and hardware vendors come with new technology too. So there will always be new and unexpected moles that needs to be wacked.

  13. Jerome Pineau on August 28th, 2009 4:16 pm

    @RC: I think I’ll stick to Kevin Closson’s last comment on my blog and leave it at that :)
    Thanks.

  14. Three kinds of software innovation, and whether patents could possibly work for them | DBMS2 -- DataBase Management System Services on March 23rd, 2010 4:19 am

    [...] may be such simple algorithms that they’re not patentable. What’s left over is incremental enhancement. Once again, O’Grady is [...]

  15. Greenplum Chorus and Greenplum 4.0 | DBMS2 -- DataBase Management System Services on April 13th, 2010 10:56 am

    [...] the most part, Greenplum 4.0 is focused on general robustness catch-up and Bottleneck Whack-A-Mole, much like the latest releases from fellow analytic DBMS vendors Vertica and Aster [...]

  16. Infobright’s Release 3.4 | DBMS2 -- DataBase Management System Services on June 27th, 2010 11:09 am

    [...] Performance and bottleneck cleanup. [...]

  17. The One-Hoss Shay | DBMS2 -- DataBase Management System Services on July 6th, 2010 11:11 pm

    [...] often write of Bottleneck Whack-A-Mole, an engineering approach that ensues when parts of a system are out of balance. Well, the flip side [...]

  18. Couchbase technical update | DBMS 2 : DataBase Management System Services on August 15th, 2011 4:26 am

    [...] improvement can indeed be made, given how few resources CouchDB has been able to devote to date to Bottleneck Whack-A-Mole. Categories: Cache, Clustering, Couchbase, Memory-centric data management, MySQL, [...]

  19. An execution worksheet for enterprise IT vendors | Strategic Messaging on January 30th, 2012 1:22 pm

    [...] management, than is needed for initial creation of something cool-but-fragile. What’s more, the schedule of problem-fixing can be hard to predict — if you knew everything about how to fix your product problems, you wouldn’t have [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.