Developing a good software product is often a process of incremental improvement. Obviously, that can happen in the case of feature addition or bug-fixing. Less obviously, there’s also great scope for incremental improvement in how the product works at its core.
And it goes even further. For example, I was told by a guy who is now a senior researcher at Attivio: “How do you make a good speech recognition product? You start with a bad one and keep incrementally improving it.”
In particular, I’ve taken to calling the process of enhancing a product’s performance across multiple releases “Bottleneck Whack-A-Mole” (rhymes with guacamole)*. This is a reference to the Whack-A-Mole arcade game, the core idea of which is:
- An annoying mole pops its head up
- You whack it with a mallet
- Another pops its head up
- You whack that one
- Repeat, as mole_count increments to a fairly large integer.
You can see Whack-A-Mole in a great picture here.
Improving performance in, for example, a database management system has a lot in common with Whack-A-Mole. Unclog your worst performance bottleneck(s), and what do you get? You get better performance, limited by other bottlenecks, which may not even have been apparent while the first ones were still in place. For example, Oracle is surely going through that now with Exadata. In its very first release, Exadata probably solved the basic I/O problem that had been limiting Oracle’s analytic query performance – edge cases perhaps aside. With that out of the way, Oracle now gets to:
- Attend to the edge cases
- Fix whatever other bottlenecks are next-worst in the highly engineered, highly complex Oracle DBMS.
When I spoke with Oracle’s development managers last fall, they didn’t really know how many development iterations would be needed to get the product truly unclogged. Of course, they professed optimism — which seemed quite sincere – that it wouldn’t be many iterations at all. But they confessed, as well they should have, to not truly knowing.
*In one way, the metaphor falls short – in the game, you have to whack a mole quickly or else you lose your opportunity entirely, while in software the problems just linger until you fix them. Well – who ever said games were PERFECT mirrors of reality?
Netezza is an even better example. Originally, Netezza had a “fat head,” in which a lot of query processing was done at a single master node. They fixed that, whereupon they had to get data redistribution right. Now Netezza’s performance focus is in yet different areas.
And in line with this theory – if you plotted a graph comparing analytic DBMS product age vs. maximum number of concurrent users supported, you could get a strong fit to a monotonically increasing curve. Evidently, concurrent performance is another of those things that takes multiple product revisions to get right.