When I’m asked to talk to academics, the requested subject is usually a version of “What should we know about what’s happening in the actual market/real world?” I then try to figure out what the scholars could stand to hear that they perhaps don’t already know.
In the current case (Berkeley next Tuesday), I’m using the title “Necessary complexity”. I actually mean three different but related things by that, namely:
- No matter how cool an improvement you have in some particular area of technology, it’s not very useful until you add a whole bunch of me-too features and capabilities as well.
- Even beyond that, however, the simple(r) stuff has already been built. Most new opportunities are in the creation of complex integrated stacks, in part because …
- … users are doing ever more complex things.
While everybody on some level already knows all this, I think it bears calling out even so.
I previously encapsulated the first point in the cardinal rules of DBMS development:
Rule 1: Developing a good DBMS requires 5-7 years and tens of millions of dollars.
That’s if things go extremely well.
Rule 2: You aren’t an exception to Rule 1.
- Concurrent workloads benchmarked in the lab are poor predictors of concurrent performance in real life.
- Mixed workload management is harder than you’re assuming it is.
- Those minor edge cases in which your Version 1 product works poorly aren’t minor after all.
My recent post about MongoDB is just one example of same.
Examples of the second point include but are hardly limited to:
- Hadoop and its ecosystem.
- The general trend of supporting multiple data paradigms in one system …
- … sometimes via schema-on-need.
- DBMS vendors’ work to exploit multiple kinds of storage in one system, from Microsoft to MemSQL.
- WibiData and Kiji.
BDAS and Spark make a splendid example as well.
As to the third point:
- In ever more use cases, the essential simplicity of the relational data model is fundamentally obsolete.
- It’s been generally accepted for a couple of years that analytic data topologies often need to be complex.
- Predictive models are getting markedly more complex as well.
Bottom line: Serious software has been built for over 50 years. Very little of it is simple any more.
- An excellent recent write-up of SQL Server and Hekaton.
- A survey of specialized business intelligence stacks.