I sometimes describe database management systems as “big SQL interpreters,” because that’s the core of what they do. But it’s not all they do, which is why I describe them as “electronic file clerks” too. File clerks don’t just store and fetch data; they also put a lot of work into neatening, culling, and generally managing the health of their information hoards.
Already 15 years ago, online backup was as big a competitive differentiator in the database wars as any particular SQL execution feature. Security became important in some market segments. Reliability and availability have been important from the getgo. And manageability has been crucial ever since Microsoft lapped Oracle in that regard, back when SQL Server had little else to recommend it except price.*
*Before Oracle10g, the SQL Server vs. Oracle manageability gap was big.
Now data warehousing is demanding the same kinds of infrastructure richness.* When you’re loading data nightly or weekly, and using it to run canned reports, the system can burp for a few hours without anybody getting sacrificially fired. But we’re entering an era of “operational BI,” in which analytics gets integrated tightly into (for example) customer-facing systems, websites and call centers alike. All of a sudden, data warehouses need OLTP-like reliability. Surely not coincidentally, I’ve recently founding myself noting new data warehouse backup strategies from Oracle and Aster Data alike.
*Obviously, there have a few cases where this was needed all along. But my sense is that the numbers are now growing a lot.
In no way do I want to deny that data warehouse performance is crucial. I’m just saying that the other stuff now matters a lot too.
OLTP-like robustness isn’t the only way in which data warehousing issues go “beyond query”; another important subject is high-performance analytics.