Most of my recent data warehouse engine research has been with the specialists. But over the past couple of days I caught up with Oracle and Microsoft (IBM is scheduled for Friday). In at least three ways, it makes sense to lump those vendors together, and contrast them with the newer data warehouse appliance startups:
- Shared-everything architecture
- End-to-end solution story
- OLTP industrial-strengthness carried over to data warehousing
In other ways, of course, their positions are greatly different. Oracle may have a full order-of-magnitude lead on Microsoft in warehouse sizes, for example, and has a broad range of advanced features that Microsoft either hasn’t matched yet, or else just released in SQL Server 2005. Microsoft was earlier in pushing DBA ease as a major product design emphasis, although Oracle has played vigorous catch-up in Oracle10g.
Shared everything. Oracle and Microsoft take a shared-everything approach, as opposed to the MPP/shared-nothing approach favored by Teradata, Netezza, and DATallegro. This has one big theoretical advantage (all resources are available at all points in the process) and one big theoretical disadvantage (there’s inherent overhead in the synchronization). I agree with the emerging consensus that blades/grids/shared-nothing is the wave of the future, but I know of no simple and obvious proof that this view is correct.
End-to-end integration. Proud as they may be of their respective data warehouse DBMS technology, Oracle and Microsoft ideally would like to shift the discussion away from just data servers, and rather talk about end-to-end integration. This could comprise RDBMS, MOLAP capabilities, ETL, BI/reporting, and maybe the whole transactional technology stack as well. Metadata integration gets mentioned a lot in such discussions.
Industrial-strengthness. Although neither vendor actually framed the argument to me in exactly that way, both gain a lot of real and perceived advantages from offering single products used both for OLTP and data warehousing. Lots of industrial strength stuff that is required for OLTP – security, high availability, etc. – is also very nice for data warehousing. Indeed, Oracle explicitly tries to tie such things to compliance, and for that and other reasons convince customers such features are a must-have. Similarly, robust concurrency can be assumed for systems with OLTP credentials. And Oracle and Microsoft automatically have operational data warehouse stories, which the smaller appliance vendors are not yet well-positioned to compete with. (Teradata, of course, is a different matter.)