I’ve long argued that:
- Oracle and Microsoft are doomed in the data warehouse market unless they acquire MPP/shared-nothing data warehouse DBMS and/or data warehouse appliances.
- DATAllegro is the ideal acquisition for either of them.
Microsoft has now validated my claim by agreeing to buy DATAllegro. As you probably know, we’ve been covering DATAllegro extensively, as per the links listed below.
Basic deal highlights include:
- A definitive agreement has been signed.
- Deal closing is expected in a few weeks.
- I got the impression that the undisclosed price is surely a nice step-up from the Series D round that closed a few months ago.
- DATAllegro CEO Stuart Frost will run an engineering division, based at DATAllegro’s current headquarters, reporting into Microsoft’s SQL Server division. He seems to be locked into staying at Microsoft for at least a couple of years.
- DATAllegro’s software will be ported from its current Linux/Ingres stack to Windows/SQL Server.
- The DATAllegro brand name will probably go away.
- Everything else is either undisclosed or truly not yet decided. In particular, there’s no word as to whether Stuart will run any parts of what now is Microsoft.
To understand how DATAllegro will fit into Microsoft’s SQL Server product line, let’s start by reviewing aspects of DATAllegro’s product architecture:
- Each DATAllegro node except the head runs a full copy of Ingres, an OLTP DBMS, over Linux.
- Thus, both data and SQL are shipped from node to node. Like vendors of comparable systems, DATAllegro does a better job each release of either resolving queries on one node or shipping data from peer to peer, rather than sending all intermediate results up to the head node for further processing.
- The whole thing runs on standard blades and EMC storage, plus Cisco Infiniband.
- The DATAllegro head node runs DATAllegro’s own SQL optimizer. Thus in some ways each query is optimized twice – once overall in the head node, then again as different pieces of the query are run on different nodes. (Actually, it’s generally more accurate to say that each piece of the query is run once per node, but sometimes that isn’t literally true due to considerations of partitioning.)
Feasibility work has already been done on the port to SQL Server. Stuart reports that the work so far indicates a significant speed-up, which he attributes to data warehouse performance optimizations present in SQL Server that are lacking in the less well-funded Ingres. (Specifically mentioned were star joins and some sort of memory-centric capability.) One interesting implication is that when DATAllegro’s optimizer is rewritten for the port, it will largely do less than it has been doing to date, since SQL Server needs less “help” in optimizing the single-node parts of queries than Ingres does. The port will also of course involve changes to the file structures, due both to the change of DBMS and operating system; I got the sense that in this area, final decisions truly haven’t yet been made.
And yes – Stuart now confesses that DATAllegro was designed for acquisition from the get-go, e.g. in the choice to incorporate a third-party OLTP DBMS.
Related links (some about to go up)
- Our best cut at DATAllegro’s numbers
- Two white papers (dated March and May of 2007) focusing on DATAllegro’s product architecture
- Why MPP/shared-nothing architectures (which neither Oracle nor Microsoft SQL Server have) will win in the data warehouse market
- Why Oracle’s counterarguments don’t hold water
- Three ways Oracle and Microsoft could go MPP
- Deal prospects in the data warehouse DBMS market
- A general link to all our DATAllegro coverage