I spent about six hours at Oracle today — talking with Andy Mendelsohn, Ray Roccaforte, Juan Loaiza, Cetin Ozbutun, et al. — and plan to write more later. For now, let me pass along a few quick comments.
- The key philosophical point that I had perhaps been missing is that Oracle thinks there is and should be a storage (server) tier, just as there also are database (server), application (server), and web (server) tiers.
- Exadata cells are designed to never talk with each other. Instead, they talk to a set of Infiniband switches, which then talk to a grid of servers on the database tier. Oracle thinks this has solved its I/O bandwidth problem for once and for all. It’s hard to see why that wouldn’t be the case.
- What Exadata does on the storage tier in query execution is throw stuff away. Mainly, this is projection and restriction/SELECT. But if a join has been resolved on a small fact table, and Oracle is now filtering a fact table to match a value or set of values, the storage tier can do that too.
- Backups are now done (or soon will be?) on the storage tier. I presume the same goes for restore, but I didn’t ask.
- Oracle says that RAC (Real Application Clusters) no longer has much to do with locking (RAC’s antecedents include Distributed Lock Manager, which is why the question arises). Generally, Oracle denies that RAC creates overhead or bottleneck problems.
- Generic UDFs (User-Defined Functions) aren’t automatically parallel on Oracle, unless it’s clear that they only affect single rows EDIT: or otherwise are assured to be free of side effects. However, Oracle has itself implemented and shipped long lists of parallel UDFs.
- Oracle insists that even without Exadata, it has lots of users in the 10+ terabyte range, and few over 100 terabytes. Oracle further insists that this is with the most conservative kind of counting — single database, true user data, etc. Oracle estimates that Teradata Petabyte Power Player Dell would only be at 300 terabytes by this kind of counting.
- I didn’t focus on admittedly worthy questions like “Across how many cores do these parallel analytics really scale?” or “How easy is it to provision a new node in a RAC cluster if one goes down?”
- The SAS drive option has been increased from 300 to 450 gigabyte drives. Presumably, this will take our estimate of high-end Exadata list pricing down from $198K/TB of user data to $122K. Competitive vendors should show similar improvements, however, if they also use new generation drives.