When I went to Oracle in October, the main purpose of the visit was to discuss Exadata. And so my initial post based on the visit was focused accordingly. But there were a number of other interesting points I’ve never gotten around to writing up. Let me now remedy that, at least in part.
- Oracle has offered compression since 9i Release 2. It’s token/dictionary on columns or groups of columns. Oracle says 2-3X compression is common and 8X not unheard of. Also, Oracle has always had variable-length rows, which can make its data somewhat more compact to start with than some other vendors’.
- Interestingly, Oracle executes SQL statements directly on compressed data.
- In general, Oracle says its compression has very little overhead.
- Oracle has offered transparent encryption since 10g Release 2, on whole columns or individual tables. I didn’t ask about encryption performance.
- In 11g Release 1, Oracle rearchitected its LOB (Large OBjects) structure for performance, and perhaps functionality as well. Andy Mendelsohn believes performance now is the same as that of raw file systems. Supported features include encryption, compression, and deduplication. Applications for this new functionality include content management such as video or images and — which I found surprising — spatial data. But it’s not so relevant to text and OLAP data, even though technically those are stored in LOBs as well.
- Actually, Oracle stores OLAP in a true MOLAP array. (I recall that that integration took years.)
- Speaking of Oracle’s geospatial functionality, it’s used heavily in ERP. (I didn’t probe for details.) Oracle also says “all” the mapping vendors are Oracle customers (partners?).
- In Ray Roccaforte’s view, the two greatest drivers of data warehouse growth are consolidation (merger-related or otherwise) and web data.
- Oracle believes that, at least among general-purpose DBMS, its main product has the only bitmaps actually stored on disk, and by far the best star optimizations.
- While conceding that table scans are important, and that Oracle had bottlenecks in that regard Exadata is designed to fix, Ray insists there are some applications for which star optimizations really are needed.
- Query pipelining is limited in Oracle, and the optimizer is not geared to optimize streams of queries.
- However, Oracle’s optimizer is sufficiently self-aware to notice when a query runs long and try to do things differently “next time.” For example, it might do more sampling if the statistics proved unreliable, or might take the time to search a bigger solution space.
- Oracle says it already does a lot of CEP-like things to refresh overlapping materialized views simultaneously. (That makes sense.) Oracle is working on truer complex event processing.
- Complexity data point: Oracle’s and SAP’s application suites each have >100,000 tables and >1,000,000 distinct SQL statements.
There were a couple of notes about text analytics as well, but I’ll blog about those separately.