September 21, 2009

Notes on the Oracle Database 11g Release 2 white paper

The Oracle Database 11g Release 2 white paper I cited a couple of weeks ago has evidently been edited, given that a phrase I quoted last month is no longer to be found. Anyhow, here are some quotes from and comments on what evidently is the latest version.

The In-Memory Database Cache (IMDB Cache) option of Oracle Database 11g Release 2, allows data to be cached and processed in the memory of the applications themselves, off-loading the data processing to middle tier resources. Any network latency between the middle tier and the back-end database is removed from the transaction path, with the result that individual transactions can often be executed up to 10 times faster. This is particularly useful where very high rates of transaction processing is required, such as those found under market trading systems, Telco switching systems, and Real Time manufacturing environments. All data in the middle tier is fully protected through local recovery, and asynchronous posting to the back end Oracle Database. With Oracle Database 11g Release 2, the ability to transparently deploy IMDB Cache with existing Oracle applications becomes much easier – with common data types, SQL and PL/SQL support, and native support for the Oracle Call Interface (OCI).

At a guess, this sounds like it’s based on Oracle’s TimesTen acquisition.

Oracle Database 11g Release 2 adds further optimizations, including capabilities to automatically determine the most optimal degree of parallelization for a query, based on available resources. With this comes automated parallel statement queuing, where the database determines that, based on current resource availability, it is more effective to queue a query for later execution once required resources have freed up.

Sounds like a kind of automatic workload management — i.e., the kind of optimization vendors of mature products get around to putting into their systems. It does not sound like query pipelining, however.

Oracle Database 11g Release 2 will automatically distribute a large compressed table (or a smaller non-compressed table), into the available memory across all the servers in the Grid, and will then localize parallel query processing to the data in memory on the individual nodes. This dramatically improves query performance, and is especially useful where large tables can be entirely compressed into the available memory using compression capabilities.

So Oracle caches compressed data. Not stated is which compression techniques are covered.

Each Exadata Storage Server stores up to 7 Terabyte [sic] of uncompressed user data, and also comes enabled with 384 GB of solid-state Flash cache. This Flash Cache automatically caches active data of the magnetic disks in the Oracle Exadata Storage Server, delivering a 10x performance gain for read and write operations under OLTP applications.

Sounds like the Flash memory is positioned for OLTP use.

In the past, Database Administrators and System Administrators have spent a great deal of time determining to how best place data across these disk arrays, to get maximum performance and availability. The best procedure for data placement is to simply Stripe And Mirror Everything; stripe data blocks equally across all disks in an array, and then mirror the blocks on at least two disks. This approach provides the perfect balance between performance, disk utilization, and ease of use.

This is a big part of what could be called the “Administering Oracle doesn’t suck nearly as badly as it used to” pitch. (Mitchell Kertzman, who was Sybase CEO after the mid-1990s meltdown, told me his motto was “We suck less every day.” But I digress …)

Automatic Storage Management (ASM), a feature of Oracle Database 11g automates the striping and mirroring of database without the need to purchase third party volume management software. As data volumes increase, additional disks can be added, and ASM will automatically restripe and rebalance the data across available disks to ensure optimal performance. Similarly, disks that report errors can be removed from the disk array, and ASM will re-adjust accordingly.

I.e., you can add nodes without taking the system down. That’s becoming a pretty standard feature for serious parallel DBMS.

Oracle Database 11g Release 2 improves ASM in significant areas. New intelligent data placement capabilities store infrequently accessed data on the inner rings of the physical disks, while frequently accessed data is placed on the outer rings, offering better performance optimization.

Also pretty standard.

Oracle has been enhancing partitioning capabilities for over ten years. Oracle Partitioning, an option of Oracle Database 11g Release 2, allows very large tables (and their associated indexes) to be partitioned into smaller, more manageable units, providing a “divide and conquer” approach to very large database management. Partitioning also improves performance, as the optimizer will prune queries to only use the relevant partitions of a table or index in a lookup. Oracle Database 11g Release 2 provides multiple methods for partitioning data, and also allows different levels of partitioning on the same table, so that a single partitioning strategy can be used to improve both performance and manageability.

Even better might be a system that doesn’t lean heavily on complex partitioning to achieve good performance.

Oracle Partitioning can also manage the lifecycle of information. Typically, all databases have active data – the information being processed this month or quarter, and historical data that is primarily read-only. Organizations can take advantage of the inherent lifecycle of data to implement a multi-tiered storage solution and lower their overall storage costs. For example, a large table within an order-entry system could contain all the orders processed in the last 7 years. Oracle Partitioning can be used to set up monthly partitions, with the current last four months of order data partitioned onto a high-end storage array, with all the other partitions placed on a lower-cost storage solution, often 2-3 times less cost than the high end storage environment.

This is becoming a standard feature for any parallel DBMS that can support multiple kinds of storage in one system.

Oracle Database 11g also provides advanced compression techniques to further reduce storage requirements. Using Oracle Advanced Compression, an option to Oracle Database 11g, all data in a table can be compressed using a continuous table compression capability that achieves a 2-4 times compression ratio with little performance impact on OLTP or Data Warehousing workloads. This compression technology replaces duplicate values in a table with a single value, and continuously adapts to data changes over time, so compression ratios are always maintained.

Sounds like dictionary/token compression.

With Oracle Database 11g Release 2, the Exadata Storage Servers in the Sun Oracle Database Machine also enable new hybrid columnar compression technology that provides up to a 10 times compression ratio, with corresponding improvements in query performance. And, for pure historical data, a new archival level of hybrid columnar compression can be used that provides up to 50 times compression ratios.

I thought they said 40X before. But even if my memory isn’t playing tricks regarding that, single-point compression ratio estimates are always very approximate.

Any hardware component in an Oracle Grid can be dynamically added or removed as required. Disks can be added or removed online with ASM, with the data automatically rebalanced across the new disk infrastructure. Additional servers can also be easily added or removed to a Real Application Cluster with users connected to these nodes rebalanced across the infrastructure. This ability to migrate users from one server to another in a RAC cluster also enables rolling patching of the database software. If a patch needs to be applied, then a server can be removed from the cluster, patched, and then put back into the cluster. The same operation can be repeated for the next server in the cluster, and so on.

Nice. And the paper goes on in that vein for quite a while.

Oracle Total Recall, an option to Oracle Database 11g, provides a solution for the retention of historical information. With Oracle Total Recall, all changes made to data are kept to provide a complete change history of information. This means that auditors can not only see who did what when, but they can also see what the actual information was at the time – something that previously has only be [sic] available by building into the application, or by expensive backup retention policies.

Timestamping/time-travel/whatever is increasingly becoming a standard feature as well, especially given the number of PostgreSQL-based DBMS on the market.

New internal control requirements found in regulations can be difficult and expensive to implement in an environment with multiple applications. Oracle Database Vault, an option to Oracle Database 11g, allows access controls to be transparently applied underneath existing applications. Users can be prevented from accessing specific application data, or from accessing the database outside of normal hours; separation-of-duty requirements can be enforced for different Database Administrators without a costly least privilege exercise. And Oracle Advanced Security, an option to Oracle Database 11g, can be used to transparently encrypt data at all levels – data in transit on the network; data at rest on physical storage and in backups. Similarly, the Data Masking pack can be used to obfuscate data as it moves from production to development, reducing the potential violation of privacy regulations or risking sensitive data leaks.

Oracle is the gold standard in database security.

Oracle’s self-management approach takes two tacks. Firstly, wherever possible, repeatable, labor intensive and error prone tasks that can be fully automated in the database have been. For example, Storage Management, Memory Management, Statistics collection, Backup and Recovery, and SQL Tuning have all been automated. Secondly, where operations cannot be fully automated, intelligent advisors are built into the database to mentor Database Administrators on how to get the best out of their systems. Advisors are provided for Configuration Management, Patching, Indexing, Partitioning, Performance Diagnostics, Data Recovery, and, new in Oracle Database 11g Release 2, Compression and Maximum Availability.

And boy are they needed.

Recent studies performed by an independent research company shows that Database Administrators can expect to spend 26% less time managing their 11g environments over their 10g environments, and as much as 50% when compared to older Oracle9i deployments.

50% of way too much is still way too much.

Comments

8 Responses to “Notes on the Oracle Database 11g Release 2 white paper”

  1. Daniel Lemire on September 21st, 2009 4:47 pm

    If you are using something like dictionary compression, then it is entirely meaningful to talk about some kind of average compression ratio. After all, we all know how well zip or png compresses.

    However, if you are doing anything fancier (as the terminology suggests “Oracle Advanced Compression”) then it is very likely that your compression rate will depend strongly on the size of the tables. It is just a lot easier to massively compress a multiterabyte table (assuming you have such a thing) than a 1-million-row table.

    There is got to be a better way to report the compression rates!

  2. JA on September 22nd, 2009 3:45 am

    “50% of way too much is still way too much.” couldn’t agree more.

  3. Glenn Davis on September 22nd, 2009 6:16 am

    It’s regrettable that an organization as large and experienced as Oracle cannot do better on compression. By their own admission, ‘Oracle Advanced Compression’ achieves only 38 percent compression factor on the TPC-H ‘lineitem’ benchmark table.

    If that’s ‘Advanced’, I’d hate to see ‘Retarded’! My (i.e., Xtreme Compression’s) compressor reaches 85 percent on the same data.

  4. Daniel Weinreb on September 22nd, 2009 8:18 am

    In this IMDB Cache option, what happens if some other client tries to access the data? That is, if there are two processes running the Oracle client, and they both open the same Oracle database, and at least one of them is using the IMDB cache, so that data from the shared database is living in one of the clients, and the other client tries to access that data, what happens?

    The ObjectStore database (which I co-architected) handled this by having the server talk to the client and revoke control of the data that was requested by the other client (unless they both only needed to read the data rather than write it). Does Oracle DB really do this? Or can you only use this option if only one middle-tier process is using the data?

  5. Curt Monash on September 22nd, 2009 11:49 am

    Dan,

    If IMDB is just TimesTen, it lives in the middle tier, and it presumably knows how to talk to more than one user at once.

  6. Greg Rahn on September 22nd, 2009 2:00 pm

    @Glenn Davis

    For your information, the “Advanced Compression” name actually refers to a database option (not a compression algorithm) that includes a number of compression capabilities that utilize compression in various places (table/RMAN/SecureFiles/etc). You can familiarize yourself with the specifics in this whitepaper.

    How much overhead is there for reading/decompression with said Xtreme Compression’s algorithm and what databases use that algorithm today?

    BTW, there is no adverse impact on read operations with Oracle’s BASIC or FOR OLTP table compression options (both use the same algorithm, they just apply it differently). With databases, it is not just about compression reducing the size, it’s about making queries run faster without significant overhead in doing so. Exadata’s Hybrid Columnar Compression yields even better compression ratios but does introduce slight CPU overhead for the read operations in QUERY mode, or you can apply “max” compression with ARCHIVE mode but in doing so consume a bit more CPU for read operations. It’s all about trade-offs for query performance and space savings, not just how small one can make the data.

  7. The hunt for Oracle Exadata production references | DBMS2 -- DataBase Management System Services on September 25th, 2009 2:22 pm

    [...] much of a chance to sit down, analyze the tea leaves, and write about Exadata 2. (Small exception: Highlights from and remarks on the Oracle Database 11g Release 2 white paper.) I hope to do that [...]

  8. Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2 | DBMS2 -- DataBase Management System Services on October 5th, 2009 8:14 am

    [...] has repeatedly emphasized that the Flash memory in Exadata 2 is meant to speed up OLTP. By way of contrast, I’ve only noticed one vague claim that Flash memory helps data [...]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.