I had a good chat with IBM about IBM BLU, aka BLU Accelerator or Acceleration. BLU basics start:
- BLU is a part of DB2.
- BLU works like a columnar analytic DBMS.
- If you want to do a join combining BLU and non-BLU tables, all the BLU tables are joined first, and the result set is joined to the other tables by the rest of DB2.
And yes — that means Oracle is now the only major relational DBMS vendor left without a true columnar story.
BLU’s maturity and scalability basics start:
- BLU is coming out in IBM DB2 10.5, this quarter.
- BLU will initially be single-server, but …
- … IBM claims “near-linear” scalability up to 64 cores, and further says that …
- … scale-out for BLU is coming “soon”.
- IBM already thinks all your analytically-oriented DB2 tables should be in BLU.
- IBM describes the first version of BLU as being optimized for 10 TB databases, but capable of handling 20 TB.
BLU technical highlights include:
- “Complete” pipelining of queries, so that scans can be shared.
- Data skipping, to reduce I/O.
- Vectorization based on SIMD (Single Instruction Multiple Data). It turns out that SIMD is available on Intel and PowerPC chips alike, and allows you to operate at once on as much data as fits into a register. At load time, BLU packs data into vector lengths appropriate for the available silicon.
- Probabilistic caching rather than pure LRU (Least Recently Used). Blocks are more likely to stick in memory if they’ve been referenced more often before. There’s some extra randomization beyond that, for reasons I didn’t wholly grasp.
- “Automatic workload management”, an option that puts an automagically selected cap on the number of simultaneous queries. This cap will rarely be >1/core; not coincidentally, IBM believes that contention for resources among queries can be very wasteful. Otherwise, BLU’s “concurrency” model is similar to regular DB2′s.
Like any other columnar RDBMS or RDBMS option, BLU has a compression story:
- BLU compression options include approximate Huffman coding (which I gather is a form of tokenization), prefix encoding, and something to do with offsets (I don’t know whether that’s straightforward delta compression).
- BLU compression strategies are automagically chosen, one column segment at a time.
- BLU operates on compressed data.
IBM said all the compression algorithms were order-preserving, and hence range predicates can be executed on compressed data. Unfortunately, I neglected to ask how “approximate Huffman coding” could preserve order.
As with any columnar system, one has to wonder about how BLU writes data. IBM observed:
- It is recommended you have BLU commit 10,000 rows at a time.
- Many queries can read uncommitted data.
And so IBM doesn’t think load latency is much of a problem for BLU.
But that’s about all I figured out about how BLU writes data. So IBM kindly followed up with a lengthy email, plus permission to copy it — lightly edited — below:
Getting new data into/out of the database (in memory and on disk) is one of BLU’s strengths. The methods we support include:
We support LOAD (replace or append), SQL INSERT, UPDATE and DELETE, as well as three SQL based utilities INGEST, IMPORT and EXPORT (not to mention BACKUP and RESTORE, which are also ways of getting data in and out).
As in previous versions of DB2, the SQL-based operations (INSERT, UPDATE, DELETE, INGEST, IMPORT) are in-memory first and hardened to disk asynchronously. LOAD operations are bulk operations to disk (though they tend to be CPU bound due to the cost of parsing and formatting data during load). As you suspected, the UPDATE operations are not in place.
Because the syntax and semantics of getting data into the database remain unchanged (an ongoing theme for us), it means that ELT tools like IBM Data Stage and partner products work without modification.
- Unlike other columnar vendors that have tried to achieve performance when inserting new data using delta areas of the table — doing fast insert into a delta area (usually a delta tree or a row-based staging table) and then asynchronously moving the row data or delta data into the columnar area — we consider that a weak strategy since it means the data has to be loaded twice (once into the staging area and then again into the mainline table). Our approach is to add data directly against the main table and use bulk transformation (i.e. thousands of records at a time) to amortize any latency that would normally come from columnar processing. We believe that by bulk processing the data we largely eliminate the overhead inherent in columnar processing, and also entirely avoid the dual processing that other companies are suffering from.
- We’ve invented a new columnar-specific logging method for BLU Acceleration. Externally it looks exactly like DB2′s traditionally log based transaction recovery (for crash and rollforward recovery), but the format of the bytes within the log is organized in columnar fashion, so we log buffers of column data rather than logging rows. This, in combination with XOR logging (only logging the XOR delta bits) results in great log space reduction. As a result the recovery logging is actually much smaller than row based logging in many cases. Heck, even we were happily surprised by it. Again, like our storage story where BLU Accelerated tables can co-exist in the same storage and buffer pool as traditional row tables, the BLU Accelerated data is logged to the same log files as row table operations. The integration is completely seamless.
Finally, last and probably also least:
- ”BLU” doesn’t stand for anything, except insofar as it’s a reference to IBM’s favorite color. It’s just a code name that stuck.