IBM sent over a bunch of success stories recently, with DB2’s new aggressive compression prominently mentioned. Mike Stonebraker made a big point of Vertica’s compression when last we talked; other column-oriented data warehouse/mart software vendors (e.g.
Kognitio, SAP, Sybase) get strong compression benefits as well. Other data warehouse/mart specialists are doing a lot with compression too, although some of that is governed by please-don’t-say-anything-good-about-us NDA agreements.
Compression is important for at least three reasons:
- It saves disk space, which is a major cost issue in data warehousing.
- It saves I/O, which is the major performance issue in data warehousing.
- In well-designed systems, it can actually make on-chip execution faster, because the gains in memory speed and movement can exceed the cost of actually packing/unpacking the data. (Or so I’m told; I haven’t aggressively investigated that claim.)
When evaluating data warehouse/mart software, take a look at the vendor’s compression story. It’s important stuff.
EDIT: DATAllegro claims in a note to me that they get 3-4x storage savings via compression. They also make the observation that fewer disks ==> fewer disk failures, and spin that — as it were — into a claim of greater reliability.