Word of the day: “Compression”
IBM sent over a bunch of success stories recently, with DB2’s new aggressive compression prominently mentioned. Mike Stonebraker made a big point of Vertica’s compression when last we talked; other column-oriented data warehouse/mart software vendors (e.g. Kognitio, SAP, Sybase) get strong compression benefits as well. Other data warehouse/mart specialists are doing a lot with compression too, although some of that is governed by please-don’t-say-anything-good-about-us NDA agreements.
Compression is important for at least three reasons:
- It saves disk space, which is a major cost issue in data warehousing.
- It saves I/O, which is the major performance issue in data warehousing.
- In well-designed systems, it can actually make on-chip execution faster, because the gains in memory speed and movement can exceed the cost of actually packing/unpacking the data. (Or so I’m told; I haven’t aggressively investigated that claim.)
When evaluating data warehouse/mart software, take a look at the vendor’s compression story. It’s important stuff.
EDIT: DATAllegro claims in a note to me that they get 3-4x storage savings via compression. They also make the observation that fewer disks ==> fewer disk failures, and spin that — as it were 🙂 — into a claim of greater reliability.
| Categories: Data warehouse appliances, Data warehousing, Database compression, DATAllegro, IBM and DB2, SAP AG, Vertica Systems | 3 Comments |
EnterpriseDB tries PostgreSQL-based Oracle plug-compatibility
Like Greenplum, EnterpriseDB is a PostgreSQL-based DBMS vendor with an interesting story, whose technical merits I don’t yet know enough to judge. In particular, CEO Andy Astor:
- Confirms that EnterpriseDB is OLTP-focused, unlike Greenplum. That said, they are also used for some reporting and so on. But they don’t run 10s-of-terabytes sized data marts.
- Claims EnterpriseDB has a high level of Oracle compatibility – SQL, datatypes, stored procedures (so that would be PL/SQL too), packages, functions, etc.
- Claims ANTs isn’t nearly as Oracle-compatible.
- Claims 50-100% better OLTP performance out of the box than vanilla PostgreSQL, due to auto-tuning.
Also, EnterpriseDB has added a bunch of tools to PostgreSQL – debugging, DBA, etc. And it provides actual-company customer support, something that seems desirable when using a DBMS. It should also be noted that the product is definitely closed-source, notwithstanding EnterpriseDB’s open-source-like business model and its close ties to the open source community.
Read more
| Categories: Actian and Ingres, ANTs Software, Data warehousing, Emulation, transparency, portability, EnterpriseDB and Postgres Plus, Mid-range, OLTP, Open source, Oracle, PostgreSQL | 2 Comments |
The five flavors of DB2
I asked Jeff Jones of IBM to explain the various DB2 code lines to me. His answer was so clear that I asked further permission to post it verbatim. Here it is. The main takeaway is that one shouldn’t confuse the shared-everything z/OS (mainframe) version with the more loosely-coupled Unix/Linux/Windows version.
1. DB2 9 for z/OS (CAM note: i.e., mainframe) is a unique code base designed in cooperation with and integrated tightly with the operating system (z/OS) and the hardware (System z). That said, our development and administration tools (the externals of the product), as well as the SQL language supported, are built to be nearly the same across DB2 platforms. DB2 9 for z/OS has a shared-resource architecture similar to Oracle RAC. Parallel Sysplex and other specialized System z hardware enable this high performance, high reliability scenario (that even Oracle has said is well built). Born in 1983.
2. DB2 9 for Linux, UNIX and Windows is a second unique code base. (CAM note: i.e., “open systems”) Roughly 10% of that code base is reserved for platform-specific code to optimize to threading, security, clustering etc. across Linux (quite a few), UNIX (AIX, Solaris, HP-UX) and Windows (many versions). This code base is designed for portability given that we don’t own the underlying hardware in all cases (as we do for DB2 on System z). Much tooling is shared across the other DB2 platforms. Born in 1993.
http://ibm.com/db2/9
http://ibm.com/software/data/db2/linux/validate < --- Linux platforms supported NOTE: DB2 for Linux runs on all four IBM servers (System z, System p, System i and System x), same code base. Read more
| Categories: IBM and DB2 | 2 Comments |
Greenplum’s strategy
I talked with Greenplum honchos Bill Cook and Scott Yara yesterday. Bill is the new CEO, formerly head of Sun’s field operations. Scott is president, and in effect the marketing-guy co-founder. I still don’t know whether I really believe their technical story. But I do think I have a feel for what they’re trying to do. Key aspects of the Greenplum strategy include:
- Greenplum rewrote a lot of PostgreSQL to parallelize it, in the correct belief that MPP is the best way to go for high-end data warehousing.
- Indeed, Greenplum claims to have a general solution to DBMS parallelization. Unlike Netezza, DATallegro, Vertica, and Kognitio, Greenplum offers a row-oriented data store with a fairly full set of indexing techniques. You want star indices or bitmaps? They have them. (They even claimed to be used for some text management when last we talked, although that was for O’Reilly and Mark Logic seems to be O’Reilly’s main text-indexing vendor.)
- Greenplum’s main sales strategy is to be part of Sun’s product line, bundled into Thumper boxes as single-part-number Sun offerings. They certainly could add other hardware OEMs, just like Checkpoint sells firewalls through multiple appliance vendors. But at least for now it’s all about Sun.
| Categories: Data warehouse appliances, Data warehousing, Greenplum, Open source, PostgreSQL | 5 Comments |
Ingres tries to become relevant again
Ingres has non-trivial resources – 300 employees, 10,000 “real” customers, and some additional large number of installations embedded in CA products. It has a fairly pure support-only open source revenue model, although there may be exceptions to that in cases such as the DATAllegro relationship.
Should anybody care?
Yes and no. To compete effectively in the mid-range OLTP relational database management system market, you need a product that’s much easier to administer than Oracle, and preferably easier even than Microsoft SQL*Server. Ingres doesn’t meet that standard. Until it does, it probably won’t have much of a market outside its current installed base. But some of Ingres’s strategies and directions are pretty clever, and may be interesting to people who’d never actually consider using Ingres technology. Specifically, Ingres has plans in the areas of appliances and database services, two subjects that are close to my heart. Read more
| Categories: Actian and Ingres, DATAllegro | 2 Comments |
DBMS market competitive overview (Part 1)
Monash Advantage members just received an exclusive nine-page Monash Letter with a competitive overview of the DBMS industry. The full analysis is exclusive to them, but I’ll give some highlights here.
1. As per my recent “deck-clearing” posts, there’s a lot more competitive opportunity in the DBMS industry than many observers recognize.
2. One reason is the considerable number of separate niches in the DBMS space.
3. Oracle is a classical Geoffrey Moore “gorilla” only in the market for high-end OLTP and mixed-used DBMS. Everything else is up for grabs.
4. As discussed here extensively, simpler appliance-like architectures are beating the overly complex general-purpose DBMS vendors’ solutions for VLDB data warehousing.
5. MPP/shared-nothing architectures are deservedly beating SMP/shared-everything approaches for VLDB data warehousing.
That’s not the only Monash Letter recently released; another one covered online marketing strategy and tactics.
| Categories: Data warehouse appliances, Data warehousing, Database diversity, Oracle, Theory and architecture | Leave a Comment |
Why Oracle and Microsoft will lose in VLDB data warehousing
I haven’t been as clear as I could have been in explaining why I think MPP/shared-nothing beats SMP/shared-everything. The answer is in a short white paper, currently bottlenecked at the sponsor’s end of the process. Here’s an excerpt from the latest draft:
There are two ways to make more powerful computers:
1. Use more powerful parts – processors, disk drives, etc.
2. Just use more parts of the same power.
Of the two, the more-parts strategy much more cost-effective. Smaller* parts are much more economical, since the bigger the part, the harder and more costly it is to avoid defects, in manufacturing and initial design alike. Consequently, all high-end computers rely on some kind of parallel processing.
*As measured in terms of capacity, transistor count, etc., not physical size. Read more
| Categories: Data warehouse appliances, Data warehousing, DATAllegro, Microsoft and SQL*Server, Netezza, Oracle, Parallelization, Teradata, Theory and architecture, Vertica Systems | 7 Comments |
How Hyperion will change Oracle
Oracle is evidently buying Hyperion Software. Much like Gaul, Hyperion can be divided into three parts:
- Budgeting and consolidation applications, descended from the original Hyperion and Pillar.
- Essbase, the definitive MOLAP engine, descended from Arbor Software.
- A business intelligence suite, descended from Brio.
The most important part is budgeting/planning, because it could help Oracle change the rules for application software. But Essbase could be just the nudge Oracle needs to finally renounce its one-server-fits-all dogma.
Read more
| Categories: Analytic technologies, Data warehousing, Microsoft and SQL*Server, MOLAP, Oracle | 17 Comments |
Opportunities for disruption in the OLTP database management market (deck-clearing post #2)
The standard Clayton Christensen “Innovator’s Dilemma” disruption narrative goes something like this:
- Market leaders have many advantages, including top technology.
- Followers come up with good technology too.
- The leaders stay ahead by making their products ever better and more complex.
- The followers sell into new or non-mainstream markets, at prices the leaders can’t match. So they dominate new markets.
- Old markets turn into low-margin commodity-fests.
- Old leaders are screwed.
And it’s really hard for market leaders to avert this sad fate, because the short- and intermediate-term margin hit would be too great.
I think the OLTP DBMS market is ripe for that kind of disruption – riper than commentators generally realize. Here are some key potential drivers:
Read more
OLTP database management system market – the consensus isn’t ALL wrong (deck-clearing post #1)
Most of what I’ve written lately about database management seems to have been focused on analytic technologies. But I have a lot to say on the OLTP (OnLine Transaction Processing) side too. So let’s start by clearing the decks. Here’s a list of some consensus views that I in essence agree with:
- Oracle is the top of the line, and has nothing wrong with it other than cost of ownership and the non-joys of doing business with Oracle Corporation.
- DB2/mainframe is a fine product, but only if you like IBM mainframes.
- DB2/open systems is another fine product, but it’s hard to think of reasons to use it over Oracle.
- Microsoft SQL Server has great cost of ownership if you’re a Windows (server) shop anyway, especially on the administrative side. It does most but not all of what Oracle does.
- Sybase Adaptive Server Enterprise is a lot like SQL Server, but without the Windows dependence or the great Microsoft tools. If you have it installed or are Chinese, you should strongly consider using it, but otherwise there are better alternatives.
- Progress’ DBMS is great if you don’t need any of the features it’s missing. Administration, for example, is a super-low-cost breeze. But why use it unless you’re also using the Progress development tools?
- Intersystems’ Cache’ is another fine mid-range product that involves buying into the vendors’ whole tool set – all the more so because it isn’t relational.
- Small-footprint embedded DBMS, from vendors such as Sybase’s iAnywhere division or Solid Information Technologies, are off in their own little world. Mainly, that world is telecom, with a satellite in medical devices, although other kinds of networked equipment also sometimes use these products.
- IBM’s non-DB2 database management products – IMS, Informix, etc. – are fine things to stick with until you have to change. Ditto products from Software AG, Computer Associates, Cincom, etc.
- MySQL Version 4 is an OLTP joke, but it’s a joke many people share. (Hey — a lot of blogs, including mine, run on WordPress and MySQL 4.)
- Until Ingres is meaningfully marketed and sold outside its installed base, it’s not worth worrying about.
- PostgreSQL is more significant as the underpinning of other products — mainly EnterpriseDB in the OLTP space — than it is in its own right.
