Automatic redistribution of data warehouse data
In a recent Oracle Exadata FAQ, Kevin Closson writes:
Q. […] don’t some of the DW vendors split the data up in a shared nothing method. Thus when the data has to be repartitioned it gets expensive. Whereas here you just add another cell and ASM goes to work in the background. (depending upon the ASM power level you set.)
A. All the DW Appliance vendors implement shared-nothing so, yes, the data is chopped up into physical partitions. If you add hardware to increase performance of queries against your current dataset the data will have to be reloaded into the new partitioning scheme. As has always been the case with ASM, adding new disks-and therefore Exadata Storage Server cells-will cause the existing data to be redistributed automatically over all (including the new) drives. This ASM data redistribution is an online function.
Hmm. That sounds much like the story I’ve heard from various other data warehousing DBMS vendors as well.
Rather than try to speak for them, however, I’ll just post this and see whether they choose to add anything to the comment thread.
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle | 7 Comments |
Greenplum pricing
Edit: Actually, this post is completely incorrect. The $20K/terabyte is for software only. So far, my attempts to get Greenplum to estimate hardware costs have been unsuccessful.
Greenplum’s Scott Yara was recently quoted citing a $20K/terabyte figure for Greenplum pricing. That naturally raises the question:
Greenplum charges around $20K/terabyte of what?
Categories: Data warehouse appliances, Data warehousing, Greenplum, Pricing | 4 Comments |
Oracle crosses the line on integrity :(
Dana Gardner did a puff interview with Oracle and HP regarding Exadata, and clearly disclosed sponsorship up top. So far, so good. My sponsored work is a lot more independent than that, but I’m probably an outlier at the other extreme. Gardner’s view of what’s ethical in this regard is a common one, and the point of this post isn’t to argue with his choices in that regard, nor of those who hired him.
Where things went badly awry is on an Oracle corporate blog, which said: Read more
Categories: Exadata, Oracle | 6 Comments |
Oracle Database Machine and Exadata pricing: Part 2
My Oracle Database Machine and Exadata pricing spreadsheet has been updated. Specifically:
- The first page has been modestly altered to accommodate more chargeable software options, as per the discussion below.
- Accordingly, my new estimate for HP Oracle Database Machine list price is $5,546,000. Per-terabyte prices (user data) are $60K and $198K for the two configurations.
- There’s a whole new second page, for Exadata configurations smaller than a full Oracle Database Machine. Most of the work on that was done by Bence Arató of BI Consulting (Hungary), who graciously gave me permission to post it.
- The lowest per-terabyte Exadata price estimates are about 20% lower than for the full Oracle Database Machine. The difference is due mainly to eliminating Real Application Clusters for a single-node SMP machine, and secondarily to rounding down slightly on server hardware capacity. But these are rough estimates, as neither Bence nor I is a hardware pricing guy.
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle, Pricing | 11 Comments |
Eric Lai on Oracle Exadata, and some addenda
Eric Lai offers a detailed FAQ on Oracle Exadata, including a good selection of links and quotes. I’d like to offer a few comments in response: Read more
Categories: Data warehouse appliances, Data warehousing, Exadata, Greenplum, Netezza, Oracle, Pricing | 4 Comments |
Has there been any progress on SAP over Postgres?
Peter Eisentraut discouragingly reported in January:
What I hear from my acquaintances at SAP, however, is this:
- SAP doesn’t need fancy database features, since the software doesn’t use them.
- Those who don’t want to buy Oracle can use MaxDB; it’s free.
PostgreSQL doesn’t support in-place upgrades, which makes it unsuitable for multiple terabyte installations typically used by SAP customers.
Has anything changed since then?
And as a trivia challenge, does anybody recognize my science fiction reference in the comment thread there? 🙂 Hint: The dialogue referenced did not occur on the planet Arrakis.
Categories: PostgreSQL | 2 Comments |
Exadata and Oracle Database Machine parallelization clarified
Some kind Oracle development managers have reached out and helped me better understand where Oracle does or doesn’t stand in query and analytic parallelization. This post supersedes prior discussions of the subject over the past week. Read more
Categories: Clustering, Data warehouse appliances, Data warehousing, Exadata, Oracle, Parallelization | 10 Comments |
Oracle Database Machine performance and compression
Greg Rahn was kind enough to recite in his blog what Oracle has disclosed about the first Exadata testers. I don’t track hardware model details, so I don’t know how the testers’ respective current hardware environments compare to that of the Oracle Database Machine.
Each of the customers cited below received “half” an Oracle Database Machine. As I previously noted, an Oracle Database Machine holds either 14.0 or 46.2 terabytes of uncompressed data. This suggests the 220 TB customer listed below — LGR Telecommunications — got compression of a little under 10:1 for a CDR (Call Detail Record) database. By comparison, Vertica claims 8:1 compression on CDRs.
Greg also writes of POS (Point Of Sale) data being used for the demo. If you do the arithmetic on the throughput figures (13.5 vs. a little over 3), compression was a little under 4.5:1. I don’t know what other vendors claim for POS compression.
Here are the details Greg posted about the four most open Oracle Database Machine tests: Read more
Categories: Data warehouse appliances, Data warehousing, Database compression, Exadata, Oracle, Telecommunications | 9 Comments |
Oracle Exadata list pricing
The figures in this post have now been updated. There’s a new spreadsheet at that link as well.
I’ve been trying to figure out how much Oracle Exadata actually costs. My first cut comes up with prices of $58-190K/TB (user data), based on a total system price of $5,322,000, and user data figures of 28 and 92.4 TB for the two available sizes of disk drive. But of course there are a lot of uncertainties in these figures. You can use this spreadsheet (Edit: That’s the old one) to see where the final numbers come from, and to modify the estimates as you see fit. Read more
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle, Pricing | 10 Comments |
Oracle Exadata Smart Scan Join Processing
Oracle has put up an Exadata white paper (hat tip to Kevin Closson’s Exadata FAQ). There’s a section on Smart Scan Join Processing. Sounds exciting, huh? It reads, in its entirety:
Exadata performs joins between large tables and small lookup tables, a very common scenario for data warehouses with star schemas. This is implemented using Bloom Filters, which are a very efficient probabilistic method to determine whether a row is a member of the desired result set.
Jeez. That almost sounds as if Exadata is an immature, Release 1 data warehouse appliance!
Categories: Data warehouse appliances, Data warehousing, Exadata, Oracle | 14 Comments |