Comments on: One vendor’s trash is another’s treasure

By: Kevin Closson

Kevin Closson — Tue, 03 Feb 2009 21:47:14 +0000

Curt,

So now that we’ve finished trudging through the terminology, it should make a lot more sense now why I said that the effectiveness of Oracle hash partitioning to handle data skew is between 0% and 100%.

The moral of the story is hash partitioning is only effective at handling skew based on the data being loaded. For example, it isn’t that good at evenly loading, say, 42 partitions if the partition key is something like a gender column. I know you are perfectly aware of how a hash function works, but I wanted to put that example out for the casual reader. I hope you’ll indulge me on that…

P.S., On a tangent regarding terminology… I recall in Informix DSA version 6 table partitions where called fragments. The default placement was round-robin. I found it strange then as I do now that it was considered a good thing to have a database residing in “round-robin fragmented storage” considering the generally negative connotation of the word fragment in database-land.

By: Curt Monash

Curt Monash — Tue, 03 Feb 2009 20:39:39 +0000

Ghassan,

Actually, what would help me avert errors would be to post in less haste. As to WHY I rushed this post and several others up between the time I submitted the first draft of the article and the time it was finally posted … well, in the interest of peace and comity, I won’t spell that out.

By: Curt Monash

Curt Monash — Tue, 03 Feb 2009 20:35:34 +0000

Greg,

I think you nailed it. Hash distrbution is exactly what I meant. Silly me!

CAM

By: ghassan salem

ghassan salem — Tue, 03 Feb 2009 20:29:04 +0000

Curt,
Oracle has something called partition-wise join, that works when you join 2 equi-hash-partitioned tables (i.e. same key partitioning, as well as same number of partitions) on the partitioning key. And in a parallel query, this is done in parallel. So, you can get the benefits of hash-join as you might get in a shared-nothing system.
Also, bear in mind that in Oracle, you can range partition a table on some column(s), and hash-partition it on anotherr column. So you get the benefits of range partitioning (e.g. ILM, fast purge of old data, partition pruning, …) as well as partition-wise joins.

Have a look at the doc, it will make your posts when writing about Oracle less error-prone.

rgds

By: Greg Rahn

Greg Rahn — Tue, 03 Feb 2009 20:15:32 +0000

Just an observation from the sideline...I think the confusion stems from your use of the phrase hash partitioning. I think you should have used the phrase hash distribution as you are discussing the physical locality of data vs. the logical grouping of data. For example, in Netezza you distribute data to the SPUs by using the DDL phrase distribute on random or you can use a phrase of distribute on [hash] (column(s)). Likewise DB2 has DDL clauses to do both data distribution (DISTRIBUTE BY) and logical grouping (PARTITION BY & ORGANIZE BY).

By: David Aldridge

David Aldridge — Tue, 03 Feb 2009 14:57:55 +0000

Curt,

Yes, I think that understanding the role of ASM is critical in some ways to understanding the reason why we use hash partitioning in Oracle, or rather it helps to explain what we do not use it for. Because, as you say, ASM spreads the data over all the available devices, hash partitioning is not used to associate data with particular storage devices in the way that it is for other platforms.

Rather it is a logical method of subdividing the data to allow more efficient processing. As I mentioned above it is used to reduce intra-slave messaging on hash joins, but Daniel’s comment on indexing reminds me that it also enables parallel index range scans. To take a simple example, one might partition sales transactions according to the day of the sale, and then hash each day of data into 64 subpartitions. A local index on “transaction dollar amount” could then be scanned in parallel to isolate all sales with a transaction dollar amount in a particular range of values, for example “more than $1,000”, _if_ the optimizer estimated that to be more efficient than scanning table partition of the entire day of sales.

By: Curt Monash

Curt Monash — Tue, 03 Feb 2009 09:33:52 +0000

Daniel,

Oracle said that even hash partitions are striped across disks, courtesy of ASM. That was in my notes right next to the observation that hash partitions don’t have their usual somewhat-pseudo-random distribution benefit in the case of Oracle, because the data’s already pseudo-randomly distributed via other mechanisms. (Those are, of course, the notes I should have checked before incorrectly saying Oracle doesn’t do hash partitioning in Exadata at all.)

CAM

By: Daniel Abadi

Daniel Abadi — Tue, 03 Feb 2009 00:02:25 +0000

One thing to be aware of is that “partition pruning” sometimes means “no parallelism”.

SELECT sum(sales)
FROM table
WHERE store_id = 5

If you hash on store_id, only one partition is involved in answering the query, which means the query runs at the speed of the one disk which contains that partition. If you use round robin partitioning and have an index on store_id on each partition, the query runs at the speed of all the disks reading (in parallel) the (much smaller) number of ‘store-id = 5’ tuples from the index.

Of course, if you don’t have an index on store-id on each node, you’d probably prefer to just get one disk involved in the query, even if all disks can be scanned in parallel.

By: Curt Monash

Curt Monash — Mon, 02 Feb 2009 20:03:30 +0000

As for the other — yeah, I was using the term “hash partitioning” too narrowly. Once again, I’m sorry for the excitement.

By: Curt Monash

Curt Monash — Mon, 02 Feb 2009 19:56:49 +0000

Between 0% and 100%. Wow. Way to not go out on a limb there, Kevin.