Comments on: Logless, lockless Netezza more carefully explained

By: Jerry Leichter

Jerry Leichter — Sun, 09 Aug 2009 11:19:02 +0000

re: asdf
This comment sat for months without the obvious reply: You’re ignoring commits. Oversimplifying: While the two transactions are both active, they will each think they incremented the counter from 0 to 1. There’s no problem with that; *uncommitted* transactions aren’t serializable and don’t need to maintain global consistency. When it comes time to commit, the rules is: You can only *commit* a value if the timestamp on it matches the timestamp it had when you read it. So the first transaction will succeed in committing, while the second will abort. ACID is maintained.

The logic here is that a transaction acts as if it’s working on a snapshot of the database as it existed at the time the transaction began. It can neither read nor write anything that is more recent than the snapshot. Attempted reading can be handled by looking at history; attempted writing – “changing history” – forces an abort.

Three things to consider:

– The work done by the aborted transaction is lost – as is the case with *any* aborted transaction. A clash of timestamps like this is the analogue of a deadlock in a lock-based system. MVCC systems are called “optimistic” because they assume such clashes will be rare enough not to matter – just as a lock-based system assumes deadlocks are rare. Both systems remain correct if their assumptions are violated, but performance disintegrates.

– There are optimizations to avoid wasting too much work. If read the counter’s timestamp “with intent to lock”, we can mark the counter to say that, and a second “read with intent to lock” can abort right away. Or the system can abort the new of the two transactions, or the one that has fewer pending writes, or whatever. Whether these approaches help or hurt has to be determined.

– There are more subtle interactions: Transaction A reads and increments counter; transaction B reads counter and writes it to counter2. If A commits before B does, B won’t abort, but we lose global serialization with respect to the timestamps (and if you have another counter and another transaction, there may be no consistent serial order at all). Fully developed MVCC systems have to deal with this kind of thing – or you can decide that the really strong “global semantic consistency” being violated here is more than you care about. (You don’t always get it with locking systems either.)

By: What does Netezza do in the FPGAs anyway, and other questions | DBMS2 -- DataBase Management System Services

Sat, 08 Aug 2009 09:18:57 +0000

[…] which for now seems to mean recognizing which rows are and aren’t valid under Netezza’s form of MVCC (MultiVersion Concurrency […]

By: asdf

asdf — Mon, 13 Oct 2008 20:59:58 +0000

“A query only returns rows that had been committed before the query began.”
So if transaction A updates a counter (from 0 to 1), and sooner than it manages to commit, transaction B (which started after A) updates the counter again (from 0 to 1), the counter is still 1. Great! But in true ACID the counter would be 2.

By: DBMS2 — DataBase Management System Services»Blog Archive » Are row-oriented RDBMS obsolete?

Mon, 22 Jan 2007 11:23:53 +0000

[…] Timestamps are used for inserts and deletes; otherwise, there are no data changes. (Without that kind of approach, the update strategy in Point #2 couldn’t be viable.) A big benefit to these timestamps is that you can assure integrity via “snapshot isolation”; i.e., by a virtual rollback to a recent point in time. Thus, Vertica can get away without any kind of locks or, for that matter, transaction/redo logs. Row-oriented Netezza uses a similar logless, lockless approach. […]

By: Stuart Frost

Stuart Frost — Sat, 21 Oct 2006 18:51:05 +0000

Curt,

If I understand this right, there’s no real way to have multiple ‘transactions’ at the same time, which would seem to be a significant limitation if multiple users want to do updates etc. at the same time (which is remarkably common, even in DW systems).

Also, does this mean that an updated row effectively changes position on the disk? If so, how does this affect Netezza’s zone maps? Zone maps only work well in systems that load data in strict date order (thereby providing a kind of date partitioning). If the row order is changed later, the performance benefits of zone maps will degrade over time.

Stuart
DATAllegro