I say “sequential”, you say …
I talked with Teradata today, and they called me on my use of the term “sequential.” Basically, if there’s any head movement for disk seeks, some computer science researchers wouldn’t call it “sequential.” I didn’t know that; I was just familiar with the less precise usage of the term in some vendors’ marketing and discussions.* OK, I’ll make up a new, more precise term instead. How about “coarse-grained”?
*And so we have another instance of Monash’s First Law of Commercial Semantics: Bad jargon drives out good.
| Categories: Teradata, Theory and architecture | 8 Comments |
No locks, no logs — no problem?
There’s another cool-sounding part to the Netezza story, which straddles their chips and their software: The FPGA takes over the work of assuring database consistency. If the system attempts to read and write a record at the same time, the FPGA keeps thing straight. This eliminates the need for locks — at least if you don’t care about transactional integrity — and some of the reason for logs. (I guess that in lieu of any kind of rollback/rollforward they just rely on failover to mirrored disks.)
This isn’t exactly the way one would want to do OLTP, and in general my head is shaking as I write this — but it sure seems to suffice for some rather demanding data warehouse users.
| Categories: Data warehouse appliances, Netezza, Theory and architecture | 2 Comments |
Netezza’s chip story
In addition to its software story, Netezza of course has a rather unique chip story. Where other vendors might have standard disk controllers and high-powered microprocessors, Netezza respectively has a FPGA (Field-Programmable Gate Array) and lesser microprocessor (PowerPC). Netezza claims that two major advantages of these choices are:
- 5X throughput/performance improvement
- Much lower heat and power consumption.
The main function of the FPGA, other than generically getting data on and off disk, is to restrict and project tables (i.e., execute single-table WHERE clauses). Netezza claims that their FPGAs can perform these operations on the streaming data at least as quickly as an expensive, hot, power-hungry top-end microprocessor would, and indeed faster. The key word is “streaming”, which they contrast to the microprocessor’s need to get the data in and then back out of RAM (cache or otherwise).
I’ll be interested to see whether somebody can muster a ringing refutation to Netezza’s claims.
| Categories: Data warehouse appliances, Netezza | 12 Comments |
Netezza vs. conventional data warehousing RDBMS
For various reasons, I’m not going to try to give a comprehensive overview of the Netezza story. But I’d like to highlight four points that illustrate a lot of the difference between Netezza’s architecture and that of more conventional data warehousing DBMS.
Read more
| Categories: Data warehouse appliances, Data warehousing, DATAllegro, Netezza | 3 Comments |
Dealing with Netezza has not been easy
Over the past year, Netezza has exhibited the squirreliest question-dodging behavior I’ve seen from a DBMS vendor since – actually, since Sybase tried to conceal the System 10 fiasco in 1993-5. To its credit, however, Netezza finally decided to open the kimono. Specifically, they invited me to their user conference, which I attended today, and indeed were quite helpful in FINALLY getting my questions addressed, and in offering more access as needed.
Read more
| Categories: Data warehouse appliances, Netezza | 2 Comments |
Is data warehousing now all about sequential access?
A lot of evidence is pointing to a major paradigm shift in data warehouse RDBMS, along the lines of:
Old way: Assume I/O is random; lower total execution time by improving selectivity and thus lowering the amount of I/O.
New way: Drive the amount of random I/O to near zero, and do as much sequential I/O as necessary to achieve this goal.
Examples include:
- Data warehouse appliances (see especially this discussion of DATallegro’s architecture)
- Columnar systems (see Nathan Myer’s first comment in this discussion of the much-hyped Required Technologies prototype)
- Memory-centric systems, notably SAP’s BI Accelerator
| Categories: Data warehouse appliances, DATAllegro, Memory-centric data management, SAP AG, Theory and architecture, TransRelational | 4 Comments |
Overflowing spam catcher
I use Akismet as a spam-catcher. On the whole it’s good, but it has one annoying deficiency — you can only review the 150 most recent suspected spam. This time around, however, I had 766 suspected spam. If you had a valid comment in the 616 I couldn’t review, I’m sorry. Please be so kind as to resubmit it.
Thanks,
CAM
EDIT: The attack continues. Today I deleted 245 real or imagined spam. A couple of days ago it was 135, all real.
| Categories: About this blog | 1 Comment |
