September 20, 2006

I say “sequential”, you say …

I talked with Teradata today, and they called me on my use of the term “sequential.” Basically, if there’s any head movement for disk seeks, some computer science researchers wouldn’t call it “sequential.” I didn’t know that; I was just familiar with the less precise usage of the term in some vendors’ marketing and discussions.* OK, I’ll make up a new, more precise term instead. How about “coarse-grained”?

*And so we have another instance of Monash’s First Law of Commercial Semantics: Bad jargon drives out good.

Comments

8 Responses to “I say “sequential”, you say …”

  1. DBMS2 — DataBase Management System Services»Blog Archive » Teradata vs. the new appliance vendors, technically on September 20th, 2006 7:32 pm

    [...] Since 2002, Teradata has had a “cylinder read” option, allowing coarse-grained reads in the 2 megabyte size, comparable to what DATallegro or Netezza do all the time. Users evidently find this extremely valuable. [...]

  2. Stuart Frost on September 21st, 2006 12:23 am

    Well OK, I guess we do move the disk heads between reads. But that’s because we have sophisticated partitioning to minimize the amount of data read, rather than scanning the whole disk (who would want to do that?). Our appliance is very carefully optimized to minimize I/O waits due to head movement while reading enough data in each access to max out the disk arrays.

    Let’s just do a simple calculation to see if this is effective. In a DATAllegro appliance running a complex mix of concurrent queries, we typically see up to 800MBps per node of twelve disks. At close to 70MBps per disk, that’s around the maximum sequential read speed as quoted by the manufacturer (where caching on the disk or controller is not involved). Hard for a computer scientist to argue with that, eh?

    In contrast, random I/O reading one 32k page at a time will max out at around 300 transactions per second with even the most expensive disks. That’s only 9.6MBps.

    Stuart
    DATAllegro

  3. David Aldridge on October 3rd, 2006 12:30 am

    I’m wondering whether the ability that appliances have to keep disks near their theoretical limit is challenged by the anticipatory scheduler introduced in the Linus 2.6 kernel. Some tests that I have started with Oracle parallel query suggest that it might be.

    The anticipatory scheduler takes a very small pause in the order of one millisecond after satisfying as large red request to see if a read request is then forthcoming that was contiguous with the first. If so, a head movement is avoided. The first test I’ve tried showed that close-to-maximum performance is sustained with eight simultaneous query slaves, with a throughput benefit of 60% over other schedulers.

    I should add that the read requests were only of 256kb, but it seems to me that this scheduler makes the read size very much less relevant.

    I posted the first test results here … http://oraclesponge.wordpress.com/2006/10/02/linux-26-kernel-io-schedulers-for-oracle-data-warehousing-part-ii/

    Any comments, pro or con, by those experienced with appliances or other technologies are very welcome, of course.

  4. DBMS2 — DataBase Management System Services»Blog Archive » IBM and Teradata too on October 3rd, 2006 2:32 am

    [...] By way of contrast, DATallegro would endorse 1, 2, and 5, but argue that table scans via sequential reads (I’ve happily given up the “coarse-grained” terminology, since almost nobody cares) obviate most or all of the need for 3 and 4. And Netezza – well, I guess I shouldn’t comment on their views, because of their strict NDA policy. [...]

  5. Mark Morris on October 5th, 2006 10:31 am

    Rather than ‘sequential’ or ‘coarse grained’, I suggest the terminology of
    ‘pseudo-sequential’. This captures the idea that near sequential rates are
    being achieved regardless of the method – i.e., scheduling and reordering,
    combining, large transfer sizes, etc.

  6. Curt Monash on October 5th, 2006 12:15 pm

    Mark,

    That was actually the first thing I thought of. But I imagined questions like “Is that anything like ‘pseudo-conversational’, and changed direction.”

    Now I’ve gone back to just using “sequential”, purism be damned.

    Best,

    CAM

  7. merovinio on November 23rd, 2006 8:57 am

    Someone can help me with oracle 10g RAC instllation on win server 2003?
    I would like to leave documentation on http://www.merovingio.it

    thank you

  8. Kevin on March 14th, 2007 7:18 am

    Very helpful !!!

    thanks

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.