January 31, 2008

5 kinds of data structure and 16 kinds of data access method

My recent post about datatype extensibility zoomed over at least one head, as per the comment thread. Since then I’ve googled, and come to suspect that part of what I was assuming as common knowledge may not be so common after all. So I’m going to back up and explain a bit about data access methods, as well as the sub-topic of data structures. If you take nothing else away from this post, I hope it will at least remind of you of the sheer variety of ways data can be stored on disk or in RAM.

First, let’s define the concept of data access method in three steps:

  1. The basic mission of a database management system is to, upon request, return part of the database to the requester. In a relational database, what’s returned will be a record or set of records; in an object-oriented database it will be an object or set of objects; and so on.

  2. In a few major brands of data warehouse appliance, the DBMS may call pretty much the whole database into memory, then find what it wants from there. Most other DBMS are more finicky, and try to retrieve from disk only the data requested. More precisely, they usually retrieve data blocks of a fixed size, and try to bring back only the blocks that may contain the information being looked for.

  3. How they make these selections is their data access method. More precisely, it’s the part of the data access method that deals with getting data off of disk. Often, there will be other parts that govern what happens to the data once it’s in memory.

At their core, most data access methods consist of two things:

There are many kinds of data structure, with some of the most important being:

There’s a huge number of possible data access methods, and each major kind can have a broad variety of tweaks. Discussing them was meant to be the main subject of this post, but it seems to have gotten plenty long enough already. And in my flu-like-symptoms funk, I’m temporarily lacking in the enthusiasm to keep going.

So I’ll just finish with a list of some data method alternatives. Please feel free to add others in the comment thread. Some time when I’m more dug out from my backlog, I’ll try to circle back and actually describe how some or all of them work.

  1. Simple file manager.

  2. Classic relational b-tree.

  3. Hash index.

  4. Bitmap.

  5. Other columnar.

  6. Star.

  7. Geospatial r-tree.

  8. Hierarchy-of-arrays MOLAP.

  9. Inverted index.

  10. Classic full-text index.

  11. Classic linked-list.

  12. Classic object-oriented.

  13. Native XML (all).

  14. Trie.

  15. File plus calculated metadata (e.g., for still image).

  16. Simple key/value lookup.

Comments

3 Responses to “5 kinds of data structure and 16 kinds of data access method”

  1. Jack Orenstein on January 31st, 2008 9:59 pm

    You missed three of my favorites.

    1) skip-list: Neat alternative to balanced binary trees.

    2) z-order (I did original research on this one): data transformation that turns any data structure supporting random and sequential access into a spatial data structure.

    3) linear hashing.

  2. Complete Rewrite on October 30th, 2008 12:32 pm

    Models and Efficiency, part 2: Databases…

    There are endless possibilities for data structures and access methods. The implementers of early database systems, such as System R, chose to use a simple one-to-one mapping between the model and the physical storage. It was a natural choice as a fi…..

  3. Gopi Nathan on September 19th, 2010 9:10 am

    Well nice topic. But there are some half-true statements <>

    The original CODASYL model did not have index. But Cullinet’s IDMS did give full support for index in 1984 itself. And in 1992 gave full SQL support – against the NONSQL (sic) database as well as native table definitions. Today’s CA-IDMS is a database supporting two models – CODASYL and Relational in one DBMS.

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers


Warning: include(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /home/dbms2cm/public_html/wp-content/themes/monash/static_sidebar.php on line 29

Warning: include(http://www.monash.com/blog-promo.php): failed to open stream: php_network_getaddresses: getaddrinfo failed: Name or service not known in /home/dbms2cm/public_html/wp-content/themes/monash/static_sidebar.php on line 29

Warning: include(): Failed opening 'http://www.monash.com/blog-promo.php' for inclusion (include_path='.:/usr/lib/php:/usr/local/lib/php') in /home/dbms2cm/public_html/wp-content/themes/monash/static_sidebar.php on line 29