July 31, 2010

Nested data structures keep coming up, especially for log files

Nested data structures have come up several times now, almost always in the context of log files.

I don’t have a grasp yet on what exactly is happening here, but it’s something.

Comments

7 Responses to “Nested data structures keep coming up, especially for log files”

  1. Neil Hepburn on July 31st, 2010 10:19 pm

    Nested data requires a new BI engine?!?

    This sounds like a data modeling challenge rather than a query engine challenge.

    I’ve worked a with graph data models, and apart from cyclic graphs, pretty much any graph can be flattened into a form that can be queried using basic SQL syntax. Cyclic graphs can be rationalized too, but with trade-offs.

    Common Table Expressions in SQL have made it possible to perform these transformations in a single declarative query.

    E.g. for hierarchical data models, creating a reflexive transitive closure can do wonders.

    Graphs can be tamed in many different ways. But I find most developers don’t see data modeling as part of their repetoire, and tend to look for algorithmic solutions.

    This is part of the reason why I feel Computer Science education doesn’t pay enough attention to data modeling.

    As they say, when you’ve got a hammer…

  2. Curt Monash on July 31st, 2010 10:40 pm

    Three issues that aren’t the same:

    1. Can you represent something logically in SQL at all?

    2. Can you represent it logically and fairly concisely?

    3. Can you get good performance in a fairly conventional SQL DBMS?

  3. Dan Nile on August 1st, 2010 9:54 am

    COBOL is back? Should I unpack my lava lamp and pet rock?

    Hope nobody asks you to access the nested structure in an unanticipated way. I heard E.F Codd is working on something to solve that problem.

  4. Neil Hepburn on August 2nd, 2010 1:17 pm

    Indeed.
    The NoSQL guys would do well to learn about IMS, IDS, Total, System 2000, etc. and why Codd proposed the relational model.

    History repeats itself.

  5. Jeff on August 2nd, 2010 3:57 pm

    Hey Curt,

    For what it’s worth, Hive was designed to work with nested data structures, though support for some obvious operations like EXPLODE (https://issues.apache.org/jira/browse/HIVE-510) are not yet implemented.

    Later,
    Jeff

  6. Further thoughts on previous posts | DBMS 2 : DataBase Management System Services on September 27th, 2010 7:30 am

    […] Hammerbacher has made various comments to the effect “Yes indeedy! Hadoop does that too!” (My wording, not his. […]

  7. What those nested data structures are about | DBMS 2 : DataBase Management System Services on October 19th, 2011 12:30 pm

    […] I’ve noted before, the very big web companies have an issue with nested data structures. The subject came up in XLDB talks yesterday too, so my big goal for lunch was to finally […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.