August 4, 2013

Data model churn

Perhaps we should remind ourselves of the many ways data models can be caused to churn. Here are some examples that are top-of-mind for me. They do overlap a lot — and the whole discussion overlaps with my post about schema complexity last January, and more generally with what I’ve written about dynamic schemas for the past several years..

Just to confuse things further — some of these examples show the importance of RDBMS, while others highlight the relational model’s limitations.

The old standbys

Product and service changes. Simple changes to your product line many not require any changes to the databases recording their production and sale. More complex product changes, however, probably will.

A big help in MCI’s rise in the 1980s was its new Friends and Family service offering. AT&T couldn’t respond quickly, because it couldn’t get the programming done, where by “programming” I mainly mean database integration and design. If all that was before your time, this link seems like a fairly contemporaneous case study.

Organizational changes. A common source of hassle, especially around databases that support business intelligence or planning/budgeting, is organizational change. Kalido’s whole business was based on accommodating that, last I checked, as were a lot of BI consultants’.

That ability was also the most noteworthy feature of PeopleSoft’s application development technology, back in 1990s, at least the way I remember Rick Berquist explaining PeopleTools to me.

Mergers & acquisitions. Obviously, accommodating a business combination has a huge effect on data management, especially if you follow the usual path of starting with separate legacy systems and combining them where possible over time. And it plays merry hell with the trend-tracking parts of your accounting and BI systems.

Application replacement. Replace your third-party apps, for whatever reason, and you almost surely get a new database structure too. The same, of course, goes when you deploy entirely new apps. And when things get either more integrated (e.g. by replacing silos with an application suite) or less so (e.g. by introducing selective SaaS apps), special fun ensues.

Refactoring and MDM. There are numerous ways it can make sense to refactor your custom apps, including your custom/in-house ones. One important reason of many is to increase your adoption of master data management.


The new stuff

Marketing campaign data. Marketers are full of creative ideas, many of which involve generating responses or other data about targets. As I’ve noted before, this data can come in a variety of structures.

Further confusing matters:

Social data. In particular, marketers like “social data”, whether through direct interaction with consumers, or from scraping online discussions. That includes a lot of text data, and representing text data in ways that work well with analytic tools is a never-ending battle.

Third-party data. Enterprises are making ever more use of data supplied by third parties. That data typically shows up whenever the customer chooses to pay for it, in whatever form the data vendor chooses to supply.

Internet log data. Website logs are a mess, and the same goes for many mobile-app equivalents. Part of the reason is nested data structures. But even leaving those aside, it’s a best practice to extract and directly store different fields at different points in time.

The examples I’ve written about explicitly are eBay and Zynga. Satisfying a similar need is one of the pillars of the Splunk value proposition.

Machine-generated data. Besides the points I’ve already noted about log data, there are other issues with machine-generated data. In particular:

Derived data. As per some of the cases cited above — whatever its reason for existing, derived data leads to schema change.

Bottom line: Any data-related business processes you have — for example data governance — should assume that your data models will be in perpetual, rapid flux.

Related links


3 Responses to “Data model churn”

  1. Is a common framework for library data a dead end? | Brinxmat's blog on August 8th, 2013 8:51 am

    […] | DBMS 2 : DataBase Management System Services [Internet]. [cited 2013 Aug 8]. Available from: ‹ Graph data: Always via an intermediary Posted in […]

  2. Curt Monash on video | DBMS 2 : DataBase Management System Services on August 8th, 2013 1:24 pm

    […] posted on dynamic schemas data model churn a few days […]

  3. Data models | DBMS 2 : DataBase Management System Services on February 22nd, 2015 10:08 pm

    […] In 2013 I observed that data models will be in perpetual, rapid flux. […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.