May 20, 2013

Some stuff I’m working on

1. I have some posts up on Strategic Messaging. The most recent are overviews of messaging, pricing, and positioning.

2. Numerous vendors are blending SQL and JSON management in their short-request DBMS. It will take some more work for me to have a strong opinion about the merits/demerits of various alternatives.

The default implementation — one example would be Clustrix’s — is to stick the JSON into something like a BLOB/CLOB field (Binary/Character Large Object), index on individual values, and treat those indexes just like any others for the purpose of SQL statements. Drawbacks include:

IBM DB2 is one recent arrival to the JSON party. Unfortunately, I forgot to ask whether IBM’s JSON implementation was based on IBM DB2 pureXML when I had the chance, and IBM hasn’t gotten around to answering my followup query.

3. Nor has IBM gotten around to answering my followup queries on the subject of BLU, an interesting-sounding columnar option for DB2.

4. Numerous clients have asked me whether they should be active in DBaaS (DataBase as a Service). After all, Amazon, Google, Microsoft, Rackspace and salesforce.com are all in that business in some form, and other big companies have dipped toes in as well.

I’m skeptical that one can succeed both in that market and in selling database software, for reasons including:

I’m also skeptical about service-only DBaaS strategies, because users will naturally resist vendor lock-in.

But despite all my skepticism, DBaaS is an area I should probably learn more about.

5. I plan to spend more time looking at machine learning and other advanced analytics. I doubt they’ll soon match the past few years’ hype about “big data analytics”, but even the reality of modern analytics looks like it’s getting more interesting. Ditto if somebody has an interesting twist on more traditional predictive analytics.

6. Three years ago,  I wrote:

  • It is inevitable* that governments and other constituencies will obtain huge amounts of information, which can be used to drastically restrict everybody’s privacy and freedom.
  • To protect against this grave threat, multiple layers of defense are needed, technical and legal/regulatory/social/political alike.
  • One particular layer is getting insufficient attention, namely restrictions upon the use (as opposed to the acquisition or retention) of data.

*And indeed in many ways even desirable

It is now frighteningly obvious that the US is becoming a high-surveillance society. The Boston Marathon bombing added three new elements to an already snowballing trend:

I need to write more about privacy.

Comments

5 Responses to “Some stuff I’m working on”

  1. aaron on May 20th, 2013 3:31 pm

    Standard fixed JSON *with a fixed simple schema* is not different from what generic RDBMS already does (possibly including functional indices for performance). The hard part is generally not save/retrieve by field (unless fields are huge – where it hits the general LOB issues) – it is more around flexibility.

    If you have complex schemas – cycles, deep nesting, or loose, dynamic, or evolving schemas the game changes. You have now changed from a projection database to a graph database. That is hard to support.

  2. Curt Monash on May 20th, 2013 4:31 pm

    Aaron,

    I agree it would be bad if the indexing into the JSON object assumed a fixed schema.

    But whose implementation actually work like that?

  3. aaron on May 21st, 2013 9:10 am

    {Please correct me if you know of something out there – I could use it!]

    It depends what you need to do. Searching for a field works everywhere (though LOB stores that need to parse/query each field each time often scanning with no index are miserably slow.) Walking through dynamic relationships really requires specialized optimization – I’ve been forced to shred things like AVRO into triple stores to do clever things that perform.

    Pretty much all of the current RDBMS support of JSON I’ve seen is either:
    – static, similar to XML support
    – flexible but weak, LOB based, with perhaps some limited indexing

    My point is that doing full scale deep JSON shred store with fast effective query for things like cycling schemas requires OODBMS or graph optimizations.

    Net-net – graph DBs can do cool and fast things with evolving JSON. Doc store DBs can do some things well, but not much better than RDBMS.

  4. Sébastien Derivaux on May 22nd, 2013 11:12 am

    For JSON in PostgreSQL you can see the following presentation that is quite interesting.
    http://nosql.mypopescu.com/post/47692111874/posgresql-as-a-schemaless-database

  5. JSON in DB2 | DBMS 2 : DataBase Management System Services on September 24th, 2013 4:43 am

    […] for multiple data manipulation languages (DMLs) or APIs — and there’s a special boom in JSON support, MongoDB-compatible or otherwise. So I talked earlier tonight with IBM’s Bobbie Cochrane […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.