2. Numerous vendors are blending SQL and JSON management in their short-request DBMS. It will take some more work for me to have a strong opinion about the merits/demerits of various alternatives.
The default implementation — one example would be Clustrix’s — is to stick the JSON into something like a BLOB/CLOB field (Binary/Character Large Object), index on individual values, and treat those indexes just like any others for the purpose of SQL statements. Drawbacks include:
- You have to store or retrieve the JSON in whole documents at a time.
- If you are spectacularly careless, you could write JOINs with odd results.
IBM DB2 is one recent arrival to the JSON party. Unfortunately, I forgot to ask whether IBM’s JSON implementation was based on IBM DB2 pureXML when I had the chance, and IBM hasn’t gotten around to answering my followup query.
3. Nor has IBM gotten around to answering my followup queries on the subject of BLU, an interesting-sounding columnar option for DB2.
4. Numerous clients have asked me whether they should be active in DBaaS (DataBase as a Service). After all, Amazon, Google, Microsoft, Rackspace and salesforce.com are all in that business in some form, and other big companies have dipped toes in as well.
I’m skeptical that one can succeed both in that market and in selling database software, for reasons including:
- Nobody I can think of has done so.
- The value propositions are different.
- DBaaS is about having administration be so easy that you the customer doesn’t need to worry about it.
- Database software is about one or more of:
- Development ease.
- Big-enterprise/legacy-vendor considerations.
I’m also skeptical about service-only DBaaS strategies, because users will naturally resist vendor lock-in.
But despite all my skepticism, DBaaS is an area I should probably learn more about.
5. I plan to spend more time looking at machine learning and other advanced analytics. I doubt they’ll soon match the past few years’ hype about “big data analytics”, but even the reality of modern analytics looks like it’s getting more interesting. Ditto if somebody has an interesting twist on more traditional predictive analytics.
6. Three years ago, I wrote:
- It is inevitable* that governments and other constituencies will obtain huge amounts of information, which can be used to drastically restrict everybody’s privacy and freedom.
- To protect against this grave threat, multiple layers of defense are needed, technical and legal/regulatory/social/political alike.
- One particular layer is getting insufficient attention, namely restrictions upon the use (as opposed to the acquisition or retention) of data.
*And indeed in many ways even desirable
It is now frighteningly obvious that the US is becoming a high-surveillance society. The Boston Marathon bombing added three new elements to an already snowballing trend:
- A revelation that the FBI could track Tamerlan Tsarnaev’s communication content without any known warrant.
- A further revelation that the police know how to put on large paramilitary displays of force (and that the public generally approves).
- An increased belief that widespread video surveillance of public places is a Good Thing.
I need to write more about privacy.