August 11, 2010

Big Data is Watching You!

There’s a boom in large-scale analytics. The subjects of this analysis may be categorized as:

The most varied, interesting, and valuable of those four categories is the first one.

That may change some day, with the growing importance of machine-generated data, and of big-data science in particular. But I think it’s a fair assessment at the present, and for at least the next few years.

Some of the most interesting use cases are concentrated in the areas of identifying individuals, groups of people, or behaviors of (groups of) people. For example:

In most cases, the analysis and/or run-time execution of the relevant models is done with the help of analytic DBMS. Other technologies that come into play include non-DBMS MapReduce (Hadoop), graph engines, and CEP (Complex Event Processing). The vendor most heavily represented on that list is probably Aster Data, because:

And by the way, all this only scratches the surface of what will be possible down the road. It’s based mainly on where you live, what you purchase, how you behave on websites, and who you communicate with. Other kinds of data, which could be used to be yet more intrusive, generally aren’t involved.

I actually have two points in drawing up this list. One is golly-gee-whiz about how a lot of analytically sophisticated applications are actually getting into production. The other is to highlight the privacy and liberty threats If This Goes On Unchecked (which is why I didn’t include some other less-people-focused examples). There’s also a related danger that, to the extent we don’t get some smart regulations to keep us safe(r), we’ll get a bunch of stupid regulations instead.

The Analytic Era has only just begun.


6 Responses to “Big Data is Watching You!”

  1. dave on August 11th, 2010 9:56 am

    well, i’m not so sure that it’s just begun – i mean, weren’t those former anthropologists doing sociograms for world war 2 to id ranking officers and deployments? at any rate, some very interesting applied cases on the part 1 above (people) can be seen at the site of valdis krebs – the same guy who used publicly available info and basic sna software to model the 9/11 terrorism network (was in the times i believe):

  2. Jeff on August 11th, 2010 9:17 pm

    Hey Curt,

    Nice piece. I like your classification of data sources, and the applications certainly look familiar. To see some examples of Cloudera customer use cases in the same domain, check out the slides from our recent webinar at


  3. unholyguy on August 11th, 2010 9:24 pm

    Wondering what % of the economy flows thru the capital markets of the world? Trades might trump people…

  4. Further thoughts on previous posts | DBMS 2 : DataBase Management System Services on September 27th, 2010 7:29 am

    […] Hammerbacher has made various comments to the effect “Yes indeedy! Hadoop does that too!” (My wording, not his. […]

  5. The technology of privacy threats | DBMS 2 : DataBase Management System Services on January 22nd, 2011 3:47 pm

    […] This post is the second of a series. The first one was an overview of privacy dangers, replete with specific examples of kinds of data that are stored for good reasons, but can also be repurposed for more questionable uses. More on this subject may be found in my August, 2010 post Big Data Is Watching You! […]

  6. Chris Bird on February 2nd, 2011 10:50 am

    For me the interesting part of all of this is when you “join up” stuff. Putting together the little digital “tells” and clues in order to make something more interesting/useful.

    Putting it together just for the sake of knowing is fun but not necessary valuable (although it could become so – ask Google). Putting things together so action can be taken is immediately valuable.

    Also it isn’t necessarily about drawing positive inferences. For example, being able to predict that you won’t take a flight that you are booked on is something the airlines have been doing for years. For example, if you make 2 reservations that are impossible to action together means that one should be cancelled (ah yes but which? That’s Revenue Integrity secret sauce!). The beneficiary here? The airline – it has placed unsellable inventory back into the market. Other possible travelers – something sold out becomes available.

    Take that a step further – imagine that you are booked SFO – Dallas today and your geo location device says you are in Seattle with no hope of making the flight, now what? Joining current location with transaction knowledge….

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.