I talk with a lot of companies, and repeatedly hear some of the same application themes. This post is my attempt to collect some of those ideas in one place.
1. So far, the buzzword of the year is “real-time analytics”, generally with “operational” or “big data” included as well. I hear variants of that positioning from NewSQL vendors (e.g. MemSQL), NoSQL vendors (e.g. AeroSpike), BI stack vendors (e.g. Platfora), application-stack vendors (e.g. WibiData), log analysis vendors (led by Splunk), data management vendors (e.g. Cloudera), and of course the CEP industry.
Yeah, yeah, I know — not all the named companies are in exactly the right market category. But that’s hard to avoid.
Why this gold rush? On the demand side, there’s a real or imagined need for speed. On the supply side, I’d say:
- There are vast numbers of companies offering data-management-related technology. They need ways to differentiate.
- Doing analytics at short-request speeds is an obvious data-management-related challenge, and not yet comprehensively addressed.
2. More generally, most of the applications I hear about are analytic, or have a strong analytic aspect. The three biggest areas — and these overlap — are:
- Customer interaction
- Network and sensor monitoring
- Game and mobile application back-ends
Also arising fairly frequently are:
- Algorithmic trading
- Risk measurement
- Law enforcement/national security
- Stakeholder-facing analytics
I’m hearing less about quality, defect tracking, and equipment maintenance than I used to, but those application areas have anyway been ebbing and flowing for decades.
3. Much of customer interaction revolves around recommendation and personalization. In connection with that I’ll remind you:
- Multiple sources say that 5 millisecond response is a real need. Srini Srinivasan explained why in a January comment.
- The results of the recommendation and personalization can be delivered in many different ways — product recommendations, ads, special offers, email, snail mail, call center scripts and more. This is the paradigmatic example for my skepticism about complete analytic applications.
4. Networks and sensors emit the epitome of machine-generated data. Data sources include web logs, network logs (in the IT sense), telecommunication networks, other utilities (e.g. electric), vehicle fleets, and more. Application themes include:
- Human monitoring, via some kind of real-time business intelligence view. I hear about that a lot.
- Various kinds of automated response. (Security is an obvious example.)
- Integration with other kinds of application, data source, or use case.
As one example of the last point, Oliver Ratzesberger told me years ago that eBay had up-to-the-minute BI cubes integrating customer response and log data, for the purpose of quickly detecting technology problems. Acunu recently told me that similar applications are one of their sales focuses.
5. In another example, games and mobile applications can be a lot like websites in terms of the analytics that support them (all the more so if we’re talking about games with in-app purchases). Two special features come up repeatedly, however — leaderboards for games, and geospatial data sent by mobile devices.
6. Algorithmic trading is flashy because of the sums of money involved, and because of what is often hyper-low latency; I’ve even heard 50 microseconds, and that’s a slightly out of date figure for a sequence of several atomic operations. But otherwise it’s not one of the more interesting areas to me, for at least two reasons:
- It depends on a lot of latency-specific stuff, such as hand-crafted hardware.
- The participants are secretive — understandably so as they’re literally in a race with each other –and don’t reveal much.
Another reason I don’t study it much is that high-frequency trading could be devastated at any time by some simple regulatory changes.
7. I finally figured out one of the big drivers for better risk analysis. Banks need to keep capital lying around to cover a fraction of the risk they take on. If they can estimate the risk more precisely, and come up with a lower number, then they need to keep less capital. That’s a lot like finding large bags of money.
8. Anti-fraud applications arise in many industries, with many different kinds of data and latency requirement. For example:
- Insurers don’t want to pay bogus claims. They usually have weeks to think about that problem.
- Telcos don’t want to provision services for customers who will defraud them. They have to decide at call-center speed.
- Similarly, retailers don’t want to accept bogus returns.
- Stockbrokers don’t want rogue traders to defeat their controls. A lot of data and analysis go into that mission, as billions of dollars — literally — can be at stake.
9. And finally, the recent Boston Marathon bombing has brought law-enforcement/anti-terrorism applications to the fore. The Boston Globe criticized difficulties in information sharing, but the money quote is:
The FBI followed up by checking government databases and looking for things such as “derogatory telephone communications, possible use of online sites associated with the promotion of radical activity, associations with other persons of interest, travel history and plans, and education history,” according to FBI Supervisory Agent Jason J. Pack. “The FBI also interviewed Tamerlan Tsarnaev and family members. The FBI did not find any terrorism activity.”
Neither the telephone intercept nor the web-surfing tracking is a capability the government routinely admits, unless there was something like a wiretap order that I so far haven’t seen reported.
- Government surveillance is even more inevitable than when I wrote in 2010 that freedom can only be preserved by limiting government USES of data.
- Stakeholder-facing analytics isn’t much better understood than when I wrote about it in 2010.
- I wrote up a different list of analytic use cases back in 2006.
- The continued drop in high-frequency trading latency strengthens my 2009 contrast between the speed of a turtle and the speed of light; we’re now over a 3 * 10^10 difference between the speed of trading and the speed of generic planning, and many turtles walk well faster than 1 cm/sec.