I’ve already suggested that several apparent issues in predictive analytic agility can be dismissed by straightforwardly applying best-of-breed technology, for example in analytic data management. At first blush, the same could be said about the actual analysis, which comprises:
- Data preparation, which is tedious unless you do a good job of automating it.
- Running the actual algorithms.
Numerous statistical software vendors (or open source projects) help you with the second part; some make strong claims in the first area as well (e.g., my clients at KXEN). Even so, large enterprises typically have statistical silos, commonly featuring expensive annual SAS licenses and seemingly slow-moving SAS programmers.
As I see it, the predictive analytics workflow goes something like this:
- Business-knowledgeable people develop a theory as to what kinds of information and segmentation could be valuable in making better business micro-decisions.
- Statistics-knowledgeable people determine a structure for modeling that reflects this theory.
- Statistics-knowledgeable people tweak the model over time, within a fixed general structure, as new data comes in.
- (Optional) Somebody sees to acquiring whatever data is needed that the organization doesn’t already have (and won’t get in the ordinary course of ongoing business).
The optional last part can be a purchase of third-party information (relatively fast and easy) or the development of a business process (and if necessary associated software) to capture the information (not always so easy). But even if that’s taken care of, or not present, we have at least two hand-offs where agility can be lost:
- Businesspeople may throw a request “over the wall” to the statisticians, who then work on it as their schedule permits.
- Once created, a model may be so set in stone that even small changes are as hard as building a new model from scratch.
The second problem can be solved by the statisticians themselves, without outside involvement. Model research and model refinement should be separate processes. You can recheck your clustering on one schedule, but recalibrate your regressions against each cluster more frequently. If that all sounds forbiddingly difficult, perhaps your model recalibration process needs another level of automation.
So I’ve finally gotten to the point of saying what may have been obvious from the start: The only excusable impediment to predictive analytic agility is the hand-off from the people who know the business to the people who know the math. So let’s examine ways that difficulty can be resolved.
At big internet companies, the usual answer is something like
Hey, it’s just data. From web logs. And network event logs. The data scientists know how to handle that.
In financial trading firms, the answer is more
The traders and analysts work closely together. Very closely. In fact, when the traders rip out their phones and throw them across the room, the analysts need to duck to avoid getting clobbered.
In credit card or telecom marketing or insurance actuarial organizations, the answer may be
Don’t worry; the stats geeks have been at this for a long time; they really do understand our business.
All three approaches work.
But what about conventional enterprises, where line-of-business people may not be as math-savvy as internet developers or financial traders, and where the math experts may not have the business issues down cold? My flippant answer is that businesspeople should know some math too.* My more serious answer is that the “business analyst” role should be expanded beyond BI and planning to include lightweight predictive analytics as well.
*I wasn’t being entirely flippant, of course. Statistics is even being taught in high school these days. And when I got a PhD in game theory, 2/3 of my thesis committee was at the Harvard Business School.
For example, at retailers:
- Market basket analysis is pretty simplistic (it only looks at small subsets of a basket at a time).
- Seasonality is tricky. (Weather and so on can skew it.)
- Each store or region can be its own universe.
- Some of the results of analytics are rather coarse-grained — e.g., merchandise adjacencies — so precision in statistical analysis may not matter much anyway.
And so truly rigorous statistical analysis may be both unfeasible and unnecessary; a lot of business-informed seat-of-the-pants reasoning needs to be mixed in. Consequently, there’s a lot to be said for pushing at least some retail predictive analytics pretty close to the merchandising department(s).
Similar stories could be told in many other industries and pursuits, including but emphatically not limited to:
- Event marketing.
- College admissions.
- Political campaigning.
- Field maintenance at utility companies.
- Price-setting (across many industries).
In each case, it’s easy to see how statistical and predictive analytic techniques could add real value to the business. But it’s hard to imagine how the enterprise could support the kind of large, experienced, business-knowledge analytic operation one might find in hedge fund investing or telecom churn analysis. And absent that, it’s tough to see why the only people doing predictive analytics for the organization should sit in some silo of statistical expertise.