November 2, 2014

Notes on predictive modeling, November 2, 2014

Following up on my notes on predictive modeling post from three weeks ago, I’d like to tackle some areas of recurring confusion.

Why are we modeling?

Ultimately, there are two reasons to model some aspect of your business:

How precise do models need to be?

Use cases vary greatly with respect to the importance of modeling precision. If you’re doing an expensive mass mailing, 1% additional accuracy is a big deal. But if you’re doing root cause analysis, a 10% error may be immaterial.

Who is doing the work?

It is traditional to have a modeling department, of “data scientists” or SAS programmers as the case may be. While it seems cool to put predictive modeling straight in the hands of business users — some business users, at least — it’s rare for them to use predictive modeling tools more sophisticated than Excel. For example, KXEN never did all that well.

That said, I support the idea of putting more modeling in the hands of business users. Just be aware that doing so is still a small business at this time.

“Operationalizing” predictive models

The topic of “operationalizing” models arises often, and it turns out to be rather complex. Usually, to operationalize a model, you need:

In some cases, the two programs might be viewed as different modules of the same system.

While it is not actually necessary for there to be a numerical score — or scores — in the process, it seems pretty common that there are such. Certainly the score calculations can create a boundary for loose-coupling between model evaluation and the rest of the system.

That said:

In any case, operationalizing a predictive model can or should include:

Traditional IT considerations, such as testing and versioning, apply.

What do we call it anyway?

The term “predictive analytics” was coined by SPSS. It basically won. However, some folks — including whoever named PMML — like the term “predictive modeling” better. I’m in that camp, since “modeling” seems to be a somewhat more accurate description of what’s going on, but I’m fine with either phrase.

Some marketers now use the term “prescriptive analytics”. In theory that makes sense, since:

Edit: Ack! I left the final paragraph out of the post, namely:

In practice, however, the term “prescriptive analytics” is a strong indicator of marketing nonsense. Predictive modeling has long been used to — as it were — prescribe business decisions; marketers who use the term “prescriptive analytics” are usually trying to deny that very obvious fact.

Comments

4 Responses to “Notes on predictive modeling, November 2, 2014”

  1. Thomas W. Dinsmore on November 3rd, 2014 10:13 pm

    Kurt,

    A few comments.

    (1) Precision and accuracy are not the same thing; predictive models can be precise but not accurate, accurate but not precise, both or neither.

    (2) Organizations do not let “business users” deliver high value “money” analytics for applications like credit risk, fraud, trading and so forth. The people who work in these areas aren’t afraid to code.

    (3) A scoring engine simply computes a numerical score; a decision engine implements rules based on scores and other criteria. Separating them into modules makes sense because the tasks are often asynchronous; we may want to score in batch, store the results and use the score in a real-time decision.

    It also makes sense to separate the scoring engine from the model training or learning engine because (a) the tasks are asynchronous; (b) scoring is embarrassingly parallel and can be implemented inside MPP databases; (c) scoring is a production application; and (d) scoring does not require highly trained analytic specialists.

    PMML is a fine standard, but only works if the organization has aligned the data models for the deployment environment and the model development environment. Many haven’t.

    The process for operationalizing a model also requires a facility to catalogue deployed models and to track performance over time.

    (4) There’s a bit more to prescriptive analytics than marketing

    http://en.wikipedia.org/wiki/Prescriptive_analytics

    Regards,

    Thomas

  2. Curt Monash on November 3rd, 2014 10:42 pm

    Thomas,

    That Wikipedia article is my top example for holding the opposite opinion from you. But I didn’t realize until you sent me back to the article that the term “Prescriptive analytics” is trademarked by one obscure firm, and hence should probably be ignored by the rest of the industry entirely.

  3. Thomas W. Dinsmore on November 4th, 2014 5:45 am

    Kurt,

    Ayata is obscure? No more than — let’s say — Nutonian. Obviously, Ayata punches above its weight if their thoughtware is part of the lexicon.

    Regards,

    Thomas

  4. Curt Monash on November 8th, 2014 7:33 am

    Thomas,

    What’s the relevance of Nutonian here? This post contradicts some of their marketing claims too. Even more to the point, it is approximately true that nobody except a trademark holder should use a trademarked term.

    Also — I somehow forgot the last paragraph of the post. It’s been added in now.

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.