June 5, 2016

Challenges in anomaly management

As I observed yet again last week, much of analytics is concerned with anomaly detection, analysis and response. I don’t think anybody understands the full consequences of that fact,* but let’s start with some basics.

*me included

An anomaly, for our purposes, is a data point or more likely a data aggregate that is notably different from the trend or norm. If I may oversimplify, there are three kinds of anomalies:

Two major considerations are:

What I mean by the latter point is:

Anyhow, the Holy Grail* of anomaly management is a system that sends the right alerts to the right people, and never sends them wrong ones. And the quest seems about as hard as that for the Holy Grail, although this one uses more venture capital and fewer horses.

*The Holy Grail, in legend, was found by 1-3 knights: Sir Galahad (in most stories), Sir Percival (in many), and Sir Bors (in some). Leading vendors right now are perhaps around the level of Sir Kay.

Difficulties in anomaly management technology include:

Consequences of the last point include:

Donald Rumsfeld’s distinction between “known unknowns” and “unknown unknowns” is relevant here, although it feels wrong to mention Rumsfeld and Sir Galahad in the same post.

And so a reasonable summary of my views might be:

Anomaly management is an important and difficult problem. So far, vendors have done a questionable job of solving it.

But there’s a lot of activity, which I look forward to writing about in considerable detail.

Related link


8 Responses to “Challenges in anomaly management”

  1. Chris Martins on June 6th, 2016 8:53 am

    Interesting Post
    I wonder if you could speak to the issues of:
    – timeliness of response, as I would think that is part of the “holy grail.” This is part of the argument for complex event processing, when the anomalies are in streaming data, you use CEP to detect and act quickly.
    – also, the extent to which machine learning helps systems improve their ability to 1) detect anomalies and 2) distinguish between signal and noise.
    – Is there not a class of anomalies where we indeed know what they’ll be, but we don’t know when they’ll happen. (hence your Rumsfeld reference).

  2. David Gruzman on June 8th, 2016 10:55 am

    Machine learning is very important, since we can have a lot of different subsets of data with different behavior. We have to “learn it”, instead of configuring.
    Finding anomalies which we see first time also can be addressed by all kinds of “single class” classification algorithms. I tend to see that it is hard to do with ML, but I believe it is even harder to solve without…

  3. Curt Monash on June 14th, 2016 2:47 am

    Re timeliness:

    I’d say that the detection should be as timely as possible. But what’s possible depends on, for example, whether the anomaly is a single event or a deviation from a “typical” number of events per minute.

  4. David Gruzman on June 14th, 2016 9:26 am

    I would rephrase that detection should be as soon as there are statistical (or other) grounds to detect it.

  5. Dinesh Vadhia on June 16th, 2016 7:36 am

    The problem with machine learning in general and anomaly detection specifically is the necessary training/re-training cycle. The end-result from training the data with the ML algorithm is a static (or batch) model. Once new data arrives, the model has to be re-trained. Lather, rinse and repeat. There is an obvious disconnect between data pouring in and a detection system that is trained on “old” data.

    This is also the reason why ML cannot be integrated into enterprise software systems. You cannot stop the ERP system while the ML system is re-trained.

  6. David Gruzman on June 16th, 2016 1:17 pm

    I completely agree that practical application of ML is challenging, but still possible for the Anomaly detection. We can try to build general enough models which work well even if a bit outdated – it is one way.
    Better way is to have daily, hourly and more fine grained models and we can aggregate and use combination of them “on demand”.
    Specifically it makes sense when there are strong bias to specific time frames like week days, rush hours, etc.

  7. Notes on anomaly management | DBMS 2 : DataBase Management System Services on October 10th, 2016 3:35 am

    […] In June I wrote about why anomaly management is hard. Well, not only is it hard to do; it’s hard to talk about as well. One reason, I think, is […]

  8. Notes on anomaly management | DBMS 2 : DataBase Management System Services – Cloud Data Architect on October 11th, 2016 1:24 am

    […] In June I wrote about why anomaly management is hard. Well, not only is it hard to do; it’s hard to talk about as well. One reason, I think, is that […]

Leave a Reply

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.