Log analysis

Discussion of how data warehousing and analytic technologies are applied to logfile analysis. Related subjects include:

September 6, 2016

“Real-time” is getting real

I’ve been an analyst for 35 years, and debates about “real-time” technology have run through my whole career. Some of those debates are by now pretty much settled. In particular:

A big issue that does remain open is: How fresh does data need to be? My preferred summary answer is: As fresh as is needed to support the best decision-making. I think that formulation starts with several advantages:

Straightforward applications of this principle include: Read more

May 30, 2016

Adversarial analytics and other topics

Five years ago, in a taxonomy of analytic business benefits, I wrote:

A large fraction of all analytic efforts ultimately serve one or more of three purposes:

  • Marketing
  • Problem and anomaly detection and diagnosis
  • Planning and optimization

That continues to be true today. Now let’s add a bit of spin.

1. A large fraction of analytics is adversarial. In particular: Read more

October 15, 2015

Basho and Riak

Basho was on my (very short) blacklist of companies with whom I refuse to speak, because they have lied about the contents of previous conversations. But Tony Falco et al. are long gone from the company. So when Basho’s new management team reached out, I took the meeting.

For starters:

Basho’s product line has gotten a bit confusing, but as best I understand things the story is:

Technical notes on some of that include:  Read more

September 17, 2015

Rocana’s world

For starters:

Rocana portrays itself as offering next-generation IT operations monitoring software. As you might expect, this has two main use cases:

Rocana’s differentiation claims boil down to fast and accurate anomaly detection on large amounts of log data, including but not limited to:

Read more

August 3, 2015

Data messes

A lot of what I hear and talk about boils down to “data is a mess”. Below is a very partial list of examples.

To a first approximation, one would expect operational data to be rather clean. After all, it drives and/or records business transactions. So if something goes awry, the result can be lost money, disappointed customers, or worse, and those are outcomes to be strenuously avoided. Up to a point, that’s indeed true, at least at businesses large enough to be properly automated. (Unlike, for example — :) — mine.)

Even so, operational data has some canonical problems. First, it could be inaccurate; somebody can just misspell or otherwise botch an entry. Further, there are multiple ways data can be unreachable, typically because it’s:

Inconsistency can take multiple forms, including:  Read more

May 26, 2015

IT-centric notes on the future of health care

It’s difficult to project the rate of IT change in health care, because:

Timing aside, it is clear that health care change will be drastic. The IT part of that starts with vastly comprehensive electronic health records, which will be accessible (in part or whole as the case may be) by patients, care givers, care payers and researchers alike. I expect elements of such records to include:

These vastly greater amounts of data cited above will allow for greatly changed analytics.
Read more

May 13, 2015

Notes on analytic technology, May 13, 2015

1. There are multiple ways in which analytics is inherently modular. For example:

Also, analytics is inherently iterative.

If I’m right that analytics is or at least should be modular and iterative, it’s easy to see why people hate multi-year data warehouse creation projects. Perhaps it’s also easy to see why I like the idea of schema-on-need.

2. In 2011, I wrote, in the context of agile predictive analytics, that

… the “business analyst” role should be expanded beyond BI and planning to include lightweight predictive analytics as well.

I gather that a similar point is at the heart of Gartner’s new term citizen data scientist. I am told that the term resonates with at least some enterprises.  Read more

March 5, 2015

Cask and CDAP

For starters:

Also:

So far as I can tell:

Read more

February 22, 2015

Data models

7-10 years ago, I repeatedly argued the viewpoints:

Since then, however:

So it’s probably best to revisit all that in a somewhat organized way.

Read more

December 31, 2014

Notes on machine-generated data, year-end 2014

Most IT innovation these days is focused on machine-generated data (sometimes just called “machine data”), rather than human-generated. So as I find myself in the mood for another survey post, I can’t think of any better idea for a unifying theme.

1. There are many kinds of machine-generated data. Important categories include:

That’s far from a complete list, but if you think about those categories you’ll probably capture most of the issues surrounding other kinds of machine-generated data as well.

2. Technology for better information and analysis is also technology for privacy intrusion. Public awareness of privacy issues is focused in a few areas, mainly: Read more

Next Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.