February 6, 2012

Sumo Logic and UIs for text-oriented data

I talked with the Sumo Logic folks for an hour Thursday. Highlights included:

Sumo Logic does SaaS (Software as a Service) log management.
Sumo Logic is text indexing/Lucene-based. Thus, it is reasonable to think of Sumo Logic as “Splunk-like”. (However, Sumo Logic seems to have a stricter security/trouble-shooting orientation than Splunk, which is trying to branch out.)
Sumo Logic has hacked Lucene for faster indexing, and says 10-30 second latencies are typical.
Sumo Logic’s main differentiation is automated classification of events.
There’s some kind of streaming engine in the mix, to update counters and drive alerts.
Sumo Logic has around 30 “customers,” free (mainly) or paying (around 5) as the case may be.
A truly typical Sumo Logic customer has single to low double digits of gigabytes of log data per day. However, Sumo Logic seems highly confident in its ability to handle a terabyte per customer per day, give or take a factor of 2.
When I asked about the implications of shipping that much data to a remote data center, Sumo Logic observed that log data compresses really well.
Sumo Logic recently raised a bunch of venture capital.
Sumo Logic’s founders are out of ArcSight, a log management company HP paid a bunch of money for.
Sumo Logic coined a marketing term “LogReduce”, but it has nothing to do with “MapReduce”. Sumo Logic seems to find this amusing.

What interests me about Sumo Logic is that automated classification story. I thought I heard Sumo Logic say:

It’s largely unsupervised machine learning.
It’s specific to a particular user/data set.
It can be up and running and classifying things effectively almost instantly (i.e., on seconds’ or minutes’ worth of data).
It’s informed by what different users tag as false positives. (Or maybe that is planned for future versions.)

I have a little trouble seeing how all those points fit exactly together, so perhaps I got some details wrong.

The payoff is that machine learning directly informs the Sumo Logic user interface. In particular, large numbers of events are bundled into a small number of categories, hopefully making it much easier for network operations types to scan the UI and pick out what’s important.

In general, the idea of machine-learning informing analytic UIs via some sort of classification is common in text-oriented technologies, notably in:

Good ol’ text search.
Text mining vendors’ approaches to clustering hits on words or phrases that say substantially the same thing.

But otherwise it seems kind of rare, if we stipulate that ad-serving/general internet personalization isn’t really an analytic UI — but I’d love to hear of any interesting examples I’ve overlooked.

Categories: Log analysis, Market share and customer counts, Predictive modeling and advanced analytics, Software as a Service (SaaS), Text

Subscribe to our complete feed!

Comments

7 Responses to “Sumo Logic and UIs for text-oriented data”

Jim on February 6th, 2012 3:37 pm

Curt,

What is the unit for “single to low double digits of log data per day”? Is it GB?
Curt Monash on February 6th, 2012 3:59 pm

Jim,

Ack! Thanks! Yes! Fixed.
Applications of an analytic kind : DBMS 2 : DataBase Management System Services on February 11th, 2012 8:32 pm

[…] analyzer Sumo Logic probably doesn’t rely on an off-the-shelf machine learning […]
Andrew Lee on November 30th, 2012 3:42 pm

“Sumo Logic’s main differentiation is automated classification of events.”

– Is this a comparison to Splunk?

How else does it differ? More dependence on machine learning techniques?
Curt Monash on November 30th, 2012 5:53 pm

I haven’t talked with Sumo Logic for a while. Their last PR pitch was a generic “Golly gee whiz big data SaaS cloud” piece of nonsense; if they actually enhanced the offering in interesting ways, they did a good job of covering it up.
El Peralta on July 2nd, 2013 4:40 pm

If you’re interested in Sumo you can always contact them directly 🙂 Sumo has more differentiators on their backend–“elastic log processing” for scaling without performance implications, machine learning and native anomaly detection technology, dashboards which run off of continuous queries for auto-updating, etc. The cloud marketing “nonsense” has room for improvement 🙂
Splunk engages in stupid lawyer tricks | DBMS 2 : DataBase Management System Services on November 25th, 2015 9:14 am

[…] are apt to backfire instead. Splunk seems to actually have had some limited success intimidating Sumo Logic. But it tried something similar against Rocana, and I was set up to potentially be collateral […]

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Sumo Logic and UIs for text-oriented data

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin