Glassbeam checked in recently, and they turn out to exemplify quite a few of the themes I’ve been writing about. For starters:
- Glassbeam has an analytic technology stack focused on poly-structured machine-generated data.
- Glassbeam partially organizes that data into event series …
- … in a schema that is modified as needed.
Glassbeam basics include:
- Founded in 2009.
- Based in Santa Clara. Back-end engineering in Bangalore.
- $6 million in angel money; no other VC.
- High single-digit customer count, …
- … plus another high single-digit number of end customers for an OEM offering a limited version of their product.
All Glassbeam customers except one are SaaS/cloud (Software as a Service), and even that one was only offered a subscription (as oppose to perpetual license) price.
So what does Glassbeam’s technology do? Glassbeam says it is focused on “machine data analytics,” specifically for the “Internet of Things”, which it distinguishes from IT logs.* Specifically, Glassbeam sells to manufacturers of complex devices — IT (most of its sales so far ), medical, automotive (aspirational to date), etc. — and helps them analyze “phone home” data, for both support/customer service and marketing kinds of use cases. As of a recent release, the Glassbeam stack can:
- Parse, process, manage, and — if needed — export data.
- Provide business intelligence, search and — you guessed it! — parametric search interfaces, with drilldown into the underlying raw data.
- Not yet do predictive modeling.
*To a first approximation, this could be translated as “We help with machine-generated data, but we’d prefer not to compete with Splunk.” That said, 1% of Splunk’s customer base is more than 100% of Glassbeam’s, so I conjecture that Splunk has more customers in Glassbeam’s target segments than Glassbeam itself yet has.
The event-series part of the story is that Glassbeam ingests various kinds of text data, especially:
- Utterly static system configuration (e.g. ID/serial numbers)
- Configuration that changes occasionally
- Logs/time series
and “stitches them together”. Until this month’s “SCALAR” release, Glassbeam ingested data in batch form from the customers who collected it, but Glassbeam now wants to collect the “phone home” data itself.
Glassbeam technical notes include:
- It’s all built in Scala.
- Like Splunk, Glassbeam revolves around a proprietary search language, which it calls “Semiotic” something-or-other. (I’ve hated that word since the company Semio was overhyped in the late 90s.)
- Technology underneath Glassbeam includes:
- Solr and Lucene.
- Not Storm, although it was considered.
- Glassbeam offers ODBC/JDBC interfaces, as well as APIs targeted at direct enterprise application integration.
- “Apps” don’t touch Cassandra directly — except I imagine in the case of drill down — but rather a custom in-memory layer. Therefore, it is not surprising that …
- … Glassbeam expects to add Spark into the mix as well.
- Much as is the case for WibiData, Glassbeam can manage a huge number of columns for a single entity (10s of 1000s were mentioned). Unlike WibiData, however, Glassbeam doesn’t smush all that into a single row.
- Glassbeam uses as little MapReduce as possible, and in particular doesn’t use MapReduce for parsing.
That’s about as much as I have. In particular, our discussion of data management wasn’t thorough, and I don’t know what those in-memory data structures are like nor exactly how schema-on-need is implemented.