April 17, 2017

Interana

Interana has an interesting story, in technology and business model alike. For starters:

Interana does ad-hoc event series analytics, which they call “interactive behavioral analytics solutions”.
Interana has a full-stack analytic offering, include:
- Its own columnar DBMS …
- … which has a non-SQL DML (Data Manipulation Language) meant to handle event series a lot more fluently than SQL does, but which the user is never expected to learn because …
- … there also are BI-like visual analytics tools that support plenty of drilldown.
Interana sells all this to “product” departments rather than marketing, because marketing doesn’t sufficiently value Interana’s ad-hoc query flexibility.
Interana boasts >40 customers, with annual subscription fees ranging from high 5 figures to low 7 digits.

And to be clear — if we leave aside any questions of marketing-name sizzle, this really is business intelligence. The closest Interana comes to helping with predictive modeling is giving its ad-hoc users inspiration as to where they should focus their modeling attention.

Interana also has an interesting twist in its business model, which I hope can be used successfully by other enterprise software startups as well. Read more

Categories: Business intelligence, Columnar database management, Data models and architecture, Data warehousing, Database compression, Log analysis, Market share and customer counts, Petabyte-scale data management, Pricing, Solid-state memory, Splunk, StreamBase, Vertica Systems, Web analytics

Transitioning to the cloud(s)

There’s a lot of talk these days about transitioning to the cloud, by IT customers and vendors alike. Of course, I have thoughts on the subject, some of which are below.

1. The economies of scale of not running your own data centers are real. That’s the kind of non-core activity almost all enterprises should outsource. Of course, those considerations taken alone argue equally for true cloud, co-location or SaaS (Software as a Service).

2. When the (Amazon) cloud was newer, I used to hear that certain kinds of workloads didn’t map well to the architecture Amazon had chosen. In particular, shared-nothing analytic query processing was necessarily inefficient. But I’m not hearing nearly as much about that any more.

3. Notwithstanding the foregoing, not everybody loves Amazon pricing.

4. Infrastructure vendors such as Oracle would like to also offer their infrastructure to you in the cloud. As per the above, that could work. However:

Is all your computing on Oracle’s infrastructure? Probably not.
Do you want to move the Oracle part and the non-Oracle part to different clouds? Ideally, no.
Do you like the idea of being even more locked in to Oracle than you are now? [Insert BDSM joke here.]
Will Oracle do so much better of a job hosting its own infrastructure that you use its cloud anyway? Well, that’s an interesting question.

Actually, if we replace “Oracle” by “Microsoft”, the whole idea sounds better. While Microsoft doesn’t have a proprietary server hardware story like Oracle’s, many folks are content in the Microsoft walled garden. IBM has fiercely loyal customers as well, and so may a couple of Japanese computer manufacturers.

5. Even when running stuff in the cloud is otherwise a bad idea, there’s still: Read more

Categories: Amazon and its cloud, Cloud computing, Emulation, transparency, portability, IBM and DB2, Microsoft and SQL*Server, Oracle, Pricing

6 Comments

October 26, 2015

Differentiation in business intelligence

Parts of the business intelligence differentiation story resemble the one I just posted for data management. After all:

Both kinds of products query and aggregate data.
Both are offered by big “enterprise standard” behemoth companies and also by younger, nimbler specialists.
You really, really, really don’t want your customer data to leak via a security breach in either kind of product.

That said, insofar as BI’s competitive issues resemble those of DBMS, they are those of DBMS-lite. For example:

BI is less mission-critical than some other database uses.
BI has done a lot less than DBMS to deal with multi-structured data.
Scalability demands on BI are less than those on DBMS — indeed, they’re the ones that are left over after the DBMS has done its data crunching first.

And full-stack analytic systems — perhaps delivered via SaaS (Software as a Service) — can moot the BI/data management distinction anyway.

Of course, there are major differences between how DBMS and BI are differentiated. The biggest are in user experience. I’d say: Read more

Categories: Business intelligence, Buying processes, ClearStory Data, Data mart outsourcing, Pricing, QlikTech and QlikView, Rocana, Tableau Software

Differentiation in data management

In the previous post I broke product differentiation into 6-8 overlapping categories, which may be abbreviated as:

Scope
Accuracy
(Other) trustworthiness
Speed
User experience
Cost

and sometimes also issues in adoption and administration.

Now let’s use this framework to examine two market categories I cover — data management and, in separate post, business intelligence.

Applying this taxonomy to data management:
Read more

Categories: Buying processes, Clustering, Data warehousing, Database diversity, Microsoft and SQL*Server, Predictive modeling and advanced analytics, Pricing

2 Comments

October 26, 2015

Sources of differentiation

Obviously, a large fraction of what I write about involves technical differentiation. So let’s try for a framework where differentiation claims can be placed in context. This post will get through the generalities. The sequels will apply them to specific cases.

Many buying and design considerations for IT fall into six interrelated areas: Read more

Categories: Buying processes, Predictive modeling and advanced analytics, Pricing, Text

1 Comment

October 15, 2015

Basho and Riak

Basho was on my (very short) blacklist of companies with whom I refuse to speak, because they have lied about the contents of previous conversations. But Tony Falco et al. are long gone from the company. So when Basho’s new management team reached out, I took the meeting.

For starters:

Basho management turned over significantly 1-2 years ago. The main survivors from the old team are 1 each in engineering, sales, and services.
Basho moved its headquarters to Bellevue, WA. (You get one guess as to where the new CEO lives.) Engineering operations are very distributed geographically.
Basho claims that it is much better at timely product shipments than it used to be. Its newest product has a planned (or at least hoped-for) 8-week cadence for point releases.
Basho’s revenue is ~90% subscription.
Basho claims >200 enterprise clients, vs. 100-120 when new management came in. Unfortunately, I forgot to ask the usual questions about divisions vs. whole organizations, OEM sell-through vs. direct, etc.
Basho claims an average contract value of >$100K, typically over 2-3 years. $9 million of that (which would be close to half the total, actually), comes from 2 particular deals of >$4 million each.

Basho’s product line has gotten a bit confusing, but as best I understand things the story is:

There’s something called Riak Core, which isn’t even a revenue-generating product. However, it’s an open source project with some big users (e.g. Goldman Sachs, Visa), and included in pretty much everything else Basho promotes.
Riak KV is the key-value store previously known as Riak. It generates the lion’s share of Basho’s revenue.
Riak S2 is an emulation of Amazon S3. Basho thinks that Riak KV loses efficiency when objects get bigger than 1 MB or so, and that’s when you might want to use Riak S2 in addition or instead.
Riak TS is for time series, and just coming out now.
Also in the mix are some (extra charge) connectors for Redis and Spark. Presumably, there are more of these to come.
There’s an umbrella marketing term of “Basho Data Platform”.

Technical notes on some of that include: Read more

Categories: Aerospike, Basho and Riak, Cassandra, Clustering, Couchbase, Databricks, Spark and BDAS, DataStax, HBase, Health care, Log analysis, MapR, Market share and customer counts, MongoDB, NoSQL, Pricing, Specific users, Splunk

1 Comment

September 17, 2015

Rocana’s world

For starters:

My client Rocana is the renamed ScalingData, where Rocana is meant to signify ROot Cause ANAlysis.
Rocana was founded by Omer Trajman, who I’ve referenced numerous times in the past, and who I gather is a former boss of …
… cofounder Eric Sammer.
Rocana recently told me it had 35 people.
Rocana has a very small number of quite large customers.

Rocana portrays itself as offering next-generation IT operations monitoring software. As you might expect, this has two main use cases:

Actual operations — figuring out exactly what isn’t working, ASAP.
Security.

Rocana’s differentiation claims boil down to fast and accurate anomaly detection on large amounts of log data, including but not limited to:

The sort of network data you’d generally think of — “everything” except packet-inspection stuff.
Firewall output.
Database server logs.
Point-of-sale data (at a retailer).
“Application data”, whatever that means. (Edit: See Tom Yates’ clarifying comment below.)

Categories: Business intelligence, Hadoop, Kafka and Confluent, Log analysis, Market share and customer counts, Petabyte-scale data management, Predictive modeling and advanced analytics, Pricing, Rocana, Splunk, Web analytics

1 Comment

June 10, 2015

Hadoop generalities

Occasionally I talk with an astute reporter — there are still a few left 🙂 — and get led toward angles I hadn’t considered before, or at least hadn’t written up. A blog post may then ensue. This is one such post.

There is a group of questions going around that includes:

Is Hadoop overhyped?
Has Hadoop adoption stalled?
Is Hadoop adoption being delayed by skills shortages?
What is Hadoop really good for anyway?
Which adoption curves for previous technologies are the best analogies for Hadoop?

To a first approximation, my responses are: Read more

Categories: Application areas, Data warehousing, Databricks, Spark and BDAS, EAI, EII, ETL, ELT, ETLT, Hadoop, Hortonworks, MapR, MapReduce, Market share and customer counts, Open source, Pricing

6 Comments

May 20, 2015

MemSQL 4.0

I talked with my clients at MemSQL about the release of MemSQL 4.0. Let’s start with the reminders:

MemSQL started out as in-memory OTLP (OnLine Transaction Processing) DBMS …
… but quickly positioned with “We also do ‘real-time’ analytic processing” …
… and backed that up by adding a flash-based column store option …
… before Gartner ever got around to popularizing the term HTAP (Hybrid Transaction and Analytic Processing).
There’s also a JSON option.

The main new aspects of MemSQL 4.0 are:

Geospatial indexing. This is for me the most interesting part.
A new optimizer and, I suppose, query planner …
… which in particular allow for serious distributed joins.
Some rather parallel-sounding connectors to Spark. Hadoop and Amazon S3.
Usual-suspect stuff including:
- More SQL coverage (I forgot to ask for details).
- Some added or enhanced administrative/tuning/whatever tools (again, I forgot to ask for details).
- Surely some general Bottleneck Whack-A-Mole.

There’s also a new free MemSQL “Community Edition”. MemSQL hopes you’ll experiment with this but not use it in production. And MemSQL pricing is now wholly based on RAM usage, so the column store is quasi-free from a licensing standpoint is as well.

Categories: Amazon and its cloud, Columnar database management, Databricks, Spark and BDAS, GIS and geospatial, Hadoop, Investment research and trading, Kafka and Confluent, Market share and customer counts, MemSQL, NewSQL, Pricing, Structured documents

10 Comments

February 18, 2015

Greenplum is being open sourced

While I don’t find the Open Data Platform thing very significant, an associated piece of news seems cooler — Pivotal is open sourcing a bunch of software, with Greenplum as the crown jewel. Notes on that start:

Greenplum has been an on-again/off-again low-cost player since before its acquisition by EMC, but open source is basically a commitment to having low license cost be permanently on.
In most regards, “free like beer” is what’s important here, not “free like speech”. I doubt non-Pivotal employees are going to do much hacking on the long-closed Greenplum code base.
That said, Greenplum forked PostgreSQL a long time ago, and the general PostgreSQL community might gain ideas from some of the work Greenplum has done.
The only other bit of newly open-sourced stuff I find interesting is HAWQ. Redis was already open source, and I’ve never been persuaded to care about GemFire.

Greenplum, let us recall, is a pretty decent MPP (Massively Parallel Processing) analytic RDBMS. Various aspects of it were oversold at various times, and I’ve never heard that they actually licked concurrency. But Greenplum has long had good SQL coverage and petabyte-scale deployments and a columnar option and some in-database analytics and so on; i.e., it’s legit. When somebody asks me about open source analytic RDBMS to consider, I expect Greenplum to consistently be on the short list.

Further, the low-cost alternatives for analytic RDBMS are adding up. Read more

Categories: Amazon and its cloud, Citus Data, Data warehouse appliances, EAI, EII, ETL, ELT, ETLT, EMC, Greenplum, Hadoop, Infobright, MonetDB, Open source, Pricing

6 Comments

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Interana

Transitioning to the cloud(s)

Differentiation in business intelligence

Differentiation in data management

Sources of differentiation

Basho and Riak

Rocana’s world

Hadoop generalities

MemSQL 4.0

Greenplum is being open sourced

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin