August 21, 2016

More about Databricks and Spark

Databricks CEO Ali Ghodsi checked in because he disagreed with part of my recent post about Databricks. Ali’s take on Databricks’ position in the Spark world includes:

Ali also walked me through customer use cases and adoption in wonderful detail. In general:

The story on those sectors, per Ali, is:  Read more

August 7, 2016

Notes on DataStax and Cassandra

I visited DataStax on my recent trip. That was a tipping point leading to my recent discussions of NoSQL DBAs and misplaced fear of vendor lock-in. But of course I also learned some things about DataStax and Cassandra themselves.

On the customer side:

Customers in large numbers want cloud capabilities, as a potential future if not a current need.

One customer example was a large retailer, who in the past was awful at providing accurate inventory information online, but now uses Cassandra for that. DataStax brags that its queries come back in 20 milliseconds, but that strikes me as a bit beside the point; what really matters is that data accuracy has gone from “batch” to some version of real-time. Also, Microsoft is a DataStax customer, using Cassandra (and Spark) for the Office 365 backend, or at least for the associated analytics.

Per Patrick McFadin, the four biggest things in DataStax Enterprise 5 are: Read more

July 19, 2016

Notes on vendor lock-in

Vendor lock-in is an important subject. Everybody knows that. But few of us realize just how complicated the subject is, nor how riddled it is with paradoxes. Truth be told, I wasn’t fully aware either. But when I set out to write this post, I found that it just kept growing longer.

1. The most basic form of lock-in is:

2. Enterprise vendor standardization is closely associated with lock-in. The core idea is that you have a mandate or strong bias toward having different apps run over the same platforms, because:

3. That last point is double-edged; you have more power over suppliers to whom you give more business, but they also have more power over you. The upshot is often an ELA (Enterprise License Agreement), which commonly works:

Read more

July 19, 2016

Notes from a long trip, July 19, 2016

For starters:

A running list of recent posts is:

Subjects I’d like to add to that list include:

Read more

January 25, 2016

Kafka and Confluent

For starters:

At its core Kafka is very simple:

So it seems fair to say:

Read more

December 1, 2015

Machine learning’s connection to (the rest of) AI

This is part of a four post series spanning two blogs.

1. I think the technical essence of AI is usually:

Of course, a lot of non-AI software can be described the same way.

To check my claim, please consider:

To see why it’s true from a bottom-up standpoint, please consider the next two points.

2. It is my opinion that most things called “intelligence” — natural and artificial alike — have a great deal to do with pattern recognition and response. Examples of what I mean include:  Read more

October 15, 2015

Cassandra and privacy requirements

For starters:

But when I made that connection and checked in accordingly with my client Patrick McFadin at DataStax, I discovered that I’d been a little confused about how multi-data-center Cassandra works. The basic idea holds water, but the details are not quite what I was envisioning.

The story starts:

In particular, a remote replication factor for Cassandra can = 0. When that happens, then you have data sitting in one geographical location that is absent from another geographical location; i.e., you can be in compliance with laws forbidding the export of certain data. To be clear (and this contradicts what I previously believed and hence also implied in this blog):

Read more

October 15, 2015

Basho and Riak

Basho was on my (very short) blacklist of companies with whom I refuse to speak, because they have lied about the contents of previous conversations. But Tony Falco et al. are long gone from the company. So when Basho’s new management team reached out, I took the meeting.

For starters:

Basho’s product line has gotten a bit confusing, but as best I understand things the story is:

Technical notes on some of that include:  Read more

September 14, 2015

DataStax and Cassandra update

MongoDB isn’t the only company I reached out to recently for an update. Another is DataStax. I chatted mainly with Patrick McFadin, somebody with whom I’ve had strong consulting relationships at a user and vendor both. But Rachel Pedreschi contributed the marvelous phrase “twinkling dashboard”.

It seems fair to say that in most cases:

Those generalities, in my opinion, make good technical sense. Even so, there are some edge cases or counterexamples, such as:

*And so a gas company is doing lightweight analysis on boiler temperatures, which it regards as hot data. :)

While most of the specifics are different, I’d say similar things about MongoDB, Cassandra, or any other NoSQL DBMS that comes to mind: Read more

June 8, 2015

Teradata will support Presto

At the highest level:

Now let’s make that all a little more precise.

Regarding Presto (and I got most of this from Teradata)::

Daniel Abadi said that Presto satisfies what he sees as some core architectural requirements for a modern parallel analytic RDBMS project:  Read more

Next Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.