April 30, 2014

The Intel investment in Cloudera

Intel recently made a huge investment in Cloudera, stated facts about which start:

Give or take stock preferences, etc., that’s around a $4.1 billion valuation post-money, but Cloudera does say it now has “most of $1 billion” in the bank.

Cloudera further told me when I visited last Friday that the majority of the Intel investment is net new money. (I presume that the rest of the round is net-new as well.) Hence, I conclude that previous investors sold in the aggregate less than 10% of total holdings to Intel. While I’m pretty sure Mike Olson is buying himself a couple of nice toys, in most respects it’s business-as-usual at Cloudera, with the same investors, directors and managers they had before. By way of contrast, many of the “cashing-out” rumors going around are OBVIOUSLY absurd, unless you think Intel acquired a much larger fraction of Cloudera than it actually did.

That said, Intel spent a lot of money, and in connection with the investment there’s a tight Cloudera/Intel partnership. In particular, Read more

April 30, 2014

Cloudera, Impala, data warehousing and Hive

There’s much confusion about Cloudera’s SQL plans and beliefs, and the company has mainly itself to blame. That said, here’s what I think is going on.

And of course, as vendors so often do, Cloudera generally overrates both the relative maturity of Impala and the relative importance of the use cases in which its offerings – Impala or otherwise – shine.

Related links

April 30, 2014

Spark on fire

Spark is on the rise, to an even greater degree than I thought last month.

*Yes, my fingerprints are showing again.

The most official description of what Spark now contains is probably the “Spark ecosystem” diagram from Databricks. However, at the time of this writing it is slightly out of date, as per some email from Databricks CEO Ion Stoica (quoted with permission):

… but if I were to redraw it, SparkSQL will replace Shark, and Shark will eventually become a thin layer above SparkSQL and below BlinkDB.

With this change, all the modules on top of Spark (i.e., SparkStreaming, SparkSQL, GraphX, and MLlib) are part of the Spark distribution. You can think of these modules as libraries that come with Spark.

Read more

April 19, 2014

Necessary complexity

When I’m asked to talk to academics, the requested subject is usually a version of “What should we know about what’s happening in the actual market/real world?” I then try to figure out what the scholars could stand to hear that they perhaps don’t already know.

In the current case (Berkeley next Tuesday), I’m using the title “Necessary complexity”. I actually mean three different but related things by that, namely:

  1. No matter how cool an improvement you have in some particular area of technology, it’s not very useful until you add a whole bunch of me-too features and capabilities as well.
  2. Even beyond that, however, the simple(r) stuff has already been built. Most new opportunities are in the creation of complex integrated stacks, in part because …
  3. … users are doing ever more complex things.

While everybody on some level already knows all this, I think it bears calling out even so.

I previously encapsulated the first point in the cardinal rules of DBMS development:

Rule 1: Developing a good DBMS requires 5-7 years and tens of millions of dollars.

That’s if things go extremely well.

Rule 2: You aren’t an exception to Rule 1. 

In particular:

  • Concurrent workloads benchmarked in the lab are poor predictors of concurrent performance in real life.
  • Mixed workload management is harder than you’re assuming it is.
  • Those minor edge cases in which your Version 1 product works poorly aren’t minor after all.

My recent post about MongoDB is just one example of same.

Examples of the second point include but are hardly limited to: Read more

April 17, 2014

MongoDB is growing up

I caught up with my clients at MongoDB to discuss the recent MongoDB 2.6, along with some new statements of direction. The biggest takeaway is that the MongoDB product, along with the associated MMS (MongoDB Management Service), is growing up. Aspects include:

Read more

April 16, 2014

The worst database developers in the world?

If the makers of MMO RPGs (Massive Multi-Player Online Role-Playing Games) aren’t quite the worst database application developers in the world, they’re at least on the short list for consideration. The makers of Guild Wars didn’t even try to have decent database functionality. A decade later, when they introduced Guild Wars 2, the database-oriented functionality (auction house, real-money store, etc.) would crash for days at a time. Lord of the Rings Online evidently had multiple issues with database functionality. Now I’m playing Elder Scrolls Online, which on the whole is a great game, but which may have the most database screw-ups of all.

ESO has been live for less than 3 weeks, and in that time:

1. There’s been a major bug in which players’ “banks” shrank, losing items and so on. Days later, the data still hasn’t been recovered. After a patch, the problem if anything worsened.

2. Guild functionality has at times been taken down while the rest of the game functioned.

3. Those problems aside, bank and guild bank functionality are broken, via what might be considered performance bugs. Problems I repeatedly encounter include:

In general, it seems like that what should be a collection of database records is really just a list, parsed each time an update occurs, periodically flushed in its entirety to disk, with all the performance problems you’d expect from that kind of choice.

Read more

March 28, 2014

NoSQL vs. NewSQL vs. traditional RDBMS

I frequently am asked questions that boil down to:

The details vary with context — e.g. sometimes MySQL is a traditional RDBMS and sometimes it is a new kid — but the general class of questions keeps coming. And that’s just for short-request use cases; similar questions for analytic systems arise even more often.

My general answers start:

In particular, migration away from legacy DBMS raises many issues:  Read more

March 23, 2014

DBMS2 revisited

The name of this blog comes from an August, 2005 column. 8 1/2 years later, that analysis holds up pretty well. Indeed, I’d keep the first two precepts exactly as I proposed back then:

I’d also keep the general sense of the third precept, namely appropriately-capable data integration, but for that one the specifics do need some serious rework.

For starters, let me say: Read more

March 23, 2014

Wants vs. needs

In 1981, Gerry Chichester and Vaughan Merlyn did a user-survey-based report about transaction-oriented fourth-generation languages, the leading application development technology of their day. The report included top-ten lists of important features during the buying cycle and after implementation. The items on each list were very similar — but the order of the items was completely different. And so the report highlighted what I regard as an eternal truth of the enterprise software industry:

What users value in the product-buying process is quite different from what they value once a product is (being) put into use.

Here are some thoughts about how that comes into play today.

Wants outrunning needs

1. For decades, BI tools have been sold in large part via demos of snazzy features the CEO would like to have on his desk. First it was pretty colors; then it was maps; now sometimes it’s “real-time” changing displays. Other BI features, however, are likely to be more important in practice.

2. In general, the need for “real-time” BI data freshness is often exaggerated. If you’re a human being doing a job that’s also often automated at high speed — for example network monitoring or stock trading — there’s a good chance you need fully human real-time BI. Otherwise, how much does a 5-15 minute delay hurt? Even if you’re monitoring website sell-through — are your business volumes really high enough that 5 minutes matters much? eBay answered “yes” to that question many years ago, but few of us work for businesses anywhere near eBay’s scale.

Even so, the want for speed keeps growing stronger. 🙂

3. Similarly, some desires for elastic scale-out are excessive. Your website selling koi pond accessories should always run well on a single server. If you diversify your business to the point that that’s not true, you’ll probably rewrite your app by then as well.

4. Some developers want to play with cool new tools. That doesn’t mean those tools are the best choice for the job. In particular, boring old SQL has merits — such as joins! — that shiny NoSQL hasn’t yet replicated.

5. Some developers, on the other hand, want to keep using their old tools, on which they are their employers’ greatest experts. That doesn’t mean those tools are the best choice for the job either.

6. More generally, some enterprises insist on brand labels that add little value but lots of expense. Yes, there are many benefits to vendor consolidation, and you may avoid many headaches if you stick with not-so-cutting-edge technology. But “enterprise-grade” hardware failure rates may not differ enough from “consumer-grade” ones to be worth paying for.

Read more

March 18, 2014

Real-time analytics for everybody, uniquely from us!!

In my latest post, I noted that

The “real-time analytics” gold rush I called out last year continues.

I also recently mocked the slogan

Analytics for everybody!

So when I saw today an email subject line

[Vendor X] to announce real-time analytics for everyone …

I laughed. Indeed, I snorted so loudly that Linda — who was on a different floor of our house — called to check that I was OK. 🙂

As the day progressed, I had a consulting call with a client, and what did I see on the first substantive slide? There were references to:

broader audience

and

real-time data analysis

The trends — real or imaginary — are melting into each other!

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.