November 29, 2010

MarkLogic and its document DBMS

This post has been long in the writing for several reasons, the biggest being that I stopped working for almost a month due to family issues. Please forgive its particularly choppy writing style; having waited this long already, I now lack the patience to further clean it up.

MarkLogic:

Is an ACID-compliant, document-oriented, non-SQL, XML-based scale-out DBMS vendor of non-trivial size and momentum.
Still has the same technical approach I previously described.
Recently posted an internally-written white paper with a lot of technical detail.
Recently had a point release — MarkLogic 4.2 — a lot of which seems to be “Oh, you didn’t have that before?” kinds of stuff.
Has given me permission to post most of the slides from same, the first few of which give a nice overview of the MarkLogic story.
Claims 200+ each of customers and employees (that’s from a slide MarkLogic did ask me to remove from the deck).
Is a client again.
Not coincidentally, is interested in branching out past the vertical markets of media and government/intelligence, in particular to the financial services market.
Has finally rationalized its company and product names so that both are now “MarkLogic.” 🙂
Has finally grasped that if it is proud of its ACID-compliance it probably shouldn’t be trying to market itself as “NoSQL”. 🙂

Categories: Investment research and trading, MarkLogic, Structured documents

7 Comments

October 25, 2010

Teradata announcements made very simple

For reasons of health,* I very regretfully canceled my trip to what is the first conference to go on my schedule every year — Teradata Partners. From afar, I’m not plugged into the details of Teradata’s announcement/embargo schedule. But what you need to know starts with this:

Teradata signaled a year ago that its software focus was on adding analytic functionality, including specifically in the temporal area.
Teradata likes to refresh its hardware annually, with a 50%+ price/performance improvement. (This year Teradata is going to 6-core Xeon processors.)

*Just a cough, but I’m both exhausted and potentially contagious, and this wasn’t a trip on which I had any truly urgent obligations (speeches, packed-room consulting sessions, whatever).

Categories: Analytic technologies, Data warehousing, Teradata

Notes and links October 22, 2010

A number of recent posts have had good comments. This time, I won’t call them out individually.

Evidently Mike Olson of Cloudera is still telling the machine-generated data story, exactly as he should be. The Information Arbitrage/IA Ventures folks said something similar, focusing specifically on “sensor data” …

… and, even better, went on to say: Read more

Categories: Analytic technologies, Aster Data, Cloudera, eBay, Greenplum, Hadoop, IBM and DB2, In-memory DBMS, Market share and customer counts, Netezza, Open source, Oracle, ParAccel, Petabyte-scale data management, SAS Institute, Surveillance and privacy, Teradata, VoltDB and H-Store

1 Comment

October 19, 2010

Introduction to Kaminario

At its core, the Kaminario story is simple:

Throw out your disks and replace them with, not Flash, but actual DRAM.
Your IOPS (Input/Output Per Second) are so high* that you get the performance you need without any further system changes.
The whole thing is very fast to set up.

In other words, Kaminario pitches a value proposition something like (my words, not theirs) “A shortcut around your performance bottlenecks.”

*1 million or so on the smallest Kaminario K2 appliance.

Kaminario asserts that both analytics and OLTP (OnLine Transaction Processing) are represented in its user base. Even so, the use cases Kaminario mentioned seemed to be concentrated on the analytic side. I suspect there are two main reasons:

As Kaminario points out, OLTP apps commonly are designed to perform in the face of regrettable I/O wait.
Also, analytic performance problems tend to arise more suddenly than OLTP ones do.*

*Somebody can think up a new analytic query overnight that takes 10 times the processing of anything they’ve ever run before. Or they can get the urge to run the same queries 10 times as often as before. Both those kinds of thing happen less often in the OLTP world.

Accordingly, Kaminario likes to sell against the alternative of getting a better analytic DBMS, stressing that you can get a Kaminario K2 appliance into production a lot faster than you can move your processing to even the simplest data warehouse appliance. Kaminario is probably technically correct in saying that; even so, I suspect it would often make more sense to view Kaminario K2 appliances as a transition technology, by which I mean:

You have an annoying performance problem.
Kaminario K2 could solve it very quickly.
That buys you time for a more substantive fix.*
If you want, you can redeploy your Kaminario K2 storage to solve your next-worst performance bottleneck.

On that basis, I could see Kaminario-like devices eventually getting to the point that every sufficiently large enterprise should have some of them, whether or not that enterprise has an application it believes should run permanently against DRAM block storage. Read more

Categories: Investment research and trading, Kaminario, Solid-state memory, Storage, Telecommunications, Web analytics

7 Comments

October 18, 2010

More notes on Membase and memcached

As a companion to my post about Membase last week, the company has graciously allowed me to post a rather detailed Membase slide deck. (It even has pricing.) Also, I left one point out.

Membase announced a Cloudera partnership. I couldn’t detect anything technically exciting about that, but it serves to highlight what I do find to be an interesting usage trend. A couple of big Web players (AOL and ShareThis) are using Hadoop to crunch data and derive customer profile data, then feed that back into Membase. Why Membase? Because it can serve up the profile in a millisecond, as part of a bigger 40-millisecond-latency request.

And why Hadoop, rather than Aster Data nCluster, which ShareThis also uses? Umm, I didn’t ask.

When I mentioned this to Colin Mahony, he said Vertica had similar stories. However, I don’t recall whether they were about Membase or just memcached, and he hasn’t had a chance to get back to me with clarification. (Edit: As per Colin’s comment below, it’s both.)

Categories: Aster Data, Cache, Cloudera, Couchbase, Hadoop, memcached, Memory-centric data management, NoSQL, Pricing, Specific users, Vertica Systems, Web analytics

7 Comments

October 17, 2010

Where ParAccel is at

Until recently, I was extremely critical of ParAccel’s marketing. But there was an almost-clean sweep of the relevant ParAccel executives, and the specific worst practices I was calling out have for the most part been eliminated. So I was open to talking and working with ParAccel again, and that’s now happening. On my recent California trip, I chatted with three ParAccel folks for a few hours. Based on that and other conversation, here’s the current ParAccel story as I understand it.
Read more

Categories: Benchmarks and POCs, Columnar database management, Database compression, Investment research and trading, Memory-centric data management, ParAccel, Solid-state memory, Storage, Vertica Systems

10 Comments

October 15, 2010

Notes on data warehouse appliance prices

I’m not terribly motivated to do a detailed analysis of data warehouse appliance list prices, in part because:

Everybody knows that in practice data warehouse appliances tend to be deeply discounted from list price.
The only realistic metric to use for pricing data warehouse appliances is price-per-terabyte, and people have gotten pretty sick of that one.

That said, here are some notes on data warehouse appliance prices. Read more

Categories: Data warehouse appliances, Data warehousing, Database compression, EMC, Exadata, Greenplum, Netezza, Oracle, Pricing

8 Comments

October 13, 2010

Notes on the EMC Greenplum Data Computing Appliance

The big confidential part of my visit last week to EMC’s Data Computing Division, nee’ Greenplum, was of course this week’s announcement of the first EMC/Greenplum “Data Computing Appliance.” Basics include: Read more

Categories: Analytic technologies, Data warehousing, EMC, Exadata, Greenplum, Oracle, Parallelization, Storage

1 Comment

October 12, 2010

Vertica-Hadoop integration

DBMS/Hadoop integration is a confusing subject. My post on the Cloudera/Aster Data partnership awaits some clarification in the comment thread. A conversation with Vertica left me unsure about some Hadoop/Vertica Year 2 details as well, although I’m doing better after a follow-up call. On the plus side, we also covered some rather cool Hadoop/Vertica product futures, and those seemed easier to understand. 🙂

I say “Year 2” because Hadoop/Vertica integration has been going on since last year. Indeed, Vertica says that there are now over 25 users of the Hadoop/Vertica combination and hence Vertica’s Hadoop connector. Vertica is now introducing — for immediate GA — a new version of its Hadoop connector. So far as I understood: Read more

Categories: Analytic technologies, Cloudera, EAI, EII, ETL, ELT, ETLT, Hadoop, MapReduce, Market share and customer counts, SQL/Hadoop integration, Text, Vertica Systems

6 Comments

October 11, 2010

Membase simplifies name, goes GA

The company Northscale that makes the product Membase is now the company Membase that makes the product Membase. Good. Also, the product Membase has now gone GA.

I wrote back in August about Membase, and that covers most of what I think, with perhaps a couple of exceptions: Read more

Categories: Basho and Riak, Cache, Couchbase, memcached, Memory-centric data management, NoSQL

4 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

MarkLogic and its document DBMS

Teradata announcements made very simple

Notes and links October 22, 2010

Introduction to Kaminario

More notes on Membase and memcached

Where ParAccel is at

Notes on data warehouse appliance prices

Notes on the EMC Greenplum Data Computing Appliance

Vertica-Hadoop integration

Membase simplifies name, goes GA

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin