March 20, 2009

The CEP guys are getting a bit chippy

In a thread responding to my post Independent CEP vendors continue to flounder, Paul Vincent wrote:

I’m not aware of anyone claiming “CEP is an alternative to a relational RDBMS” – except maybe as an application platform for processing events (where RDBMS could be seen as a square hole for an event round peg).

Huh?

Actually, it’s hard to think of an application for off-the-shelf CEP where the alternative technologies aren’t:

  1. Custom CEP
  2. Just write it to an RDBMS and query it

What’s more, except where super-low latency is needed, #2 is apt to be the primary alternative. Read more

March 20, 2009

More on Greenplum, Fox/MySpace, and load speeds

Eric Lai offers more facts, figures, explanation, and competitive insight than I did on Greenplum’s loading of the Fox/MySpace database, including that Greenplum is being loaded with data at the 4 TB/hour rate only for half an hour at a time.

Also, Eric cites the Greenplum Fox Interactive Media database as being only 200 TB in size.  Surely there is some confusion somewhere, since Greenplum described it as being 400 TB back in August.

March 20, 2009

Notes from the Oracle conference call

Chris Karnacus reports two tidbits from the Oracle conference call:

Seeking Alpha, as usual, has a full transcript, some typos aside.  There were plenty of comments on other sales, just not Exadata ones. On the other hand, Oracle execs did repeat several times how wonderful they think Exadata is.

One question about the transcript — it sort of reads like there was a big text-oriented deal at Bank of America, but there’s clearly a typo in the reference.  Does anybody who actually listened to the call know for sure whether that’s what was said? (Edit: Answered in the comments below.)

March 20, 2009

Greenplum claims very fast load speeds, and Fox still throws away most of its MySpace data

Data warehouse load speeds are a contentious issue.  Vertica contrived a benchmark with a 5 1/2 terabyte/hour load rate.  Oracle has gotten dinged for very low load speeds, which then are hotly debated.  I was told recently of a Greenplum partner’s salesman steering a prospect who needed rapid load speeds away from Greenplum, which seemed odd to me.

Now Greenplum has come out swinging, claiming “consistent” load speeds of 4 terabytes/hour at its Fox Interactive Media account, and armed with a customer quote saying just that.  Note however that load speeds tend to be proportional to the number of disks, and there are a LOT of disks at that installation.

One way to think about load speeds is — how long would it take to load the entire database? It seems as if the Fox database could be loaded, perhaps not in one week, but certainly in less than two. Flipping that around, the Fox site only has enough capacity to hold less than 2 weeks of detailed data. (This is not uncommon in network event kinds of databases.) And a corollary of that is — worldwide storage sales are still constrained by cost, not by absolute limits on the amounts of data enterprises would like to store.

March 18, 2009

Database implications if IBM acquires Sun

Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion today (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal — if it happens — might affect the database management system industry. Read more

March 17, 2009

Pervasive DataRush today

In my first post-fire briefing, I had a long-scheduled dinner with the Pervasive DataRush folks.  Much of DataRush’s positioning, feature evolution, and so on remain To Be Determined.  Most existing customers and applications remain To Be Disclosed.  What’s more, DataRush is a technology to accelerate applications that

  1. Need to be parallelized
  2. Should run on SMP rather than shared-nothing hardware

and Pervasive hasn’t done a great job of explaining where #2 applies.

That said, there’s at least one use case for which DataRush should clearly be considered today.  Suppose you have a messy ETL/data transformation task that requires custom code.  Then I see three main choices:

In some cases, DataRush may be best possibility.

March 9, 2009

Independent CEP vendors continue to flounder

Independent CEP (Complex/Event Processing) vendors continue to flounder, at least outside the financial services and national intelligence markets.

CEP’s penetration outside of its classical markets isn’t quite zero. Customers include several transportation companies (various vendors), Sallie Mae (Coral8), a game vendor or two (StreamBase, if I recall correctly), Verizon (Aleri, I think), and more. But I just wrote that list from memory — based mainly on not-so-recent deals — and a quick tour of the vendors’ web sites hasn’t turned up much I overlooked. (Truviso does have a recent deal with Technorati, but that’s not exactly a blue chip customer these days.)

So far as I can tell, this is a new version of a repeated story. Read more

March 7, 2009

Three Greenplum customers’ applications of MapReduce

Greenplum (and Truviso) advisor Joseph Hellerstein offers a few examples of MapReduce applications (specifically Greenplum MapReduce), namely:

The big aha moment occured for me during our panel discussion, which included Luke Lonergan from Greenplum, Roger Magoulas from O’Reilly, and Brian Dolan from Fox Interactive Media (which runs MySpace among other web properties).

Roger talked about using MapReduce to extract structured entities from text for doing tech trend analyses from billions of rows of online job postings.  Brian (who is a mathematician by training) was talking about implementing conjugate gradiant and Support Vector Machines in parallel SQL to support “hypertargeting” for advertisers.  I mentioned how Jonathan Goldman at LinkedIn was using SQL and MapReduce to do graph algorithms for social network analysis.

Incidentally: While it’s been some months since I asked, my sense is that the O’Reilly text extraction is home-grown, and primitive compared to what one could do via commercial products. That said, if the specific application is examining job postings, I’m not sure how much value more sophisticated products would add. After all, tech job listings are generally written in a style explicitly designed to ensure that most or all of their meaning is conveyed simply by a bag of keywords. And by the way, this effort has been underway for quite some time.

Related link

March 7, 2009

Greenplum discloses a bit of pricing

Getting information about Greenplum pricing is not always easy.  However, a bit was disclosed in a recent Greenplum blog post, which said:

… roughly $200k … For that amount you get the hardware, software and services to stand up around a 4TB (usable) Greenplum DW …

No doubt there are large quantity discounts for much bigger systems.

March 5, 2009

DATAllegro sales price: $275 million

According to a press release announcing a venture capitalist’s job change,

Microsoft purchased DATAllegro for $275 million

Technically, that needn’t shut down the rumor mill altogether, since given the way deals are structured and reported, it’s unlikely that Microsoft actually cut checks to DATAllegro stockholders in the aggregate amount of $275 million promptly after the close of the acquisition.

Still, it’s a data point of some weight.

Hat tip to Mark Myers.

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.