Data warehousing

Analysis of issues in data warehousing, with extensive coverage of database management systems and data warehouse appliances that are optimized to query large volumes of data. Related subjects include:

October 18, 2009

Three big myths about MapReduce

Once again, I find myself writing and talking a lot about MapReduce. But I suspect that MapReduce-related conversations would go better if we overcame three fairly common MapReduce myths:

Read more

October 18, 2009

Introduction to SenSage

I visited with SenSage on my two most recent trips to San Francisco. Both visits were, through no fault of SenSage’s, hasty. Still, I think I have enough of a handle on SenSage basics to be worth writing up.

General SenSage highlights include:

Read more

October 18, 2009

Kickfire capacity and pricing

Kickfire’s marketing communication efforts are still a work in progress. Kickfire did finally relax its secrecy about FPGA-vs.-custom-silicon – not coincidentally during Netezza’s recent publicity cycle. That wise choice helped Kickfire get some favorable attention recently for its technical and market strategy, e.g. from Daniel Abadi, Merv Adrian and, kicking things off — as it were — me. Weeks after a recent Kickfire product release, there’s finally a fairly accurate data sheet up, although there’s still one self-defeatingly misleading line I’ll comment on below. Pricing is a whole other area of confusion, although it seems that current list prices have been inadvertently* leaked in Merv’s post linked above, with only one inaccuracy that I can detect.**

*I gather from the company that they forgot to tell Merv pricing was NDA.

** Merv cited a price as “starting” that I believe to be top-of-the-line. No criticism of Merv is implied in that; Kickfire has not been very clear in communicating hard numbers.

All that said, if one takes Kickfire’s marketing statements literally, Kickfire list pricing is around $20-50K per terabyte for a few small, fixed, high-performance configurations. That’s all-in, for plug-and-play appliances. What’s more, that range is based on the actual published user data capacity numbers for various Kickfire models, which I think are low for several reasons:

October 14, 2009

Infobright notes

I had lunch w/ Bob Zurek and Susan Davis of Infobright today. This wasn’t primarily a briefing, but a few takeaways are:

October 14, 2009

Greenplum is going hybrid columnar as well

Over the past summer, Vertica, VectorWise, and Oracle all announced flavors of hybrid row/columnar storage. Now it’s Greenplum’s turn. Greenplum is actually offering true columnar storage, as opposed to Oracle’s PAX-like scheme — and also as opposed to the kind of Frankencolumn storage Daniel Abadi decries. For example, you don’t have to do a join to retrieve multiple columns; you just ask for them and there they are. Similarly, Greenplum doesn’t maintain explicit row IDs – whether in row-oriented or column-oriented append-only storage – relying instead on block-level header information. Read more

October 10, 2009

How 30+ enterprises are using Hadoop

MapReduce is definitely gaining traction, especially but by no means only in the form of Hadoop. In the aftermath of Hadoop World, Jeff Hammerbacher of Cloudera walked me quickly through 25 customers he pulled from Cloudera’s files. Facts and metrics ranged widely, of course:

Read more

October 10, 2009

Scientific data sharing

I’ve been posting recently about some issues in scientific data management. One topic I haven’t addressed yet is policies around data sharing. Generally:

On the other hand, it’s blindingly obvious that the world as a whole would be better off with widespread scientific data sharing, provided that making data “free” doesn’t significantly undermine scientists’ incentives to capture it in the first place. And institutions such as funding agencies are taking note. Thus:

Scientific data management technology should be suitable for either of the scenarios:

Read more

October 6, 2009

Oracle Exadata customers presenting at Oracle Open World

Greg Rahn tweeted a list of Exadata-focused sessions at Oracle Open World next week. As Oracle employees and supporters have been foreshadowing, there will be Exadata users and user-like folks presenting. I identified what look like half a dozen (not counting any who, for example, will make surprise appearances at keynote addresses), specifically: Read more

October 6, 2009

Oracle and Vertica on compression and other physical data layout features

In my recent post on Exadata pricing, I highlighted the importance of Oracle’s compression figures to the discussion, and the uncertainty about same. This led to a Twitter discussion featuring Greg Rahn* of Oracle and Dave Menninger and Omer Trajman of Vertica.  I also followed up with Omer on the phone. Read more

October 5, 2009

Oracle Exadata 2 capacity pricing

Summary of Oracle Exadata 2 capacity pricing

Analyzing Oracle Exadata pricing is always harder than one would first think. But I’ve finally gotten around to doing an Oracle Exadata 2 pricing spreadsheet. The main takeaways are:

Longer version

When Oracle introduced Exadata last year it was, well, expensive. Exadata 2 has now been announced, and it is significantly cheaper than Exadata 1 per terabyte of user data, based on:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.