Data warehousing

Analysis of issues in data warehousing, with extensive coverage of database management systems and data warehouse appliances that are optimized to query large volumes of data. Related subjects include:

February 10, 2009

Aster Data nPath

Edit: Unfortunately, this post and its sequel rely on Aster Data posts that Aster’s buyer Teradata no longer makes easily available.

At the same time as it rolled out its cloud story, Aster Data told of nPath, a MapReduce-based feature in nCluster. As best I understand it, the core idea of nPath is that it preprocesses sequential data via MapReduce so that you can then do ordinary SQL on it. (Steve Wooledge’s blog post about nPath outlines why that might be needed. Point 1 in Mayank Bawa’s August, 2008 post is much more concise. 😉 ) Now, that might seem to contradict the syntax, which is all about MapReduce being invoked via SQL — still, it’s what’s really going on.

That leads to two obvious questions: What is nPath used (or useful) for? and How is the preprocessing done anyway? Read more

February 10, 2009

Aster Data in the cloud

Aster Data is in the news, bragging about a cloud version of nCluster, and providing both a press release and a blog post on the subject. It seems there are three actual customers, two of which have been publicly named. One of them, ShareThis, is in production. (2 terabytes of data on 9 nodes, planning to scale to 10-18 TB on 24 or so nodes by year-end.) All seem to be doing something in the area of internet marketing, web analytics or otherwise — which makes sense, as the same could be said of almost all Aster customers overall. That said, it seems that these customers are doing their primary analytic processing remotely, which makes Aster’s experience in that regard more akin to Kognitio’s than to Vertica’s. Read more

February 7, 2009

Analytics’ role in a frightening economy

I chatted yesterday with the general business side (as opposed to the trading operation) of a household-name brokerage firm, one that’s in no immediate financial peril. It seems their #1 analytic-technology priority right now is changing planning from an annual to a monthly cycle.* That’s a smart idea. While it’s especially important in their business, larger enterprises of all kinds should consider following suit. Read more

February 6, 2009

Final (for now) slides on how to select a data warehouse DBMS

I’ve now posted a final version of the slide deck* I first posted Wednesday.  And I do mean final; TDWI likes its slide decks locked down weeks in advance, because they go to the printer to be memorialized on dead trees.  I added or fleshed out notes on quite a few slides vs. the prior draft. Actual changes to the slides themselves, however, were pretty sparse, and mainly were based on comments to the prior post.  Thanks for all the help!

*That’s a new URL.  The old deck is still up too, for those morbidly curious as to what I did or didn’t change.

February 4, 2009

Draft slides on how to select an analytic DBMS

I need to finalize an already-too-long slide deck on how to select an analytic DBMS by late Thursday night.  Anybody see something I’m overlooking, or just plain got wrong?

Edit: The slides have now been finalized.

February 3, 2009

Winter Corporation on Exadata

The most ridiculous analyst study I can recall — at least since Aberdeen pulled back from the “You pay; we say” business — is Winter Corporation’s list of large data warehouses. (Failings include that it only lists warehouses run by software from certain vendors; it doesn’t even list most of the largest warehouses from those vendors; and its size metrics are in my opinion fried.) So it was with some trepidation that I approached what appears to be an Oracle-sponsored Winter Corporation white paper about Exadata.* Read more

February 3, 2009

EMC’s take on data warehousing and BI

I just ran across a December 10 blog post by Chuck Hollis outlining some of EMC’s — or at least Chuck’s — views on data warehousing and business intelligence. It’s worth scanning, a certain “Where you stand depends upon where you sit” flavor to it notwithstanding.  In a contrast to my usual blogging style, Chuck’s post is excerpted at length below, with comments from me interspersed. Read more

February 2, 2009

One vendor’s trash is another’s treasure

A few months ago, CEO Mayank Bawa of Aster Data commented to me on his surprise at how “profound” the relationship was between design choices in one aspect of a data warehouse DBMS and choices in other parts. The word choice in that was all Mayank, but the underlying thought is one I’ve long shared, and that I’m certain architects of many analytic DBMS share as well.

For that matter, the observation is no doubt true in many other product categories as well. But in the analytic database management arena, where there are literally 10-20+ competitors with different, non-stupid approaches, it seems most particularly valid. Here are some examples of what I mean. Read more

February 1, 2009

Oracle says they do onsite Exadata POCs after all

When I first asked Oracle about Netezza’s claim that Oracle doesn’t do onsite Exadata POCs, they blew off the question. Then I showed Oracle an article draft saying they don’t do onsite Exadata proofs-of-concept. At that point, Oracle denied Netezza’s claim, and told me there indeed have been onsite Exadata POCs.  Oracle has not yet been able to provide me with any actual examples of same, but perhaps that will change soon.  In the mean time, I continue with the assumption that Oracle is, at best, reluctant to do Exadata POCs at customer sites.

I do understand multiple reasons for vendors to prefer POCs be done on their own sites, both innocent (cost) and nefarious (excessive degrees of control). Read more

February 1, 2009

Simplicity in analytic database management systems

Ideally, administering a relational database management system should be simple — describe the tables, load the data, and rely on the system to take care of everything else. Complexity comes primarily in two (somewhat overlapping) forms:

Vendors whose products shine in one of those areas but not in both tend to claim greater advantages in “simplicity” than they actually possess. And the list of such vendors is long, because there’s something of a negative correlation between excellence in the two metrics, often because:

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.