Analytic technologies

Discussion of technologies related to information query and analysis. Related subjects include:

October 10, 2010

Notes and links October 10 2010

More quick-hit notes, links, and so on:  Read more

October 10, 2010

EMC/Greenplum notes

I dropped by the former Greenplum for my quarterly consulting visit (scheduled for the first week of Q4 for a couple of reasons, one of them XLDB4). Much of what we discussed was purely advisory and/or confidential — duh! — but there were real, nonconfidential takeaways in two areas.

First, feelings about the EMC acquisition are still very positive.

Read more

October 10, 2010

It can be hard to analyze analytics

When vendors talk about the integration of advanced analytics into database technology, confusion tends to ensue. For example: Read more

October 6, 2010

eBay followup — Greenplum out, Teradata > 10 petabytes, Hadoop has some value, and more

I chatted with Oliver Ratzesberger of eBay around a Stanford picnic table yesterday (the XLDB 4 conference is being held at Jacek Becla’s home base of SLAC, which used to stand for “Stanford Linear Accelerator Center”). Todd Walter of Teradata also sat in on the latter part of the conversation. Things I learned included:  Read more

September 20, 2010

Some thoughts on the announcement that IBM is buying Netezza

As you’ve probably read, IBM and Netezza announced a deal today for IBM to buy Netezza. I didn’t sit in on the conference call, but I’ve seen the reporting. Naturally, I have some quick thoughts, which I’ve broken up into several sections below:

Read more

September 15, 2010

Aster Data nCluster Version 4.6

The main thing in Aster Data nCluster Version 4.6 is Aster’s version of hybrid row-column store technology. Technical highlights include:

So Aster Data has now joined Greenplum/EMC among row-based analytic DBMS vendors with hybrid row-column stores. Oracle will join them some day, and the same probably applies to other row-based vendors as well. Similarly, Aster Data will probably join Oracle some day in having columnar compression. And so this all fits the model:

Read more

August 22, 2010

The Workday architecture — a new kind of OLTP software stack

One of my coolest company visits in some time was to SaaS (Software as a Service) vendor Workday, Inc., earlier this month. Reasons included:

Workday kindly allowed me to post this Workday slide deck. Otherwise, I’ve split out a quick Workday, Inc. company overview into a separate post.

The biggie for me was the data and object management part. Specifically:  Read more

August 21, 2010

The substance of Pentaho’s Hadoop strategy

Pentaho has been talking about a Hadoop-related strategy. Unfortunately, in support of its Hadoop efforts, Pentaho has been — quite insistently — saying things that don’t make a lot of sense to people who know anything about Hadoop.

That said, I think I found four sensible points in Pentaho’s Hadoop strategy, namely:

  1. If you use an ETL tool like Pentaho’s to move things in and out of HDFS, you may be able to orchestrate two more steps in the ETL process than if you used Hadoop’s native orchestration tools.
  2. A lot of what you want to do in MapReduce is things that can be graphically specified in an ETL tool like Pentaho’s. (That would include tokenization or regex.)
  3. If you have some really lightweight BI requirements (ad hoc, reporting, or whatever) against HDFS data, you might be content to do it straight against HDFS, rather than moving the data into a real DBMS. If so, BI tools like Pentaho’s might be useful.
  4. Somebody might want to use a screwy version of MapReduce, where by “screwy” I mean anything that isn’t Cloudera Enterprise, Aster Data SQL/MapReduce, or some other implementation/distribution with a lot of supporting tools. In that case, they might need all the tools they can get.

The first of those points is, in the grand scheme of things, pretty trivial.

The third one makes sense. While Hadoop’s Hive client means you could roll your own integration with your own favorite BI tool in any case, having somebody certify it for you themselves could be nice. So if Pentaho ships something that works before other vendors do, good on them. (Target date seems to be October.)

The fourth one is kind of sad.

But if there’s any shovel-meet-pony aspect to all this — or indeed a reason for writing this blog post — it would be the second point. If one understands data management, but is in the “Oh no! Hadoop wants me to PROGRAM!” crowd, then being able to specify one’s MapReduce might be a really nice alternative versus having to actually code it.

August 18, 2010

DB2 workload management

DB2 has added a lot of workload management features in recent releases. So when we talked Tuesday afternoon, Tim Vincent and I didn’t bother going through every one. Even so, we covered some interesting subjects in the area of DB2 workload management, including:  Read more

August 18, 2010

More on temp space, compression, and “random” I/O

My PhD was in a probability-related area of mathematics (game theory), so I tend to squirm when something is described as “random” that clearly is not. That said, a comment by Shilpa Lawande on our recent flash/temp space discussion suggests the following way of framing a key point:

If everybody else is cool with it too, I can live with that. 🙂

Meanwhile, I talked again with Tim Vincent of IBM this afternoon. Tim endorsed the temp space/Flash fit, but with a different emphasis, which upon review I find I don’t really understand. The idea is:

My problem with that is: Flash typically has lower write than read IOPS (I/O per second), so being (relatively) write-intensive would, to a first approximation, seem if anything to disfavor a workload for flash.

On the plus side, I was reminded of something I should have noted when I wrote about DB2 compression before:

Much like Vertica, DB2 operates on compressed data all the way through, including in temp space.

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.