Data warehouse appliances – fact and fiction
Borrowing the “Fact or fiction?” meme from the sports world:
- Data warehouse appliances have to have specialized hardware. Fiction. Indeed, most contenders except Teradata and Netezza — for example, DATAllegro, Vertica, ParAccel, Greenplum, and Infobright — offer Type 2 appliances. (Dataupia is another exception.)
- Specialized hardware is a dead-end for data warehouse appliances. Fiction. If it were easy for Teradata to replace its specialized switch technology, it would have done so a decade ago. And Netezza’s strategy has a lot of appeal.
- Data warehouse appliances are nothing new, and failed long ago. Fiction, but only because of Teradata. 1980s appliance pioneer Britton-Lee didn’t do so well (it was actually bought by Teradata). IBM and ICL (Britain’s national-champion hardware company) had content-addressable data store technology that went nowhere.
- Since data warehouse appliances failed long ago, they’ll fail now too. Fiction. Shared-nothing MPP is a fundamental advantage of appliances. So are various index-light strategies. Data warehouse appliances are here to stay.
- Data warehouse appliances only make sense if your main database management system can’t handle the job. Fiction. There are dozens of data warehouse appliances managing under 5 terabytes of user data, if not under 1 terabyte. True, some of them are legacy installations, dating back to when Oracle couldn’t handle that much data well itself. But new ones are still going in. Even if Oracle or Microsoft SQL Server can do the job, a data warehouse appliance is often a far superior — cheaper, easier to deploy and keep running, and/or better performing — alternative.
- Data warehouse appliances are just for data marts. For your full enterprise data warehouse, use a conventional DBMS. Part fact, part fiction. It depends on the appliance, and on the complexity of your needs. Teradata systems can do pretty much everything. Netezza and DATAllegro, two of the oldest data warehouse appliance startups, have worked hard on their concurrency issues and now can support fairly large user or reporting loads. They also can handle reasonable volumes of transactional or trickle-feed updates, and probably can support full EDW requirements for decent-sized organizations. Even so, there are some warehouse use cases for which they’re ill-suited. Newer appliance vendors are more limited yet.
- Analytic appliances are just renamed data warehouse appliances. Fact, even if misleading. Netezza is using the term “analytic appliance” to highlight additional things one can do on its boxes beyond answering queries. But those are still operations on a data mart or data warehouse.
- Teradata is the leading data warehouse appliance vendor. More fact than fiction. Some observers say that Teradata systems aren’t data warehouse appliances. But I think they are. Competitors may be superior to Teradata in one or the other characteristic trait of appliances – e.g., speed of installation – but it’s hard to define “appliances” in an objective way that excludes Teradata.
If you liked this post, you might also like one on text mining fact and fiction.
Amazon Dynamo — when primary key access is enough
Amazon has a very decentralized technical operation. But even the individual pieces have interestingly huge scale. Thus, various different things they’re doing are of interest.
They recently presented a research paper on a high-performance transactional system called Dynamo. (Hat tip to Dare Obasanjo.) A key point is the following:
There are many services on Amazon’s platform that only need primary-key access to a data store. For many services, such as those that provide best seller lists, shopping carts, customer preferences, session management, sales rank, and product catalog, the common pattern of using a relational database would lead to inefficiencies and limit scale and availability. Dynamo provides a simple primary-key only interface to meet the requirements of these applications.
Now, I don’t think too many organizations past Amazon are going to decide that they can’t afford the overhead of an RDBMS for such OLTP-like applications. But I do think it will become increasingly common to find other reasons to eschew traditional OLTP relational architectures. Maybe you’ll want the schema flexibility of XML. Or perhaps you’ll be happy with a fixed relational schema, but will want to optimize for analytic performance.
| Categories: Amazon and its cloud, Cloud computing, Data models and architecture, Database diversity, NoSQL, OLTP, Theory and architecture | 1 Comment |
Monash Research in 2008
DBMS2, obviously, has a parent company — Monash Research. It’s time to fill you all in on some of the exciting things we have going on.
We’ve upgraded our whole line of vendor services, adding attractive new consulting packages, starting the new Monash Research webcast series, and sharpening our white paper services as well. Most important, we enhanced our flagship Monash Advantage executive program, based on how members have actually used it in the inaugural year. Monash Advantage membership now includes significantly more consulting than before. Membership also remains the only way to get access to our Monash Letter analyst reports — such as our blockbuster guide to strategic marketing (coming soon) — and to our webcast and white paper sponsorship opportunities. Over half the companies listed in the sidebar are clients; and at this time of year, more are joining every week.
We also updated our main website at www.monash.com. It’s now even easier to keep up with all our research, or just with our most important news. We added to our already stellar lists of customers and testimonials. We redesigned the users’ guide to our white papers. And of course we updated the descriptions of our services. We even changed our name, for the first time in 17 years, although we’ll continue using “Monash Information Services” for financial dealings only.
Of course, we’re not stopping there. For example, there will be further changes when the Monash Research webcasts start being announced, held, and archived. User-oriented services will continue to be expanded, just as the vendor-oriented ones have been. And we plan to redesign DBMS2 and our other blog sites, some time in early 2008.
I look forward to working with you all over the next year.
| Categories: About this blog | Leave a Comment |
Netezza has another big October quarter
Netezza reported a big October quarter, ahead of expectations. And official guidance for next quarter is essentially flat quarter-over-quarter, suggesting Q3 was indeed surprisingly big. However, Netezza’s year-over-year growth for Q3 was a little under 50%, suggesting the quarter wasn’t so remarkable after all. (Netezza has a January fiscal year.)
Tentative conclusion: Netezza just tends to have big October quarters, perhaps by timing sales cycles to finish soon after the late September user conference. If Netezza’s user conference ever moves to later in the fall, expect Q3 to be weak that year.
Netezza reported 18 new customers, double last year’s figure. Read more
| Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Greenplum, Kognitio, Netezza | 3 Comments |
One of the funniest fake press releases ever
About an extended outage in Lord of the Rings Online.
Edited July 2, 2008: New URL that works at least for now.
| Categories: Games and virtual worlds, Humor | 1 Comment |
OK, now I get it — the guys at Ab Initio have something to spin or hide
According to the comments on this blog post, Ab Initio has been throwing analysts out of their trade show booths and being otherwise rude for at least two years, and probably a long longer. That goes beyond marketing strategy or quirkiness. It means Ab Initio has some secrets it desperately doesn’t want to have found out, or at least that it wants to conceal unless there are Ab Initio salespeople present to spin the prospects’ response to the news. Read more
| Categories: Ab Initio Software, EAI, EII, ETL, ELT, ETLT | 17 Comments |
Does Ab Initio need to be taken seriously?
Users and vendors occasionally mention Ab Initio to me. But when I inquired about details at their booth at the Teradata conference, I was told to go away by a “gentleman” who seemed quite amused with himself for doing so. And when I checked Ab Initio’s website right now, I found it to be rather content-free.
Is Ab Initio all flash — and Flash — and no substance? Do they rely on selling to a few enterprises they can bamboozle, free from interference by prying analysts? Obviously, I don’t know for sure, but that’s how my guesses are leaning right now.
| Categories: Ab Initio Software | 6 Comments |
One of the coolest visualizations I’ve seen
An obscure little company called Ward Analytics was displaying a Teradata performance management tool at the recent Teradata Partners conference, and I just found the visualization to be very cool. Yes, it’s full-screen, but there’s a LOT of information on the screen — basically, what amounts to about four graphs or charts, each of them complex. Plus there are lots of widgets to adjust what you see. And I actually don’t think full-screen is much of a drawback; you just have to be smart about the simpler elements you put in a portal-based UI that then blow up into complex full-screen ones on demand.
This screenshot doesn’t do the product — called Visual Edge — full justice, but it gives a pretty good taste. The weirdest part is that Ward rolled its own technology to create Visual Edge, feeling there were no generally suitable visualizations out there in the market for it to adopt.
| Categories: Analytic technologies, Business intelligence, Teradata | 5 Comments |
Coral8 highlights some key issues with dashboards
Coral8 today is rolling out the Coral8 Portal, offering some BI basics for CEP (Complex Event Processing) filters and queries. In Release 1, this is primitive compared with other BI portals, and of direct interest only to organizations that have already decided they’re using CEP technology. Even so, it serves as a useful illustration of several important issues in dashboarding.
The simplest is that real-time dashboards require different visualizations than others. Most obvious is the ever-popular graph marching from right to left across the screen as time advances along the x-axis. There also are difference in styles between reports and tables that you actually read, vs. read-outs that you merely watch for flickers of change. (Of course those two examples hardly make for a complete list.)
More interesting is the flexibility and parameterization. While Coral8 sells to multiple markets, the design point for the portal is clearly financial trading. So, for example, a query may be registered with one ticker symbol, and an end user can easily customize it to slot in another one instead. In a way, this is a step toward the much greater flexibility that dashboards need overall.
Truth be told, if you put all such Coral8 flexibility features together they’re not yet very impressive. So what’s even more interesting is the overall architecture that could support much greater flexibility in the future. If dashboards gain the flexibility they need, and queries continue to be done in the conventional manner, query volumes will increase enormously. If it further is the case that they are upgraded in some near real-time manner, that’s another huge increase.
How huge? Well, I can make a case that it could be well over three orders of magnitude: Read more
| Categories: Aleri and Coral8, Analytic technologies, Business intelligence, Memory-centric data management, Streaming and complex event processing (CEP) | 3 Comments |
The key problem with dashboard functionality
I keep hinting – or saying outright 🙂 — that I think dashboards need to be revolutionized. It’s probably time to spell that point out a little further.
The key issue, in my opinion, it that dashboards need to be much more personalizable than they are now. This isn’t just me talking. I’ve raised the subject with a lot of users recently, and am getting close to 100% agreement with my viewpoint.
One part of the problem is personalizing what to see, how to visualize it, and how all that’s arranged on the screen. No one product yet fully combines best-of-breed ideas from mainstream BI, specialized visualization tools, and flexible personalized web portals. But that’s not my biggest concern, as I think the BI industry is on a pretty good path in those respects.
Rather, the real issue is that dashboards don’t adequately reflect personal opinions as to what is important. Indeed, that lack is often portrayed as virtue, because supposedly top management can dictate through a few simple metrics what a whole company of subordinates will think and think about. (Balanced scorecard theology is a particularly silly form of this.) But actually that lack is a serious impediment to dashboard success, or indeed to a general analytic/numerate enterprise culture overall.
“One version of the truth” can be a gross oversimplification. Read more
| Categories: Analytic technologies, Business intelligence, OLTP | 6 Comments |
