May 22, 2006

Data warehouse appliances

If we define a “data warehouse appliance” as “a special-purpose computer system, with appliance administratibility, that manages a data warehouse,” then there are two major contenders: Netezza and DATAllegro, both startups, both with a small number of disclosed customers. Past contenders would include Teradata and White Cross (which seems to have just merged into Kognitio), but neither would admit to being in that market today. (I suspect this is a mistake on Teradata’s part, but so be it.) IBM with DB2 on the z-Series wouldn’t be properly regarded as an appliance player either, although IBM is certainly conscious of appliance competition. And SAP’s BI Accelerator does not persist data at this time.

In principle, the Netezza and DATAllegro stories are similar — take an established open source RDBMS*, build optimized hardware to run it, and optimize the software configuration as well. Much of the optimization is focused on getting data on and off disk sequentially, minimizing any random accesses. This is why I often refer to data warehouse appliances as being the best alternative to memory-centric data management. Beyond that, the optimizations by the two vendors differ considerably.
*Netezza uses PostgreSQL; DATAllegro uses Ingres.

Hmm. I don’t feel like writing more on this subject at this very moment, yet I want to post something urgently because there’s an IOU in my Computerworld column today for it. OK. More later.

Categories: Actian and Ingres, Companies and products, Data warehouse appliances, DATAllegro, DBMS product categories, IBM and DB2, Memory-centric data management, Open source, SAP AG

Leave a Comment

May 22, 2006

Introduction to Cogito

In my Computerworld column appearing today, I promised to post here about Cogito. Let me start with a disclosure and a confession: Read more

Categories: Cogito and 7 Degrees, RDF and graphs

8 Comments

May 15, 2006

Philip Howard likes Viper

Philip Howard likes DB2’s Viper release. Truth be told, Philip Howard seems to like most products, whether they deserve it or not. But in this case, I think his analysis is spot-on.

Categories: IBM and DB2, OLTP, Structured documents

Leave a Comment

May 13, 2006

Hot times at Intersystems

About a year ago, I wrote a very favorable column focusing on Intersystems’ OODBMS Cache’. Cache’ appears to be the one OODBMS product that has good performance even in a standard disk-centric configuration, notwithstanding that random pointer access seems to be antithetical to good disk performance.

Intersystems also has a hot new Cache’-based integration product, Ensemble. They attempted to brief me on it (somewhat belatedly, truth be told) last Wednesday. Through no fault of the product, however, the briefing didn’t go so well. I still look forward to learning more about Ensemble.

Categories: EAI, EII, ETL, ELT, ETLT, Humor, Intersystems and Cache', Object, OLTP

Leave a Comment

May 10, 2006

White paper on memory-centric data management — excerpt

Here’s an excerpt from the introduction to my new white paper on memory-centric data management. I don’t know why WordPress insists on showing the table gridlines, but I won’t try to fix that now. Anyhow, if you’re interested enough to read most of this excerpt, I strongly suggest downloading the full paper.

	Introduction
Conventional DBMS don’t always perform adequately.	Ideally, IT managers would never need to think about the details of data management technology. Market-leading, general-purpose DBMS (DataBase Management Systems) would do a great job of meeting all information management needs. But we don’t live in an ideal world. Even after decades of great technical advances, conventional DBMS still can’t give your users all the information they need, when and where they need it, at acceptable cost. As a result, specialty data management products continue to be needed, filling the gaps where more general DBMS don’t do an adequate job.
Memory-centric technology is a powerful alternative.	One category on the upswing is memory-centric data management technology. While conventional DBMS are designed to get data on and off disk quickly, memory-centric products (which may or may not be full DBMS) assume all the data is in RAM in the first place. The implications of this design choice can be profound. RAM access speeds are up to 1,000,000 times faster than random reads on disk. Consequently, whole new classes of data access methods can be used when the disk speed bottleneck is ignored. Sequential access is much faster in RAM, too, allowing yet another group of efficient data access approaches to be implemented.
It does things disk-based systems can’t.	If you want to query a used-book database a million times a minute, that’s hard to do in a standard relational DBMS. But Progress’ ObjectStore gets it done for Amazon. If you want to recalculate a set of OLAP (OnLine Analytic Processing) cubes in real-time, don’t look to a disk-based system of any kind. But Applix’s TM1 can do just that. And if you want to stick DBMS instances on 99 nodes of a telecom network, all persisting data to a 100^th node, a disk-centric system isn’t your best choice – but Solid’s BoostEngine should get the job done.
Memory-centric data managers fill the gap, in various guises.	Those products are some leading examples of a diverse group of specialist memory-centric data management products. Such products can be optimized for OLAP or OLTP (OnLine Transaction Processing) or event-stream processing. They may be positioned as DBMS, quasi-DBMS, BI (Business Intelligence) features, or some utterly new kind of middleware. They may come from top-tier software vendors or from the rawest of startups. But they all share a common design philosophy: Optimize the use of ever-faster semiconductors, rather than focusing on (relatively) slow-spinning disks.
They have a rich variety of benefits.	For any technology that radically improves price/performance (or any other measure of IT efficiency), the benefits can be found in three main categories: Doing the same things you did before, only more cheaply; Doing the same things you did before, only better and/or faster; Doing things that weren’t technically or economically feasible before at all. For memory-centric data management, the “things that you couldn’t do before at all” are concentrated in areas that are highly real-time or that use non-relational data structures. Conversely, for many relational and/or OLTP apps, memory-centric technology is essentially a much cheaper/better/faster way of doing what you were already struggling through all along.
Memory-centric technology has many applications.	Through both OEM and direct purchases, many enterprises have already adopted memory-centric technology. For example:
	Financial services vendors use memory-centric data management throughout their trading systems. Telecom service vendors use memory-centric data management in multiple provisioning, billing, and routing applications. Memory-centric data management is used to accelerate web transactions, including in what may be the most demanding OLTP app of all — Amazon.com’s online bookstore. Memory-centric data management technology is OEMed in a variety of major enterprise network management products, including HP Openview. Memory-centric data management is used to accelerate analytics across a broad variety of industries, especially in such areas as planning, scenarios, customer analytics, and profitability analysis.

Categories: Data types, Memory-centric data management, MOLAP, Object, OLTP, Open source, Progress, Apama, and DataDirect

3 Comments

May 8, 2006

Memory-centric data management whitepaper

I have finally finished and uploaded the long-awaited white paper on memory-centric data management.

This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:

Applix, vendors of in-memory/memory-centric MOLAP tool TM1
Progress Software, vendors of ObjectStore, an OODBMS that has more impressive references in-memory or otherwise memory-centric than it does in classical disk-based configurations, and also of the Apama stream processing products
SAP, vendors of the BI Accelerator functionality of SAP NetWeaver, or whatever tortured name they want to give it this month — basically, that’s a very cool in-memory columnar data mart technology
Solid Information Technology, vendor of hybrid in-memory/disk-based OLTP RDBMS. Historically focused on the embedded systems market, especially telecom and networking, they’ve recently been in the news because of a deal with MySQL that is designed to extend their reach.
Intel, makers of the processors used to run a lot of the other sponsors’ products (including all BI Accelerator installations to date).

If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.

And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.

Categories: Cognos, Companies and products, In-memory DBMS, Intel, Memory-centric data management, MOLAP, Open source, Progress, Apama, and DataDirect, SAP AG, solidDB

2 Comments

May 2, 2006

DBMS2 at IBM

I had a chat a couple of weeks ago with Bob Picciano, who runs servers (i.e., DBMS) for IBM. I came away feeling that, while they don’t use that name, they’re well down the DBMS2 path. By no means is this SAP’s level of commitment; after all, they have to cater to traditional technology strategies as well. But they definitely seem to be getting there.

Why do I say that? Well, in no particular order:

They have a huge commitment to a data integration business, with an increasing XML focus.
Their favorite buzzword these days is “information-intensive,” which seems to amount to semi-composite apps that may talk in part to unstructured/semi-structured data.
They’re serious about their XML data server.
Unprompted – well, OK, he’s clearly read my stuff, but other than that it was unprompted – Bob referred to one of the key benefits (real and perceived) of XML storage as being “schema flexibility.”
By accident or design, IBM has a multi-server, horses-for-courses DBMS strategy: DB2 in two important flavors, XML server, Multivalue/Pick (that’s growing, by the way), and so on.

The big piece of a DBMS2 strategy that IBM seems to be lacking is a data-oriented services repository. IBM has had disasters in the past with over-grand repository plans, so they’re treading cautiously this time around. There also might be an organizational issue; DBMS and integration technology sit in separate divisions, and I doubt it’s yet appreciated throughout IBM how central data is to an SOA strategy.

But that not-so-minor detail aside, IBM definitely seems to be developing a DBMS2-like technology vision.

Categories: EAI, EII, ETL, ELT, ETLT, IBM and DB2, OLTP, Structured documents

Leave a Comment

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Data warehouse appliances

Introduction to Cogito

Philip Howard likes Viper

Hot times at Intersystems

White paper on memory-centric data management — excerpt

Memory-centric data management whitepaper

DBMS2 at IBM

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin