November 14, 2005

So how robust is Ingres?

CA is spinning off Ingres, more or less, to an investment fund led by Terry Garnett, who will also be interim of CEO. Now, I’ve given Terry a lot of grief over the decades. It started by accident, when I bashed his presentation of Lightyear at a 1984 party in Rosann Stach’s house (where we also used Jerry Kaplan as a subject for the Mindprober psychological analysis product — those were the days of goofy software!). Years later, I didn’t even recall that had been Terry until I was reminded. But in the early 1990s, when Terry and Jerry Baker were dueling at Oracle, I was firmly in the Jerry Baker camp, and believe I was right to this day. Still — be all that as it may, Terry knows DBMS and knows promotion, and if the company falls flat it won’t be because he screwed it up. He’s no dunce, and he’s been around DBMS a loooong time.

But how stands the product? Let’s flash back a decade, to when CA bought it. Ingres was a solid general-purpose RDBMS. But it was beginning to fall behind the technology power curve, especially on the data warehousing side. (For more detail, see my Ingres history post over in the Software Memories blog.) And then product development slowed to a crawl. Tony Gaughan, who ran the product for CA before the latest move, claims that they’ve actually done a good job on advancing the product on the OLTP side, perhaps to the point of comparability with Oracle9i, and certainly ahead of MySQL 5.0. I’m inclined to believe him, after applying some reasonable discount factor for expected puffery, in part because this wasn’t a high hurdle to cross. Over the past decade, the main action in high-end DBMS product enhancement has been in data warehousing and nontabular datatypes, not in OLTP.

Where Ingres definitely seems to lag is in data warehousing. E.g., there are no materialized views, and I bet that even if they have some of the index types such as bitmaps, star schemas, etc., the implementation, optimizer support, administrative support, and so on lag far behind that of Oracle and IBM. So again, the proper comparison for Ingres isn’t Oracle and IBM; it’s fellow open source vendor MySQL. Only — deserved or not, MySQL has a ton of momentum for such a small company, incuding an attractive product plan partially fueled by SAP.

Appliance vendor DATallegro makes a plausibiity argument that Ingres can be adapted for nontrivial data warehouse uses as well. But while that’s cool, and might even become persuasive once DATallegro has some happy, disclosed customers, it’s not the same as saying you want to put a big data warehouse into off-the-shelf Ingres.

So basically, I’m afraid that Ingres is going to appeal mainly to users who either already are making major use of it, or else have a huge problem with paying the license fees demanded by other vendors. I wish them well, and hope they kindle a spark somehow; but right now I don’t see where it would be coming from.

November 14, 2005

Defining and surveying “Memory-centric data management”

I’m writing more and more about memory-centric data management technology these days, including in my latest Computerworld column. You may be wondering what that term refers to. Well, I’ve basically renamed what are commonly called “in-memory DBMS,” for what I think is a very good reason: Most of the products in the category aren’t true DBMS, aren’t wholly in-memory, or both! Indeed, if you catch me in a grouchy mood I might argue that “in-memory DBMS” is actually a contradiction in terms.

I’ll give a quick summary of the vendors and products I am focusing on in this newly-named category, and it should be clearer what I mean:

So there you have it. There are a whole lot of technologies out there that manage data in RAM, in ways that would make little or no sense if disks were more intimately involved. Conventional DBMS also try to exploit RAM and limit disk access, via caching; but generally the data access methods they use in RAM are pretty similar to those they use when going out to disk. So memory-centric systems can have a major advantage.

November 13, 2005

Breaking through the disk speed barrier

Most aspects of computer performance and capacity grow at Moore’s Law kinds of speeds. Doubling times may be anywhere from 9 months to 2 years, but in any case speeds and storage capacities grow exponentially quickly. Not so, however, with disk rotation speeds. The very first disk drives, over 50 years ago, rotated 1,200 times per minute. Today’s top disk rotation speed is around 15,000 RPM. Indeed, while I recall seeing a reference to one at 15,600 RPM, I can’t now go back and find it. Yes, folks; disk rotational speed in the entire history of computing has increased just by a measly factor of 13.

Why does this matter to DBMS design? Simply put, disk rotation speed is an absolute limit to the speed of random disk-based data access. Today’s fastest disks take 4 milliseconds to rotate once. Thus, multiple heads aside, getting something from a known but random location on the disk will take at least 2 milliseconds. And a naive data management algorithm will, for a single query, result in dozens or even hundreds of random accesses.

Thus, for a DBMS to run at acceptable speed, it needs to get data off disk not randomly, but rather a page at a time (i.e., in large blocks of predetermined size) or better yet sequentially (i.e., in continuous streams of indeterminate size). The indexes needed to assure these goals had best be sized to fit entirely in RAM. Clustering also plays an increasingly large role, so that data needed at the same time is likely to be on the same page, or at least in the same part of the disk.

Right there I’ve described some of the toughest ongoing challenges facing DBMS engineers. The big vendors all do a great job at meeting them (if they didn’t, they’d be out of business). Even so, some small companies find themselves able to beat the big guys, by some egregious cheating.

Data warehouse appliance vendors such as Netezza and especially Datallegro optimize their systems to stream data sequentially off of disk. In doing so, they go deeper into the operating systems, hardware, etc. than Oracle could ever allow itself to do. And the results seem pretty good. But I’ll write about that another time. Instead, I’m focusing right now on memory-centric data management; please see my other posts in that topic category.

November 13, 2005

Gartner on “The Death of the Database”

Gartner had a recent conference session on “The Death of the Database,” as described in David Berlind’s and Kathy Somebodyorother’s blogs. The core idea was that data in the future might be stored closest to where it would need to be used, which might not be in a traditional DBMS.

Before getting to the real meat of that, let me push back at some of the extremist boobirds. First, I doubt the analysts really talked about “the intersection of a row and a tuple”; it’s much more likely that that is a misquote due to reporting error. Second, their claim that BI will switch from being an “application” to a “service” is not at all unreasonable. BI should never have been viewed as an application; it’s much more a collection of application-enabling technologies. And the analysts explicitly said that DBMS will continue to be useful for analytics. As for their claim that some data needs to be only briefly persistent — they’re absolutely right, but let me defer that point to a separate post on memory-centric OLTP.

All that said — while a lot of their points ring true, it sounds as if they overstated their case in one important area. They’re making it sound as if some of today’s OLTP databases will no longer be needed, and as if tomorrow’s new kinds of OLTP data won’t need to be at least partly persisted to conventional DBMS. Wrong and wrong. Every important transaction needs to wind up in a DBMS. Those DBMS may not be as centralized as previously thought. The data may be copied to non-DBMS data stores (or, more likely, kept in a lightweight local DBMS and copied from there to serioius OLTP database). These DBMS may use native XML rather than traditional tabular data structures. But at the end of the day, transactional databases will continue to be needed for all the reasons they’ve been necessary in the past.

November 12, 2005

TransRelational(TM) — The final debunking

In prior posts, I’ve mentioned the essential dishonesty behind the hoohah around Transrelational(TM) technology from Required Technologies, Inc., and Chris Date’s highly regrettable promotion of same. Now I’ve been able to get more detail from another former executive of the company. Unsurprisingly, it corroborates what I wrote before, and utterly contradicts some of the myths spread by Date and his acolytes. This executive, while requesting that his name be withheld because of the acrimony between the CEO and just about every other company insider, otherwise gave me permission to report fully on what he told me. Read more

October 29, 2005

Oh, dear — Chris Date is displeased with me

Chris Date is quite annoyed with me, and has taken issue with various things I’ve written. Some of his reasoning is hard to follow. For example, he said something to the effect that it would be silly for him to ever say anything misleading, because he’d immediately be caught out. Uh, Chris – you’re the guy who’s berating the terrible level of education and understanding in a field for which YOU WROTE THE DEFINITIVE TEXTBOOK (which has sold “over 700,000 copies”). If your readers can’t even understand the correct things you say in your book, why should they be able to instantly spot the errors? Read more

October 18, 2005

EII marketing soup

In the comments to another thread, the subject of EII (Enterprise Information Integration) came up. It’s a tricky one, for several reasons.

First, it’s a marketing construction — a blend between between ETL (Extract, Transform, Load) and EAI (Enterprise Application Integration). It’s a legitimate category; all those things are getting smushed together as near-real-time apps become more prominent. Still, it’s also an attempt to grab marketing turf.

Second, it’s commonly associated with a marketing overreach — the claim that an EII “platform” or “suite” will do everything a DBMS does (almost), but fully and heterogeneously distributed as well. Yeah, right.

Third, two of the sharpest proponents have been acquired by behemoths that tend to obscure their acquirees marketing pitches — Ascential by IBM and SeeBeyond by Sun.

Fourth, some of the best grand integrated EII suites (at least the ones that started as ETL, which is the side I’m more familiar with) aren’t complete yet. So vendors didn’t want to be too clear for fear of freezing current sales. I’m referring here mainly to Ascential and Informatica. They told analysts of their grand plans, but they haven’t been so eager to openly publicize the full details.

Fifth, the area is getting integrated with development tools for composite applications. Good examples there are SeeBeyond and Intersystems’ Cache’.

Sixth, no EII vendors’ plans fully work unless they have full relational and XML integration, and nobody really has been doing a great job on that, typically being strong in one area or the other.

Obviously, this is an area I have to research actively; EII is the neuromuscular system that holds DBMS2 together. But all the research in the world won’t change the fact that as of now it’s the weak spot in the story. There’s lots of great database management technology, and lots of excellent reasons to use a variety of kinds of that technology in your enterprise. But the tools to knit the resulting heterogeneous databases together are still sadly deficient.

October 13, 2005

It’s not about a single database

Critics of the DBMS2 idea generally are focused on the design of a single database. That’s somewhat missing the point.

Here are some excerpts and paraphrases from a discussion over on TDAN.

October 10, 2005

Limitations of the Relational Model

In my October Computerworld column, I tried to explain some of the reasons why I don’t think the pure Relational Model should be as absolutely dominant as its most fervent proponents assert.

The key points were:

1. Logical and physical modeling will never be completely separable.
2. “True relational” DBMSs are very unlikely ever to be practically useful, except perhaps in narrow niches.
3. Enterprises don’t fully control their data models.
4. Duplicated data is not inherently bad.
5. Saying that the relational model (RM) is based on mathematics proves almost nothing.
6. IT isn’t just concerned with facts.

For details see the link above.

And while I’m at it, here’s a link to my September Computerworld column, on three life-and-death apps that won’t get built with a relational architecture.

October 10, 2005

TransRelational(TM) nonsense

Database guru Christopher J. Date is apparently accepting money from attendees to his seminars on TransRelational(TM) database archicture, so that he can tell them about an as-yet unreleased product from Required Technologies, Inc.

This is regrettable on multiple levels.

1. Required Technologies shut down product development in 2002, after running through $30 million; there’s great acrimony between investors and the CEO; and lawsuits are likely.

2. Required’s product never did most of what Date seems to be claiming it now does. It was a read-oriented columnar data store, much like Sybase IQ or a number of other products from younger companies. Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.