Intersystems and Cache’
Analysis of Intersystems, its object-oriented DBMS Cache’, and its composite application tool Ensemble. Related subjects include:
Philip Howard went to at least one conference this month I didn’t, namely IBM’s, and wrote up some highlights. As usual, he seems to have been favorably impressed.
In one note, he says that IBM is claiming a 2-5X XML performance improvement. This is a good step, since one of my clients who evaluated such engines dismissed IBM early on for being an order of magnitude too slow. That client ultimately chose Marklogic, with Cache’ having been the only other choice to make the short list.
Speaking of IBM, I flew back from the Business Objects conference next to a guy who supports IMS. He told me that IBM has bragged of an actual new customer win for IMS within the past couple of years (a large bank in China). Read more
I’ve been implying that the short list for native XML database engine vendors should be MarkLogic, IBM, and maybe Microsoft, on the theory that Progress and Intersystems tried the market and pulled back. Well, add Intersystems to the list, and not necessarily in last place. They’ve long had a very fast nonrelational engine in Cache’. Perhaps building Ensemble on it has induced them to sharpen up the XML capabilities again.
Anyhow, while I’m not at liberty to explain more of my reasoning (i.e., to disclose my evidence) — Cache’ should be taken seriously as an XML DBMS alternative … even if I never can seem to get a proper DBMS briefing from them (which is far from entirely being their fault).
|Categories: IBM and DB2, Intersystems and Cache', MarkLogic, Microsoft and SQL*Server, Progress, Apama, and DataDirect, Structured documents||1 Comment|
Edit: This post has largely been superseded by this more recent one defining mid-range relational DBMS.
I find myself defining a new product category – midrange OLTP/multipurpose DBMS. (Or just midrange DBMS for brevity.) Nothing earthshaking here; I’m simply referring to those products that: Read more
|Categories: Actian and Ingres, EnterpriseDB and Postgres Plus, IBM and DB2, Intersystems and Cache', Microsoft and SQL*Server, Mid-range, MySQL, OLTP, Open source, Oracle, Progress, Apama, and DataDirect, solidDB, Sybase||8 Comments|
The standard Clayton Christensen “Innovator’s Dilemma” disruption narrative goes something like this:
- Market leaders have many advantages, including top technology.
- Followers come up with good technology too.
- The leaders stay ahead by making their products ever better and more complex.
- The followers sell into new or non-mainstream markets, at prices the leaders can’t match. So they dominate new markets.
- Old markets turn into low-margin commodity-fests.
- Old leaders are screwed.
And it’s really hard for market leaders to avert this sad fate, because the short- and intermediate-term margin hit would be too great.
I think the OLTP DBMS market is ripe for that kind of disruption – riper than commentators generally realize. Here are some key potential drivers:
Most of what I’ve written lately about database management seems to have been focused on analytic technologies. But I have a lot to say on the OLTP (OnLine Transaction Processing) side too. So let’s start by clearing the decks. Here’s a list of some consensus views that I in essence agree with:
- Oracle is the top of the line, and has nothing wrong with it other than cost of ownership and the non-joys of doing business with Oracle Corporation.
- DB2/mainframe is a fine product, but only if you like IBM mainframes.
- DB2/open systems is another fine product, but it’s hard to think of reasons to use it over Oracle.
- Microsoft SQL Server has great cost of ownership if you’re a Windows (server) shop anyway, especially on the administrative side. It does most but not all of what Oracle does.
- Sybase Adaptive Server Enterprise is a lot like SQL Server, but without the Windows dependence or the great Microsoft tools. If you have it installed or are Chinese, you should strongly consider using it, but otherwise there are better alternatives.
- Progress’ DBMS is great if you don’t need any of the features it’s missing. Administration, for example, is a super-low-cost breeze. But why use it unless you’re also using the Progress development tools?
- Intersystems’ Cache’ is another fine mid-range product that involves buying into the vendors’ whole tool set – all the more so because it isn’t relational.
- Small-footprint embedded DBMS, from vendors such as Sybase’s iAnywhere division or Solid Information Technologies, are off in their own little world. Mainly, that world is telecom, with a satellite in medical devices, although other kinds of networked equipment also sometimes use these products.
- IBM’s non-DB2 database management products – IMS, Informix, etc. – are fine things to stick with until you have to change. Ditto products from Software AG, Computer Associates, Cincom, etc.
- MySQL Version 4 is an OLTP joke, but it’s a joke many people share. (Hey — a lot of blogs, including mine, run on WordPress and MySQL 4.)
- Until Ingres is meaningfully marketed and sold outside its installed base, it’s not worth worrying about.
- PostgreSQL is more significant as the underpinning of other products — mainly EnterpriseDB in the OLTP space — than it is in its own right.
About a year ago, I wrote a very favorable column focusing on Intersystems’ OODBMS Cache’. Cache’ appears to be the one OODBMS product that has good performance even in a standard disk-centric configuration, notwithstanding that random pointer access seems to be antithetical to good disk performance.
Intersystems also has a hot new Cache’-based integration product, Ensemble. They attempted to brief me on it (somewhat belatedly, truth be told) last Wednesday. Through no fault of the product, however, the briefing didn’t go so well. I still look forward to learning more about Ensemble.
The introduction and technical-implementation part of this discussion was in Part 1.
It seems likely that widespread adoption of native XML storage is, at best, several years off, if for no other reason than that the DML (Data Manipulation Language) situation is still rather primitive. But looking beyond that nontrivial problem, it does seem as if there are broad classes of application that might go better in native XML. Here’s a survey.
First of all, there’s what might be called custom document composition – technical publishing, customized technical manuals, etc. If you make complex products, or sell information, this is obviously an important specialty application for you. Otherwise, it probably is rather peripheral, at least for now. If you do have an interest in this area, by the way, you shouldn’t only look at the big guys’ XML offerings; you should even talk to specialists like Mark Logic. (Mark Logic sells an XML-only DBMS with a strong text-search orientation.)
Second, there are complex documents with low update rates. Medical records are a prime example – and, by the way, may of those are stored in InterSystems’ OODBMS Cache rather than in a relational system. Other examples might include insurance claims, media assets, etc. – basically, the areas that have been thought of as the purview of document management systems. In many cases, these apps ain’t broke and shouldn’t be fixed, such as when they exist mainly to satisfy slow-changing regulatory requirements. Besides, it’s not obvious that native XML is particularly useful for these apps anyway. Often, the information is in a DBMS for three main reasons: General manageability (e.g., backup), ad-hoc searchability, and management of metadata. If the metadata is simple enough to fit comfortably into a tabular structure, extended-relational DBMS may be satisfactory as underpinnings for these apps indefinitely.
Third, and here’s where it really begins to get interesting, is complex transactional documents. One of the flagship apps in Viper’s alpha test was financial derivatives trading, with complex, number-laden, term-laden contracts being processed very quickly, and it’s easy to envision that kind of functionality spreading across the trading sector. Governments – wisely or not – may want to require new complex forms to be filled out, or to make older ones easier to process. (E.g., tax returns, or applications for various kinds of permits.) If privacy concerns allow, medical information might be collected and processed centrally by governments or large insurance providers. Complex service-level agreements could be negotiated for a broad variety of product and service categories. Customers might demand radically faster processing of insurance claims than has historically been necessary. Indeed, it’s hard to think of an industry sector where complex transactional documents might not gain a foothold. And if you’re looking for high performance access to portions of documents, native XML may well be the best storage choice.
Finally, there’s a fourth category, which I’ll give the trendy-looking name Profiles 2.0, in imitation of Web 2.0, Identity 2.0, and so on. Here’s what I mean by it. A number of the hottest buzzconcepts in computing focus on collecting, organizing, and using information about individual people – presence, identity, personalization/customization, narrowcasting/market-of-one, data mining/predictive analytics, weblog analysis, social software, and so on. Put all those together, and you have a humongous hairball of a user profile that no current systems come close to handling properly.
Let’s think about some characteristics of this data. Some of it is transient. Some of it is unreliable. Some of it indeed is guesswork – albeit educated guesswork – rather than fact (e.g., the results of data mining analyses). Much of it exists for some profilees but not others. Much of it is naturally tree- or graph-shaped (e.g., information about website traversal, product category interests, relationship networks, role-based authorizations, etc.) There are many kinds of it; pulling it all together relationally can lead to Joins From Hell.
And this isn’t just for individuals; similar kinds of stories can be told for information about organizations, battleships, and so on. Those are objects with rich internal structures. True, those can usually be modeled hierarchically – but at each node, some of the complications mentioned in the prior paragraph occur. Profiling an enterprise is even messier than profiling a single individual who shops or works there.
Applications using this kind of information are typically extremely primitive, even though the beginnings of the personalization hype are now 7-8 years in the past. I don’t think we’re going to get these systems kind right until we take a true, holistic view of individuals and their profiles – and until we learn how to think about apps whose fundamental objects keep changing in shape. But as hard as the problem is, it has to be worked on immediately, because what I’m talking about here are some of the major classes of competitive-advantage app.
So Profiles 2.0 isn’t something we can just ignore. And when we do pay attention to it, I don’t think we’ll find that it looks very natural dressed in rows and columns.
In the comments to another thread, the subject of EII (Enterprise Information Integration) came up. It’s a tricky one, for several reasons.
First, it’s a marketing construction — a blend between between ETL (Extract, Transform, Load) and EAI (Enterprise Application Integration). It’s a legitimate category; all those things are getting smushed together as near-real-time apps become more prominent. Still, it’s also an attempt to grab marketing turf.
Second, it’s commonly associated with a marketing overreach — the claim that an EII “platform” or “suite” will do everything a DBMS does (almost), but fully and heterogeneously distributed as well. Yeah, right.
Third, two of the sharpest proponents have been acquired by behemoths that tend to obscure their acquirees marketing pitches — Ascential by IBM and SeeBeyond by Sun.
Fourth, some of the best grand integrated EII suites (at least the ones that started as ETL, which is the side I’m more familiar with) aren’t complete yet. So vendors didn’t want to be too clear for fear of freezing current sales. I’m referring here mainly to Ascential and Informatica. They told analysts of their grand plans, but they haven’t been so eager to openly publicize the full details.
Fifth, the area is getting integrated with development tools for composite applications. Good examples there are SeeBeyond and Intersystems’ Cache’.
Sixth, no EII vendors’ plans fully work unless they have full relational and XML integration, and nobody really has been doing a great job on that, typically being strong in one area or the other.
Obviously, this is an area I have to research actively; EII is the neuromuscular system that holds DBMS2 together. But all the research in the world won’t change the fact that as of now it’s the weak spot in the story. There’s lots of great database management technology, and lots of excellent reasons to use a variety of kinds of that technology in your enterprise. But the tools to knit the resulting heterogeneous databases together are still sadly deficient.