Progress, Apama, and DataDirect

Analysis of Progress Software and its various product lines, including Apama, DataDirect, and OpenEdge. Related subjects include:

February 1, 2008

Dan Weinreb on ObjectStore

Dan Weinreb was one of the key techies at Object Design, the company that made the object-oriented database management system ObjectStore. (Object Design later merger into Excelon, which was eventually sold to Progress, which has deemphasized but still supports ObjectStore.) Recently he wrote a pair of long and fascinating articles about Object Design, ObjectStore, and OODBMS, the first of which makes the case that “object-oriented database management systems succeeded.” Read more

January 22, 2008

What leading DBMS vendors don’t want you to realize

For very high-end applications, the list of viable database management systems is short. Scalability can be a problem. (The rankings of most scalable alternatives differ in the OLTP and data warehouse realms.) Extreme levels of security can be had from only a few DBMS. (Oracle would have you believe there’s only one choice.) And if you truly need 99.99% uptime, there only are a few DBMS you even should consider.

But for most applications at any enterprise – and for all applications at most enterprises – super high-end DBMS aren’t required. There are relatively few applications that wouldn’t run perfectly well on PostgreSQL or EnterpriseDB today. Ingres and Progress OpenEdge aren’t far behind (they’re a little lacking in datatype support). Ditto Intersystems Cache’, although the nonrelational architecture will be off-putting to many. And to varying degrees, you can also do fine with MySQL, Pervasive PSQL, MaxDB, or a variety of other products – or for that matter with the cheap or free crippled versions of Oracle, SQL Server, DB2, and Informix.

What’s more, these mid-range database management systems can have significant advantages over their high-end brethren. Read more

August 10, 2007

Coral8 versus StreamBase

Besides talking about what Coral8 and StreamBase (and other CEP vendors) have in common, Mark Tsimelzon and I talked quite a bit about what he sees as some of the important differences. There were a lot, of course, but three in particular stood out.

1. Mark believes Coral8 has significantly lower latency than StreamBase. E.g., the Wombat/Coral8 combo achieves sub-millisecond latency, with Coral8 itself consuming less than a tenth of that. The best comparable figures from StreamBase that I currently know of are almost an order of magnitude slower.

Top-end speed aside, Mark believes that Coral8 is fundamentally better suited for complex queries and pattern recognition, while StreamBase works well with simpler queries. For example, his other performance claims notwithstanding, he concedes that StreamBase is at least comparable to Coral8 in its throughput for huge numbers of simple queries. (The number he mentioned was ½ million queries/second.) Indeed, while we barely talked about customer/marketing issues, Mark asserts that the companies’ respective customer bases reflect this complex/simple distinction.*
Read more

August 3, 2007

Competitive claims in CEP

For the most part, the vendors I talk with in complex event/stream processing like and speak well of each other (most of the exceptions seem to involve StreamBase). Even so, there are a lot of interesting competitive claims and counterclaims in this market. Prior posts and comment threads have covered Apama/StreamBase jousting on the subjects of who has more business and how many financial data feeds StreamBase supports. Other areas that generate interesting sparks are performance, parallelism, and determinism. Read more

August 3, 2007

A deeper dive into Apama

My recent non-technical Apama briefing has now had a much more technical sequel, with charming founder and former Cambridge professor John Bates. He still didn’t fully open the kimono – trade secrets and all that — but here’s the essence of what’s going on.

Complex event/stream processing (CEP) is all about looking for many patterns at once. Reality – the stream(s) of data – is checked against these patterns for matches. In Apama, these patterns are kept in a kind of tree – they call it a hypertree — and John says the work to check them is only logarithmic in the number of patterns.

Since patterns commonly have multiple parts — and usually also take time to unfold — what really goes on is that partial matches are found, after which what’s being matched against is the REMAINDER of the pattern. Thus, there’s constant pruning and rebalancing of the tree. What’s more, a large fraction of all patterns – at least in the financial trading market — involve a short time window, which again creates a need for ongoing, rapid tree modification. Read more

July 26, 2007

An era of easier database portability?

More and more, I find myself addressing questions of database portability and transparency, most particularly in the cases of EnterpriseDB, Ants Software, and now also Dataupia. None of those three efforts is very large yet, but so far I’d rate their respective buzzes to be very encouraging in the case of EnterpriseDB, non-discouraging or better in the case of Ants, and too early to judge for Dataupia. On the whole, it definitely seems like a matter worthy of attention.

With that as backdrop, where is all this compatibility/portability/transparency stuff going to lead? Read more

July 18, 2007

StreamBase rebuts

In my post Monday about Apama, I complained that StreamBase hadn’t offered a rebuttal to some of Apama’s claims. This has now been fixed. :) Bill Hobbib, StreamBase’s VP of Marketing wrote in. Part of what he had to say was the following.

Adapters to Data Feeds

Your blog comment that adapters doesn’t seem like a key competitive differentiator is accurate, and since adapters are so straightforward to develop with StreamBase as part of a customer engagement, we’ve never found adapters to be a key competitive differentiator. The comment by a competitor that their advantage over StreamBase comes from their having developed more adapters suggests they cannot distinguish themselves based on the other functional capabilities that are important to customers. In reality, our speed/performance and scalability are orders of magnitude superior to competitors, as is the speed with which StreamBase applications are developed, deployed, and modified when business needs change. (If it were easy to develop applications with certain competitive systems, then one might assume they would make free evaluation versions of their product available for download from their websites!)

That being said, StreamBase offers adapters to a broad array of data feeds. Most of these are offered out-of-the-box by StreamBase, including the following:
* Financial Market Data: processes data from Reuters® RMDS™ and Reuters Triarch™
* TIBCO® Rendezvous™: converts Rendezvous message into StreamBase tuples and vice versa.
* StreamBase Adapter for JDBC: connects StreamBase to enterprise databases, allowing submission of SQL queries to external resources such as IBM® DB2™, Oracle®, Microsoft® SQLServer™, and Sybase®.
* StreamBase Adapter for JMS: integrates StreamBase with any JMS-compliant message bus.
* StreamBase Adapter for Microsoft Excel™: allows applications to publish data to Excel or read data from Excel.
* StreamBase CSV Adapters: allow applications to read data from, and write data to, comma-separated value (CSV) files.
* StreamBase SMTP adapter: taps into the IP stack on a running system to process live data, converts the IP packets into a TCP data stream, or reads IP packets from captured files.
* StreamBase XML Adapter: streams XML-formatted data records into and out of StreamBase applications

We also can connect to financial exchanges either using our own adapters or through a third-party partnership. Below you’ll find a listing of those.

Read more

July 16, 2007

Progress Apama

I finally got my promised briefing with Progress Apama. Unfortunately, nobody particularly technical was able to attend, but I came away with a better understanding even so.

Unlike StreamBase or Truviso, Apama has a rules-based architecture. In essence, the rules engine maintains state of various kinds, and matches that state against desired patterns, called “scenarios.” They can handle 100s or possibly even 1000s of scenarios at once. Read more

June 14, 2007

Native XML engine short list

I’ve been implying that the short list for native XML database engine vendors should be Mark Logic, IBM, and maybe Microsoft, on the theory that Progress and Intersystems tried the market and pulled back. Well, add Intersystems to the list, and not necessarily in last place. They’ve long had a very fast nonrelational engine in Cache’. Perhaps building Ensemble on it has induced them to sharpen up the XML capabilities again.

Anyhow, while I’m not at liberty to explain more of my reasoning (i.e., to disclose my evidence) — Cache’ should be taken seriously as an XML DBMS alternative … even if I never can seem to get a proper DBMS briefing from them (which is far from entirely being their fault).

April 28, 2007

Progress Software progress report

For the past 20+ years – all the way back to when it was still privately held — I’ve periodically gotten up to speed on Progress Software. I’m trying again now, and to that end dropped by yesterday for a chat with Jeff Stamen. I’ll give a brief overview now – which is probably all I’m qualified to do right now anyway – and then loop back with more detailed info after I get it.

After a reorganization at the beginning of this (November) fiscal year, the vast majority of Progress’ products fall into one of five buckets, which I shall glibly refer to in decreasing order of size as “Progress Classic,” “SOA,” “drivers,” “memory-centric,” and “EasyAsk.” Here’s a quick overview of each. Read more

April 18, 2007

Naming the DBMS disruptors

Edit: This post has largely been superseded by this more recent one defining mid-range relational DBMS.

I find myself defining a new product category – midrange OLTP/multipurpose DBMS. (Or just midrange DBMS for brevity.) Nothing earthshaking here; I’m simply referring to those products that: Read more

March 25, 2007

Oracle, Tangosol, objects, caching, and disruption

Oracle made a slick move in picking up Tangosol, a leader in object/data caching for all sorts of major OLTP apps. They do financial trading, telecom operations, big web sites (Fedex, Geico), and other good stuff. This is a reminder that the list of important memory-centric data handling technologies is getting fairly long, including:

And that’s just for OLTP; there’s a whole other set of memory-centric technologies for analytics as well.

When one connects the dots, I think three major points jump out:

  1. There’s a lot more to high-end OLTP than relational database management.
  2. Oracle is determined to be the leader in as many of those areas as possible.
  3. This all fits the market disruption narrative.

I write about Point #1 all the time. So this time around let me expand a little more on #2 and #3.

Read more

February 27, 2007

Opportunities for disruption in the OLTP database management market (deck-clearing post #2)

The standard Clayton Christensen “Innovator’s Dilemma” disruption narrative goes something like this:

And it’s really hard for market leaders to avert this sad fate, because the short- and intermediate-term margin hit would be too great.

I think the OLTP DBMS market is ripe for that kind of disruption – riper than commentators generally realize. Here are some key potential drivers.

Read more

February 27, 2007

OLTP database management system market – the consensus isn’t ALL wrong (deck-clearing post #1)

Most of what I’ve written lately about database management seems to have been focused on analytic technologies. But I have a lot to say on the OLTP (OnLine Transaction Processing) side too. So let’s start by clearing the decks. Here’s a list of some consensus views that I in essence agree with:

May 10, 2006

White paper on memory-centric data management — excerpt

Here’s an excerpt from the introduction to my new white paper on memory-centric data management. I don’t know why Wordpress insists on showing the table gridlines, but I won’t try to fix that now. Anyhow, if you’re interested enough to read most of this excerpt, I strongly suggest downloading the full paper.

Introduction

Conventional DBMS don’t always perform adequately.

Ideally, IT managers would never need to think about the details of data management technology. Market-leading, general-purpose DBMS (DataBase Management Systems) would do a great job of meeting all information management needs. But we don’t live in an ideal world. Even after decades of great technical advances, conventional DBMS still can’t give your users all the information they need, when and where they need it, at acceptable cost. As a result, specialty data management products continue to be needed, filling the gaps where more general DBMS don’t do an adequate job.

Memory-centric technology is a powerful alternative.

One category on the upswing is memory-centric data management technology. While conventional DBMS are designed to get data on and off disk quickly, memory-centric products (which may or may not be full DBMS) assume all the data is in RAM in the first place. The implications of this design choice can be profound. RAM access speeds are up to 1,000,000 times faster than random reads on disk. Consequently, whole new classes of data access methods can be used when the disk speed bottleneck is ignored. Sequential access is much faster in RAM, too, allowing yet another group of efficient data access approaches to be implemented.

It does things disk-based systems can’t.

If you want to query a used-book database a million times a minute, that’s hard to do in a standard relational DBMS. But Progress’ ObjectStore gets it done for Amazon. If you want to recalculate a set of OLAP (OnLine Analytic Processing) cubes in real-time, don’t look to a disk-based system of any kind. But Applix’s TM1 can do just that. And if you want to stick DBMS instances on 99 nodes of a telecom network, all persisting data to a 100th node, a disk-centric system isn’t your best choice – but Solid’s BoostEngine should get the job done.

Memory-centric data managers fill the gap, in various guises.

Those products are some leading examples of a diverse group of specialist memory-centric data management products. Such products can be optimized for OLAP or OLTP (OnLine Transaction Processing) or event-stream processing. They may be positioned as DBMS, quasi-DBMS, BI (Business Intelligence) features, or some utterly new kind of middleware. They may come from top-tier software vendors or from the rawest of startups. But they all share a common design philosophy: Optimize the use of ever-faster semiconductors, rather than focusing on (relatively) slow-spinning disks.

They have a rich variety of benefits.

For any technology that radically improves price/performance (or any other measure of IT efficiency), the benefits can be found in three main categories:

  • Doing the same things you did before, only more cheaply;
  • Doing the same things you did before, only better and/or faster;
  • Doing things that weren’t technically or economically feasible before at all.

For memory-centric data management, the “things that you couldn’t do before at all” are concentrated in areas that are highly real-time or that use non-relational data structures. Conversely, for many relational and/or OLTP apps, memory-centric technology is essentially a much cheaper/better/faster way of doing what you were already struggling through all along.

Memory-centric technology has many applications.

Through both OEM and direct purchases, many enterprises have already adopted memory-centric technology. For example:

  • Financial services vendors use memory-centric data management throughout their trading systems.
  • Telecom service vendors use memory-centric data management in multiple provisioning, billing, and routing applications.
  • Memory-centric data management is used to accelerate web transactions, including in what may be the most demanding OLTP app of all — Amazon.com’s online bookstore.
  • Memory-centric data management technology is OEMed in a variety of major enterprise network management products, including HP Openview.
  • Memory-centric data management is used to accelerate analytics across a broad variety of industries, especially in such areas as planning, scenarios, customer analytics, and profitability analysis.

May 8, 2006

Memory-centric data management whitepaper

I have finally finished and uploaded the long-awaited white paper on memory-centric data management.

This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:

If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.

And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.

January 26, 2006

Progress DataDirect discovers XML

As a general rule, if you want DBMS drivers, your first call should be to Progress DataDirect. They’ve been the dominant vendor (under multiple names and ownerships) of both ODBC and JDBC drivers, essentially since those standards’ respective inventions. (Persistent Systems Private Ltd. — better known as PSPL — wouldn’t be a terrible choice for your second call).

DataDirect seems to have introduced XQuery drivers last fall. I don’t have a lot of detail on those, however, because the DataDirect guy who contacted me did so mainly to show off a nice toy, Stylus Studio. StylusS tudio is an XML query-building toolkit, available for online purchase for $800 or less. A lot of the users seem to be system integrators. Sales are split 50-50 between the DataDirect regular salesforce and online, apparently mainly from their own store, but I got the sense we’re not talking about huge numbers yet.

In usability when they demoed it to me it looked on a par with Cognos Improptu (a SQL query-building tool) circa the mid-1990s. But they do claim all the right things in round-trip code generation and so on.

Applications seem to be concentrated in intercompany information exchange, based on both legacy EDI (Electronic Data Interchange) and more modern web services. Other uses they cited were parsing web server logs and publishing relational data to a web page.

The technology/product seems to have bounced around for a while, from Object Design (OODBMS pioneer that took a premature shot at the XML database business, and the source of the ObjectStore technology I keep writing about in this blog) to eXcelon (merger partner for ODS, eventually bought by Progress), to Progress’s Sonic Software Division, and now to DataDirect after Progress bought them. Apparently none of those companies have or had top-end UI expertise …

If you want to get a better feel for XQuery, you could do worse than to play with this tool. For example, it’s what I think I’ll use in the unlikely case I ever get around to parsing the SpamAssassin add-ins to my email messages and trying to understand what SpamAssassin is and isn’t doing.

January 11, 2006

Another OLTP success for memory-centric OO

Computerworld published a Progress ObjectStore OLTP success story.

Hotel reservations system, this time. Not as impressive as the Amazon store — what is? — but still nice.

November 14, 2005

Defining and surveying “Memory-centric data management”

I’m writing more and more about memory-centric data management technology these days, including in my latest Computerworld column. You may be wondering what that term refers to. Well, I’ve basically renamed what are commonly called “in-memory DBMS,” for what I think is a very good reason: Most of the products in the category aren’t true DBMS, aren’t wholly in-memory, or both! Indeed, if you catch me in a grouchy mood I might argue that “in-memory DBMS” is actually a contradiction in terms.

I’ll give a quick summary of the vendors and products I am focusing on in this newly-named category, and it should be clearer what I mean:

So there you have it. There are a whole lot of technologies out there that manage data in RAM, in ways that would make little or no sense if disks were more intimately involved. Conventional DBMS also try to exploit RAM and limit disk access, via caching; but generally the data access methods they use in RAM are pretty similar to those they use when going out to disk. So memory-centric systems can have a major advantage.

October 10, 2005

The Amazon.com bookstore is a huge, modern OLTP app. So is it relational?

I don’t know for a fact that the Amazon.com bookstore is the world’s biggest OLTP application — but if it isn’t, it’s close.

And the thing is — that’s never been an entirely relational application. Oh, the ordering part surely is. But the inventory lookup is currently driven by an OODBMS (from Progress). The personalization used to be done in Red Brick (I knew which software replaced it, but I’m forgetting at the moment — it may even be one of the relational warehouse appliance vendors). And of course the full-text search is a custom in-house system.

August 13, 2005

The end of the single-server DBMS vendor

For all practical purposes, there are no DBMS vendors left advocating single-server strategies. Oracle was the last one, but it just acquired in-memory data management vendor TimesTen, which will be used as a cache in front of high-performance Oracle databases. (It will also continue to be sold for stand-alone uses, especially in the financial trading and defense/intelligence markets.)

IBM’s Viper is a server-and-a-half story, with lots of integration over a dual-server (one relational, one native XML) base. IBM also is moving aggressively in data integration/federation, with Ascential and many other acquisitions. It also sells a broad range of database products itself, including two DB2s, several Informix products, and so on.

Microsoft also has a multi-server strategy. In its case, relational, text, and MOLAP storage are more separate than in Oracle’s or even IBM’s products; again, there’s a thick layer of technology on top integrating them. An eventual move to native XML storage will, one must imagine, be handled in the same way.

Smaller vendors Sybase and Progress also offer multiple DBMS each.

Teradata is a pretty big player with only one DBMS — but it’s specialized for data warehousing. Teradata is the first to tell you you should use something else for your classical transaction processing.

The Grand Unified Integrated Database theory is, so far as I can tell, quite dead. Some people just refuse to admit that fact.

Feed including blog about database management, data warehousing, and business intelligence Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Recent white paper

The Explosion in DBMS Choice

August, 2008

Recent webcast

What leading database vendors don't want you to know

Originally broadcast April 9, 2008

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.