Analysis of MOLAP (Multidimensional OnLine Analytic Processing) products and vendors. Related subjects include:
If I weren’t on a snorkeling vacation,* this might be a good time to write about why I once called Cognos “The Gang That Couldn’t Shoot Straight,” how Ron Zambonini used that label to help him gain the company’s top spot, why he’s such a big fan of mine, why I got my highest ever per-minute speaking fee to attend a Cognos sales kickoff event, why I went for a midnight touristing stroll in downtown Ottawa in zero degree Fahrenheit weather, or how I managed, while attending the aforementioned Cognos sales kickoff, to get snowed in for three days in, of all places, Dallas, Texas. But the wrasses and jacks await, so I’ll get straight to the point.
*Albeit fairly snorkel-free so far, thanks to Hurricane Felix.
As I discussed at considerable length in a white paper, Applix’s core technology is fully-featured, memory-centric MOLAP. This is certainly cool technology, and I think it is actually unique. That it’s historically been positioned as the engine for a mid-range set of performance management tools is a travesty, a shame, the result of a prior merger – and also the quite understandable consequence of RAM limitations. However, RAM is ever cheaper and Applix’s technology is now 64-bit, so the RAM barriers have been relaxed. Cognos can take Applix’s TM1 engine high-end if it wants to. And boy, should Cognos ever want to. Indeed, there are three different great ways Cognos could package and position TM1:
- As a no-data-warehouse-design quick-start analytics engine analogous to QlikView (the fastest-growing and most important newish BI suite, open source perhaps excepted);
- As the most sophisticated and versatile planning tool this side of SAP’s APO (and while APO’s sophistication is not in dispute, its versatility is questionable anyway);
- As the processing hub for dashboards-done-right.
|Categories: Analytic technologies, Business intelligence, Cognos, Memory-centric data management, MOLAP||6 Comments|
Oracle is evidently buying Hyperion Software. Much like Gaul, Hyperion can be divided into three parts:
- Budgeting and consolidation applications, descended from the original Hyperion and Pillar.
- Essbase, the definitive MOLAP engine, descended from Arbor Software.
- A business intelligence suite, descended from Brio.
The most important part is budgeting/planning, because it could help Oracle change the rules for application software. But Essbase could be just the nudge Oracle needs to finally renounce its one-server-fits-all dogma.
|Categories: Analytic technologies, Data warehousing, Microsoft and SQL*Server, MOLAP, Oracle||6 Comments|
SAS has its own data store, called SAS Intelligence Storage. It’s a relational system running on SMP boxes, whose unique feature is that it has fixed-length records and hence is a perfect array, for speedy lookup. This is highly analogous to classical MOLAP systems. However, SAS reports that customers store up to several hundred terabytes of data in SAS Intelligence Storage, which is definitely not very analogous to what goes on in the MOLAP world.
It sounds as if the product is optimized for data mining and generic OLAP alike. Indeed, SAS Intelligence Storage is used to power both SAS’s data mining and other advanced analytics, and also its more conventional BI suite.
I’ve been posting a lot recently about the diverse database technologies used to support data warehousing. With the marketplace supporting such a broad range of architectures, it seems clear that a lot of those architectures actually deserve to thrive, presumable each in a different kind of usage scenario. So in this post I’ll take a pass at dividing up use cases for data warehouses, and suggesting which kinds of data warehouse management technologies might do the best job of supporting them. To start with, I’ve divided things into a number of buckets:
- Pinpoint data lookup
- Constrained query and reporting
- Cube-filling calculations
- Hardcore tabular data crunching
- Text and media search
- Specialty areas, such as relationship analytics
|Categories: Data warehouse appliances, Data warehousing, DATAllegro, IBM and DB2, MOLAP, Netezza, Teradata||1 Comment|
Here’s an excerpt from the introduction to my new white paper on memory-centric data management. I don’t know why WordPress insists on showing the table gridlines, but I won’t try to fix that now. Anyhow, if you’re interested enough to read most of this excerpt, I strongly suggest downloading the full paper.
Conventional DBMS don’t always perform adequately.
Ideally, IT managers would never need to think about the details of data management technology. Market-leading, general-purpose DBMS (DataBase Management Systems) would do a great job of meeting all information management needs. But we don’t live in an ideal world. Even after decades of great technical advances, conventional DBMS still can’t give your users all the information they need, when and where they need it, at acceptable cost. As a result, specialty data management products continue to be needed, filling the gaps where more general DBMS don’t do an adequate job.
Memory-centric technology is a powerful alternative.
One category on the upswing is memory-centric data management technology. While conventional DBMS are designed to get data on and off disk quickly, memory-centric products (which may or may not be full DBMS) assume all the data is in RAM in the first place. The implications of this design choice can be profound. RAM access speeds are up to 1,000,000 times faster than random reads on disk. Consequently, whole new classes of data access methods can be used when the disk speed bottleneck is ignored. Sequential access is much faster in RAM, too, allowing yet another group of efficient data access approaches to be implemented.
It does things disk-based systems can’t.
If you want to query a used-book database a million times a minute, that’s hard to do in a standard relational DBMS. But Progress’ ObjectStore gets it done for Amazon. If you want to recalculate a set of OLAP (OnLine Analytic Processing) cubes in real-time, don’t look to a disk-based system of any kind. But Applix’s TM1 can do just that. And if you want to stick DBMS instances on 99 nodes of a telecom network, all persisting data to a 100th node, a disk-centric system isn’t your best choice – but Solid’s BoostEngine should get the job done.
Memory-centric data managers fill the gap, in various guises.
Those products are some leading examples of a diverse group of specialist memory-centric data management products. Such products can be optimized for OLAP or OLTP (OnLine Transaction Processing) or event-stream processing. They may be positioned as DBMS, quasi-DBMS, BI (Business Intelligence) features, or some utterly new kind of middleware. They may come from top-tier software vendors or from the rawest of startups. But they all share a common design philosophy: Optimize the use of ever-faster semiconductors, rather than focusing on (relatively) slow-spinning disks.
They have a rich variety of benefits.
For any technology that radically improves price/performance (or any other measure of IT efficiency), the benefits can be found in three main categories:
For memory-centric data management, the “things that you couldn’t do before at all” are concentrated in areas that are highly real-time or that use non-relational data structures. Conversely, for many relational and/or OLTP apps, memory-centric technology is essentially a much cheaper/better/faster way of doing what you were already struggling through all along.
Memory-centric technology has many applications.
Through both OEM and direct purchases, many enterprises have already adopted memory-centric technology. For example:
|Categories: Data types, Memory-centric data management, MOLAP, Object, OLTP, Open source, Progress, Apama, and DataDirect||3 Comments|
I have finally finished and uploaded the long-awaited white paper on memory-centric data management.
This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:
- Applix, vendors of in-memory/memory-centric MOLAP tool TM1
- Progress Software, vendors of ObjectStore, an OODBMS that has more impressive references in-memory or otherwise memory-centric than it does in classical disk-based configurations, and also of the Apama stream processing products
- SAP, vendors of the BI Accelerator functionality of SAP NetWeaver, or whatever tortured name they want to give it this month — basically, that’s a very cool in-memory columnar data mart technology
- Solid Information Technology, vendor of hybrid in-memory/disk-based OLTP RDBMS. Historically focused on the embedded systems market, especially telecom and networking, they’ve recently been in the news because of a deal with MySQL that is designed to extend their reach.
- Intel, makers of the processors used to run a lot of the other sponsors’ products (including all BI Accelerator installations to date).
If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.
And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.
|Categories: Cognos, Companies and products, In-memory DBMS, Intel, Memory-centric data management, MOLAP, Open source, Progress, Apama, and DataDirect, SAP AG, solidDB||2 Comments|
“MOLAP” stands for “Multidimensional OLAP.” It’s almost exactly what Ted Codd was referring to in the white paper where he introduced the term “OLAP.” Relational advocates correctly point out that relational tables are NOT “two-dimensional;” rather, every column in a table represents a dimension.
(If that’s not obvious, think of rows in a table as n-tuples, and n-tuples as akin to vectors. Then think back to the linear algebra segment at the beginning of your Calculus of Several Variables class. Vector spaces? Dimensions? I rest my case.)
Despite all that, I’m comfortable with the “M” in MOLAP, because a dimension in a MOLAP hypercube is a lot more complex than a dimension in a relational table. The latter is itself — well, if there’s a sort order, it’s typically one dimensional. But the analog in a MOLAP cube can be a whole rich and complex hierarchy.
So yes — MOLAP is inherently more multidimensional than ROLAP, atlhough one can of course do something equivalent to a single hypercube by creating a whole lot of different tables.
I did a webinar on memory-centric data management for Applix. It was the standard hour in length, but they had me do the vast majority of the talking, so I laid out my ideas in some detail.
In line with their business focus, I emphasized OLAP in general and MOLAP in particular. But I did have a chance to lay out pretty much the whole story.
There’s a lot of material in it I haven’t published yet in written form, and some nuances I may never get around to writing down. So if you’re sufficiently interested in the area, I recommend watching the webinar.
What I’ve written so far in this blog (and in Computerworld) about memory-centric data management technology is just the tip of the iceberg. A detailed white paper is forthcoming, sponsored by most of the industry leaders: Applix, Progress, SAP, Intel (in association with SAP), and Solid. (But for some odd reason Oracle declined to participate …)
A lot of the material will be rolled out publically for the first time in a webinar on Wednesday, January 25, at 11 EST. Applix is the host. To participate, please follow this link.
I’m also holding forth online, in webinars and even video, on other subjects these days. More details may be found over in the Monash Report.
I just spent a couple of days at SAP’s analyst meeting, and realized something I’d somewhat forgotten – much of the DBMS2 concept was inspired by SAP’s technical strategy. That’s not to say that SAP’s techies necessarily agree with me on every last point. But I do think it is interesting to review SAP’s version of DBMS2, to the extent I understand it.
1. SAP’s Enterprise Services Architecture (ESA) is meant to be, among other things, an abstraction layer over relational DBMS. The mantra is that they’re moving to a “message-based architecture” as opposed to a “database architecture.” These messages are in the context of a standards-based SOA, with a strong commitment to remaining open and standards-based, at least on the data and messaging levels. (The main limitation on openness that I’ve detected is that they don’t think much of standards such as BPEL in the business process definition area, which aren’t powerful enough for them.)
2. One big benefit they see to this strategy is that it reduces the need to have grand integrated databases. If one application manages data for an entity that is also important to another application, the two applications can exchange messages about the entity. Anyhow, many of their comments make it clear that, between partner company databases (a bit of a future) and legacy app databases (a very big factor in the present day), SAP is constantly aware of situations in which a single integrated database in infeasible.
3. SAP is still deeply suspicious of redundant transactional data. They feel that with redundant data you can’t have a really clean model – unless, of course, you code up really rigorous synchronization. However, if for some reason synchronization is preferred – e.g., for performance reasons — it can be hidden from users and most developers.
4. One area where SAP definitely favors redundancy and synchronization is data warehousing. Indeed, they have an ever more elaborate staging system to move data from operational to analytic systems.
5. In general, they are far from being relational purists. For example, Shai Agassi referred to doing things that you can’t do in a pure relational approach. And Peter Zencke reminded me that this attitude is nothing new. SAP has long had complex business objects, and even done some of its own memory management to make them performant, when they were structured in a manner that RDBMS weren’t well suited for. (I presume he was referring largely to BAPI.)
6. That said, they’re of course using relational data stores today for most things. One exception is text/content, which they prefer to store in their own text indexing/management system TREX. Another example is their historical support for MOLAP, although they seem to be edging as far away from that as they can without offending the MOLAP-loving part of their customer base.
Incidentally, the whole TREX strategy is subject to considerable doubt too. It’s not a state-of-the-art product, and they currently don’t plan to make it into one. In particular, they have a prejudice against semi-automated ontology creation, and that has clearly become a requirement for top-tier text technologies.
7. One thing that Peter said which confused me a bit is when we were talking about nonrelational data retrieval. The example he used was retrieving information on all of a specific sales reps’ customers, or perhaps on several sales reps’ customers. I got the feeling he was talking about the ability to text search on multiple columns and/or multiple tables/objects/whatever at once, but I can’t honestly claim that I connected all the dots.
And of course, the memory-centric ROLAP tool BI Accelerator — technology that’s based on TREX — is just another example of how SAP is willing to go beyond passively connecting to a single RDBMS. And while their sponsorship of MaxDB isn’t really an example of that, it is another example of how SAP’s strategy is not one to gladden the hearts of the top-tier DBMS vendors.
|Categories: EAI, EII, ETL, ELT, ETLT, Memory-centric data management, MOLAP, OLTP, SAP AG, Theory and architecture||9 Comments|