MOLAP
Analysis of MOLAP (Multidimensional OnLine Analytic Processing) products and vendors. Related subjects include:
- Cognos (which now owns TM1)
- Oracle (which now owns Essbase)
- Microsoft (which offers Microsoft Analysis Services)
- Data warehousing
EMC’s take on data warehousing and BI
I just ran across a December 10 blog post by Chuck Hollis outlining some of EMC’s — or at least Chuck’s — views on data warehousing and business intelligence. It’s worth scanning, a certain “Where you stand depends upon where you sit” flavor to it notwithstanding. In a contrast to my usual blogging style, Chuck’s post is excerpted at length below, with comments from me interspersed. Read more
| Categories: Analytic technologies, Data warehousing, EMC, MOLAP, Solid-state memory, Storage | 2 Comments |
More Oracle notes
When I went to Oracle in October, the main purpose of the visit was to discuss Exadata. And so my initial post based on the visit was focused accordingly. But there were a number of other interesting points I’ve never gotten around to writing up. Let me now remedy that, at least in part. Read more
| Categories: Complex event processing (CEP), Data types, Data warehousing, Database compression, GIS and geospatial, MOLAP, Oracle, SAP AG, Theory and architecture, Web analytics | 9 Comments |
Other notes on Oracle data warehousing
Obviously, the big news this week is Exadata, and its parallelization or lack thereof. But let’s not forget the rest of Oracle’s data warehousing technology.
- Frankly, I’ve come to think that disk-based OLAP cubes and materialized views are both cop-outs, indicative of a relational data warehouse architecture that can’t answer queries quickly enough straight-up. But if you disagree, then you might like Oracle’s new OLAP cube materialized views, which sound like a worthy competitor to Microsoft Analysis Services. (Further confusing things, I’ve seen reports that Oracle is increasing its commitment to Essbase, a separate MOLAP engine. I hope those are incorrect.)
- A few weeks ago, I came to realize that Oracle’s data mining database features actually mattered — perhaps not quite as much as Charlie Berger might think, but to say that is to praise with faint damns.
SPSS seems to be getting large performance gains from leveraging the scoring part, and perhaps the transformation part as well. I haven’t focused on getting my details right yet, so I haven’t been writing about it. But heck, with all the other Oracle data warehousing discussion, it seems right to at least mention this part too.
Applix – Three huge opportunities Cognos will probably ignore
If I weren’t on a snorkeling vacation,* this might be a good time to write about why I once called Cognos “The Gang That Couldn’t Shoot Straight,” how Ron Zambonini used that label to help him gain the company’s top spot, why he’s such a big fan of mine, why I got my highest ever per-minute speaking fee to attend a Cognos sales kickoff event, why I went for a midnight touristing stroll in downtown Ottawa in zero degree Fahrenheit weather, or how I managed, while attending the aforementioned Cognos sales kickoff, to get snowed in for three days in, of all places, Dallas, Texas. But the wrasses and jacks await, so I’ll get straight to the point.
*Albeit fairly snorkel-free so far, thanks to Hurricane Felix.
As I discussed at considerable length in a white paper, Applix’s core technology is fully-featured, memory-centric MOLAP. This is certainly cool technology, and I think it is actually unique. That it’s historically been positioned as the engine for a mid-range set of performance management tools is a travesty, a shame, the result of a prior merger – and also the quite understandable consequence of RAM limitations. However, RAM is ever cheaper and Applix’s technology is now 64-bit, so the RAM barriers have been relaxed. Cognos can take Applix’s TM1 engine high-end if it wants to. And boy, should Cognos ever want to. Indeed, there are three different great ways Cognos could package and position TM1:
- As a no-data-warehouse-design quick-start analytics engine analogous to QlikView (the fastest-growing and most important newish BI suite, open source perhaps excepted);
- As the most sophisticated and versatile planning tool this side of SAP’s APO (and while APO’s sophistication is not in dispute, its versatility is questionable anyway);
- As the processing hub for dashboards-done-right.
| Categories: Analytic technologies, Business intelligence, Cognos, Memory-centric data management, MOLAP | 6 Comments |
How Hyperion will change Oracle
Oracle is evidently buying Hyperion Software. Much like Gaul, Hyperion can be divided into three parts:
- Budgeting and consolidation applications, descended from the original Hyperion and Pillar.
- Essbase, the definitive MOLAP engine, descended from Arbor Software.
- A business intelligence suite, descended from Brio.
The most important part is budgeting/planning, because it could help Oracle change the rules for application software. But Essbase could be just the nudge Oracle needs to finally renounce its one-server-fits-all dogma.
Read more
| Categories: Analytic technologies, Data warehousing, Microsoft and SQL*Server, MOLAP, Oracle | 4 Comments |
SAS Intelligence Storage
SAS has its own data store, called SAS Intelligence Storage. It’s a relational system running on SMP boxes, whose unique feature is that it has fixed-length records and hence is a perfect array, for speedy lookup. This is highly analogous to classical MOLAP systems. However, SAS reports that customers store up to several hundred terabytes of data in SAS Intelligence Storage, which is definitely not very analogous to what goes on in the MOLAP world.
It sounds as if the product is optimized for data mining and generic OLAP alike. Indeed, SAS Intelligence Storage is used to power both SAS’s data mining and other advanced analytics, and also its more conventional BI suite.
| Categories: Data warehousing, MOLAP, SAS Institute | 4 Comments |
Data warehouse and mart uses – a tentative taxonomy
I’ve been posting a lot recently about the diverse database technologies used to support data warehousing. With the marketplace supporting such a broad range of architectures, it seems clear that a lot of those architectures actually deserve to thrive, presumable each in a different kind of usage scenario. So in this post I’ll take a pass at dividing up use cases for data warehouses, and suggesting which kinds of data warehouse management technologies might do the best job of supporting them. To start with, I’ve divided things into a number of buckets:
- Pinpoint data lookup
- Constrained query and reporting
- Cube-filling calculations
- Hardcore tabular data crunching
- Text and media search
- Specialty areas, such as relationship analytics
| Categories: Data warehouse appliances, Data warehousing, DATAllegro, IBM and DB2, MOLAP, Netezza, Teradata | 1 Comment |
White paper on memory-centric data management — excerpt
Here’s an excerpt from the introduction to my new white paper on memory-centric data management. I don’t know why WordPress insists on showing the table gridlines, but I won’t try to fix that now. Anyhow, if you’re interested enough to read most of this excerpt, I strongly suggest downloading the full paper.
|
|
Introduction
|
|
Conventional DBMS don’t always perform adequately. |
Ideally, IT managers would never need to think about the details of data management technology. Market-leading, general-purpose DBMS (DataBase Management Systems) would do a great job of meeting all information management needs. But we don’t live in an ideal world. Even after decades of great technical advances, conventional DBMS still can’t give your users all the information they need, when and where they need it, at acceptable cost. As a result, specialty data management products continue to be needed, filling the gaps where more general DBMS don’t do an adequate job.
|
|
Memory-centric technology is a powerful alternative. |
One category on the upswing is memory-centric data management technology. While conventional DBMS are designed to get data on and off disk quickly, memory-centric products (which may or may not be full DBMS) assume all the data is in RAM in the first place. The implications of this design choice can be profound. RAM access speeds are up to 1,000,000 times faster than random reads on disk. Consequently, whole new classes of data access methods can be used when the disk speed bottleneck is ignored. Sequential access is much faster in RAM, too, allowing yet another group of efficient data access approaches to be implemented.
|
|
It does things disk-based systems can’t. |
If you want to query a used-book database a million times a minute, that’s hard to do in a standard relational DBMS. But Progress’ ObjectStore gets it done for Amazon. If you want to recalculate a set of OLAP (OnLine Analytic Processing) cubes in real-time, don’t look to a disk-based system of any kind. But Applix’s TM1 can do just that. And if you want to stick DBMS instances on 99 nodes of a telecom network, all persisting data to a 100th node, a disk-centric system isn’t your best choice – but Solid’s BoostEngine should get the job done.
|
|
Memory-centric data managers fill the gap, in various guises. |
Those products are some leading examples of a diverse group of specialist memory-centric data management products. Such products can be optimized for OLAP or OLTP (OnLine Transaction Processing) or event-stream processing. They may be positioned as DBMS, quasi-DBMS, BI (Business Intelligence) features, or some utterly new kind of middleware. They may come from top-tier software vendors or from the rawest of startups. But they all share a common design philosophy: Optimize the use of ever-faster semiconductors, rather than focusing on (relatively) slow-spinning disks.
|
|
They have a rich variety of benefits. |
For any technology that radically improves price/performance (or any other measure of IT efficiency), the benefits can be found in three main categories:
For memory-centric data management, the “things that you couldn’t do before at all” are concentrated in areas that are highly real-time or that use non-relational data structures. Conversely, for many relational and/or OLTP apps, memory-centric technology is essentially a much cheaper/better/faster way of doing what you were already struggling through all along.
|
|
Memory-centric technology has many applications. |
Through both OEM and direct purchases, many enterprises have already adopted memory-centric technology. For example: |
|
|
|
| Categories: Data types, Memory-centric data management, MOLAP, Object, OLTP, Open source, Progress, Apama, and DataDirect | 3 Comments |
Memory-centric data management whitepaper
I have finally finished and uploaded the long-awaited white paper on memory-centric data management.
This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:
- Applix, vendors of in-memory/memory-centric MOLAP tool TM1
- Progress Software, vendors of ObjectStore, an OODBMS that has more impressive references in-memory or otherwise memory-centric than it does in classical disk-based configurations, and also of the Apama stream processing products
- SAP, vendors of the BI Accelerator functionality of SAP NetWeaver, or whatever tortured name they want to give it this month — basically, that’s a very cool in-memory columnar data mart technology
- Solid Information Technology, vendor of hybrid in-memory/disk-based OLTP RDBMS. Historically focused on the embedded systems market, especially telecom and networking, they’ve recently been in the news because of a deal with MySQL that is designed to extend their reach.
- Intel, makers of the processors used to run a lot of the other sponsors’ products (including all BI Accelerator installations to date).
If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.
And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.
| Categories: Cognos, Companies and products, In-memory DBMS, Intel, Memory-centric data management, MOLAP, Open source, Progress, Apama, and DataDirect, SAP AG, solidDB | 2 Comments |
Why I use the word “MOLAP”
“MOLAP” stands for “Multidimensional OLAP.” It’s almost exactly what Ted Codd was referring to in the white paper where he introduced the term “OLAP.” Relational advocates correctly point out that relational tables are NOT “two-dimensional;” rather, every column in a table represents a dimension.
(If that’s not obvious, think of rows in a table as n-tuples, and n-tuples as akin to vectors. Then think back to the linear algebra segment at the beginning of your Calculus of Several Variables class. Vector spaces? Dimensions? I rest my case.)
Despite all that, I’m comfortable with the “M” in MOLAP, because a dimension in a MOLAP hypercube is a lot more complex than a dimension in a relational table. The latter is itself — well, if there’s a sort order, it’s typically one dimensional. But the analog in a MOLAP cube can be a whole rich and complex hierarchy.
So yes — MOLAP is inherently more multidimensional than ROLAP, atlhough one can of course do something equivalent to a single hypercube by creating a whole lot of different tables.
