Analysis of MOLAP (Multidimensional OnLine Analytic Processing) products and vendors. Related subjects include:
An enterprise user wrote in with a question that boils down to:
What are reasonable MDX performance expectations?
MDX doesn’t come up in my life very much, and I don’t have much intuition about it. E.g., I don’t know whether one can slap an MDX-to-SQL converter on top of a fast analytic RDBMS and go to town. What’s more, I’m heading off on vacation and don’t feel like researching the matter myself in the immediate future.
So here’s the long form of the question. Any thoughts?
I have a general question on assessing the performance of an OLAP technology using a set of MDX queries. I would be interested to know if there are any benchmark MDX performance tests/results comparing different OLAP technologies (which may be based on different underlying DBMS’s if appropriate) on similar hardware setup, or even comparisons of complete appliance solutions. More generally, I want to determine what performance limits I could reasonably expect on what I think are fairly standard servers.
In my own work, I have set up a star schema model centered on a Fact table of 100 million rows (approx 60 columns), with dimensions ranging in cardinality from 5 to 10,000. In ad hoc analytics, is it expected that any query against such a dataset should return a result within a minute or two (i.e. before a user gets impatient), regardless of whether that query returns 100 cells or 50,000 cells (without relying on any aggregate table or caching mechanism)? Or is that level of performance only expected with a high end massively parallel software/hardware solution? The server specs I’m testing with are: 32-bit 4 core, 4GB RAM, 7.2k RPM SATA drive, running Windows Server 2003; 64-bit 8 core, 32GB RAM, 3 Gb/s SAS drive, running Windows Server 2003 (x64).
I realise that caching of query results and pre-aggregation mechanisms can significantly improve performance, but I’m coming from the viewpoint that in purely exploratory analytics, it is not possible to have all combinations of dimensions calculated in advance, in addition to being maintained.
I have a large number of posts still in backlog. For starters, there are ones based on recent visits with Aster, Greenplum, Sybase, Vertica, and a Very Large User. I suspect I’ll write more soon on Oracle as well. Plus there’s my whole future-of-online-media area. And quite a bit more will grow out of planned research.
So there are a whole lot of other worthy subjects I doubt I’ll be getting to any time soon. In some cases, of course, other people are doing great jobs of writing about same. Here are pointers to a few links that I am glad to recommend:
- I wrote recently that I’ve discovered a number of different in-memory OLAP engines. Cindi Howson far outdid that, writing at length for Intelligent Enterprise on in-memory analytics, in an article that seems to itself be a teaser for a longer, free white paper on the subject.
- CouchDB posted an eye-catching, risque slide presentation promoting CouchDB and, more generally, key-value stores, at least for internet applications. And yes, they’ve integrated MapReduce.
- Merv Adrian posted favorably about Birst, with special reference to its OEM efforts. As previously noted, I was highly unimpressed with Birst’s end-user BI story at the time of its September roll-out, and Jerome Pineau’s recent examination did nothing to reassure me. But perhaps OEM is a different matter.
- Merv also offers an interesting post about data integration upstart Expressor, and a highly favorable one about “visualization” vendor Tableau.
- Ann All interviewed Nigel Pendse, who grumped that BI features are overrated, and what end users really want is great query performance. I’m not so sure about the features side of that, but I’m hugely in agreement about the performance. That’s a big part of why the analytic DBMS industry is so vibrant. It’s also why in-memory OLAP is suddenly so hot.
|Categories: Analytic technologies, Business intelligence, CouchDB, Data warehousing, EAI, EII, ETL, ELT, ETLT, Expressor, MapReduce, Memory-centric data management, MOLAP, Presentations, Tableau Software, Theory and architecture||Leave a Comment|
My skeptical remarks on the Aleri/Coral8 merger generated some pushback. Today I actually got around to talking with John Morell, who was marketing chief at Coral8 and has remained with the combined company. First, some quick metrics:
- The combined Aleri has around 100 employees, 60-40 from Aleri vs. Coral8.
- The combined Aleri has around 80 customers. All of Aleri’s, with one sort-of exception at Banks.com, were in financial services. A large minority of Coral8′s were in financial services too.
- However, half of Aleri’s marketing spend going forward is budgeted outside the financial services markets. Not unreasonably, John presents this as a proof point Aleri is serious about selling to other markets.
- Aleri had 12-14 people in the UK pre-merger. Coral8 had none in Europe.
- Coral8 had 15 OEMs pre-merger, some actually generating revenue. Aleri had substantially none.
- Coral8 had been closing a “couple” of customers/quarter in online commerce. But recently, that rate ramped up to a “few.”
- Aleri’s engine is used to handle “many” hundreds of thousands of messages per second. Coral8′s highest-throughput user processes 100-150,000 messages/second.
John is sticking by the company line that there will be an integrated Aleri/Coral8 engine in around 12 months, with all the performance optimization of Aleri and flexibility of Coral8, that compiles and runs code from any of the development tools either Aleri or Coral8 now has. While this is a lot faster than, say, the Informix/Illustra or Oracle/IRI Express integrations, John insists that integrating CEP engines is a lot easier. We’ll see.
I focused most of the conversation on Aleri’s forthcoming efforts outside the financial services market. John sees these as being focused around Coral8′s old “Continuous (Business) Intelligence” message, enhanced by Aleri’s Live OLAP. Aleri Live OLAP is an in-memory OLAP engine, real-time/event-driven, fed by CEP. Queries can be submitted via ODBO/MDX today. XMLA is coming. John reports that quite a few Coral8 customers are interested in Live OLAP, and positions the capability as one Coral8 would have had to develop had the company remained independent. Read more
|Categories: Aleri and Coral8, Analytic technologies, Application areas, Complex event processing (CEP), Games and virtual worlds, Investment research and trading, MOLAP, Web analytics||4 Comments|
I chatted yesterday with the general business side (as opposed to the trading operation) of a household-name brokerage firm, one that’s in no immediate financial peril. It seems their #1 analytic-technology priority right now is changing planning from an annual to a monthly cycle.* That’s a smart idea. While it’s especially important in their business, larger enterprises of all kinds should consider following suit. Read more
|Categories: Analytic technologies, Application areas, Business intelligence, Cognos, Data warehousing, IBM and DB2, MOLAP||Leave a Comment|
I just ran across a December 10 blog post by Chuck Hollis outlining some of EMC’s — or at least Chuck’s — views on data warehousing and business intelligence. It’s worth scanning, a certain “Where you stand depends upon where you sit” flavor to it notwithstanding. In a contrast to my usual blogging style, Chuck’s post is excerpted at length below, with comments from me interspersed. Read more
|Categories: Analytic technologies, Data warehousing, EMC, MOLAP, Solid-state memory, Storage||2 Comments|
When I went to Oracle in October, the main purpose of the visit was to discuss Exadata. And so my initial post based on the visit was focused accordingly. But there were a number of other interesting points I’ve never gotten around to writing up. Let me now remedy that, at least in part. Read more
|Categories: Complex event processing (CEP), Data types, Data warehousing, Database compression, GIS and geospatial, MOLAP, Oracle, SAP AG, Theory and architecture, Web analytics||9 Comments|
- Frankly, I’ve come to think that disk-based OLAP cubes and materialized views are both cop-outs, indicative of a relational data warehouse architecture that can’t answer queries quickly enough straight-up. But if you disagree, then you might like Oracle’s new OLAP cube materialized views, which sound like a worthy competitor to Microsoft Analysis Services. (Further confusing things, I’ve seen reports that Oracle is increasing its commitment to Essbase, a separate MOLAP engine. I hope those are incorrect.)
- A few weeks ago, I came to realize that Oracle’s data mining database features actually mattered — perhaps not quite as much as Charlie Berger might think, but to say that is to praise with faint damns. SPSS seems to be getting large performance gains from leveraging the scoring part, and perhaps the transformation part as well. I haven’t focused on getting my details right yet, so I haven’t been writing about it. But heck, with all the other Oracle data warehousing discussion, it seems right to at least mention this part too.
If I weren’t on a snorkeling vacation,* this might be a good time to write about why I once called Cognos “The Gang That Couldn’t Shoot Straight,” how Ron Zambonini used that label to help him gain the company’s top spot, why he’s such a big fan of mine, why I got my highest ever per-minute speaking fee to attend a Cognos sales kickoff event, why I went for a midnight touristing stroll in downtown Ottawa in zero degree Fahrenheit weather, or how I managed, while attending the aforementioned Cognos sales kickoff, to get snowed in for three days in, of all places, Dallas, Texas. But the wrasses and jacks await, so I’ll get straight to the point.
*Albeit fairly snorkel-free so far, thanks to Hurricane Felix.
As I discussed at considerable length in a white paper, Applix’s core technology is fully-featured, memory-centric MOLAP. This is certainly cool technology, and I think it is actually unique. That it’s historically been positioned as the engine for a mid-range set of performance management tools is a travesty, a shame, the result of a prior merger – and also the quite understandable consequence of RAM limitations. However, RAM is ever cheaper and Applix’s technology is now 64-bit, so the RAM barriers have been relaxed. Cognos can take Applix’s TM1 engine high-end if it wants to. And boy, should Cognos ever want to. Indeed, there are three different great ways Cognos could package and position TM1:
- As a no-data-warehouse-design quick-start analytics engine analogous to QlikView (the fastest-growing and most important newish BI suite, open source perhaps excepted);
- As the most sophisticated and versatile planning tool this side of SAP’s APO (and while APO’s sophistication is not in dispute, its versatility is questionable anyway);
- As the processing hub for dashboards-done-right.
|Categories: Analytic technologies, Business intelligence, Cognos, Memory-centric data management, MOLAP||6 Comments|
Oracle is evidently buying Hyperion Software. Much like Gaul, Hyperion can be divided into three parts:
- Budgeting and consolidation applications, descended from the original Hyperion and Pillar.
- Essbase, the definitive MOLAP engine, descended from Arbor Software.
- A business intelligence suite, descended from Brio.
The most important part is budgeting/planning, because it could help Oracle change the rules for application software. But Essbase could be just the nudge Oracle needs to finally renounce its one-server-fits-all dogma.
|Categories: Analytic technologies, Data warehousing, Microsoft and SQL*Server, MOLAP, Oracle||5 Comments|
SAS has its own data store, called SAS Intelligence Storage. It’s a relational system running on SMP boxes, whose unique feature is that it has fixed-length records and hence is a perfect array, for speedy lookup. This is highly analogous to classical MOLAP systems. However, SAS reports that customers store up to several hundred terabytes of data in SAS Intelligence Storage, which is definitely not very analogous to what goes on in the MOLAP world.
It sounds as if the product is optimized for data mining and generic OLAP alike. Indeed, SAS Intelligence Storage is used to power both SAS’s data mining and other advanced analytics, and also its more conventional BI suite.