MOLAP
Analysis of MOLAP (Multidimensional OnLine Analytic Processing) products and vendors. Related subjects include:
- Cognos (which now owns TM1)
- Oracle (which now owns Essbase)
- Microsoft (which offers Microsoft Analysis Services)
- Data warehousing
The Ted Codd guarantee
I write a lot about whether or not to use relational DBMS. For example:
- In May I surveyed relational vs. non-relational pros and cons at some length.
- Last November I mused about when it might be OK to do without joins.
- The question is implicit in a variety of posts about, say, document-oriented or object-oriented DBMS.
Before going further in that vein, I’d like to do a quick review of what E. F. “Ted” Codd was getting at with the relational model in the first place. Read more
| Categories: Data models and architecture, IBM and DB2, MOLAP, NoSQL | 1 Comment |
Eight kinds of analytic database (Part 2)
In Part 1 of this two-part series, I outlined four variants on the traditional enterprise data warehouse/data mart dichotomy, and suggested what kinds of DBMS products you might use for each. In Part 2 I’ll cover four more kinds of analytic database — even newer, for the most part, with a use case/product short list match that is even less clear. Read more
Eight kinds of analytic database (Part 1)
Analytic data management technology has blossomed, leading to many questions along the lines of “So which products should I use for which category of problem?” The old EDW/data mart dichotomy is hopelessly outdated for that purpose, and adding a third category for “big data” is little help.
Let’s try eight categories instead. While no categorization is ever perfect, these each have at least some degree of technical homogeneity. Figuring out which types of analytic database you have or need — and in most cases you’ll need several — is a great early step in your analytic technology planning. Read more
When it’s still best to use a relational DBMS
There are plenty of viable alternatives to relational database management systems. For short-request processing, both document stores and fully object-oriented DBMS can make sense. Text search engines have an important role to play. E. F. “Ted” Codd himself once suggested that relational DBMS weren’t best for analytics.* Analysis of machine-generated log data doesn’t always have a naturally relational aspect. And I could go on with more examples yet.
*Actually, he didn’t admit that what he was advocating was a different kind of DBMS, namely a MOLAP one — but he was. And he was wrong anyway about the necessity for MOLAP. But let’s overlook those details.
Nonetheless, relational DBMS dominate the market. As I see it, the reasons for relational dominance cluster into four areas (which of course overlap):
- Data re-use. Ted Codd’s famed original paper referred to shared data banks for a reason.
- The benefits of normalization, which include:
- You only have to do programming work of writing something once …
- … and you don’t have to do the programming work of keeping multiple versions of the information consistent.
- You only have to do processing work of writing something once.
- You only have to buy storage to hold each fact once.
- Separation of concerns.
- Different people can worry about programming and “database stuff.”
- Indeed, even performance optimization can sometimes be separated from programming (i.e., when all you have to do to get speed is implement the correct indexes).
- Maturity and momentum, as reflected in the availability of:
- People.
- A broad variety of mature relational DBMS.
- Vast amounts of packaged software that “talks” SQL.
Generally speaking, I find the reasons for sticking with relational technology compelling in cases such as: Read more
| Categories: Analytic technologies, Data models and architecture, Database diversity, MOLAP, NoSQL, Object, Theory and architecture | 17 Comments |
Evolving definitions and technology categories for 2011
It seems my prediction of a limited blogging schedule in December came emphatically true. I shall re-start with a collection of quick thoughts, clearing the decks for more detailed posts to follow. Read more
| Categories: Analytic technologies, DBMS product categories, Data types, Data warehousing, MOLAP, Theory and architecture | 6 Comments |
Ray Wang on SAP
Ray Wang made a terrific post based on SAP’s annual influencer love-in, an event which I no longer attend. Ray believes SAP has been in a “crisis”, and sums up his views as
The Bottom Line – SAP’s Turning The Corner
Credit must be given to SAP for charting a new course. A shift in the management philosophy and product direction will take years to realize, however, its not too late for change. SAP must remember its roots and become more German and less American. The renewed focus must put customer requests and priorities ahead of SAP’s bureaucracy. The emphasis must focus on the relationship. When that reemerges in how SAP works with customers, partners, influencers, and its own employees, SAP will be back in good graces. In the meantime, its time to get to work and deliver. Oracle’s Fusions Apps are coming soon and competitors such as IBM, Microsoft, Epicor, IFS, and SalesForce.com will not relent.
I recall the 1980s, when SAP’s main differentiator, at least in the English-speaking US, was a total commitment to customer success, and when it could be taken for granted that SAP would do business ethically. Things change, and not always for the better.
Anyhow, the reason I’m highlighting Ray’s post is that he makes reference to a number of interesting SAP-cetric technology trends or initiatives. Read more
| Categories: Analytic technologies, Business intelligence, MOLAP, Memory-centric data management, SAP AG, Solid-state memory | 1 Comment |
A question on MDX performance
An enterprise user wrote in with a question that boils down to:
What are reasonable MDX performance expectations?
MDX doesn’t come up in my life very much, and I don’t have much intuition about it. E.g., I don’t know whether one can slap an MDX-to-SQL converter on top of a fast analytic RDBMS and go to town. What’s more, I’m heading off on vacation and don’t feel like researching the matter myself in the immediate future.
So here’s the long form of the question. Any thoughts?
I have a general question on assessing the performance of an OLAP technology using a set of MDX queries. I would be interested to know if there are any benchmark MDX performance tests/results comparing different OLAP technologies (which may be based on different underlying DBMS’s if appropriate) on similar hardware setup, or even comparisons of complete appliance solutions. More generally, I want to determine what performance limits I could reasonably expect on what I think are fairly standard servers.
In my own work, I have set up a star schema model centered on a Fact table of 100 million rows (approx 60 columns), with dimensions ranging in cardinality from 5 to 10,000. In ad hoc analytics, is it expected that any query against such a dataset should return a result within a minute or two (i.e. before a user gets impatient), regardless of whether that query returns 100 cells or 50,000 cells (without relying on any aggregate table or caching mechanism)? Or is that level of performance only expected with a high end massively parallel software/hardware solution? The server specs I’m testing with are: 32-bit 4 core, 4GB RAM, 7.2k RPM SATA drive, running Windows Server 2003; 64-bit 8 core, 32GB RAM, 3 Gb/s SAS drive, running Windows Server 2003 (x64).
I realise that caching of query results and pre-aggregation mechanisms can significantly improve performance, but I’m coming from the viewpoint that in purely exploratory analytics, it is not possible to have all combinations of dimensions calculated in advance, in addition to being maintained.
| Categories: Analytic technologies, Benchmarks and POCs, Data warehousing, MOLAP | 16 Comments |
Clearing some of my buffer
I have a large number of posts still in backlog. For starters, there are ones based on recent visits with Aster, Greenplum, Sybase, Vertica, and a Very Large User. I suspect I’ll write more soon on Oracle as well. Plus there’s my whole future-of-online-media area. And quite a bit more will grow out of planned research.
So there are a whole lot of other worthy subjects I doubt I’ll be getting to any time soon. In some cases, of course, other people are doing great jobs of writing about same. Here are pointers to a few links that I am glad to recommend:
- I wrote recently that I’ve discovered a number of different in-memory OLAP engines. Cindi Howson far outdid that, writing at length for Intelligent Enterprise on in-memory analytics, in an article that seems to itself be a teaser for a longer, free white paper on the subject.
- CouchDB posted an eye-catching, risque slide presentation promoting CouchDB and, more generally, key-value stores, at least for internet applications. And yes, they’ve integrated MapReduce.
- Merv Adrian posted favorably about Birst, with special reference to its OEM efforts. As previously noted, I was highly unimpressed with Birst’s end-user BI story at the time of its September roll-out, and Jerome Pineau’s recent examination did nothing to reassure me. But perhaps OEM is a different matter.
- Merv also offers an interesting post about data integration upstart Expressor, and a highly favorable one about “visualization” vendor Tableau.
- Ann All interviewed Nigel Pendse, who grumped that BI features are overrated, and what end users really want is great query performance. I’m not so sure about the features side of that, but I’m hugely in agreement about the performance. That’s a big part of why the analytic DBMS industry is so vibrant. It’s also why in-memory OLAP is suddenly so hot.
Aleri update
My skeptical remarks on the Aleri/Coral8 merger generated some pushback. Today I actually got around to talking with John Morell, who was marketing chief at Coral8 and has remained with the combined company. First, some quick metrics:
- The combined Aleri has around 100 employees, 60-40 from Aleri vs. Coral8.
- The combined Aleri has around 80 customers. All of Aleri’s, with one sort-of exception at Banks.com, were in financial services. A large minority of Coral8′s were in financial services too.
- However, half of Aleri’s marketing spend going forward is budgeted outside the financial services markets. Not unreasonably, John presents this as a proof point Aleri is serious about selling to other markets.
- Aleri had 12-14 people in the UK pre-merger. Coral8 had none in Europe.
- Coral8 had 15 OEMs pre-merger, some actually generating revenue. Aleri had substantially none.
- Coral8 had been closing a “couple” of customers/quarter in online commerce. But recently, that rate ramped up to a “few.”
- Aleri’s engine is used to handle “many” hundreds of thousands of messages per second. Coral8′s highest-throughput user processes 100-150,000 messages/second.
John is sticking by the company line that there will be an integrated Aleri/Coral8 engine in around 12 months, with all the performance optimization of Aleri and flexibility of Coral8, that compiles and runs code from any of the development tools either Aleri or Coral8 now has. While this is a lot faster than, say, the Informix/Illustra or Oracle/IRI Express integrations, John insists that integrating CEP engines is a lot easier. We’ll see.
I focused most of the conversation on Aleri’s forthcoming efforts outside the financial services market. John sees these as being focused around Coral8′s old “Continuous (Business) Intelligence” message, enhanced by Aleri’s Live OLAP. Aleri Live OLAP is an in-memory OLAP engine, real-time/event-driven, fed by CEP. Queries can be submitted via ODBO/MDX today. XMLA is coming. John reports that quite a few Coral8 customers are interested in Live OLAP, and positions the capability as one Coral8 would have had to develop had the company remained independent. Read more
| Categories: Aleri and Coral8, Analytic technologies, Application areas, Complex event processing (CEP), Games and virtual worlds, Investment research and trading, MOLAP, Web analytics | 4 Comments |
Analytics’ role in a frightening economy
I chatted yesterday with the general business side (as opposed to the trading operation) of a household-name brokerage firm, one that’s in no immediate financial peril. It seems their #1 analytic-technology priority right now is changing planning from an annual to a monthly cycle.* That’s a smart idea. While it’s especially important in their business, larger enterprises of all kinds should consider following suit. Read more
