SAP AG
Analysis of SAP AG, and most especially its memory-centric BI Accelerator technology. Also covered are SAP’s overall database, connectivity, and analytics strategies. Related subjects include:
- SAP’s Business Objects business intelligence subsidiary
- Memory-centric data management
- Columnar database management
- (in Text Technologies) SAP’s TREX search engine and Inxight text analytics technology
- (in The Monash Report) Strategic issues for SAP
- (in Software Memories) Historical notes on SAP
Memory-centric data management whitepaper
I have finally finished and uploaded the long-awaited white paper on memory-centric data management.
This is the project for which I origially coined the term “memory-centric data management,” after realizing that the prevalent “in-memory DBMS” creates all sorts of confusion about how and whether data persists on disk. The white paper clarifies and updates points I have been making about memory-centric data management since last summer. Sponsors included:
- Applix, vendors of in-memory/memory-centric MOLAP tool TM1
- Progress Software, vendors of ObjectStore, an OODBMS that has more impressive references in-memory or otherwise memory-centric than it does in classical disk-based configurations, and also of the Apama stream processing products
- SAP, vendors of the BI Accelerator functionality of SAP NetWeaver, or whatever tortured name they want to give it this month — basically, that’s a very cool in-memory columnar data mart technology
- Solid Information Technology, vendor of hybrid in-memory/disk-based OLTP RDBMS. Historically focused on the embedded systems market, especially telecom and networking, they’ve recently been in the news because of a deal with MySQL that is designed to extend their reach.
- Intel, makers of the processors used to run a lot of the other sponsors’ products (including all BI Accelerator installations to date).
If there’s one area in my research I’m not 100% satisfied with, it may be the question of where the true hardware bottlenecks to memory-centric data management lie (it’s obvious that the bottleneck to disk-centric data management is random disk access). Is it processor interconnect (around 1 GB/sec)? Is it processor-to-cache connections (around 5 GB/sec)? My prior pronouncements, the main body of the white paper, and the Intel Q&A appendix to the white paper may actually have slightly different spins on these points.
And by the way — the current hard limit on RAM/board isn’t 2^64 bytes, but a “mere” 2^40. But don’t worry; it will be up to 2^48 long before anybody actually puts 256 gigabytes under the control of a single processor.
| Categories: Cognos, Companies and products, In-memory DBMS, Intel, Memory-centric data management, MOLAP, Open source, Progress, Apama, and DataDirect, SAP AG, solidDB | 2 Comments | 
SAP on SAP
Dan Farber’s blog from SAP’s developer conference isn’t, frankly, his best piece of work, since the quotes are sometimes so garbled as to be a bit unreadable. Still, it helps flesh out what we already knew about SAP’s strategy.
Basically, they claim to be reengineering their whole product line for the new services-based architcture I keep writing about. And they insist this truly is a new platform architecture. In that regard, I buy into and agree with their pitch.
They further insist that the mid-market will be a big part of their business going forward, but SaasS will not. I don’t buy into that as fully.
I’ll spell out why in another post, but not until Monday at the earliest. Watch the comments section on this one for trackbacks.
| Categories: SAP AG, Theory and architecture | Leave a Comment | 
SAP, MaxDB, and MySQL, updated
I’ve had a chance to clarify and correct my understanding of the relationship between SAP, MaxDB, and MySQL. The story is this:
- MySQL has the right to sell MaxDB, but apparently isn’t focusing much on that.
- The MySQL and MaxDB code lines are NOT merging, for technical reasons. For example, the older MaxDB does a lot of its own thread management, while MySQL relies on the operating system for that.
- When SAP thinks a DBMS is capable of running SAP’s apps, it adds the DBMS to its product catalog and resells it. Yes, even Oracle. That’s why all my discussions with SAP of MySQL’s enterprise-readiness quickly come back to an exhaustive multi-year certification process.
- My personal best guess as to when MySQL will be in SAP’s product catalog is 1 1/2 – 3 years from now.
And by the way, MaxDB’s share in SAP’s user base is about the same as DB2’s (at least DB2 for open systems). MaxDB is being aggressively supported, and nobody should get any ideas to the contrary!
| Categories: IBM and DB2, MySQL, Open source, Oracle, SAP AG | 6 Comments | 
SAP’s version of DBMS2
I just spent a couple of days at SAP’s analyst meeting, and realized something I’d somewhat forgotten – much of the DBMS2 concept was inspired by SAP’s technical strategy. That’s not to say that SAP’s techies necessarily agree with me on every last point. But I do think it is interesting to review SAP’s version of DBMS2, to the extent I understand it.
1. SAP’s Enterprise Services Architecture (ESA) is meant to be, among other things, an abstraction layer over relational DBMS. The mantra is that they’re moving to a “message-based architecture” as opposed to a “database architecture.” These messages are in the context of a standards-based SOA, with a strong commitment to remaining open and standards-based, at least on the data and messaging levels. (The main limitation on openness that I’ve detected is that they don’t think much of standards such as BPEL in the business process definition area, which aren’t powerful enough for them.)
2. One big benefit they see to this strategy is that it reduces the need to have grand integrated databases. If one application manages data for an entity that is also important to another application, the two applications can exchange messages about the entity. Anyhow, many of their comments make it clear that, between partner company databases (a bit of a future) and legacy app databases (a very big factor in the present day), SAP is constantly aware of situations in which a single integrated database in infeasible.
3. SAP is still deeply suspicious of redundant transactional data. They feel that with redundant data you can’t have a really clean model – unless, of course, you code up really rigorous synchronization. However, if for some reason synchronization is preferred – e.g., for performance reasons — it can be hidden from users and most developers.
4. One area where SAP definitely favors redundancy and synchronization is data warehousing. Indeed, they have an ever more elaborate staging system to move data from operational to analytic systems.
5. In general, they are far from being relational purists. For example, Shai Agassi referred to doing things that you can’t do in a pure relational approach. And Peter Zencke reminded me that this attitude is nothing new. SAP has long had complex business objects, and even done some of its own memory management to make them performant, when they were structured in a manner that RDBMS weren’t well suited for. (I presume he was referring largely to BAPI.)
6. That said, they’re of course using relational data stores today for most things. One exception is text/content, which they prefer to store in their own text indexing/management system TREX. Another example is their historical support for MOLAP, although they seem to be edging as far away from that as they can without offending the MOLAP-loving part of their customer base.
Incidentally, the whole TREX strategy is subject to considerable doubt too. It’s not a state-of-the-art product, and they currently don’t plan to make it into one. In particular, they have a prejudice against semi-automated ontology creation, and that has clearly become a requirement for top-tier text technologies.
7. One thing that Peter said which confused me a bit is when we were talking about nonrelational data retrieval. The example he used was retrieving information on all of a specific sales reps’ customers, or perhaps on several sales reps’ customers. I got the feeling he was talking about the ability to text search on multiple columns and/or multiple tables/objects/whatever at once, but I can’t honestly claim that I connected all the dots.
And of course, the memory-centric ROLAP tool BI Accelerator — technology that’s based on TREX — is just another example of how SAP is willing to go beyond passively connecting to a single RDBMS. And while their sponsorship of MaxDB isn’t really an example of that, it is another example of how SAP’s strategy is not one to gladden the hearts of the top-tier DBMS vendors.
| Categories: EAI, EII, ETL, ELT, ETLT, Memory-centric data management, MOLAP, OLTP, SAP AG, Theory and architecture | 9 Comments | 
Defining and surveying “Memory-centric data management”
I’m writing more and more about memory-centric data management technology these days, including in my latest Computerworld column. You may be wondering what that term refers to. Well, I’ve basically renamed what are commonly called “in-memory DBMS,” for what I think is a very good reason: Most of the products in the category aren’t true DBMS, aren’t wholly in-memory, or both! Indeed, if you catch me in a grouchy mood I might argue that “in-memory DBMS” is actually a contradiction in terms.
I’ll give a quick summary of the vendors and products I am focusing on in this newly-named category, and it should be clearer what I mean:
- TimesTen (now owned by Oracle): TimesTen is the quintessentional “in-memory DBMS.” It’s a fairly full relational DBMS, but if you want to persist memory to disk it has to be handed off to a conventional DBMS. Historically, that has usually been MySQL or Oracle. TimesTen’s biggest market penetration has been in financial trading.
- Solid Information Technology‘s BoostEngine: Solid is a Finnish company (or was — it’s pretty American now) specializing in embedded DBMS sold mainly for telecommunication uses. Big OEM customers include several well-known telecom equipment manufacturers and HP (for OpenView). “Embedded” often means no DBA, no monitor, no keyboard — they box manufacturer installs it and there it stays for the life of the product. Solid has to offer strong replication capabilities, since its products are often used in highly distributed (e.g., multiblade, multibox) environments. So it’s taken the next step and exploited the replication by allowing customers to use some instances of the product disklessly.
- Event-stream products from Streambase and Progress: The canonical application for event-stream products is automating financial trading decisions based on the flow of market information. Mike Stonebraker, the brains behind Streambase, has recently popularized the idea; Progress bought Apama, who actually have been in the business longer. These applications require even more speed than the financial trading apps that TimesTen handles, and they discard most of the information they look at. In-memory is the only way to go.
- Progress’s ObjectStore: ObjectStore comes from the company Object Design, which merged into Excelon, which was acquired by Progress. It’s really a toolkit for building DBMS and similar systems, which is why it’s at various times been marketed as an OODBMS and an XML DBMS, without a lot of success either way. But there have been a few sterling apps built in ObjectStore even so, including a key part of the Amazon bookstore Despite this limited market success, a significant fraction of Progress’s best engineering talent has moved over to the Real-Time Division to focus on ObjectStore and other memory-centric products. The memory-centric aspect of ObjectStore is this: ObjectStore’s big virtue is that it gets objects from disk to memory and vice-versa very efficiently, then distributes and caches them around a network as needed. This was originally invented for client/server processing, but works fine in a multi-server thin client setup as well. And object processing, of course, relies on a whole lot of pointers. And pointer-chasing is pretty much the worst way to deal with the disk speed barrier, unless you do it in main memory.
- Applix‘s TM1: Like many companies in the analytics area, Applix has had trouble deciding whether it sells applications, BI system software, or both. But in any case its core technology is TM1, a memory-centric MOLAP offering. Traditional MOLAP products reside on the horns of a nasty dilemma: They rely on precalculation to give good performance, but that causes ghastly database explosion. Applix gets out of this problem by doing no precalculation whatsoever, loading the data into main memory, and executing all queries on the fly.
- SAP’s BI Accelerator: SAP is building out an elaborate technology stack with NetWeaver, especially in the BI area. One important aspect is that the full data warehouse is logically broken (or copied) into a series of data marts called “InfoCubes.” BI Accelerator takes the logical next step, loading an entire InfoCube into main memory. Almost every query is executed via a full table scan, which would be insane on disk but makes perfect sense when the data is already in RAM.
So there you have it. There are a whole lot of technologies out there that manage data in RAM, in ways that would make little or no sense if disks were more intimately involved. Conventional DBMS also try to exploit RAM and limit disk access, via caching; but generally the data access methods they use in RAM are pretty similar to those they use when going out to disk. So memory-centric systems can have a major advantage.
Down with database consolidation!
As with all changes in information technology, the move to DBMS2 will largely be one of evolution. But it does have a couple of revolutionary aspects.
Short-term, the biggest change is a renunciation of database and DBMS vendor consolidation. Consolidation never has worked, it never will work, and as data integration technologies keep improving it’s not that important anyway.
IBM and Oracle offer really great, brilliantly complex data warehousing technology. But if you want the most bang for the buck, forget about them, and go instead with a specialty vendor. Depending on the specifics of your situation, Teradata, Netezza, Datallego, WhiteCross, or SAP may offer the best choice, and that list could be even longer.
Similarly, for generic OLTP data management, cheap and/or open source options are getting ever more attractive. Microsoft is a serious contender for applications that previously only Oracle and IBM could handle, while MySQL and maybe Ingres are moving up the food chain right behind.
In many cases, these alternative technologies are lower-cost across the board: Lower purchase price, lower ongoing maintenance fees, and lower administrative costs.
So what, again, is the case for consolidation?
| Categories: Actian and Ingres, Analytic technologies, Data warehouse appliances, Database diversity, IBM and DB2, Kognitio, Memory-centric data management, Microsoft and SQL*Server, MOLAP, MySQL, Netezza, Open source, Oracle, SAP AG, Theory and architecture | Comments Off on Down with database consolidation! | 
MySQL, SAP, and MaxDB
MySQL is like a star high school athlete — impressive skills and potential, but it still only excels at a limited range of mainly simple things. Will it grow into a robust, adult star? I think so, and here’s a big part of the reason why: MaxDB and SAP certification.
MaxDB is a database product that bounced among all the major German computer hardware and software companies: Nixdorf, Siemens, Software AG, and SAP. (What little fame it ever had was primarily under the name Adabas-D.) SAP eventually shipped MaxDB as the underlying DBMS at many R3 installations. This is a huge sign of OLTP industrial-strengthness; if a DBMS can run SAP’s apps, it can run pretty much anything. OK, not necessarily retail banking, airline reservations, and so on — but pretty much anything else.
Well, two years ago MySQL (the company) and SAP agreed to what amounts to a slow-motion merge between MySQL (the product) and MaxDB. The resulting joint product (currently still quite separate from MySQL 5.0) is undergoing a multi-year process of achieving SAP certification. Everybody involved clearly expects this certification to eventually succeed — in 2-3 years, probably, or perhaps less if they were being really coy with me.
And when that happens, there will be a version of MySQL that is unquestionably fit for rigorous OLTP.
Technorati Tags: Database, DBMS, DBMS2, MySQL, SAP, Software
| Categories: MySQL, SAP AG | 4 Comments | 
