SAP HANA has gotten much attention, mainly for its potential. I finally got briefed on HANA a few weeks ago. While we didn’t have time for all that much detail, it still might be interesting to talk about where SAP HANA stands today.
SAP HANA is positioned as an “appliance”. So far as I can tell, that really means it’s a software product for which there are a variety of emphatically-recommended hardware configurations — Intel-only, from what right now are eight usual-suspect hardware partners. Anyhow, the core of SAP HANA is an in-memory DBMS. Particulars include:
- Mainly, HANA is an in-memory columnar DBMS, based on SAP’s confusingly-renamed BI Accelerator/BW Accelerator. Analytics and most OLTP (OnLine Transaction Processing) go against the columnar part of HANA.
- The HANA DBMS also has an in-memory row storage option, used to store metadata, small tables, and so on.
- SAP HANA talks both SQL and MDX.
- The HANA DBMS is shared-nothing across blades or rack servers. I imagine that within an individual blade it’s shared everything. The usual-suspect data distribution or partitioning strategies are available — hash, range, round-robin.
- SAP HANA has what sounds like a natural disk-based persistence strategy — logs, snapshots, and so on. SAP says that this is synchronous enough to give ACID compliance. For some hardware partners, those “disks” are actually Fusion I/O cards.
- HANA is fault-tolerant “across servers”.
- Text support is “coming soon”, which makes sense, given that BI Accelerator was based on the TREX search engine in the first place. Inxight is also in the HANA text mix.
- You can put data into SAP HANA in a variety of obvious ways:
- Writing it directly.
- Trigger-based replication (perhaps from the DBMS that runs your SAP apps).
- Log-based replication (based on Sybase Replication Server).
- SAP Business Objects’ ETL tool.
SAP says that the row-store part is based both on P*Time, an acquisition from Korea some time ago, and also on SAP’s own MaxDB. The IBM white paper mentions only the MaxDB aspect. (Edit: Actually, see the comment thread below.) Based on a variety of clues, I conjecture that this was an aspect of SAP HANA development that did not go entirely smoothly.
Other SAP HANA components include:
- A hierarchical planning/consolidation engine. SAP says that what’s hard about hierarchical planning is the complex versioning/locking it entails.
- A scoring-only predictive analytics library, based on R. Of course, SAP is working on adding predictive modeling to the mix.
- Other libraries, for “business functions”. The example SAP offered was allocations across cost centers.
- At least three modeling pieces, one each for developers, business analysts, and administrators.
- Application server run-time services, whatever that means.
Those pieces sound a lot like what’s in SAP BW (Business Warehouse), which is surely not a coincidence.
In one quick conversation with SAP, it’s hard to sort out where SAP HANA has actually been used in production, and where people are just building something on HANA that they hope will work well. That said:
- The SAP Business Objects BI on Demand SaaS (Software as a Service) offering now runs “mostly” on HANA.
- Also running on HANA is the analytic part of SAP’s Business ByDesign general SaaS offering. (Even though Business ByDesign is widely regarded as having had disappointing adoption, SAP does say it has >1000 customers at this point.)
- Certain specific SAP apps require or strongly benefit from HANA today.
- SAP BW has been ported to HANA, as a deployment option. (That’s in controlled release, with full GA coming some months down the road.)
- The same is planned to soon be true for SAP’s various application suites, more or less in their entirety.
- SAP says that in the market where its users are most likely to build their own custom apps — financial services — there’s been some custom HANA app building as well.
- At least one third party (called Medidata) is building at least one SaaS application on HANA.
The example SAP gave for a new app built on HANA happened to be in the machine-generated data area — Smart Meter Analytics, which polls electric meters every 15 minutes or so. SAP’s named reference has pretty low data volumes (commercial meters only), but SAP expects deployment soon across commercial and residential meters alike for a large US electric utility.
In another example, SAP took an existing application — a fairly analytic one called “CO-PA”, for Profitability Analysis — and accelerated part of it via HANA, specifically in the area of allocations. The port only required a few lines of code, reflecting that certain tables were moved from the RDBMS to HANA in a schema-preserving way.
Putting all that together, the analytic case for SAP HANA seems decently substantiated — there are years of experience with the technology and its antecedents, and column stores (including in-memory) are well-established for analytics via multiple vendors. The OLTP case for HANA, however, remains largely unproven. It will be interesting to see how it plays out.