September 29, 2009

Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2

Oracle is pushing Exadata 2 as being a great system for any of OLTP (OnLine Transaction Processing), data warehousing or, presumably, the integration of same. This claim rests on a few premises, namely:

Exadata is great for data warehousing. At this time, that’s a claim much better supported by marketing and theory than by practice.
Exadata 2 is a suitable annual improvement over last year’s Exadata 1. That’s quite plausible.
Oracle is outstanding for OLTP. That’s borne out by vast amounts of experience, especially if by “outstanding” you mean “Gets the job done really, really well at a very high cost in terms of both licenses and labor.”
The Flash memory in Exadata 2 makes Oracle even better for OLTP.* That’s plausible too. Worst-case is probably that Flash support doesn’t really work well in this release, but will be cleaned up soon.**
OLTP and data warehousing uses for Exadata don’t interfere with each other. That one bears some discussion.

*Oracle has repeatedly emphasized that the Flash memory in Exadata 2 is meant to speed up OLTP. By way of contrast, I’ve only noticed one vague claim that Flash memory helps data warehousing – a reference to a doubling in “user scan rates”, which perhaps was a slip of the marketing pen.

**Oracle probably has been working on Flash memory support for a long time. But it’s likely that Oracle didn’t have a strategic commitment to Sun’s specific technology until April of this year. After all, back in March it looked as if IBM would wind up owning Sun.

The integration-versus-separation argument for OLTP and analytic databases is an old one. In the early 1980s, IBM pushed both the “Information Center” (precursor to the data warehouse) and relational DBMS (portrayed as good for query and maybe for OLTP as well). In the early 1990s, Ted Codd opined that relational DBMS were good for OLTP but not analytics, instead favoring “OLAP” systems like Arbor Software’s Essbase (which, ironically, is now owned by Oracle). As the 1990s progressed, a consensus emerged that most large* enterprises should have at least one relational data warehouse separate from the core OLTP DBMS, a view that has persisted to this day. Until the announcement of Exadata 2, Oracle hadn’t seriously disputed this consensus, although it of course it always has wanted its DBMS software to run your OLTP and analytic databases alike.

*At a sufficiently small enterprise, one DBMS suffices. If a single commodity server has enough power to do all your processing, without even requiring you to have the expertise to tune very seriously, that’s probably the right way to go.

Assuming one DBMS has plenty of functionality for OLTP and analytics alike – as Oracle certainly does – the main arguments for separating OLTP and data warehousing revolve around performance. Reasons to split out a separate analytic database include:

You might just want to run a separate brand of DBMS for your OLTP and data warehousing. Oracle thinks this is a terrible idea. (I disagree, as do a whole lot of analytic DBMS vendors – Teradata, Netezza, Greenplum, Sybase, Vertica, Aster Data, Infobright, Kognitio, et al. — and their customers.)
You may want to lay out or index your tables differently for OLTP and data warehousing. Materialized view capabilities as flexible as Oracle’s should let you do that in a single database.
You may want to lay out your files differently for OLTP and data warehousing (e.g., in terms of block sizes). Oracle might claim that ASM (Automatic Storage Management) and, in particular, the “Stripe and Mirror Everything” option obviate that point. I’m far from convinced.
OLTP and analytic workloads step on each other’s toes, Part 1. For example, analytic queries that call for table scans often don’t mix well with OLTP operations that call for random reads and (especially) writes.* In principle, Flash memory could greatly reduce the problem, if the OLTP workload talks mainly to Flash, while Flash talks to disk mainly via microbatches. But I’ll be quite surprised if Oracle has aced that challenge on the first try. More likely, a longish stretch of Bottleneck Whack-A-Mole lies ahead.
OLTP and analytic workloads step on each other’s toes, Part 2. Even more fundamentally: If you don’t have sufficiently good workload management tools, combining OLTP and analytic workloads is a ghastly performance idea, with OLTP slowing to a crawl while analytic queries rumble to completion. However, I’d think Oracle is in pretty good shape in that area.

*If this weren’t a terribly difficult problem, Oracle, IBM, and/or Teradata – all of which can do a reasonably decent job of mixing long and short queries in the same workload — would probably have solved it years ago.

Bottom line: Some day, Oracle Exadata may be a great system for integrated OLTP and data warehousing – but probably not in the current release.

Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Exadata, OLTP, Oracle, Solid-state memory, Theory and architecture

Subscribe to our complete feed!

Comments

36 Responses to “Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2”

RC on September 29th, 2009 3:55 am

“If a single commodity server has enough power to do all your processing, without even requiring you to have the expertise to tune very seriously, that’s probably the right way to go.”…

Why probably..? No need to use the word probably in this case. I would say:

“If a single commodity server has enough power to do all your processing, without even requiring you to have the expertise to tune very seriously, that’s the right way to go.”
Tony Bain on September 29th, 2009 4:17 am

It will be interesting to see how far Oracle goes with the mixed workload message. I don’t think it has a lot of relevance for true warehouses and specialized analytical workloads, no doubt the specialized vendors will keep the price/performance benefit here.

But I have interest in mixed workloads on Exadata. When scaling traditional enterprise core business applications (ERP, financials) these are all mixed workload apps (transaction processing, lots of history, many aggregate query reports etc). When you hit a wall in scalability, what do you do? You start trying to off-load reporting requirements to replica’s & standby servers. Then issues such as data latency and consistency start to crop up (as well as the issue than many ERP’s are designed to work with read only replicas so running a report still involves a write transaction). So many data marts are created simply because existing systems cannot cope with the dual reporting/transaction workload. Not because you want to consolidate many sources of data, or process high intensity/complicated queries. Just so you can run your standard reports without nailing your transaction processing. This is a lot of effort and expense to go to run your reports at scale of course.

If you can plug in Exadata and much better scalability of both requirements in a single database without thinking too much about it, that would be massive. The cost of doing so is going to limit it a bit of course, but if you are a big Oracle Apps user then the cost of Exadata might not appear that great.
RC on September 29th, 2009 4:27 am

@Tony Bain

No one wants to store the data twice, the only reason that dataware houses exist is performance.
Alistair Wall on September 29th, 2009 6:58 am

The major conflict is that you might want to load the data warehouse with no logging, but for OLTP you would require logging to keep the standby database and backups valid.
Shawn Fox on September 29th, 2009 7:18 am

Any reasonably big company is going to have 10s or 100s of applications and will thus have a need to consolidate data from them into a single system.

The challenge of building a single application which does everything a company needs is vastly more unobtainable than building a single database which can handle both transactional and analytic workloads at the same time.

Throw in the fact that a star schema model can improve query performance by an order of magnitude or more in some situations and it becomes clear that the need for a separate data warehouse is going to remain for the foreseeable future.

Over time the need will become less, especially for smaller organizations, but it isn’t going away any time soon.

The use case that will see a great deal of increase is “active data warehousing” where the data warehouse is updated in near real time. The data warehouse will also be used more as the operational data source for master data (generally via web services) since it really doesn’t make sense to keep this type of data in multiple places.
Serge Rielau on September 29th, 2009 9:04 am

I think one often forgotten benefit of separating OLTP and warehouse is risk. Software has bugs.
OLTP workloads are oftentimes quite static with very predictable and testable(!) code path.
Warehouse queries on the other hand can and do throw arbitrarily complex queries at the DBMS. that is not only a challenge for teh optimizer and workload management, but it also result sin execution of unanticipated code paths.
And the last thing you want is some power user at head office bringing down the OLTP system that is bringing home the bacon.
Now, HA systems help some, but if a query had a bad effect on that one node it will shoot down the backup just as happily. Whether it’s RAC, HACMP, HADR… all those system replicate the same code.
So… putting all eggs into one basket…. not so sure that is a universally good idea.
Cheers
Serge
Curt Monash on September 29th, 2009 9:35 am

@Serge — Good point! Oft-forgotten indeed. 🙂

@Shawn — In PRINCIPLE all that DW consolidation, star schema recopying, and so on can be done in the same database as runs central OLTP. The question is how that works out in PRACTICE.
Daniel Abadi on September 29th, 2009 9:43 am

@Serge, @Shawn — I totally agree.
Daniel Abadi on September 29th, 2009 9:48 am

Sorry Curt — didn’t see your comment before I posted. This is a really great and useful discussion.
Rob Klopp on September 29th, 2009 11:09 am

A great many companies can’t even get all of their data warehouse workload to run on a single platform… they may have multiple data warehouses and very often have numerous data marts. This makes me more doubtful still that OLTP and all DW workload can coexist.

While it may be possible to index your way around the performance problems, every index impacts OLTP performance. Indexes can be built offline, but the index build is another significant workload to be handled in a single system.

But maybe Exadata can remove the requirement for a separate ODS?
Curt Monash on September 29th, 2009 11:19 am

@Rob,

It sounds as if you’re suggesting materialized views might be an intolerable performance burden for OLTP, while higher-latency data warehouse loads to the same logical effect are not.

Good point.

I don’t actually know whether materialized view update policies are sufficiently tunable to obviate that point. But I would guess they are not.
Peter Mooshammer on September 29th, 2009 2:37 pm

May I add another angle?

Many customers are worried that a DW project will end up being an investment without return. So you end up having a big pile of idle HW. Claiming that your HW runs OLTP great as well, lets you use your HW/SW licenses even when DW project goes nowhere. Or at least lets you run a smaller DW project next to your OLTP and go from there.

Oracle always told their customers to put all their eggs in one basket. For years now they preach the advantages of consolidation. (From 100’s databases down to 2 or 3 Oracle RAC DBs.) That way they can get rid of a few smaller competitors …

Oracle is working on the next gen mainframe – it is called the Grid… Physically it might not look like a mainframe, but organizationally it is. One big department running the IT of a company – everything provided and supported by Oracle. IBM out – Oracle in, its between these two – mostly.
Bence Arató on September 29th, 2009 5:24 pm

I don’t think that Oracle actually advises running OLTP and DW workloads on the same physical Exadata setup, but they do want to serve both market segment (DW and OLTP) with the v2 product.

I see their strategy in the following way:
– With Exadata V1, Oracle has been able to offer a fast, scalable, intelligent storage solution for existing big Oracle-based DWs. This was quite important, because there was serious (and growing) competitive pressure from the MPP vendors.
– But this was a market with limited potential, as there are just so much really large Oracle DW, where the price of Exadata would be justified.
– With Exadata V2, now they can offer similar benefits to big OLTP sites. Of which there are plenty…

It also worth to mention that between v1 and v2, Oracle more than doubled the price of the Exadata hardware. (Exadata v1 price: 24.000 USD, Exadata v2 price: 60.000 USD). I’m sure that the inclusion of the Flash RAM cards to accelerate the OLTP performance has something to do with it 🙂
Noons on September 29th, 2009 10:50 pm

There is one aspect of the argument that “DW requires long analytical queries on lots of data” and therefore has a different “load pattern” that I totally disagree with.

First: long analytical queries act on large amounts of data.

Second: to do that, the large amounts of data must be there in the first place!

Third: how do you think the large amount of data gets there? Spontaneous generation perhaps?

No. The large amount of data has to be *written* first, before it can be read.

Writing is an expensive operation in *any* database. Much more expensive than reads.

And the large amount of data must be kept up-to-date or else it means nothing. That’s what ETL deals with. Translate: writing even *more* large amounts of data.

To give you an idea: our DW does 2.5TB writes per day, versus 4.5TB reads. And ours is not atypical, or too large, by any stretch of the imagination.

That’s roughly 2:1 read/write ratio, not the usual “DW is mostly reads” marketing nonsense.

So next time someone says: “DWs do lots of reads in long queries”, stop and think for a moment.

They do a *lot* more than just reads. They also do more writes than just about any other database type – certainly than OLTP, possible exception some data marts and DSS types.

And this is where hardware tools like Exadata, which can do *both* very large amounts of reads *and* large amounts of writes while still maintaining the ACID and recovery qualities of a DB, become incredibly important.
Leandro Tubia on September 30th, 2009 5:10 am

Hi Noons

Just because writes almost doubles reads ratios in your case (I agree with you), and because writes are more costly in terms of resource usage due to locking and locking escalation, I dont’see the benefits of having both OLTP and DW in the same box.
In our installations, most of the writes are due to lots of transformation (new tables) that must be applied to the relational tables to solve business questions that cannot be solved through materialized vews.
You can say that we have a design problem: yes we may have, but some business queries cannot be solved in a single step because OLTP data model was thought specifically for other purpouses.
So business logic from the analytic point of view may be spread over different tables and entity instances, making unavoidable building new tables.
That constant data movement cannot be good for OLTP activity that expect resorces be released as soon as possible.
David Aldridge on September 30th, 2009 6:46 am

With regard to mixed workloads I’d imagine that the advice from Oracle would be this…

“In your 8 database server ODM, run OLTP on 5 of them and DW on the other 3.”

Thus you’d choose the ratio of DW:OLTP servers that suits you, and even vary it during the day or week (switching a subset of OLTP servers to ETL functionality at night, for example).

It would make sense to me anyway. You’d probably want the OLTP and DW instances to have different memory configurations (in terms of SGA vs PGA) anyway.
Leandro Tubia on September 30th, 2009 7:43 am

Hi @David

That sounds good. But in our case OLTP activity and DW batch processes overlay most of the time. Besides, despite having 5 servers dedicated to OLTP they finally would access the same disk resources as the other 3 ones. Except that there’s something about Exadata architecture that I’not considering. I will read more details about it.
Noons on September 30th, 2009 9:57 am

@Leandro

We do a lot of transformations as well in the DW. And a lot of cross-polination between fact data and the hypoerion analisys tool, which contrary to a lot of advice is running in the same db. Saving us a lot of disk and interface processing overhead, while showing excellent performance. Don’t use Exadata yet, but I can see it coming once we add even more subsystems to it.

That is not a design problem IMHO, neither is yours: each IT installation is different, trying to apply and fit generic “patterns” is what I’m not in favour of.

The perspective I want to put forward is that Exadata is not a “normal” piece of hardware, to try and “pigeon-hole” it under classic lines of IT thought is perhaps not the best approach.

Now, if you imagine that as David A. said (100% agreed, David!), we could separate the OLTP side to a set of nodes while balancing another to DW, it is quite possible to have this going in a single instance/database.

It is very rare indeed that both OLTP and DW would always be using the same set of volatile tables. And I mean exactly the same tables.

If they are not, then with hardware like Exadata it is a piece of cake (well not exactly, but a lot easier than with other solutions) to isolate workloads and make sure they do not clash.

Overall, a very high I/O rate capacity both in latency and throughput would be required.

This is what I think makes Exadata unique: as far as I know, it’s the only one capable of such, together with the necessary partitioning of workloads.

I do agree entirely that all this is still a bit “green” but the path ahead is extraordinarily exciting and full of promise. Better than anything else ever tried in this field.

This is a great blog post and the comments are excellent. Thanks everyone!
Leandro Tubia on September 30th, 2009 12:07 pm

Hi @Noons

I agree with you that it’s rare that OLTP and DW uses the same volatile tables, and in the case they use the same one, partitioning could be applied so as to separate high volatile data from stable one.
That leads me think that column oriented physical organization of tables could be problematic within this type of architecture.
However the last one is very useful when reaching the dozens-of-TBs barrier, just where relational Dbs are not able to perform as good as analytical oriented DBs.
So, as you’ve said, it depends on each case.
Michael McIntire on September 30th, 2009 2:52 pm

I’m going to add some color to @Rob’s comments…

The underlying problem is that managing systems like these is about meeting SLAs. OLTP systems tend to have fixed SLAs for each query – which have known and predictable access paths and times. DW systems have variable and contextual SLAs, access paths, and runtimes – basically in all DW cases SLA performance is “it depends” on the environment and priorities of the moment.

As a result, these two platforms are typically managed in very different ways. OLTP systems are managed for stability of individual transaction throughput – and have idle time to guarantee that requirement. DW systems run at 100% all the time – they are managed by throughput, essentially in the “headroom” of system performance.

None of the database vendors does even close to an adequate job of the balance of these fundamentally competing objectives.
Ben Werther on September 30th, 2009 8:10 pm

Curt — very insightful post. I touched on some similar points about Exadata 2 on the Greenplum blog:
http://www.greenplum.com/news/245/231/When-New-is-Old—Part-2/d,blog/

The bottom line is that Oracle is a fine mainstream OLTP database. However, the very architectural decisions that allowed it to succeed in that market are why it has struggled so badly for so many years at high scale and highly parallel workloads. Lacking a true MPP shared-nothing architecture, customers quickly learn that the system requires an endless amount of ‘black magic’ tuning and partitioning to keep things from unraveling. This is a world apart from Greenplum, where, for example, we were able to deploy a 6.5 Petabyte database across 96 nodes at eBay that supports a wide class of users and queries with little fanfare or tuning required.
RC on October 1st, 2009 3:34 am

@Ben Werther,

Either you can’t count or I can’t count, mister.

You ‘state’ that you have made 4 blog entries in september 2009 (in the right panel) but I count only 3. Or do you have some hidden blog entries that only logged in people can see?

Please correct me if I’m wrong!!

RC
David Aldridge on October 1st, 2009 9:12 am

@Michael,

It looks like the concerns about SLA’s and guaranteeing OLTP performance are being addressed (at last) in Exadata 2 with the I/O Resource Manager mentioned here: http://www.oracle.com/technology/products/bi/db/exadata/pdf/exadata-technical-whitepaper.pdf

“The DBRM and I/O resource management capabilities of Exadata storage can prevent one class of work, or one database, from monopolizing disk resources and bandwidth and ensures user defined SLAs are met when using Exadata storage. The DBRM enables the coordination and prioritization of I/O bandwidth consumed between databases, and between different users and classes of work. By tightly integrating the database with the storage environment, Exadata is aware of what types of work and how much I/O bandwidth is consumed. Users can therefore have the Exadata system identify various types of workloads, assign priority to these workloads, and ensure the most critical workloads get priority.”
Joe Harris on October 1st, 2009 9:22 am

@RC re “No one wants to store the data twice, the only reason that data warehouses exist is performance.”

You are dead wrong on this point. I *absolutely* want to store my data twice. Once where the source system can screw it up however it wants and once where I keep *the truth*.

In a modern environment most of my apps are packaged, I can’t touch the schema without massive costs and I don’t want to. Moreover each app handles updating and time-stamping different (but always badly IMHO).

Looking forward, I’m going to have an increasing number of SaaS apps both on and off premises. I’ll have no choice but to take a copy of their data if I want to something with it.

Data isolation is the first step to data quality and integrity in the warehouse.

@Peter Mooshammer re “Many customers are worried that a DW project will end up being an investment without return. So you end up having a big pile of idle HW.”

This sounds like old thinking to me from the time when DW projects were being done as big-bang, consultancy-led marathons (which did indeed fail at an alarming rate).

Modern projects are being done based on single path ROI in short time frames in a central DW. Subsequent projects can then tap the existing data and therefore need incrementally less ROI.
RC on October 1st, 2009 10:59 am

@Joe Harris

When the source system screws something up you can restore a backup.
Joe Harris on October 1st, 2009 12:23 pm

@RC Don’t take this the wrong way but: Have you worked on a data warehouse?

OLTP systems are designed to deal with *now*, in a small number of areas they will “track history”, occasionally they do it correctly.

However, most changes in OLTP take the form of an update. I feel a deep sense of joy when an app has a “DateUpdated” field. If I have to rely on the app to tell me the previous value then I’m SOL.

I’m going to quote myself here: “Data isolation is the first step to data quality and integrity in the warehouse.”

And add: A PITA (point in time architecture) is the second step to data quality and integrity in the warehouse.

…and yes I do relish the double entendre of that last acronym. 🙂
RC on October 1st, 2009 2:00 pm

@Joe Harris

In my previous job we stored both the now and the history of our data in our OLTP system. That was possible because the amount of data was small, no need to export the data to a dataware house and delete history in the OLTP system. By the way just storing present and past is hard but storing data with a startdate in the future makes it much, much more complicated. The past will always be the past but the future will become the present first before becoming the past.

Most changes in this system where a combination of update and insert. You could query how ‘life’ was 200 or 600 days ago or how it will be over 50 days.

The business logic was very complicated and I think that splitting the data between an ‘OLTP’ database and a ‘dataware house’ database would have made it even more complicated and it would be costly and error-prone. Should that dataware house only contain history data or maybe future data too? When it has to contain future data too you will have to copy that data to the OLTP system one day.

I don’t understand how a dataware house somehow improves data that have become inconsistent in the OLTP environment. I think it is better to explore the possibilities of stuff like unit tests to improve applications. Or explore the possibilities of unique function based indexes or fast refresh materialized views to check the data consistency if you want.

In my current job the amount of data of my customers is much larger and because of the large amount it is indeed needed to separately store the now data and the old data. Luckily we don’t have to deal with future data:)
Stray__Cat on October 1st, 2009 4:11 pm

Why you need a database for business data analysis (i.e. a datawarehouse) EVEN IF you are a small company, have few data and your server can handle all the workload.

1) Your history is there even if your OLTP systems blows up. It’s not a matter of backups: the vendor goes belly up and there’s no further support, your business changes too much to keep using the old system, your new CEO loves a different SW etc.

2) Building such a database forces you to think to an analysis model for your business. A stable model makes comparisons with the past possible. The need for changes in a DW mirrors the need to change the company strategy; changing too often means that your company does not know where she’s going.

3) Building such a database forces you to think to key performance indicators for your business. No business, no matter how small it is, is best managed with the bank bottom line only.

4) Likely you have different systems with different master data. Likely you had them reconciled in your database or you have generated best practices for them.

5) Often business people think by categories not implemented in a business application. Your DW may be the only place where some data may reside and be applied to your analysis model.

6) If you have a decent Business Intelligence system in place, you can cope quickly to unexpected, one shot, requests and be ready when the request is issued again for another one shot.

7) having a different DB on a different technology make the DW users feel special compared to all those data entry people, and make it more acceptable to upper management.

It’s not a technical issue at all. Separating the two worlds makes sense from a business perspective.
Curt Monash on October 1st, 2009 4:20 pm

Stray Cat,

Only your #1 and #7 really seem to require separate databases for DW and OLTP. And #7 seems pretty silly to me.

#1 could actually be addressed in a single DBMS by materialized views as well. But it’s an excellent point even so.
Winston Chen on October 1st, 2009 4:54 pm

The debate here is whether the same DB platform can serve multiple purposes, different load and query patterns, etc. This is the old generalist versus specialist debate. There’re parallels in biology and evolution, in finance about the merits of diversification, in management strategy about why conglomerates exists. Back to the IT world: Do you want your DB platform to specialize in solving a specific class of problems, but have a hard time moving beyond that, or do you want your platform to handle whatever you throw at it reasonably well but doesn’t excel at any one thing? My personal opinion is that there is a place for both.
David Aldridge on October 2nd, 2009 2:34 am

@Joe,

Yes, I agree that the data should be isolated into some kind of separate stuctures for data warehousing. As you say, historical data analysis is not well served by 3NF systems and Peoplesoft or SAP or course are not designed to do that.

However I don’t think that claiming that Exadata 2 is suitable for mixed DW/OLTP workloads is the same thing as saying that you don’t transform and move the data. I think it’s very likely that you would. Whether that’s in the same database or a different one housed on the same machine would be an interesting decision, and I wonder how co-hosting on the same hardware would affect that. Maybe it’s the old-skool in me but I’d look at separate, “co-located” databases first and look to be convinced that they can be part of the same database. Oracle’s CDC solutions or transportable tablespaces would be interesting when running with the same Exadata 2 hardware as both source and target.
David Aldridge on October 2nd, 2009 2:57 am

@Curt

>> You might just want to run a separate brand of DBMS for your OLTP and data warehousing.

Oracle’s philosophy for a long time has been to take the code to the data, not to take the data to the code (eg. integration of OLAP, Data Mining, Rules Management etc), so Exadata 2 as a combined platform is very much in line with that. Having worked on extracting data to data mining and OLAP applications I’m very much inclined to agree with them — data movement is a pain.

However, running analytics against a live OLTP schema is a formidable problem. That doesn’t preclude hosting both types of database on the same machine though, with all the advantages that would bring. It looks like they’ve put soe serious thought into resolving DW and OLTP SLA’s also.

As for the opinion of Teradata, Greenplum etc. … “Man holding hammer says screws will never catch on”. They rather would say that, wouldn’t they? Solving the technical problems of creating a world-class OLTP RDBMS rather exceed those of creating an analytic platform, I’d say.

>>You may want to lay out or index your tables differently for OLTP and data warehousing

Yes. I don’t think that anyone at Oracle Corp. would disagree with that. But they would probably suggest that Exadata 2 as a combined platform helps with that by making the process faster and reducing the complexity of data movement.

>> You may want to lay out your files differently for OLTP and data warehousing (e.g., in terms of block sizes).

Maybe so. It’s a trade off between complexity of management and technical advantages. Oracle have the reputation, somewhat unjustly IMHO, of being high-maintenance and ASM’s SAME approach is obviously intended to address that by offering to take away a decision point about these issues. That decision can still be addressed if you want to though, I would guess.

>> OLTP and analytic workloads step on each other’s toes

In Oracle that tends to be a disk issue. It looks like that is really being addressed though with more advanced query scheduling and I/O balancing (I/O resource management has not been addressed at all as far as I can think in pre-Exadata 2 Oracle), and OLTP-on-flash can only help.

Well I think you’re right to be cautious, but I also think that Oracle are right to pursue this line. They’re obviously putting a lot of effort into whacking the moles of the problem and I think the the discussion above shows that the primary concerns people have about it have been anticipated. Whether the solution is right on the first pass is still to be proven, but it’s an exciting step in the right direction.

I think that if mixed workloads on a single machine are possible then Oracle have positioned themselves to be the only ones capable of reaching it in the near future.
Leandro Tubia on October 2nd, 2009 7:26 am

@David

I totally agree with you that it’s not a matter of deciding if it’s better having one or two data models (I prefer two for all reasons enumerated above), but if it’s factible to have both models in the same box.
Actually, we have a simmilar dilemma at another level when designing disk assignment to OLTP and DW servers from the same HP EVA cabin: it’s supposed that data distribution along disk and virtual arrays are dynamically assigned by cabin intelligence according to usage patterns, so as to optimize throughput.
I imagine that Oracle suggests to bubble up this philosophy to the server node level.
In the case of Exadata it could be easier as architecture should be totally balanced from CPU, memory to HBAs and disk arrays.
In the classic Server+SAN architecture it’s quite different because if still not considering the mission the server would be assigned to, heterogeneous server configurations generate hierarchized usage: newer powerfull servers (many CPUs with much, much Ram) monopolize disk consumption, leaving other older servers waiting for access.
Stray__Cat on October 2nd, 2009 12:04 pm

@Curt

You are perfectly right in a technical frameset, but my point is that chosing on a technical basis is NOT the right method to go. Separate systems with different technologies make sense from a business perspective.

About #7, I’ve seen to many CEOs thinking that, if the technology is the same as the ERP, BI is something operational and not strategic.
In other words, placing the DW together with the OLTP disqualifies the DW and make harder to get resources and visibility.
Oracle Exadata 2 capacity pricing | DBMS2 -- DataBase Management System Services on October 6th, 2009 8:20 am

[…] Issues in integrating OLTP and data warehousing in a single system […]
Multi-model database managers | DBMS 2 : DataBase Management System Services on September 30th, 2015 9:56 pm

[…] in 2009 integrating OLTP and data warehousing was clearly a bad […]

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Thoughts on the integration of OLTP and data warehousing, especially in Exadata 2

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin