Theory and architecture
Analysis of design choices in databases and database management systems. Related subjects include:
- Any subcategory
- Database diversity
- Explicit support for specific data types
- (in Text Technologies) Text search
Amazon’s version of DBMS2
Last year, I pointed out that Amazon has a highly diversified DBMS strategy. Now Mike Vizard has a great interview with Werner Vogel, Amazon’s CTO, where he unearths a lot more detail. And it turns out that Amazon has been a hardcore adopter of DBMS2, since long before DBMS2 was named.
Read more
Categories: Amazon and its cloud, Database diversity, NoSQL, Specific users, Theory and architecture | Leave a Comment |
How and where to deploy business rules
James Taylor can be something of an extremist in his advocacy of inference engines, but I think this post about how to deploy business rules is spot-on. The three points I particularly liked were:
- “Don’t underestimate how much change you might actually want in apparently “fixed” rules.” Rules should be managed by a flexible tool for specification, development, and/or maintenance. Relational purists sometimes advocate putting them under direct control of the DBMS; I disagree strongly.
- “An effective way to combine business rules and BPEL model-driven rules is to use decision services in conditions.”
- “Using templates to control which parts of which rules can be edited by a given group of business users is key to delivering agility.”
Categories: Theory and architecture | Leave a Comment |
Tom Kyte said it more concisely
Mark Whitehorn had a good article on the importance of horses-for-courses context. But Tom Kyte said the same thing more concisely:
I believe strongly – and more strongly every day – that there are only two possible answers to a “first question”. They are:
- Why
- It Depends
That said, I suspect that I agree with Tom more emphatically than he himself does. 😉 At least when it comes to the relative superiority of various data models …
Categories: Theory and architecture | Leave a Comment |
Multivalue in Access triggers religious war
The Register managed to inflame the faithful on all sides with its comments on the addition of multivalue datatypes to Access. Trying to soothe(?) matters, Mark Whitehorn makes some astute comments about data models in general. One of my favorite parts is some armchair psychology about people who, having learned one complex system, grow attached to it, regard it as the One True Way, and regard all alternatives as the work of the Devil. My other favorite part is this analogy:
Categories: Theory and architecture | 1 Comment |
SAP on SAP
Dan Farber’s blog from SAP’s developer conference isn’t, frankly, his best piece of work, since the quotes are sometimes so garbled as to be a bit unreadable. Still, it helps flesh out what we already knew about SAP’s strategy.
Basically, they claim to be reengineering their whole product line for the new services-based architcture I keep writing about. And they insist this truly is a new platform architecture. In that regard, I buy into and agree with their pitch.
They further insist that the mid-market will be a big part of their business going forward, but SaasS will not. I don’t buy into that as fully.
I’ll spell out why in another post, but not until Monday at the earliest. Watch the comments section on this one for trackbacks.
Categories: SAP AG, Theory and architecture | Leave a Comment |
Business rules, business process
Alf Pederson’s blog has yet another long discussion on putting business rules in the database versus putting them in the application. (Since IT Toolbox trackbacks seem to be, as usual, broken, this is the best link I have.)
What’s getting forgotten as usual in this debate, I think, is the direct automation of business processes. Business rules of the sort “No credit granted can exceed $10,000” are silly whereever they’re put. Rather, the business rule should be something like “An attempt to grant credit in excess of $10,000 is not successful until it has been approved by a VP-level manager.” And the natural way to implement that kind of rule is NOT via database constraints (you need all sorts of other logic around it for usability).
The only “business rules” that belong in the database are precisely those that aren’t really business rules at all.
Categories: Theory and architecture | 10 Comments |
Another OLTP success for memory-centric OO
Computerworld published a Progress ObjectStore OLTP success story.
Hotel reservations system, this time. Not as impressive as the Amazon store — what is? — but still nice.
Categories: Cache, Memory-centric data management, Object, OLTP, Progress, Apama, and DataDirect, Theory and architecture | 5 Comments |
What needs to be updated anyway?
Shayne Nelson is posting some pretty wild ideas on data architecture and redundancy. In the process of doing so, he’s reopening an old discussion topic:
Why would data ever need to be erased?
and the natural follow-on
If it doesn’t need to be erased, what exactly do we have to update?
Here are some quick cuts at answering the second question:
- “Primary” data usually doesn’t really need to be updated, exactly. But it does need to be stored in such a way that it can immediately be found again and correctly identified as the most recent information.
- Analytic data usually doesn’t need to be updated with full transactional integrity; slight, temporary errors do little harm.
- “Derived” data such as bank balances (derived from deposits and withdrawals) and inventory levels (derived from purchases and sales) commonly needs to be updated with full industrial-strength protections.
- Certain kinds of primary transactions, such as travel reservations, need the same treatment as “derived” data. When the item sold is unique, the primary/derived distinction largely goes away.
- Notwithstanding the foregoing, it must be possible to update anything for error-correction purposes (something Nelson seems to have glossed over to date).
Respondents to Nelson’s blog generally argue that it’s better to store data once and have redundant subcopies of it in the form of indexes. I haven’t yet seen any holes in those arguments. Still, it’s a discussion worth looking at and noodling over.
Categories: Theory and architecture | 2 Comments |
Application logic in the database
I’m highly in favor of modularity in application development, but suspicious of folks who promote it to extremes as a panacea. (Perhaps another legacy of my exaggerated infatuation with LISP in the 1980s?) Thus, I was one of the chief drumbeaters for OO programming before Java made it de rigeur, but I also was one of the chief mockers of Philippe Kahn’s claims that Borland would outdevelop Microsoft in office productivity tools just because it used OO tools. (Analyst Michelle Preston bought that pitch lock, stock, and barrel, and basically was never heard from again.)
I’ve held similar views on stored procedures. A transactional DBMS without stored procedures is for many purposes not a serious product. CASE tools that use stored procedures to declaratively implement integrity constraints have been highly valuable for a decade. But more general use of stored procedures has been very problematic, due to the lack of development support for writing and maintaining them in any comprehensive way. Basically, stored procedures have been database-resident spaghetti.
Microsoft claims to have changed all this with the relationship between the new releases of SQL Server and Visual Studio, and have touted this as one of the few “game changers” in SQL Server 2005. I haven’t actually looked at their offering, but I’m inclined to give them the benefit of the doubt — i.e., absent verification I tentatively believe they are making it almost as practical from a team development standpoint to implement code in the database as it is on the middle tier.
Between the Microsoft announcement and the ongoing rumblings of the business rules folks, there’s considerable discussion of putting application logic in the database, including by the usual suspects over on Alf Pedersen’s blog. (Eric’s response in that thread is particularly good.) Here are some of my thoughts:
1. As noted above, putting logic in the database, to the extent the tools are good, has been a good thing. If the tools are indeed better now, it may become a better thing.
2. The myth that an application is just database-logic-plus-the-obvious-UI has been with us for a LONG time. It’s indeed a myth, for several reasons. There’s business process, for one thing. For another, UIs aren’t as trivial as that story would make them sound. (I keep promising to write on the UI point and never get around to it. I will. Stay tuned. For one thing, I have a white paper in the works on portals. For another, I’m not writing enough about analytics, and UI is one of the most interesting things going in analytics these days.) Plus there are many apps for which a straightforward relational/tabular database design doesn’t make sense anyway. (That’s a primary theme of this blog.)
3. It’s really regrettable that the term “business rules” is used so carelessly. It conflates integrity constraints and general application logic. Within application logic, it conflates those which are well served by a development and/or implementation paradigm along the line of a rules engine, and those for which a rules engine would make little sense. It’s just bad semantics.
4. Besides everything else, I mainly agree with SAP’s belief that the DBMS is the wrong place to look for module interfaces.
Categories: Microsoft and SQL*Server, Theory and architecture | 2 Comments |
Two kinds of DBMS extensibility
Microsoft took slight exception to my claim that they lack fully general DBMS extensibility. The claim is actually correct, but perhaps it could lead to confusion. And anyhow there’s a distinction here worth drawing, namely:
There are two different kinds of DBMS extensibility.
The first one, which Microsoft has introduced in SQL Server 2005 (but which other vendors have had for many years) is UDTs (User-Defined Types), sometimes in other systems called user-defined functions. These are in essence datatypes that are calculated functions of existing datatypes. You could use a UDT, for example, to make the NULLs in SQL go away, if you hate them. Or you can calculate bond interest according to the industry-standard “360 day year.” Columns of these datatypes can be treated just like other columns — one can use them in joins, one can index on them, the optimizer can be aware of them, etc.
The second one, commonly known by the horrible name of abstract datatypes (ADTs), is found mainly in Oracle, DB2, and previously the Informix/Illustra products. Also, if my memory is accurate, Ingres has a very partial capability along those lines, and PostgresSQL is said to be implementing them too. ADTs offer a way to add totally new datatypes into a relational system, with their own data access methods (e.g., index structures). That’s how a DBMS can incorporate a full-text index, or a geospatial datatype. It can also be a way to more efficiently implement something that would also work as a UDT.
In theory, Oracle et al. expose the capability to users to create ADTs. In practice, you need to be a professional DBMS developer to write them, and they done either by the DBMS vendors themselves, or by specialist DBMS companies. E.g., much geospatial data today is stored in ESRI add-ons to Oracle; ESRI of course offered a speciality geospatial DBMS before ADTs were on the market.
Basically, implementing a general ADT capability is a form of modularity that lets new datatypes be added more easily than if you don’t have it. But it’s not a total requirement for new datatypes. E.g., I was wrong about Microsoft’s native XML implementation; XML is actually managed in the relational system. (More on that in a subsequent post.)