January 26, 2006

Progress DataDirect discovers XML

As a general rule, if you want DBMS drivers, your first call should be to Progress DataDirect. They’ve been the dominant vendor (under multiple names and ownerships) of both ODBC and JDBC drivers, essentially since those standards’ respective inventions. (Persistent Systems Private Ltd. — better known as PSPL — wouldn’t be a terrible choice for your second call).

DataDirect seems to have introduced XQuery drivers last fall. I don’t have a lot of detail on those, however, because the DataDirect guy who contacted me did so mainly to show off a nice toy, Stylus Studio. StylusS tudio is an XML query-building toolkit, available for online purchase for $800 or less. A lot of the users seem to be system integrators. Sales are split 50-50 between the DataDirect regular salesforce and online, apparently mainly from their own store, but I got the sense we’re not talking about huge numbers yet.

In usability when they demoed it to me it looked on a par with Cognos Improptu (a SQL query-building tool) circa the mid-1990s. But they do claim all the right things in round-trip code generation and so on.

Applications seem to be concentrated in intercompany information exchange, based on both legacy EDI (Electronic Data Interchange) and more modern web services. Other uses they cited were parsing web server logs and publishing relational data to a web page.

The technology/product seems to have bounced around for a while, from Object Design (OODBMS pioneer that took a premature shot at the XML database business, and the source of the ObjectStore technology I keep writing about in this blog) to eXcelon (merger partner for ODS, eventually bought by Progress), to Progress’s Sonic Software Division, and now to DataDirect after Progress bought them. Apparently none of those companies have or had top-end UI expertise …

If you want to get a better feel for XQuery, you could do worse than to play with this tool. For example, it’s what I think I’ll use in the unlikely case I ever get around to parsing the SpamAssassin add-ins to my email messages and trying to understand what SpamAssassin is and isn’t doing.

Categories: Progress, Apama, and DataDirect

More on the inventory database example

In my recent column on XML storage, I referenced a Microsoft-provided example of an inventory database. A retailer (I think an online one) wanted to manage books and DVDs and so on, and search across attributes that we common to the different entity kinds, such as title.

Obviously, there are relational alternatives. Items have unique SKU numbers, and they have one of a limited number of kinds, and a set of integrity constraints could mandate that an item was listed in the appropriate table for its kind and no other, and then common attributes could be search on via views that amounted to unions (or derived tables kept synchronized via their own integrity constraints).

I pushed back at Microsoft — which is, you may recall, not just an XML advocate but also one of the largest RDBMS vendors — with this kind of reasoning, and they responded with the following, which I just decided to (with permission) post verbatim.

“If all you ever do is manage books and DVDs, then managing them relationally works well, especially if their properties do not change. However, you many want to add CDs and MP3 on memory cards and many other items that all have different properties. Then you quickly run into an administration overhead and may not be able to keep up with your schema evolution (and you need an additional DBA for managing the complex relational schema). Even if you use a relational approach that stores common properties in joint tables, the recomposition costs of the information for one item may become too expensive to bear.”

Categories: Microsoft and SQL*Server, Structured documents

3 Comments

January 19, 2006

And now a moment of humor

A classic hacker jest, and also the best blonde joke ever.

Categories: Humor

2 Comments

January 16, 2006

Finally a column on XML storage

After several months of headfakes, I finally did a column on XML storage this month. There turned out to be room for application discussion, but not for much technical nitty-gritty.

The app discussion is pretty consistent with what I’d already posted here, although I wish I’d gone into more detail on the inventory database example. (Stay tuned for followup here!)

I also intend to post soon with some technical detail about how XML storage is actually handled.

I also got some good insight from Marklogic about what customers wanted in their text-centric markets. More on that soon too.

And by the way — I didn’t pick the Oracle-bashing title. I also didn’t pick the Oracle-bashing title for my Network World “Hot Seat” video. But somehow, the Oracle-doubting parts of my views are of special interest to my friends in the media. And it’s not as if the titles say anything I actually disagree with …

Categories: OLTP, Oracle, Structured documents

3 Comments

January 13, 2006

Memory-centric research — hear the latest!

What I’ve written so far in this blog (and in Computerworld) about memory-centric data management technology is just the tip of the iceberg. A detailed white paper is forthcoming, sponsored by most of the industry leaders: Applix, Progress, SAP, Intel (in association with SAP), and Solid. (But for some odd reason Oracle declined to participate …)

A lot of the material will be rolled out publically for the first time in a webinar on Wednesday, January 25, at 11 EST. Applix is the host. To participate, please follow this link.

I’m also holding forth online, in webinars and even video, on other subjects these days. More details may be found over in the Monash Report.

Categories: Memory-centric data management, MOLAP

Another OLTP success for memory-centric OO

Computerworld published a Progress ObjectStore OLTP success story.

Hotel reservations system, this time. Not as impressive as the Amazon store — what is? — but still nice.

Categories: Cache, Memory-centric data management, Object, OLTP, Progress, Apama, and DataDirect, Theory and architecture

5 Comments

January 9, 2006

A possibly useful resource

It’s not that easy to find detailed, vendor-neutral explanations of XML storage in RDBMS. One reason may be that there isn’t much vendor-neutral reality to talk about yet; each implementation is different.

Anyhow, while it’s not overwhelming, I found one book chapter online that’s fairly useful for reviewing one or the other somewhat murky area of the technology. Here’s a link to the section on shredding.

The book in question is a collection of chapters by various XQuery experts, a couple of whom have made strong, direct contributions to my research for this blog. I’m not sure I see the point in buying ANY book about a technology area so ill-defined and fast-changing, especially one over a year old. But if I did want a book, it would be very high on my list of ones to consider.

Categories: Structured documents

What needs to be updated anyway?

Shayne Nelson is posting some pretty wild ideas on data architecture and redundancy. In the process of doing so, he’s reopening an old discussion topic:

Why would data ever need to be erased?

and the natural follow-on

If it doesn’t need to be erased, what exactly do we have to update?

Here are some quick cuts at answering the second question:

“Primary” data usually doesn’t really need to be updated, exactly. But it does need to be stored in such a way that it can immediately be found again and correctly identified as the most recent information.
Analytic data usually doesn’t need to be updated with full transactional integrity; slight, temporary errors do little harm.
“Derived” data such as bank balances (derived from deposits and withdrawals) and inventory levels (derived from purchases and sales) commonly needs to be updated with full industrial-strength protections.
Certain kinds of primary transactions, such as travel reservations, need the same treatment as “derived” data. When the item sold is unique, the primary/derived distinction largely goes away.
Notwithstanding the foregoing, it must be possible to update anything for error-correction purposes (something Nelson seems to have glossed over to date).

Respondents to Nelson’s blog generally argue that it’s better to store data once and have redundant subcopies of it in the form of indexes. I haven’t yet seen any holes in those arguments. Still, it’s a discussion worth looking at and noodling over.

Categories: Theory and architecture

2 Comments

December 20, 2005

Solid state (Flash) memory vs. RAM vs. disks

I just wrote a column and a blog post on the potential for diskless PCs based on flash drives. It was a fun exercise, and I think I kept it general enough that my lack of knowledge about hardware technology details didn’t lead me into significant error.

The first vendor response I got was from Bit Micro Networks, who seem to sell such drives for PCs and enterprise storage alike. One of their press releases touts an Oracle implementation. Interesting idea. It’s far from a substitute for full memory-centric data management, but it’s kind of an intermediate way of getting some of the benefits without altering your traditional software setup much at all.

Categories: Memory-centric data management, Oracle, Solid-state memory

1 Comment

December 15, 2005

Application logic in the database

I’m highly in favor of modularity in application development, but suspicious of folks who promote it to extremes as a panacea. (Perhaps another legacy of my exaggerated infatuation with LISP in the 1980s?) Thus, I was one of the chief drumbeaters for OO programming before Java made it de rigeur, but I also was one of the chief mockers of Philippe Kahn’s claims that Borland would outdevelop Microsoft in office productivity tools just because it used OO tools. (Analyst Michelle Preston bought that pitch lock, stock, and barrel, and basically was never heard from again.)

I’ve held similar views on stored procedures. A transactional DBMS without stored procedures is for many purposes not a serious product. CASE tools that use stored procedures to declaratively implement integrity constraints have been highly valuable for a decade. But more general use of stored procedures has been very problematic, due to the lack of development support for writing and maintaining them in any comprehensive way. Basically, stored procedures have been database-resident spaghetti.

Microsoft claims to have changed all this with the relationship between the new releases of SQL Server and Visual Studio, and have touted this as one of the few “game changers” in SQL Server 2005. I haven’t actually looked at their offering, but I’m inclined to give them the benefit of the doubt — i.e., absent verification I tentatively believe they are making it almost as practical from a team development standpoint to implement code in the database as it is on the middle tier.

Between the Microsoft announcement and the ongoing rumblings of the business rules folks, there’s considerable discussion of putting application logic in the database, including by the usual suspects over on Alf Pedersen’s blog. (Eric’s response in that thread is particularly good.) Here are some of my thoughts:

1. As noted above, putting logic in the database, to the extent the tools are good, has been a good thing. If the tools are indeed better now, it may become a better thing.

2. The myth that an application is just database-logic-plus-the-obvious-UI has been with us for a LONG time. It’s indeed a myth, for several reasons. There’s business process, for one thing. For another, UIs aren’t as trivial as that story would make them sound. (I keep promising to write on the UI point and never get around to it. I will. Stay tuned. For one thing, I have a white paper in the works on portals. For another, I’m not writing enough about analytics, and UI is one of the most interesting things going in analytics these days.) Plus there are many apps for which a straightforward relational/tabular database design doesn’t make sense anyway. (That’s a primary theme of this blog.)

3. It’s really regrettable that the term “business rules” is used so carelessly. It conflates integrity constraints and general application logic. Within application logic, it conflates those which are well served by a development and/or implementation paradigm along the line of a rules engine, and those for which a rules engine would make little sense. It’s just bad semantics.

4. Besides everything else, I mainly agree with SAP’s belief that the DBMS is the wrong place to look for module interfaces.

Categories: Microsoft and SQL*Server, Theory and architecture

2 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Progress DataDirect discovers XML

More on the inventory database example

And now a moment of humor

Finally a column on XML storage

Memory-centric research — hear the latest!

Another OLTP success for memory-centric OO

A possibly useful resource

What needs to be updated anyway?

Solid state (Flash) memory vs. RAM vs. disks

Application logic in the database

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin