April 5th, 2008 Curt Monash
There now are four hardware vendors that each offer or seem about to announce two different tiers of data warehouse appliances: Sun, HP, EMC, and Teradata. Specifically:
Read the rest of this entry »
Posted in Analytics and analytic technologies, DATAllegro, Data warehouse appliances, Data warehousing, Dataupia, Greenplum, HP and Neoview, IBM and DB2, Infobright and Brighthouse, Kognitio and WX2, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, Relational database management systems, Sybase, Teradata | 4 Comments »
March 14th, 2008 Curt Monash
An interesting part of my conversation with Dataupia’s CTO John O’Brien came when we talked about data warehousing in general. On the one hand, he endorsed the view that using Oracle probably isn’t a good idea for data warehouses larger than 10 terabytes, with SQL Server’s limit being well below that. On the other hand, he said he’d helped build 50-60 terabyte warehouses in Oracle years ago.
The point is that to build warehouses that big in Oracle or other traditional DBMS, you have to pull out a large bag of tricks. Read the rest of this entry »
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Microsoft and SQL*Server, Oracle, Relational database management systems | 16 Comments »
March 6th, 2008 Curt Monash
As usual, Microsoft forgot to brief me, but Mary Jo Foley reports on Microsoft SQL Server Data Services. A look at the official site clarifies that this database-in-a-cloud offering uses “Microsoft SQL Server as a data storage node.” However, there seems to be a software layer on top of SQL Server providing scale-out and appropriate management.
In addition to the more-than-SQL-Server layer, there seems to be a less-than-SQL-Server aspect as well. In a particular, Microsoft SQL Server Data Services boasts “Support for simple types: string, numeric, datetime, boolean.” XML is the “primary wire format,” and hints dropped about the schema philosophy sound XMLish too.
Interestingly, Foley reports that Microsoft plans to offer an on-premises version of Microsoft SQL Server Data Services as well.
Please subscribe to our feed!
Posted in Cloud computing, Microsoft and SQL*Server, Native XML | No Comments »
February 18th, 2008 Curt Monash
I recently caught up with ParAccel’s CTO Barry Zane and Marketing VP Kim Stanick for a long technical discussion, which they have graciously continued by email. It would be impolitic in the extreme to comment on what led up to that. Let’s just note that many things I’ve previously written about ParAccel are now inoperative, and go straight to the highlights.
Read the rest of this entry »
Posted in Columnar architectures, Data warehousing, Microsoft and SQL*Server, ParAccel, Portability, transparency, and plug-compatibility | 4 Comments »
January 28th, 2008 Curt Monash
Question of the day #2
Who is actually using native XML?
Mark Logic is having a fine time using its native XML engine for custom publishing. One outfit I know of is using a native XML for something like web analytics, but is driving me crazy by never coming through on permission to divulge details. There’s a bit of native XML use out there supporting the insurance industry’s ACORD standard.
And after that I quickly run out of examples of native XML use. Read the rest of this entry »
Posted in Data types, IBM and DB2, Mark Logic, Microsoft and SQL*Server, Native XML, Oracle | 1 Comment »
January 24th, 2008 Curt Monash
I may argue for the use of open source and other mid-range database management systems, but a lot of industry sentiment remains on the other side. Vendors of high-end RDBMS naturally advocate enterprise-wide single-vendor adoption. Many CIOs and industry analysts, overwhelmed by product proliferation, think that’s a neat idea as well.
And in fairness, they’re not entirely wrong. Here are 14 reasons for using high-end relational database management systems, even on applications for which mid-range DBMS would suffice. Read the rest of this entry »
Posted in Microsoft and SQL*Server, Mid-range DBMS, MySQL, OLTP database management, Open source RDBMS, Oracle, PostgreSQL, Relational database management systems | 17 Comments »
January 22nd, 2008 Curt Monash
For very high-end applications, the list of viable database management systems is short. Scalability can be a problem. (The rankings of most scalable alternatives differ in the OLTP and data warehouse realms.) Extreme levels of security can be had from only a few DBMS. (Oracle would have you believe there’s only one choice.) And if you truly need 99.99% uptime, there only are a few DBMS you even should consider.
But for most applications at any enterprise – and for all applications at most enterprises – super high-end DBMS aren’t required. There are relatively few applications that wouldn’t run perfectly well on PostgreSQL or EnterpriseDB today. Ingres and Progress OpenEdge aren’t far behind (they’re a little lacking in datatype support). Ditto Intersystems Cache’, although the nonrelational architecture will be off-putting to many. And to varying degrees, you can also do fine with MySQL, Pervasive PSQL, MaxDB, or a variety of other products – or for that matter with the cheap or free crippled versions of Oracle, SQL Server, DB2, and Informix.
What’s more, these mid-range database management systems can have significant advantages over their high-end brethren. Read the rest of this entry »
Posted in EnterpriseDB and Postgres Plus, IBM and DB2, Ingres, Intersystems and Cache', Microsoft and SQL*Server, Mid-range DBMS, MySQL, Open source RDBMS, Oracle, Pervasive Software, PostgreSQL, Progress, Apama, and DataDirect, Relational database management systems, SAP, BI Accelerator, and MaxDB | 14 Comments »
January 14th, 2008 Curt Monash
I’m getting a flood of press releases today, because many of the companies I write about were selected to Intelligent Enterprise’s list of 12 most influential vendors plus 36 more to watch in the areas Intelligent Enterprise covers (which seems to be pretty much the analytics-related parts of what I write about here and on Text Technologies). It looks like a pretty reasonable list, although I think they forced the issue in some of the small analytics vendors they selected, and of course anybody can quibble with some of the omissions.
Among the companies they cited, you can find topical categories here for IBM (and Cognos), Informatica, Microsoft, Netezza, Oracle, SAP/Business Objects (both), SAS, and Teradata; QlikTech; Cast Iron, Coral8, DATAllegro, HP, ParAccel, and StreamBase; and Software AG. On Text Technologies you’ll find categories for some of the same vendors, plus Attensity, Clarabridge, and Google. There also are categories for some of these vendors on the Monash Report.
Posted in Business Objects, Cast Iron Systems, Coral8, DATAllegro, HP and Neoview, IBM and DB2, Informatica, Microsoft and SQL*Server, Netezza, Oracle, ParAccel, QlikTech and QlikView, SAP, BI Accelerator, and MaxDB, SAS Institute, Software AG and ADABAS, StreamBase, Teradata | No Comments »
October 29th, 2007 Curt Monash
Please do not rely on the parts of this post that draw a distinction between in-memory and disk-based operation. See our February 18, 2008 post about ParAccel instead. It turns out that communication with ParAccel was yet worse than I had realized.
Officially launched today at the TDWI conference, ParAccel is out to compete with Netezza. Right out of the chute, ParAccel may have surpassed Netezza in at least one area: pointlessly annoying secrecy. (In other regards I love them dearly, but that paranoia can be a real pain.) As best I can remember, here are some things about ParAccel that I both am allowed to say and find interesting:
- ParAccel offers a columnar, MPP data warehouse DBMS, called the ParAccel Analytic Database.
- ParAccel’s product runs in two main modes. “Maverick” is normal, stand-alone mode. “Amigo” mode amounts to a plug-compatible accelerator for Oracle or Microsoft SQL*Server. Early sales and marketing were concentrated on SQL*Server Amigo mode.
- ParAccel’s product also runs in another pair of modes – in-memory and disk-based. Early sales and marketing were concentrated on in-memory mode. Hybrid memory-centric processing sounds like something for a future release.
- Sun has a reseller partnership with ParAccel, focused on in-memory mode.
- Sun and ParAccel published record-shattering 100 gigabyte, 300 gigabyte, and 1 terabyte TPC-H benchmarks today, based on in-memory mode. (If you’d like to throw 13 terabytes of disk at 1 terabyte of user data, running simple and repetitive queries, that benchmark might be a useful guide to your own experience. But hey – that’s a big improvement on the prior champion, who used 40 terabytes of disk. To ParAccel’s credit, they’re not pretending that this is a bigger deal than it is.)
Read the rest of this entry »
Posted in Analytics and analytic technologies, Columnar architectures, Data warehouse appliances, Data warehousing, Microsoft and SQL*Server, Oracle, ParAccel, Portability, transparency, and plug-compatibility, Relational database management systems | No Comments »
October 12th, 2007 Curt Monash
I’ve been arguing for a while that Oracle and Microsoft are screwed in high-end data warehousing. The reason is that they’re stuck with SMP (Symmetric Multi-Processing) architectures, while Teradata, Netezza, DATAllegro, and many others enjoy the benefits of MPP (Massively Parallel Processing). Thus, Teradata and DATAllegro boast installations in the hundreds of terabytes each, while Oracle and Microsoft users usually have to perform unnatural acts of hard-coded partitioning even to reach the 10 terabyte level.
That said, there are at least three ways Oracle and/or Microsoft could get out of this technical box:
1. They could buy or just partner with MPP vendors such as Dataupia, who offer plug-compatibility with their respective main DBMS.
2. They could buy whoever they want, plug-compatibility be damned. Presumably, they’d quickly add a light-weight data federation front-end to give the appearance of integration, then merge the products more closely over time.
3. They could develop or buy technology like DATAllegro’s, which essentially federates instances of an ordinary SMP DBMS across nodes of an MPP grid (Greenplum does something similar). I imagine that, for example, ripping Ingres out of DATAllegro and slotting in Oracle instead would be a pretty straightforward exercise; even without dramatic change to any of the optimizations, the resulting port would be something that ran pretty quickly on Day 1.
Bottom line: Oracle and Microsoft are hemorrhaging at the data warehouse high end now. But there are ways they could stanch the bleeding.
Posted in Analytics and analytic technologies, DATAllegro, Data warehouse appliances, Data warehousing, Dataupia, Greenplum, Microsoft and SQL*Server, Oracle, Portability, transparency, and plug-compatibility, Relational database management systems, Teradata | 1 Comment »
October 9th, 2007 Curt Monash
At the Teradata show today, I talked with Mike Weber of Scorecard Systems Inc. Scorecard’s business is vertical BI for telecommunications companies to analyze call data. They support Teradata (obviously), Oracle, and Microsoft SQL*Server, with Netezza coming soon. But not DB2.
Mike says that, in ten years in this business, he’s never seen DB2. Read the rest of this entry »
Posted in Analytics and analytic technologies, Business intelligence, Data warehousing, IBM and DB2, Microsoft and SQL*Server, Oracle, Teradata | No Comments »
October 5th, 2007 Curt Monash
I’ve been talking a lot to text mining vendors this week, as per a series of posts over on Text Technologies. Specifically, I’ve focused on the two with exhaustive extraction strategies, namely Attensity and Clarabridge. (Exhaustive extraction is Attensity’s term for separating the linguistic-analysis part of text mining from the DBMS-based BI/analytics part.)
So I asked each of Attensity and Clarabridge the side question as to which data warehouse software or appliances they were seeing. The answers were almost identical — Oracle, Microsoft SQL*Server, Teradata, and Netezza. One also mentioned MySQL and 2 HP prospects — but the HP sites were running NonStop SQL, not NeoView. Amazingly, there were no mentions of DB2. There also weren’t any mentions of the smaller specialist startups, such as DATAllegro, Greenplum, or Vertica.
Want to continue getting great research about text mining, data warehouse appliances, and other hot analytics-related topics? Then subscribe to our comprehensive feed, by RSS/Atom or e-mail! We recommend taking the integrated feed for all our blogs, but blog-specific ones are also easily available.
Posted in Analytics and analytic technologies, Business intelligence, Data warehouse appliances, Data warehousing, Greenplum, HP and Neoview, IBM and DB2, Microsoft and SQL*Server, MySQL, Oracle, Relational database management systems, Teradata | 7 Comments »
September 24th, 2007 Curt Monash
Pervasive Software has a long history – 25 years, in fact, as they’re emphasizing in some current marketing. Ownership and company name have changed a few times, as the company went from being an independent startup to being owned by Novell to being independent again. The original product, and still the cash cow, was a linked-list DBMS called Btrieve, eventually renamed Pervasive PSQL as it gained more and more relational functionality.
Pervasive Summit PSQL v10 has just been rolled out, and I wrote a nice little white paper to commemorate the event, describing some of the main advances over v9, primarily for the benefit of current Pervasive PSQL developers. In one major advance, Pervasive made the SQL functionality much stronger. In particular, you now can have a regular SQL data dictionary, so that the database can be used for other purposes – BI, additional apps, whatever. Apparently, that wasn’t possible before, although it had been possible in yet earlier releases. Pervasive also added view-based security permissions, which is obviously a Very Good Thing.
There also are some big performance boosts. Read the rest of this entry »
Posted in Database compression, Hierarchies, networks, graphs, and trees, Memory-centric data management, Microsoft and SQL*Server, Mid-range DBMS, OLTP database management, Pervasive Software, Portability, transparency, and plug-compatibility, Relational database management systems | No Comments »
June 14th, 2007 Curt Monash
I’ve been implying that the short list for native XML database engine vendors should be Mark Logic, IBM, and maybe Microsoft, on the theory that Progress and Intersystems tried the market and pulled back. Well, add Intersystems to the list, and not necessarily in last place. They’ve long had a very fast nonrelational engine in Cache’. Perhaps building Ensemble on it has induced them to sharpen up the XML capabilities again.
Anyhow, while I’m not at liberty to explain more of my reasoning (i.e., to disclose my evidence) — Cache’ should be taken seriously as an XML DBMS alternative … even if I never can seem to get a proper DBMS briefing from them (which is far from entirely being their fault).
Want to continue getting great research about DBMS, analytics, data integration, and other technologies related to data management? Then get a FREE subscription, by RSS/Atom or e-mail! We recommend taking the integrated feed for all our blogs, but blog-specific ones are also easily available.
Technorati Tags: XML database, Intersystems, Cache’
Posted in Hierarchies, networks, graphs, and trees, IBM and DB2, Intersystems and Cache', Mark Logic, Microsoft and SQL*Server, Native XML, Progress, Apama, and DataDirect | 1 Comment »
June 9th, 2007 Curt Monash
I have the enviable task of researching online game and virtual world technology for an upcoming Network World column. My first interview, quite naturally, was with the lead developers of a game I actually play – Guild Wars. The overview is in another post; that may provide context for this one, which focuses on the database technology. (I also did a short post just on the implications for Guild Wars players.) It also has a brief description of what Guild Wars is – namely, a MMORPG (Massively MultiPlayer Role-Playing Game) with the unusual feature that most of the game world is instanced rather than utterly shared.
First, some scope. ArenaNet (Guild Wars’ developer, now a subsidiary of NCsoft) runs Microsoft SQL Server, mainly Enterprise Edition, having just switched to 2005 4 months ago. They run 1500-2500 transactions/second all day, spiking up to 5000 in their busiest periods. They have no full-time DBA, and when the developers started this project they didn’t know SQL. They’ve only had one major SQL Server failure in the 2+ years the game has been running, and that was (like most of their bugs) a network driver problem more than an issue with the core system.
As for what’s going on — there are a few different kinds of database things that happen in an instanced MMORPG. Read the rest of this entry »
Posted in Application areas, Games and virtual worlds, Microsoft and SQL*Server, OLTP database management | 11 Comments »
April 18th, 2007 Curt Monash
Edit: This post has largely been superseded by this more recent one defining mid-range relational DBMS.
I find myself defining a new product category – midrange OLTP/multipurpose DBMS. (Or just midrange DBMS for brevity.) Nothing earthshaking here; I’m simply referring to those products that: Read the rest of this entry »
Posted in EnterpriseDB and Postgres Plus, IBM and DB2, Ingres, Intersystems and Cache', Microsoft and SQL*Server, Mid-range DBMS, MySQL, OLTP database management, Open source RDBMS, Oracle, Progress, Apama, and DataDirect, Relational database management systems, Sybase, solidDB | 7 Comments »
March 6th, 2007 Curt Monash
I haven’t been as clear as I could have been in explaining why I think MPP/shared-nothing beats SMP/shared-everything. The answer is in a short white paper, currently bottlenecked at the sponsor’s end of the process. Here’s an excerpt from the latest draft:
There are two ways to make more powerful computers:
1. Use more powerful parts – processors, disk drives, etc.
2. Just use more parts of the same power.
Of the two, the more-parts strategy much more cost-effective. Smaller* parts are much more economical, since the bigger the part, the harder and more costly it is to avoid defects, in manufacturing and initial design alike. Consequently, all high-end computers rely on some kind of parallel processing.
*As measured in terms of capacity, transistor count, etc., not physical size.
Read the rest of this entry »
Posted in DATAllegro, Data warehouse appliances, Data warehousing, Database theory and practice, Microsoft and SQL*Server, Netezza, Oracle, Relational database management systems, Teradata, Vertica Systems | 6 Comments »
March 1st, 2007 Curt Monash
Oracle is evidently buying Hyperion Software. Much like Gaul, Hyperion can be divided into three parts:
- Budgeting and consolidation applications, descended from the original Hyperion and Pillar.
- Essbase, the definitive MOLAP engine, descended from Arbor Software.
- A business intelligence suite, descended from Brio.
The most important part is budgeting/planning, because it could help Oracle change the rules for application software. But Essbase could be just the nudge Oracle needs to finally renounce its one-server-fits-all dogma.
Read the rest of this entry »
Posted in Data warehousing, Hierarchies, networks, graphs, and trees, MOLAP, Microsoft and SQL*Server, Oracle | 2 Comments »
February 27th, 2007 Curt Monash
The standard Clayton Christensen “Innovator’s Dilemma” disruption narrative goes something like this:
- Market leaders have many advantages, including top technology.
- Followers come up with good technology too.
- The leaders stay ahead by making their products ever better and more complex.
- The followers sell into new or non-mainstream markets, at prices the leaders can’t match. So they dominate new markets.
- Old markets turn into low-margin commodity-fests.
- Old leaders are screwed.
And it’s really hard for market leaders to avert this sad fate, because the short- and intermediate-term margin hit would be too great.
I think the OLTP DBMS market is ripe for that kind of disruption – riper than commentators generally realize. Here are some key potential drivers.
Read the rest of this entry »
Posted in ANTs Software, Data warehousing, EnterpriseDB and Postgres Plus, IBM and DB2, Intersystems and Cache', Microsoft and SQL*Server, Mid-range DBMS, MySQL, OLTP database management, Open source RDBMS, Oracle, Progress, Apama, and DataDirect, Relational database management systems | 5 Comments »
February 27th, 2007 Curt Monash
Most of what I’ve written lately about database management seems to have been focused on analytic technologies. But I have a lot to say on the OLTP (OnLine Transaction Processing) side too. So let’s start by clearing the decks. Here’s a list of some consensus views that I in essence agree with:
- Oracle is the top of the line, and has nothing wrong with it other than cost of ownership and the non-joys of doing business with Oracle Corporation.
- DB2/mainframe is a fine product, but only if you like IBM mainframes.
- DB2/open systems is another fine product, but it’s hard to think of reasons to use it over Oracle.
- Microsoft SQL Server has great cost of ownership if you’re a Windows (server) shop anyway, especially on the administrative side. It does most but not all of what Oracle does.
- Sybase Adaptive Server Enterprise is a lot like SQL Server, but without the Windows dependence or the great Microsoft tools. If you have it installed or are Chinese, you should strongly consider using it, but otherwise there are better alternatives.
- Progress’ DBMS is great if you don’t need any of the features it’s missing. Administration, for example, is a super-low-cost breeze. But why use it unless you’re also using the Progress development tools?
- Intersystems’ Cache’ is another fine mid-range product that involves buying into the vendors’ whole tool set – all the more so because it isn’t relational.
- Small-footprint embedded DBMS, from vendors such as Sybase’s iAnywhere division or Solid Information Technologies, are off in their own little world. Mainly, that world is telecom, with a satellite in medical devices, although other kinds of networked equipment also sometimes use these products.
- IBM’s non-DB2 database management products – IMS, Informix, etc. – are fine things to stick with until you have to change. Ditto products from Software AG, Computer Associates, Cincom, etc.
- MySQL Version 4 is an OLTP joke, but it’s a joke many people share. (Hey — a lot of blogs, including mine, run on Wordpress and MySQL 4.)
- Until Ingres is meaningfully marketed and sold outside its installed base, it’s not worth worrying about.
- PostgreSQL is more significant as the underpinning of other products — mainly EnterpriseDB in the OLTP space — than it is in its own right.
Want to continue getting great research about DBMS, analytics, and other technologies related to data management? Then subscribe to our feed, by RSS/Atom or e-mail! We recommend taking the integrated feed for all our blogs, but blog-specific ones are also easily available.
Posted in EnterpriseDB and Postgres Plus, IBM and DB2, Ingres, Intersystems and Cache', Microsoft and SQL*Server, Mid-range DBMS, MySQL, OLTP database management, Open source RDBMS, Oracle, PostgreSQL, Products and vendors, Progress, Apama, and DataDirect, Relational database management systems, solidDB | 1 Comment »
February 1st, 2007 Curt Monash
Jim Allchin’s farewell blog post is a hoot. There’s even a bit of database stuff in it.
Posted in Humor, Microsoft and SQL*Server | No Comments »
January 31st, 2007 Curt Monash
While scattering his mother’s ashes, ironically. Let’s hope this isn’t as bad as it sounds, and that he comes home safely, soon.
Edit: As of Thursday morning, the news is bad.
Posted in Microsoft and SQL*Server | No Comments »
October 3rd, 2006 Curt Monash
Several vendors are offering links to Gartner’s new Magic Quadrant report on data warehouse DBMS. (Edit: This is now a much better link to the 2006 MQ.) Somewhat atypically for Gartner, there’s a strict hierarchy among most of the vendors, with Teradata > IBM > Oracle > Microsoft > Sybase > Kognitio > MySQL > Sand, in each case on both axes of the matrix. The only two exceptions are Netezza and DATallegro, which are depicted as outvisioning Microsoft somewhat even as they trail both Microsoft and Sybase in execution.
Gartner Magic Quadrants tend to annoy me, and I’m not going to critique the rankings in detail. But I do think this particular MQ is helpful in framing a vendor segmentation, namely:
- Big full-spectrum MPP/shared-nothing vendors: Teradata and IBM.
- MPP/shared-nothing appliance upstarts: Netezza and DATallegro
- Big SMP/shared-everything vendors who also are apt to be your OLTP incumbent, and who want to integrate your software stack soup-to-nuts: Oracle and Microsoft
- Niche vendors: Pretty much everybody else
Read the rest of this entry »
Posted in DATAllegro, Data warehouse appliances, Data warehousing, IBM and DB2, Microsoft and SQL*Server, Netezza, Oracle, Relational database management systems, Teradata | 4 Comments »
September 27th, 2006 Curt Monash
Most of my recent data warehouse engine research has been with the specialists. But over the past couple of days I caught up with Oracle and Microsoft (IBM is scheduled for Friday). In at least three ways, it makes sense to lump those vendors together, and contrast them with the newer data warehouse appliance startups:
- Shared-everything architecture
- End-to-end solution story
- OLTP industrial-strengthness carried over to data warehousing
In other ways, of course, their positions are greatly different. Oracle may have a full order-of-magnitude lead on Microsoft in warehouse sizes, for example, and has a broad range of advanced features that Microsoft either hasn’t matched yet, or else just released in SQL Server 2005. Microsoft was earlier in pushing DBA ease as a major product design emphasis, although Oracle has played vigorous catch-up in Oracle10g.
Read the rest of this entry »
Posted in DATAllegro, Data warehouse appliances, EII, ETL, and/or EAI, IBM and DB2, Microsoft and SQL*Server, Netezza, Oracle, Relational database management systems, Teradata | 1 Comment »
July 9th, 2006 Curt Monash
A Slashdot thread tonight on the possibility of Oracle directly supporting Linux got me thinking – integration of DBMS and OS is much more common than one might at first realize, especially least in high-end data warehousing.
Think about it.
- Mainframe DB2 has OS/DBMS integration.
- Teradata has OS/DBMS integration.
- Oracle, unlike other open system DBMS vendors, has always had a lot of careful integration with (or at least interfacing to) each individual DBMS it supports.
- Microsoft of course integrates DBMS and OS
- The data warehousing appliance vendors integrate DBMS and OS. Stuart Frost of DATallegro made some excellent, detailed comments in this thread laying out that case.
This trend isn’t quite universal, of course. Open systems DB2 and Sybase and Progress and MySQL and so on are quite OS-independent, and of course you could dispute my characterization of Oracle as being “integrated” with the underlying OS. But in performance-critical environments, DBMS are often intensely OS-aware.
And of course this dovetails with a point I noted in another thread – DBMS are (or need to become) increasingly aware of chip architecture details as well.
Posted in Data warehouse appliances, Microsoft and SQL*Server, Oracle, Relational database management systems | 1 Comment »