December 29th, 2007 Curt Monash
I’ve been a DBMS analyst since before there were cost-based optimizers or, for that matter, a whole lot of relational DBMS. And in the 20 years that optimizers have been around, I’ve never fully understood why they’re so simple-minded. Even so, I think they’re pretty cool, as per the fanboyish discussion in this 2004 Computerworld column.
So I’m delighted to see that the Oracle folks have started a hardcore blog on optimizer details. If you want to get a sense of how smart a leading DBMS is or isn’t, I encourage you to check it out.
Keep getting great research about database management and related technologies. Sign up today! No hassle, no spam!
Posted in OLTP database management, Oracle, Relational database management systems | 1 Comment »
December 21st, 2007 Curt Monash
IBM is acquiring Solid Information Technology, makers of solidDB. Some quick comments:
- solidDB is actually a very interesting hybrid disk/in-memory memory-centric database management system. However, the press release announcing the deal makes it sound as if solidDB is in-memory only.
- That strongly suggests that IBM is buying Solid mainly to compete with Oracle TimesTen. As of last June, solidDB was already IBM’s TimesTen answer via a partnership; this deal just solidifies that arrangement.
- This probably isn’t good news for Solid’s MySQL engine. That’s a pity, since solidDB technically has the potential to be the best MySQL engine around.
- Notwithstanding IBM’s presumed intentions, Solid’s main market success historically is as an embedded system in telecommunications equipment, network software, and similar systems.
- Last year I wrote a white paper on memory-centric data management, showcasing four products. IBM now has bought two of them, namely Solid’s and Applix’s (via Cognos).
- Comparisons to IBM’s embedded Java DBMS Cloudscape are pointless. That’s just a failed product vs. solidDB or Sybase SQL Anywhere, and IBM long ago cut its losses.
Read the rest of this entry »
Posted in Cognos and Applix TM1, IBM and DB2, Memory-centric data management, OLTP database management, Sybase, solidDB | 3 Comments »
December 18th, 2007 Curt Monash
Elastra is a startup offering MySQL and PostgreSQL SaaS instances in the Amazon S3/EC2 cloud. On their board is John Hummer, which I generally regard as a good thing, although it’s hardly a guarantee of success.* High Scalability raises some doubts about Elastra’s pricing, but I think that may be missing the point. Read the rest of this entry »
Posted in Amazon, SimpleDB, and S3, Cloud computing, Elastra, MySQL, OLTP database management, Open source RDBMS, PostgreSQL, SaaS | 2 Comments »
December 18th, 2007 Curt Monash
I’ve posted several times about Amazon as an innovative, super-high-end user — doing transactional object caching with ObjectStore, building an inhouse less-than-DBMS called Dynamo, or just generally adopting a very DBMS2-like approach to data management. Now Amazon is bring the Dynamo idea to the public, via a SaaS offering called SimpleDB. (Hat tip to Tim Anderson.)
SimpleDB is obviously meant to be a data server for online applications. There are no joins, and queries don’t run over 5 seconds, so serious analytics are out of the question. Domains are limited to 10GB for now, so extreme media file serving also isn’t what’s intended; indeed, Amazon encourages one to use SimpleDB to store pointers to larger objects stored as files in Amazon S3.
On the other hand, if you think of SimpleDB as an OLTP DBMS, your head might explode. There’s no sense of transaction, no mechanisms to help with integrity, no way to do arithmetic, and indeed no assurance that writes will be immediately reflected in reads. Read the rest of this entry »
Posted in Amazon, SimpleDB, and S3, Cloud computing, Database theory and practice, OLTP database management, SaaS | 2 Comments »
December 17th, 2007 Curt Monash
Every few months I try to make contact with Intersystems. Sometimes they graciously respond, promising to schedule a briefing, which then never happens. Other times they don’t even bother. Now, on one level I can’t blame them, based on what happened at my last briefing. Read the rest of this entry »
Posted in Hierarchies, networks, graphs, and trees, Objects | 5 Comments »
December 17th, 2007 Curt Monash
Maybe I was just in an odd mood, but I laughed for a LONG time at this cautionary tale.
Hat tip to Dan Weinreb, who introduced me to the site by sending over a link to this funny cartoon, in response to this filksong.
Posted in Humor | No Comments »
December 14th, 2007 Curt Monash
There are at least 16 different vendors offering appliances and/or software that do database management primarily for analytic purposes.* That’s a lot to keep up with,. So I’ve thrown together a little overview of the analytic data management landscape, liberally salted with links to information about specific vendors, products, or technical issues. In some ways, this is a companion piece to my prior post about data warehouse appliance myths and realities.
*And that’s just the tabular/alphanumeric guys. Add in text search and you run the total a lot higher.
Numerous data warehouse specialists offer traditional row-based relational DBMS architectures, but optimize them for analytic workloads. These include Teradata, Netezza, DATAllegro, Greenplum, Dataupia, and SAS. All of those except SAS are wholly or primarily vendors of MPP/shared-nothing data warehouse appliances. EDIT: See the comment thread for a correction re Kognitio.
Numerous data warehouse specialists offer column-based relational DBMS architectures. These include Sybase (with the Sybase IQ product, originally from Expressway), Vertica, ParAccel, Infobright, Kognitio (formerly White Cross), and Sand. Read the rest of this entry »
Posted in Analytics and analytic technologies, Cognos and Applix TM1, DATAllegro, Data warehouse appliances, Data warehousing, Dataupia, Greenplum, IBM and DB2, Kognitio and WX2, Netezza, Oracle, ParAccel, Relational database management systems, SAS Institute, Sybase, Teradata, Vertica Systems | 10 Comments »
December 8th, 2007 Curt Monash
Since I was researching Software AG anyway, I took the opportunity to ask about Software AG’s native XML DBMS Tamino, which certainly has some fans. Jim Fowler, Software AG’s Director of Market Development, Enterprise Transaction Systems, was kind enough to write up the following for me:
As you know, when Tamino was released in the late 1990s it was one of the first – if not the first – commercially available native XML database. We now have several hundred Tamino customers worldwide, and Software AG is fully committed to supporting our customers.
At the same time, we recognize that XML has matured and evolved in many different directions during the past decade;
Read the rest of this entry »
Posted in Data types, Native XML, Software AG and ADABAS | No Comments »
December 8th, 2007 Curt Monash
The two oldest major software products companies may well both be German – SAP and Software AG. They’re both a little older than CA (which, directly, or indirectly, has bought most of the other pioneers), Information Builders, or SAS, none of which – if I recall correctly – was founded before 1975-6.
In its current configuration, Software AG is based in Germany, publicly traded, and divided into two divisions:
- ETS (Enterprise Transaction Systems), perhaps better thought of as “Software AG Classic.” This is a 350 million Euros business, solidly profitable and still growing, albeit slowly.
- WebMethods, a SOA/integration division named after the biggest of the acquisitions it’s built from. This is a 100 million Euros business growing Very Fast.
The ETS folks briefed me last week. Highlights follow. I also posted about Software AG’s history over on Software Memories, which may provide some useful detail and context. Read the rest of this entry »
Posted in OLTP database management, Software AG and ADABAS | 3 Comments »
December 7th, 2007 Curt Monash
The proximate cause for today’s flurry of Netezza-related posts is that the company has finally rolled out its compression story. In a nutshell, Netezza has developed its own version of columnar delta compression, slated to ship May, 2008. It compresses 2-5X, with the factor sometimes going up into double digits. Netezza estimates this produces a 2-3X improvement in overall performance, with the core marketing claim being that performance will “double” from compression alone. Read the rest of this entry »
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Database compression, Database theory and practice, Netezza, Relational database management systems | No Comments »
December 7th, 2007 Curt Monash
In 1993, Ted Codd introduced the term OLAP (OnLine Analytic Processing) to describe data management that wasn’t optimized for OLTP (OnLine Transaction Processing). Later in the 1990s, Henry Morris of IDC introduced the term analytic applications to describe apps that weren’t transactional. Since then, no better word than “analytic” has emerged to cover the broad class of IT apps and technologies that aren’t focused on transactional processing.
In the latest incarnation, analytic appliances are coming to the fore. Read the rest of this entry »
Posted in Analytics and analytic technologies, Data warehouse appliances, Netezza, Relational database management systems, Vertica Systems | No Comments »
December 7th, 2007 Curt Monash
I’ve bashed Netezza repeatedly for secrecy and obscurity about its technology and technical plans. Well, they’re getting a lot better. The latest post in a Netezza company blog, by marketing exec Phil Francisco, lays out their story clearly and concisely. And it’s backed up by a white paper that does more of the same. In particular, Page 11 of that white paper spells out possible future directions for enhancement, such as better compression, encryption, join filtering, and Netezza Developer Network stuff. Read the rest of this entry »
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Netezza, Relational database management systems | 2 Comments »
December 7th, 2007 Curt Monash
I talked with Netezza today, and finally understand better why they don’t have node-to-node data shipping problems with only 1-gigabit (gigE) interconnects:
- Netezza boxes have lots of relatively small nodes, so all else being equal, each individual node has less communicating to do than, say, a DATAllegro node does.
- It’s not just just 1-gigabit. There’s a hierarchical communications architecture, and at one level in the hierarchy switches are talking to each other through 32 parallel 1-gigabit channels at a time.
Posted in Data warehouse appliances, Netezza | No Comments »
December 5th, 2007 Curt Monash
Quite a bit of DBMS plug-compatibility is being claimed these days. Lewis Cunningham’s post on a few new EnterpriseDB features illustrates just how picky compatibility features can get. One can run Oracle code but not get around to handling comments properly? Sheesh.
Posted in EnterpriseDB and Postgres Plus, Oracle, Portability, transparency, and plug-compatibility, Relational database management systems | No Comments »
December 5th, 2007 Curt Monash
I’m going to praise EnterpriseDB’s marketing communications twice in two blog posts, because I really liked some of the crunch they put into a press release announcing a MySQL replacement at FortiusOne. To wit (emphasis mine):
The PostGIS geospatial extensions to PostgreSQL played a key role in FortiusOne’s selection of EnterpriseDB Advanced Server, a PostgreSQL-based solution, and dramatically improved performance. FortiusOne needed to run complex spatial queries against large datasets quickly and efficiently, and found the MySQL spatial extensions to be far less complete and comprehensive than PostGIS. EnterpriseDB Advanced Server processes some of GeoCommons’ database-intensive rendering requests in one-thirtieth of the time required by MySQL. During peak loads, GeoCommons processes more than one hundred thousand complex requests per hour, requiring true enterprise-class performance and scalability.
Another major factor in FortiusOne’s replacement of MySQL with EnterpriseDB Advanced Server was the company’s need for advanced partitioning, custom triggers, and functional indexing. EnterpriseDB’s advanced partitioning capabilities instantly enabled linear performance, even with tables having billions of rows.
Read the rest of this entry »
Posted in Data types, EnterpriseDB and Postgres Plus, GIS and geospatial, MySQL | 10 Comments »
December 5th, 2007 Curt Monash
Ashlee Vance discovered that EnterpriseDB had shot its field sales force, and opined that EnterpriseDB might generally be in trouble. EnterpriseDB CEO Andy Astor and marketing exec Derek Rodner responded quickly in their respective blogs. Andy and I also talked on the phone.
As best as I can tell, here’s what’s actually going on: Read the rest of this entry »
Posted in EnterpriseDB and Postgres Plus, Open source RDBMS, Relational database management systems | 2 Comments »
December 3rd, 2007 Curt Monash
Borrowing the “Fact or fiction?” meme from the sports world:
- Data warehouse appliances have to have specialized hardware. Fiction. Indeed, most contenders except Teradata and Netezza — for example, DATAllegro, Vertica, ParAccel, Greenplum, and Infobright — offer Type 2 appliances. (Dataupia is another exception.)
- Specialized hardware is a dead-end for data warehouse appliances. Fiction. If it were easy for Teradata to replace its specialized switch technology, it would have done so a decade ago. And Netezza’s strategy has a lot of appeal.
-
Data warehouse appliances are nothing new, and failed long ago. Fiction, but only because of Teradata. 1980s appliance pioneer Britton-Lee didn’t do so well (it was actually bought by Teradata). IBM and ICL (Britain’s national-champion hardware company) had content-addressable data store technology that went nowhere.
- Since data warehouse appliances failed long ago, they’ll fail now too. Fiction. Shared-nothing MPP is a fundamental advantage of appliances. So are various index-light strategies. Data warehouse appliances are here to stay.
- Data warehouse appliances only make sense if your main database management system can’t handle the job. Fiction. There are dozens of data warehouse appliances managing under 5 terabytes of user data, if not under 1 terabyte. True, some of them are legacy installations, dating back to when Oracle couldn’t handle that much data well itself. But new ones are still going in. Even if Oracle or Microsoft SQL Server can do the job, a data warehouse appliance is often a far superior — cheaper, easier to deploy and keep running, and/or better performing — alternative.
- Data warehouse appliances are just for data marts. For your full enterprise data warehouse, use a conventional DBMS. Part fact, part fiction. It depends on the appliance, and on the complexity of your needs. Teradata systems can do pretty much everything. Netezza and DATAllegro, two of the oldest data warehouse appliance startups, have worked hard on their concurrency issues and now can support fairly large user or reporting loads. They also can handle reasonable volumes of transactional or trickle-feed updates, and probably can support full EDW requirements for decent-sized organizations. Even so, there are some warehouse use cases for which they’re ill-suited. Newer appliance vendors are more limited yet.
- Analytic appliances are just renamed data warehouse appliances. Fact, even if misleading. Netezza is using the term “analytic appliance” to highlight additional things one can do on its boxes beyond answering queries. But those are still operations on a data mart or data warehouse. And Vertica is using the term “analytic appliance” to mean exactly what “data warehouse” means.
- Teradata is the leading data warehouse appliance vendor. More fact than fiction. Some observers say that Teradata systems aren’t data warehouse appliances. But I think they are. Competitors may be superior to Teradata in one or the other characteristic trait of appliances – e.g., speed of installation – but it’s hard to define “appliances” in an objective way that excludes Teradata.
If you liked this post, you might also like one on text mining fact and fiction.
Keep getting great research about data warehousing and other analytic technologies. No hassle, no spam!
Technorati Tags: data warehousing, data warehouse appliance
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Relational database management systems | 3 Comments »
December 2nd, 2007 Curt Monash
Amazon has a very decentralized technical operation. But even the individual pieces have interestingly huge scale. Thus, various different things they’re doing are of interest.
They recently presented a research paper on a high-performance transactional system called Dynamo. (Hat tip to Dare Obasanjo.) A key point is the following:
There are many services on Amazon’s platform that only need primary-key access to a data store. For many services, such as those that provide best seller lists, shopping carts, customer preferences, session management, sales rank, and product catalog, the common pattern of using a relational database would lead to inefficiencies and limit scale and availability. Dynamo provides a simple primary-key only interface to meet the requirements of these applications.
Now, I don’t think too many organizations past Amazon are going to decide that they can’t afford the overhead of an RDBMS for such OLTP-like applications. But I do think it will become increasingly common to find other reasons to eschew traditional OLTP relational architectures. Maybe you’ll want the schema flexibility of XML. Or perhaps you’ll be happy with a fixed relational schema, but will want to optimize for analytic performance.
Posted in Amazon, SimpleDB, and S3, Cloud computing, Database diversity, Database theory and practice, OLTP database management | No Comments »