February 18, 2008

Mike Stonebraker calls for the complete destruction of the old DBMS order

Last week, Dan Weinreb tipped me off to something very cool: Mike Stonebraker and a group of MIT/Brown/Yale colleagues are calling for a complete rewrite of OLTP DBMS. And they have a plan for how to do it, called H-Store, as per a paper and an associated slide presentation.

Read more

February 16, 2008

Mike Stonebraker’s DBMS taxonomy

In a response to my recent five-part series on DBMS diversity, Mike Stonebraker has proposed his own taxonomy of data management technologies over on Vertica’s Database Column blog. (Edit: Some good stuff disappeared when Vertica nuked that blog.)

  1. OLTP DBMSs focused on fast, reliable transaction processing
  2. Analytic/Data Warehouse DBMSs focused on efficient load and ad-hoc query performance
  3. Science DBMSs — after all MatLab does not scale to disk-sized arrays
  4. RDF stores focused on efficiently storing semi-structured data in this format
  5. XML stores focused on semi-structured data in this format
  6. Search engines — the big players all use proprietary engines in this area
  7. Stream Processing Engines focused on real-time StreamSQL
  8. “Lean and Mean,” less-than-a-database engines focused on doing a small number of things very well (embedded databases are probably in this category)
  9. MapReduce and Hadoop — after all Google has enough “throw weight” to define a category

He goes on to say that each will be architected differently, except that — as he already convinced me back in July — RDF will be well-managed by specialty data warehouse DBMS. Read more

February 15, 2008

Database management system choices – beyond relational

This is the fifth of a five-part series on database management system choices. For the first post in the series, please click here.

Relational database management systems have three essential elements:

  1. Rows and columns. Theoretically, rows and columns may be inessential to the relational model. But in reality, they are built into the design of every real-world relational product. If you don’t have rows and columns, you’re not using the product to do what it was well-designed for.
  2. Predicate logic. Theoretically, everything can be fitted into a predicate Procrustean bed. But if you’re looking for relevancy rankings on a text search, binary logic is a highly convoluted way to get them.
  3. Fixed schemas. Database theorists commonly assume that databases have fixed schemas. If this means that 90%+ of all information is null or missing, they have elegant ways of dealing with that. Even so, as computing gets ever more concerned with individuals — each with his/her/its unique “profile(s)” — fixed schemas get ever harder to maintain.

If any of these three elements is missing or inappropriate, then a traditional relational database management system may not be the best choice.

Read more

February 15, 2008

Database management system choices — mid-range-relational

This is the fourth of a five-part series on database management system choices. For the first post in the series, please click here.

The other threat to the high-end relational DBMS vendors aims squarely at the heart of their business. It’s the mid-range relational database management systems, which are doing an ever-larger fraction of what their high-end cousins can. That said, different products do different things well. So if you’re not blindly paying up for the security of an all-things-to-all-people high-end DBMS, there are a number of factors you might want to consider.

Read more

February 15, 2008

Database management system choices – relational data warehouse

This is the third of a five-part series on database management system choices. For the first post in the series, please click here.

High-end OLTP relational database management system vendors try to offer one-stop shopping for almost all data management needs. But as I noted in my prior post, their product category is facing two major competitive threats. One comes from specialty data warehouse database management system products. I’ve covered those extensively in this blog, with key takeaways including:

Let me expand on that last point. Different features may or may not be important to you, depending on whether your precise application needs include: Read more

February 15, 2008

Database management system choices – 4 categories of relational

This is the second of a five-part series on database management system choices. For the first post in the series, please click here.

For the most part, relational database management systems divide into four major classes:

Read more

February 15, 2008

Database management system choices — overview

This is the first in a 5-part series of posts on data management product choices. By pre-arrangement, Mike Stonebraker is responding on The Database Column, starting with his own taxonomy of DBMS types.

In the 1990s, most database management experts believed that a single general-purpose DBMS could meet substantially all needs. If you just kept adding in enough datatypes and data access methods (e.g., specialized indexes), your DBMS could eventually do a good job of meeting almost any requirement. And so, from the late 1990s into the beginning of this decade, it seemed that technology was supporting business trends, and the DBMS industry was inexorably consolidating. There was an oligopoly of high-end vendors, who sold increasingly similar super-sophisticated database management systems. Nothing else in database management seemed to matter.

Well, we were wrong. The big thing we overlooked is that database optimizations go down to the level of actual storage. Read more

February 14, 2008

EnterpriseDB on Elastra, early stages

I finally caught up with Bob Zurek about EnterpriseDB’s foray into the Elastra cloud. Here are some highlights:

February 11, 2008

eBay is over 5 petabytes now

Single largest database >1.4 petabytes.

From Oliver Ratzesberger’s LinkedIn profile:

Our systems process in excess of 10 billion records per day, serving thousands of users and delivering hundreds of millions of queries per month in a true global 24×7 operation with distributed teams around the globe on systems over 5 PB in size (largest single system >1.4PB).

February 8, 2008

Load speeds and related issues in columnar DBMS

Please do not rely on the parts of the post below that are about ParAccel. See our February 18 post about ParAccel instead.

I’ve already posted about a chat I had with Mike Stonebraker regarding Vertica yesterday. I naturally raised the subject of load speed, unaware that Mike’s colleague Stan Zlodnik had posted at length about load speed the day before. Given that post, it seems timely to go into a bit more detail, and in particular to address three questions:

  1. Can columnar DBMS do operational BI?
  2. Can columnar DBMS do ELT (Extract-Load-Transform, as opposed to ETL)?
  3. Are columnar DBMS’ load speeds a problem other than in issues #1 and #2?

Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.