Analysis of open source database management system PostgreSQL and other products in the PostgreSQL ecosystem. Related subjects include:

September 13, 2009


Despite a thoughtful heads-up from Daniel Abadi at the time of his original posting about HadoopDB, I’m just getting around to writing about it now. HadoopDB is a research project carried out by a couple of Abadi’s students. Further research is definitely planned. But it seems too early to say that HadoopDB will ever get past the “research and oh by the way the code is open sourced” stage and become a real code line — whether commercialized, open source, or both.

The basic idea of HadoopDB is to put copies of a DBMS at different nodes of a grid, and use Hadoop to parcel work among them. Major benefits when compared with massively parallel DBMS are said to be:

HadoopDB has actually been built with PostgreSQL. That version achieved performance well below that of a commercial DBMS “DBX”, where X=2. Column-store guru Abadi has repeatedly signaled his intention to try out HadoopDB with VectorWise at the nodes instead. (Recall that VectorWise is shared-everything.) It will be interesting to see how that configuration performs.

The real opportunity for HadoopDB, however, in my opinion may lie elsewhere. Read more

September 10, 2009

What could or should make Oracle/MySQL antitrust concerns go away?

When the Oracle/MySQL deal was first announced, I wrote:

I can probably come up with business practices that could make things very hard on Oracle/MySQL competitors … but I haven’t found a compelling antitrust trigger on my first pass over the subject.

Subsequently, there’s been a lot of discussion about whether or not Oracle can use control of MySQL to make life difficult for third-party MySQL storage engine vendors.

Now that the European Commission is delaying the Oracle/Sun deal, explicitly because of Oracle/MySQL antitrust fears.  That is, the European Commission wants to be reassured that an Oracle takeover of MySQL won’t unduly impinge upon the future availability of open source/low cost DBMS alternatives.  This raises that natural question:

What could Oracle do to assure concerned parties that its ownership of MySQL won’t unduly hamper open-source-based DBMS competition?

I think that’s indeed the crucial question. The Oracle/Sun deal has enough momentum at this point that it both should and will be allowed to happen — perhaps with safeguards — rather than banned outright. If  you have concerns about Oracle’s pending acquisition of MySQL, you should speak up and outline what kinds of regulatory safeguards would alleviate the problems you foresee.

More or less obvious possibilities include:

September 3, 2009

Continuent on clustering

Robert Hodges, CTO of my client Continuent, put up a blog post laying out his and Continuent’s views on database clustering. Continuent offers Tungsten, its third try at database clustering technology, targeted at MySQL, PostgreSQL, and perhaps Oracle. Unlike Continuent’s more ambitious. second-generation product, Tungsten offers single-master replication, which in Robert’s view allows for great ease of deployment and administration (he likes the phrase “bone-simple”).

The downside to Continuent Tungsten ‘s stripped down architecture is that it doesn’t solve the most extreme performance scale-out problems. Instead, Continuent focuses on the other big benefits of keeping your data in more than one place, namely high availability and data loss prevention (i.e., backup).

Continuent has been around for a number of years, starting out in Finland but now being based in Silicon Valley. For most purposes, however, it’s reasonable to think of Continuent and Tungsten as start-up efforts.

As you might guess from the references to Finland and MySQL, Continuent’s products are open source, or at least have open source versions. I’m still a little fuzzy as to which features are open sourced and which are not. For that matter, I’m still unclear as to Tungsten’s feature list overall …

July 29, 2009

What are the best choices for scaling Postgres?

March, 2011 edit: In its quaintness, this post is a reminder of just how fast Short Request Processing DBMS technology has been moving ahead.  If I had to do it all over again, I’d suggest they use one of the high-performance MySQL options like dbShards, Schooner, or both together.  I actually don’t know what they finally decided on in that area. (I do know that for analytic DBMS they chose Vertica.)

I have a client who wants to build a new application with peak update volume of several million transactions per hour.  (Their base business is data mart outsourcing, but now they’re building update-heavy technology as well. ) They have a small budget.  They’ve been a MySQL shop in the past, but would prefer to contract (not eliminate) their use of MySQL rather than expand it.

My client actually signed a deal for EnterpriseDB’s Postgres Plus Advanced Server and GridSQL, but unwound the transaction quickly. (They say EnterpriseDB was very gracious about the reversal.) There seem to have been two main reasons for the flip-flop.  First, it seems that EnterpriseDB’s version of Postgres isn’t up to PostgreSQL’s 8.4 feature set yet, although EnterpriseDB’s timetable for catching up might have tolerable. But GridSQL apparently is further behind yet, with no timetable for up-to-date PostgreSQL compatibility.  That was the dealbreaker.

The current base-case plan is to use generic open source PostgreSQL, with scale-out achieved via hand sharding, Hibernate, or … ??? Experience and thoughts along those lines would be much appreciated.

Another option for OLTP performance and scale-out is of course memory-centric options such as VoltDB or the Groovy SQL Switch.  But this client’s database is terabyte-scale, so hardware costs could be an issue, as of course could be product maturity.

By the way, a large fraction of these updates will be actual changes, as opposed to new records, in case that matters.  I expect that the schema being updated will be very simple — i.e., clearly simpler than in a classic order entry scenario.

June 5, 2009

Greenplum update — Release 3.3 and so on

I visited Greenplum in early April, and talked with them again last night. As I noted in a separate post, there are a couple of subjects I won’t write about today. But that still leaves me free to cover a number of other points about Greenplum, including: Read more

May 22, 2009

Yet more on MySQL forks and storage engines

The issue of MySQL forks and their possible effect on closed-source storage engine vendors continues to get attention.  The underlying question is:

Suppose Oracle wants to make life difficult for third-party storage engine vendors via its incipient control of MySQL?  Can the storage engine vendors insulate themselves from this risk by working with a MySQL fork?

Read more

April 20, 2009

First thoughts on Oracle acquiring Sun

More later.  I have a radio interview in a few minutes on a very different subject.

April 2, 2009

Ingres update

I talked with Ingres today. Much of the call was fluff — open-source rah-rah, plus some numbers showing purported success, but so finely parsed as to be pretty meaningless. (To Ingres’ credit, they did offer to let me talk w/ their CFO, even if they offered no promises as to whether he’d offer any more substantive information.) Highlights included: Read more

March 18, 2009

Database implications if IBM acquires Sun

Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion today (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal — if it happens — might affect the database management system industry. Read more

December 29, 2008

ParAccel actually uses relatively little PostgreSQL code

I often find it hard to write about ParAccel’s technology, for a variety of reasons:

ParAccel is quick, however, to send email if I post anything about them they think is incorrect.

All that said, I did get careless when I neglected to doublecheck something I already knew. Read more

← Previous PageNext Page →

Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.