Infobright

Analysis of Infobright and its MySQL-based data warehouse DBMS formerly known as Brighthouse. Related subjects include:

June 25, 2009

My current customer list among the analytic DBMS specialists

(This is an updated version of an August, 2008 post.)

One of my favorite pages on the Monash Research website is the list of many current and a few notable past customers. (Another favorite page is the one for testimonials.) For a variety of reasons, I won’t undertake to be more precise about my current customer list than that. But I don’t think it would hurt anything to list the analytic/data warehouse DBMS/appliance specialists in the group. They are:

All of those are Monash Advantage members.

If you care about all this, you may also be interested in the rest of my standards and disclosures.

June 7, 2009

Daniel Abadi on Kickfire and related subjects

Daniel Abadi has a new blog, whose first post centers around Kickfire.  The money quote is (emphasis mine):

In order for me to get excited about Kickfire, I have to ignore Mike Stonebraker’s voice in my head telling me that DBMS hardware companies have been launched many times in the past are ALWAYS fail (the main reasoning is that Moore’s law allows for commodity hardware to catch up in performance, eventually making the proprietary hardware overpriced and irrelevant). But given that Moore’s law is transforming into increased parallelism rather than increased raw speed, maybe hardware DBMS companies can succeed now where they have failed in the past

Good point.

More generally, Abadi speculates about the market for MySQL-compatible data warehousing.  My responses include:

Anyhow, as previously noted, I’m a big Daniel Abadi fan. I look forward to seeing what else he posts in his blog, and am optimistic he’ll live up to or exceed its stated goals.

April 20, 2009

This week is a REALLY good time to actively strengthen the MySQL forkers

As my first three posts on the Oracle/Sun merger suggested, I think Oracle will do a better job with MySQL product development than Sun has.  But of course that’s a low hurdle.  And so it leaves open the questions:

What should and/or will be the most widely adopted code lines of MySQL (or other open source DBMS),

especially for the types of users and vendors who are engaged with MySQL (as opposed to principal alternative PostgreSQL) today?

As much as I’ve bashed MySQL/MyISAM and MySQL/InnoDB for being low-quality general-purpose DBMS, I’d still hate to see MySQL-based development stall out. There are a number of MySQL engine providers with rather unique technology, that deserve a good front-end partner to build their products with.  The high-volume sharding guys deserve the chance to continue down their current path as well.  And so does the low-end mass market — although I’m least worried about them, as I can’t imagine any realistic scenario in which Oracle doesn’t offer a version of MySQL fully suited to support 10s of millions of WordPress and Joomla installations.

So far as I can tell, there are only four real and currently active candidates for MySQL code coordinator:

Patrick Galbraith and Steven Vaughan-Nichols did good jobs of illustrating the turmoil.

Oracle isn’t a very comfortable partner long term for the storage engine vendors, and Drizzle doesn’t seem to be what they need. So I think that Infobright, Kickfire, Tokutek, Calpont, et al. need to get aligned in a hurry with an outside MySQL provider such as Percona or MariaDB or a newcomer, preferably all with the same one.  Yes, I understand that Infobright is getting a lot of marketing help from Sun these days, that Kickfire just got a nice-sounding Sun marketing announcement as well, and so on. But the time to start working toward the inevitable future is now.

And by “now” I mean “right now,” since the MySQL community is at this moment gathered together for its annual conference.

April 20, 2009

MySQL storage engine round-up, with Oracle-related thoughts

Here’s what I know about MySQL storage engines, more or less.

April 20, 2009

Infobright update

For the past couple of quarters, Infobright has been MySQL’s partner of choice for larger data warehousing applications. Infobright’s stated business metrics, and I quote, include:

  • > 50 Customers in 7 Countries

  • > 25 Partners on 4 continents

  • A vibrant open source community

    • +1 million visitors

    • Approaching 10,000 downloads

    • 2,000 active community participants

These may be compared with analogous metrics Infobright offered in February.

Infobright has also made or promised a variety of technological enhancements. Ones that are either shipping now or promised soon include:

Read more

March 18, 2009

Database implications if IBM acquires Sun

Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion today (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal — if it happens — might affect the database management system industry.

Read more

February 12, 2009

Infobright update

Infobright briefed me, and I thought it would be best to invite them to provide a write-up themselves of what customer and other information they did and didn’t want to disclose, for me to publish. Read more

February 4, 2009

Draft slides on how to select an analytic DBMS

I need to finalize an already-too-long slide deck on how to select an analytic DBMS by late Thursday night.  Anybody see something I’m overlooking, or just plain got wrong?

Edit: The slides have now been finalized.

January 27, 2009

Introduction to Pentaho

I finally caught up with Pentaho, which along with Jaspersoft is one of the two most visible open source business intelligence companies, Actuate perhaps excepted. Highlights included:

Read more

September 22, 2008

Database compression is heavily affected by the kind of data

I’ve written often of how different kinds or brands of data warehouse DBMS get very different compression figures. But I haven’t focused enough on how much compression figures can vary among different kinds of data. This was really brought home to me when Vertica told me that web analytics/clickstream data can often be compressed 60X in Vertica, while at the other extreme — some kind of floating point data, whose details I forget for now — they could only do 2.5X. Edit: Vertica has now posted much more accurate versions of those numbers. Infobright’s 30X compression reference at TradeDoubler seems to be for a clickstream-type app. Greenplum’s customer getting 7.5X — high for a row-based system — is managing clickstream data and related stuff. Bottom line:

When evaluating compression ratios — especially large ones — it is wise to inquire about the nature of the data.

September 22, 2008

Web analytics — clickstream and network event data

It should surprise nobody that web analytics – and specifically clickstream data — is one of the biggest areas for high-end data warehousing. For example:

Read more

September 15, 2008

Infobright update

In connection with the announcements that:

I got my first real Infobright update since January. Highlights included:

Read more

September 15, 2008

Infobright’s open source move has a lot of potential

Infobright announced today that it’s going full-bore into open source – specifically in the MySQL ecosystem — with the licensing approach, pricing, distribution strategy, and VC money from Sun that such a move naturally entails. I think this is a great idea, for a number of reasons: Read more

September 15, 2008

Infobright goes open source — sound bites

As has recently become my custom when there is industry news, I herewith provide quotable sound bites about Infobright and its move to an open source strategy. Weather permitting, I’ll be on a plane to the Netezza conference this afternoon. And I’ve only slept about 10 hours since Thursday. So I hope these suffice, although if they don’t and you email me I’ll try to respond by some time Tuesday morning.

Posts today on open source DBMS

August 24, 2008

My current customer list among the data warehouse specialists

One of my favorite pages on the Monash Research website is the list of many current and a few notable past customers. (Another favorite page is the one for testimonials.) For a variety of reasons, I won’t undertake to be more precise about my current customer list than that. But I don’t think it would hurt anything to list the data warehouse DBMS/appliance specialists in the group. They are:

All of those are Monash Advantage members.

If you care about all this, you may also be interested in the rest of my standards and disclosures.

July 10, 2008

How is MySQL’s join performance these days?

In a comment thread on a recent post comparing MySQL to Postgres, Jonathon Moore chimed in based on experience with both products. His characterization of some MySQL problems: Read more

May 8, 2008

Outsourced data marts

Call me slow on the uptake if you like, but it’s finally dawned on me that outsourced data marts are a nontrivial segment of the analytics business. For example:

To a first approximation, here’s what I think is going on. Read more

April 5, 2008

Positioning the data warehouse appliances and specialty DBMS

There now are four hardware vendors that each offer or seem about to announce two different tiers of data warehouse appliances: Sun, HP, EMC, and Teradata. Specifically:

Read more

January 21, 2008

Will Brighthouse become the MySQL data warehouse of choice?

As I’ve previously noted:

Talking with Infobright today, I was again struck by how close their relationship with MySQL (the company is). Stay tuned.

January 21, 2008

Infobright is gearing up for a press push

There’s another TDWI conference coming up, so it’s time for data warehouse-related press rollouts. Infobright (one of my many clients in this area) will be doing one of them, and ran an early version by me. Customer announcements, vendor partnerships, and so on are still being finalized, but anyhow Infobright has 7 revenue-recognized customers and a bunch more that are sold and in the implementation cycle. There’s a Release 3 of Brighthouse coming up. As one would expect, Release 3’s major claims to fame are the general addition of features (including some which elicit a “You didn’t have that already?” reaction), plus huge performance improvements in some queries (i.e., the biggest bottlenecks in Brighthouse Release 2).

On that level, it’s all standard stuff, as is Infobright’s core pitch — ease, simplicity, low cost, etc., and the benefits of same. But drilling down, there are some rather unique technical claims. Read more

January 16, 2008

Things could get interesting for Infobright

Of the many new specialty data warehouse DBMS and appliances, Infobright’s BrightHouse is the only leading one based on MySQL. I expect Sun and Infobright to have some interesting conversations now. Conversely, I wouldn’t be optimistic about any partnering discussions Infobright might have with, say, HP.

The most directly competitive relationship Sun now has to any future Infobright partnership is with ParAccel.

October 28, 2007

Infobright responds

An InfoBright employee posted something quite reasonable-looking in response to my inaugaral post about BrightHouse. Even so, InfoBright asked if they could substitute something with a slightly different tone. I agreed. Here’s what they sent in.

Curt, thanks for the write-up and the opportunity to talk about our customer success stories. As you say, our customer story is definitely “more than zero.” We are addressing a number of critical customer issues with our unique approach to data warehousing.

Infobright currently has 5 customers - customers that have bucked the trend of throwing hardware at the problem. To be perfectly braggadocio about this, we have never lost a competitive proof of concept in which we’ve been engaged. This is accomplished with the horsepower of one box (though for redundancy customers may deploy multiple boxes with a load balancer).

Read more

October 22, 2007

Infobright BrightHouse — columnar, VERY compressed, simple, and related to MySQL

To a first approximation, Infobright – maker of BrightHouse — is yet another data warehouse DBMS specialist with a columnar architecture, boasting great compression and running on commodity hardware, emphasizing easy set-up, simple administration, great price-performance, and hence generally low TCO. BrightHouse isn’t actually MPP yet, but Infobright confidently promises a generally available MPP version by the end of 2008. The company says that experience shows >10:1 compression of user data is realistic – i.e., an expansion ratio that’s fractional, and indeed better than 1/10:1. Accordingly, despite the lack of shared-nothing parallelism, Infobright claims a sweet spot of 1-10 terabyte warehouses, and makes occasional references to figures up to 30 terabytes or so of user data.

BrightHouse is essentially a MySQL storage engine, and hence gets a lot of connectivity and BI tool support features from MySQL for “free.” Beyond that, Infobright’s core technical idea is to chop columns of data into 64K chunks, called data packs, and then store concise information about what’s in the packs. The more basic information is stored in data pack nodes,* one per data pack. If you’re familiar with Netezza zone maps, data pack nodes sound like zone maps on steroids. They store maximum values, minimum values, and (where meaningful) aggregates, and also encode information as to which intervals between the min and max values do or don’t contain actual data values. Read more

Feed including blog about database management, data warehousing, and business intelligence Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.