MySQL
Analysis of open source DBMS vendor MySQL (recently acquired by Sun Microsystems), its products, and other products in the MySQL ecosystem. Related subjects include:
MySQL Query Analyzer
Given how the product’s rollout has been handled, it seems necessary to comment on MySQL’s recently released MySQL Query Analyzer without actually having much information on the subject. Mark Callaghan offers a good take — he’s generally very favorable, but notes that MySQL has some limitations that Query Analyzer has trouble getting around.
| Categories: MySQL | Leave a Comment |
MySQL is being used in an IBM Lotus appliance
Apparently, IBM is rolling out an appliance for small businesses. MySQL is under the covers. The appliance won’t have a keyboard or monitor, so there won’t be a lot of database administration going on.
Before Solid and solidDB were acquired by IBM, one of the things Solid was proudest of was some embedded apps in which solidDB ran for years in boxes without keyboards or monitors.
I still think it’s a pity that IBM isn’t using solidDB as broadly as the technology deserves. Even so, this is a nice endorsement of MySQL for reliable zero-DBA mid-range use.
| Categories: DBMS product categories, IBM and DB2, Mid-range, MySQL, solidDB | Leave a Comment |
Introduction to Kickfire
I’ve spent a few hours visiting or otherwise talking with my new clients at Kickfire recently, so I think I have a better feel for their story. A few details are still missing, however, either because I didn’t get around to asking about them, or because an unexplained accident corrupted my notes (and I wasn’t even using Office 2007). Highlights include:
| Categories: Columnar database management, Data warehouse appliances, Data warehousing, Kickfire, MySQL, Theory and architecture | Leave a Comment |
Infobright update
In connection with the announcements that:
- Infobright is open sourcing its analytical DBMS product (which is a really good idea)
- Infobright raised a $10 million VC round, with Sun as a new investor
I got my first real Infobright update since January. Highlights included:
| Categories: Data warehousing, Infobright, MySQL, Open source | 2 Comments |
Infobright’s open source move has a lot of potential
Infobright announced today that it’s going full-bore into open source – specifically in the MySQL ecosystem — with the licensing approach, pricing, distribution strategy, and VC money from Sun that such a move naturally entails. I think this is a great idea, for a number of reasons: Read more
| Categories: Data warehousing, Infobright, MySQL, Open source, Uncategorized | 4 Comments |
Infobright goes open source — sound bites
As has recently become my custom when there is industry news, I herewith provide quotable sound bites about Infobright and its move to an open source strategy. Weather permitting, I’ll be on a plane to the Netezza conference this afternoon. And I’ve only slept about 10 hours since Thursday. So I hope these suffice, although if they don’t and you email me I’ll try to respond by some time Tuesday morning.
- For almost anybody in the MySQL world who needs high-performance analytics, Infobright is the first good solution.
- Infobright’s product strengths and use cases are a great match for open source.
- Most leading analytic DBMS have open source roots, but they generally haven’t been open sourced themselves. Infobright immediately becomes one of the premier open source analytic database offerings. The only serious open source rival that’s coming to mind is MonetDB.
- Storage engines are MySQL’s achilles heel. Each good MySQL storage engine is precious.
- Infobright has enough production references to show that it can get the job done for many data mart uses. It won’t meet everybody’s needs, but it’s well worth an experimental download.
- If you want to build a little data mart and run it yourself, most good products are too complicated or expensive. But in the right use cases, Infobright pretty much runs itself, and there’s no arguing with the Community Edition price (free).
- So Infobright is a great fit for the individual downloader – i.e., for the stereotypical open source user.
- Netezza, DATAllegro, Vertica, ParAccel, Greenplum, and Aster Data are all based in one way or another on PostgreSQL (even though Vertica includes no PostgreSQL code). DATAllegro was based on Ingres. Infobright and Kickfire are based on MySQL.
- If Infobright doesn’t get the job done, try downloading Vertica, which – while closed source – is also free for download and development.
- The “rough set” part of Infobright’s story is a lot of mumbo-jumbo, but the “knowledge grid” part is more real.
- When you compare Infobright to Teradata, Netezza, Greenplum, or even Vertica, it’s kind of a toy. But when you compare it to generic MySQL, it’s more like rocket science.
- Infobright was too little, too late in the mainstream analytic DBMS market. They had to do something different. Kudos to them for recognizing that.
- The Infobright product has some serious limitations. If you want a market that’s willing to adopt a DBMS with serious limitations, the MySQL world is the place for you.
Posts today on open source DBMS
- Infobright’s smart move to open source
- General Infobright update
- Infobright sound bites
- The many faces of open source DBMS
| Categories: Data warehousing, Infobright, MySQL, Open source | 3 Comments |
Top DBMS on Linux
I was looking up George Crump’s blogs in connection with his recent post on SSDs, and I stumbled upon one that outlines at great length what features Linux backup systems should have. I won’t claim to have read it word for word, but what did catch my eye were a couple of comments on DBMS market share, which boiled down to:
- Oracle
- MySQL
- PostgreSQL
| Categories: IBM and DB2, Market share, MySQL, Oracle, PostgreSQL | Leave a Comment |
Sun’s Rock chip is going to revolutionize OLTP? Yeah, right.
Ted Dziuba offers a profane and passionate screed to the effect that it would be really, really wonderful if Sun’s forthcoming Rock chip magically revolutionized OLTP. His idea — if I may dignify it with that term — seems to be that by solving some programming issues in multithreading, Sun will achieve orders of magnitude performance improvements in DBMS processing, with MySQL as the beneficiary.
Frankly, I don’t know what in the world Dziuba is talking about, and I strongly suspect that neither does he. Wikipedia wasn’t terribly enlightening, except to point out that some of the ideas originated with Tom Knight, which is encouraging. Ars Technica has a decent article about the Rock chip, but it’s hard to find support for Dziuba’s enthusiasm in their more sober discussion.
| Categories: MySQL, OLTP | 4 Comments |
Microsoft is buying DATAllegro
I’ve long argued that:
- Oracle and Microsoft are doomed in the data warehouse market unless they acquire MPP/shared-nothing data warehouse DBMS and/or data warehouse appliances.
- DATAllegro is the ideal acquisition for either of them.
Microsoft has now validated my claim by agreeing to buy DATAllegro. As you probably know, we’ve been covering DATAllegro extensively, as per the links listed below.
Basic deal highlights include:
Pushback on the PostgreSQL vs. MySQL comparison
It should come as no surprise that not everybody agrees with EnterpriseDB’s views on the PostgreSQL/MySQL comparison. In particular, the High Availability MySQL blog offers a detailed rebuttal post, with more in the comment thread. According to MySQL fans, EnterpriseDB got its facts wrong on several matters regarding MySQL and InnoDB, especially in the areas of triggers and locking. And of course they disagree with EnterpriseDB’s general conclusion. ![]()
| Categories: MySQL, Open source, PostgreSQL | Leave a Comment |
How is MySQL’s join performance these days?
In a comment thread on a recent post comparing MySQL to Postgres, Jonathon Moore chimed in based on experience with both products. His characterization of some MySQL problems: Read more
| Categories: Infobright, MySQL, Open source | 6 Comments |
PostgreSQL vs. MySQL, as per EnterpriseDB
EnterpriseDB put out a white paper arguing for the superiority of PostgreSQL over MySQL, even without EnterpriseDB’s own Postgres Plus extensions. Highlights of EnterpriseDB’s opinion include:
- EnterpriseDB asserts that MyISAM is the only MySQL storage engine with decent performance.
- EnterpriseDB then bashes MyISAM for all sorts of well-deserved reasons, especially ACID-noncompliance.
- EnterpriseDB asserts that row-level triggers, lacking in MySQL but present in PostgreSQL, are the most important kind of trigger.
- EnterpriseDB claims PostgreSQL is superior in procedural language support to MySQL.
- EnterpriseDB claims PostgreSQL is superior in authentication support to MySQL.
| Categories: EnterpriseDB and Postgres Plus, Mid-range, MySQL, Open source, PostgreSQL | 10 Comments |
Unreliable web MySQL application (Technorati/Wordpress)
Technorati yesterday exposed an application error, to wit (in what presumably should be a blog content region): Read more
| Categories: MySQL | 6 Comments |
Yahoo scales its web analytics database to petabyte range
Information Week has an article with details on what sounds like Yahoo’s core web analytics database. Highlights include:
- The Yahoo web analytics database is over 1 petabyte. They claim it will be in the 10s of petabytes by 2009.
- The Yahoo web analytics database is based on PostgreSQL. So much for MySQL fanboys’ claims of Yahoo validation for their beloved toy … uh, let me rephrase that. The highly-regarded MySQL, although doing a great job for some demanding and impressive applications at Yahoo, evidently wasn’t selected for this one in particular. OK. That’s much better now.
- But the Yahoo web analytics database doesn’t actually use PostgreSQL’s storage engine. Rather, Yahoo wrote something custom and columnar.
- Yahoo is processing 24 billion “events” per day. The article doesn’t clarify whether these are sent straight to the analytics store, or whether there’s an intermediate storage engine. Most likely the system fills blocks in RAM and then just appends them to the single persistent store. If commodity boxes occasionally crash and lose a few megs of data — well, in this application, that’s not a big deal at all.
- Yahoo thinks commercial column stores aren’t ready yet for more than 100 terabytes of data.
- Yahoo says it got great performance advantages from a custom system by optimizing for its specific application. I don’t know exactly what that would be, but I do know that database architectures for high-volume web analytics are still in pretty bad shape. In particular, there’s no good way yet to analyze the specific, variable-length paths users take through websites.
| Categories: Analytic technologies, Columnar database management, Data warehousing, MySQL, PostgreSQL, Specific users, Theory and architecture, Yahoo | 9 Comments |
Relational purists should root for ScaleDB
I just put up a long post about a small development-stage company, ScaleDB. The punchline is that ScaleDB has a data access method — an extension of Patricia tries — that gives referential integrity and updatable views for free.
People who think current “relational” DBMS aren’t relational enough often suggest that’s the kind of foundation DBMS should have. And unlike Required Technologies’ TransRelational (TM) shtick, ScaleDB’s really is an OLTP-oriented approach.
Please subscribe to our feed!
| Categories: MySQL, Theory and architecture, TransRelational | Leave a Comment |
ScaleDB presents The Revenge of the Pointer
The MySQL user conference is upon us, and hence so are MySQL-related product announcements, including storage engines. One such is Kickfire. ScaleDB — smaller and earlier-stage — is another.
In a nutshell, ScaleDB’s proposition is:
-
Innovative approach to indexing relational DBMS, providing performance advantages.
-
Shared-everything scale-up that ScaleDB believes will leapfrog the MySQL engine competition already in Release 1. (In my opinion, this is the least plausible part of the ScaleDB story.)
-
State-of-the-art me-too facilities for locking, logging, replication/fail-over, etc., also already in Release 1.
Like many software companies with non-US roots, ScaleDB seems to have started with a single custom project, using a Patricia trie indexing system. Then they decided Patricia tries might be really useful for relational OLTP as well. The ScaleDB team now features four developers, plus half-time or so “Chief Architect” involvement from Vern Watts. Watts seems to pretty much have been Mr. IMS for the past four decades, and thus surely knows a whole lot about pointer-based database management systems; presumably, he’s responsible for the generic DBMS design features that are being added to the innovative indexing scheme. On ScaleDB’s advisory board is PeopleSoft veteran Rick Berquist, about whom I’ve had fond thoughts ever since he talked me into focusing on consulting as the core of my business.*
*More precisely, Rick pretty much tricked me into doing a day of consulting for $15K, then revealed that’s what he’d done, expressing the thought that he’d very much gotten his money’s worth. But I digress …
ScaleDB has no customers to date, but hopes to be in beta by the end of this year. Angels and a small VC firm have provided bridge loans; otherwise, ScaleDB has no outside investment. ScaleDB’s business model thoughts include:
| Categories: Data models and architecture, Mid-range, MySQL, OLTP, Open source, ScaleDB, Theory and architecture | Leave a Comment |
Supporting evidence for the DBMS disruption story
As previously announced, I did a webcast this afternoon, discussing database diversity. The title of the talk was taken directly from a post – What leading DBMS vendors don’t want you to realize — that argued mid-range DBMS are suitable for a broad variety of tasks. The overriding theme was a Clayton Christensen-style “disruption” narrative.
The sponsor was EnterpriseDB, which is fitting. While not the biggest DBMS industry disrupter in terms of revenue or visible impact (MySQL and Netezza say “Hi”), the Postgres family in general and EnterpriseDB in particular epitomize the disruption threat like nobody else, because of how broadly they substitute for market-leading database managers.
As I promised on the call, below is a post with links to further research backing up the points made. They’re numbered to match some of the presentation slides, which you can find at this link.
3. Much of the discussion of database diversity comes from a series of posts I coordinated with Mike Stonebraker.
4. At various times, starting on Slide 4, I made reference to datatype extensibility, a key feature of Oracle and DB2 – and a key advantage of Postgres over MySQL.
10. Capping off the database diversity discussion, Slide 10 mirrors this 11-point version of a data management software taxonomy.
13-14. I’ve posted many times about data warehousing DBMS and related technologies, including this overview of major analytic DBMS products, another recent overview of data warehouse specialty technologies, and an attempt to distinguish between data warehouse appliance myths and realities. Of particular interest for further research may be our sections on data warehouse appliances and columnar DBMS.
15. I do most of my posting about text search over on Text Technologies, specifically in the search category. Vendors I specifically mentioned as blending search with other kinds of data retrieval were Mark Logic and Attivio.
16. There’s a section here on native XML database management.
17. We also have a section on managing RDF and other graphical data models.
18. Ditto complex event/stream processing.
19. The only embeddable DBMS I’ve written much about recently is solidDB. And frankly, even in that case I’ve focused more on mid-tier caching uses, the now-canceled MySQL relationship, or general technology than I did specifically on embedded uses.
22-24. Back in February, 2007 I made what is probably still my clearest post explaining why I think market-leading DBMS vendors are in the process of getting disrupted
| Categories: EnterpriseDB and Postgres Plus, Mid-range, MySQL, Open source, Oracle, PostgreSQL | Leave a Comment |
Kickfire is de-cloaking
Kickfire, the renamed C2, is doing one of those buzz-building rollouts in which they make sure the first word comes from people on their payroll golly-gee-whizzing. You can see those at Xarpb and Diamond Notes, as well as a forthcoming article in MySQL magazine. Farhan Mashraqi also appears to be involved. Kickfire is also sponsoring the MySQL user conference next week.
I plan to write more after I get some substance, but a few things seem clear:
1. Kickfire’s product is an appliance that functions as a MySQL storage engine.
2. There’s a custom chip involved.
3. Kickfire plans to throw around the “stream processing” buzzphrase a lot.
Now, “stream processing” means a lot of different things to different people. E.g., Netezza uses the phrase just because their FPGA throws away a lot of data before ever routing it to more conventional SQL processing. But pending a briefing, I’m guessing that Kickfire’s sense is similar to what underlies the case for using CEP in BI.
Edit: Here’s an update after an actual Kickfire briefing.
| Categories: Analytic technologies, Data warehouse appliances, Data warehousing, Kickfire, MySQL | 7 Comments |
XML versus sparse columns in variable schemas
Simon Sabin makes an interesting point: If you can have 30,000 columns in a table without sparsity management blowing up, you can handle entities with lots of different kinds of attributes. (And in SQL Server you can now do just that.) The example he uses is products — different products can have different sets of possible colors, different kinds of sizes, and so on. An example I’ve used in the past is marketing information — different prospects can reveal different kinds of information, which may have been gathered via non-comparable marketing programs.
I’ve suggested this kind of variability as a reason to actually go XML — you’re constantly adding not just new information, but new kinds of information, so your fixed schema is never up to date. But I haven’t detected many actual application designers who agree with me …
| Categories: MySQL, Native XML, Theory and architecture | 3 Comments |
EnterpriseDB unveils Postgres Plus
EnterpriseDB is making a series of moves and announcements. Highlights include:
- Renaming/repositioning the product as “Postgres Plus.” The free product is now Postgres Plus, while the version you pay EnterpriseDB for is now Postgres Plus Advanced Server.
- Repackaging the products, so that Postgres Plus Advanced Server is a strict superset of Postgres Plus.
- New features added to Postgres Plus Advanced Server.
- Features newly migrated from Advanced Server down to Postgres Plus.
- A strategic investment by IBM.
- Stressing Postgres in EnterpriseDB marketing, and dropping the tag-line defining themselves as “the Oracle-compatible database company.”
So far as I can tell, most of the technical differences between Advanced Server and regular Postgres Plus lie in three areas: Read more
| Categories: Cache, Emulation, transparency, portability, EnterpriseDB and Postgres Plus, Mid-range, MySQL, OLTP, Open source, PostgreSQL | 1 Comment |
More Twitter weirdness
Twitter commonly has the problem of duplicate tweets. That is, if you post a message, it shows up twice. After a little while, the dupe disappears, but if you delete the dupe manually, the original is gone too.
I presume what’s going on is that tweets are cached, the tweets are eventually batched to disk, and they don’t always get deleted from cache until some time after they’re persisted. If you happen to check the page of your recent tweets inbetween — boom, you get two hits. But what I don’t understand is why the two versions have different timestamps.
Presumably, this could be explained at a MySQL User Conference session next month, one of whose topics will be Intelligent caching strategies using a hybrid MemCache / MySQL approach. I’m so glad they don’t use stupid strategies to do this … Read more
| Categories: Cache, MySQL, OLTP, Specific users | 3 Comments |
IBM discontinues the solidDB MySQL engine
Last year, I thought that solidDB could at least potentially be an outstanding MySQL engine. But as per news posted on SourceForge last week, that’s not going to happen. At least, it’s not going to happen via any development efforts from IBM.
| Categories: IBM and DB2, Mid-range, MySQL, Open source, solidDB | 4 Comments |
Database management system choices — mid-range-relational
This is the fourth of a five-part series on database management system choices. For the first post in the series, please click here.
The other threat to the high-end relational DBMS vendors aims squarely at the heart of their business. It’s the mid-range relational database management systems, which are doing an ever-larger fraction of what their high-end cousins can. That said, different products do different things well. So if you’re not blindly paying up for the security of an all-things-to-all-people high-end DBMS, there are a number of factors you might want to consider.
| Categories: Database diversity, EnterpriseDB and Postgres Plus, Mid-range, MySQL, OLTP, PostgreSQL, Theory and architecture | 2 Comments |
What hard-core transactional applications have actually been built in MySQL, PostgreSQL, EnterpriseDB, or FileMaker?
And here’s the biggie.
Question of the day #3
What complex, high-volume transactional applications have actually been built in mid-range DBMS such as MySQL, PostgreSQL, FileMaker, or EnterpriseDB?
I’ve been flamed for suggesting that MySQL or FileMaker aren’t fully equal to Oracle and DB2 in supporting hard-core transactional applications. (Which is ironic, because I’ve also been flamed for suggesting hard-core transactional support isn’t as big a deal for DBMS selection as some relational purists insist. But I digress …) So I’m putting the question out there — what impressive transactional applications do the stand-alone mid-range DBMS actually support? Read more
| Categories: EnterpriseDB and Postgres Plus, FileMaker, Mid-range, MySQL, OLTP, Open source, PostgreSQL | 20 Comments |
A high write-volume MySQL user
Spinn3r crawls and indexes blogs. It says it covers 1 million blogs and 25K posts/hour, doing thousands of write transactions per second. And it does this into federated MySQL — but with a lot of software built on top. To wit: Read more
| Categories: MySQL, Specific users | 1 Comment |
