Stonebraker, DeWitt, et al. compare MapReduce to DBMS
Along with five other coauthors — the lead author seems to be Andy Pavlo — famous MapReduce non-fans Mike Stonebraker and David DeWitt have posted a SIGMOD 2009 paper called “A Comparison of Approaches to Large-Scale Data Analysis.” The heart of the paper is benchmarks of Hadoop, Vertica, and “DBMS-X” on identical clusters of 100 low-end nodes., across a series of tests including (if I understood correctly):
- A couple of different flavors of a Grep task originally proposed in a Google MapReduce paper.
- A database query on simulated clickstream data
- A join on the same clickstream data.
- Two aggregations on the clickstream data.
Categories: Analytic technologies, Hadoop, MapReduce, Michael Stonebraker, Parallelization, Vertica Systems | 6 Comments |
Amazon Elastic MapReduce
Amazon is introducing a beta of Amazon Elastic MapReduce. What it boils down to is cheap, on-demand Hadoop.
This seems like a great way to experiment with MapReduce and see if you like it. But for serious use, I don’t know why you wouldn’t prefer MapReduce more closely integrated into a DBMS.
Categories: Amazon and its cloud, Cloud computing, MapReduce | 1 Comment |
CSQL: Yet another in-memory DBMS for caching
A few of you care about obscure in-memory DBMS products. Well, I was just e-mailed about another one, apparently called CSQL or CSQLcache. As of now, CSQL has a SourceForge website, a Wikipedia entry, and a blog.
One interesting thing on that blog is a taxonomy of caches — Level 1 cache, Level 2 cache, RAM, disk, etc., with some approximate figures for lookup times. Edit: However, Kevin Closson emailed me to say it’s way out of date. Stay tuned to his blog for more on the subject.
Categories: Cache, In-memory DBMS, Memory-centric data management | 3 Comments |
Ingres update
I talked with Ingres today. Much of the call was fluff — open-source rah-rah, plus some numbers showing purported success, but so finely parsed as to be pretty meaningless. (To Ingres’ credit, they did offer to let me talk w/ their CFO, even if they offered no promises as to whether he’d offer any more substantive information.) Highlights included: Read more
Categories: Actian and Ingres, Data warehousing, EnterpriseDB and Postgres Plus, MySQL, Open source, Oracle, PostgreSQL, Sybase | 6 Comments |
Donald Farmer knocks the April Fool 8-ball out of the park
Donald Farmer has an excellently-crafted April Fool post about a revolution in business intelligence. Look at the character names, for example.
I wonder whether Donald learned operations research from that textbook where two main decision-making characters were Mark Off and his father Pop, an example company was Edifice Wrecks, and an example CEO was Dawn Shirley Light …
Categories: Analytic technologies, Business intelligence, Humor | 1 Comment |
April Fool’s Day highlights
Amazon says it’s taking “cloud” computing to new heights, as it were.
Derivative funds and large government-subsidized entities will be especially interested in FACE’s transmodal operation. They can allocate a dedicated FACE, load it up with data, and then send it out to sea to perform advanced processing in safety. The government will have absolutely no chance of acting against them, because they will be too busy trying to decide which Federal Air Regulation (FAR) was violated, not to mention scheduling news conferences.
First excellent April Fool’s joke I saw this year was from The Guardian. The best so far is from Expedia. Others are linked in my Twitter feed. And personally, I’m encouraging the concept of April No-Fooling Day.
Categories: Amazon and its cloud, Cloud computing, Humor | 1 Comment |
Business intelligence notes and trends
I keep not finding the time to write as much about business intelligence as I’d like to. So I’m going to do one omnibus post here covering a lot of companies and trends, then circle back in more detail when I can. Top-level highlights include:
- Jaspersoft has a new v3.5 product release. Highlights include multi-tenancy-for-SaaS and another in-memory OLAP option. Otherwise, things sound qualitatively much as I wrote last September.
- Inforsense has a cool composite-analytical-applications story. More precisely, they said my phrase “analytics-oriented EAI” was an “exceptionally good” way to describe their focus. Inforsense’s biggest target market seems to be health care, research and clinical alike. Financial services is next in line.
- Tableau Software “gets it” a little bit more than other BI vendors about the need to decide for yourself how to define metrics. (Of course, it’s possible that other “exploration”-oriented new-style vendors are just as clued-in, but I haven’t asked in the right way.)
- Jerome Pineau’s favorable view of Gooddata and unfavorable view of Birst are in line with other input I trust. I’ve never actually spoken with the Gooddata folks, however.
- Seth Grimes suggests the qualitative differences between open-source and closed-source BI are no longer significant. He has a point, although I’d frame it more as being about the difference between the largest (but acquisition-built) BI product portfolios and the smaller (but more home-grown) ones, counting open source in the latter group.
- I’ve discovered about five different in-memory OLAP efforts recently, and no doubt that’s just the tip of the iceberg.
- I’m hearing ever more about public-facing/extranet BI. Information Builders is a leader here, but other vendors are talking about it too.
A little more detail Read more
Categories: Application areas, Business intelligence, Information Builders, Inforsense, Jaspersoft, QlikTech and QlikView, Scientific research, Tableau Software | 8 Comments |
Lots of analytic DBMS vendors are hiring
After writing about a Twitter jobs page, it occurred to me to check out whether analytic DBMS vendors are still hiring. Based on the Careers pages on their websites, I determined that Aster, Greenplum, Kickfire, and ParAccel all evidently are, in various mixes of (mainly) technical and field positions. At that point I got bored and stopped.
I didn’t choose those vendors entirely at random. If I had to name three vendors who are said to have had small layoffs at some point over the past few quarters, it would be ParAccel, Greenplum, and Kickfire. So if even they are hiring, the analytic DBMS sector is still pretty healthy … or at least thinks it is. 😉
Categories: Aster Data, Data warehousing, Greenplum, Kickfire, ParAccel | 5 Comments |
Somebody is spreading Teradata acquisition rumors again
An mass email from Tom Coffing was forwarded to me today that starts:
I have heard from reliable sources that both HP and SAP have purchased more than 5% of Teradata stock. My sources tell me that both companies appear to be positioning themselves for a bid.
I got my version of the same email from Coffing yesterday with a different introduction but otherwise the same substance (he’s pushing a new product of his). It also had a different From address.
Possible explanations include but are not limited to:
- Coffing knows something (seems unlikely, but I haven’t actually checked www.sec.gov to confirm or disconfirm)
- Coffing thinks he knows something
- Coffing just made this up (I hope not)
- There’s an April Fool’s Day prank going on (not by me — after my bizarre March, I’m recusing myself from April Fool’s pranks this year)
Categories: Data warehousing, HP and Neoview, SAP AG, Teradata | 4 Comments |
Twitter is considering using MapReduce
From a Twitter job listing (formatting mine). The most interesting section is “Additional preferred experience.” Read more
Categories: Analytic technologies, Data warehousing, MapReduce, Specific users, Web analytics | 6 Comments |