Greenplum (and Truviso) advisor Joseph Hellerstein offers a few examples of MapReduce applications (specifically Greenplum MapReduce), namely:
The big aha moment occured for me during our panel discussion, which included Luke Lonergan from Greenplum, Roger Magoulas from O’Reilly, and Brian Dolan from Fox Interactive Media (which runs MySpace among other web properties).
Roger talked about using MapReduce to extract structured entities from text for doing tech trend analyses from billions of rows of online job postings. Brian (who is a mathematician by training) was talking about implementing conjugate gradiant and Support Vector Machines in parallel SQL to support “hypertargeting” for advertisers. I mentioned how Jonathan Goldman at LinkedIn was using SQL and MapReduce to do graph algorithms for social network analysis.
Incidentally: While it’s been some months since I asked, my sense is that the O’Reilly text extraction is home-grown, and primitive compared to what one could do via commercial products. That said, if the specific application is examining job postings, I’m not sure how much value more sophisticated products would add. After all, tech job listings are generally written in a style explicitly designed to ensure that most or all of their meaning is conveyed simply by a bag of keywords. And by the way, this effort has been underway for quite some time.
- Greenplum has a page on the O’Reilly relationship. However, the part that isn’t behind a registration barrier is trivial — and I wouldn’t know one way or the other about the registration-required part.