I am annoyed with my former friends at Greenplum, who took umbrage at a brief sentence I wrote in October, namely “eBay has thrown out Greenplum“. Their reaction included:
- EMC Greenplum no longer uses my services.
- EMC Greenplum no longer briefs me.
- EMC Greenplum reneged on a commitment to fund an effort in the area of privacy.
The last one really hurt, because in trusting them, I put in quite a bit of effort, and discussed their promise with quite a few other people.
Yes, that five-word sentence really seems to have been the problem. I’ve heard that from more than one source.
I think the rest is overwrought too, and not just because I regret the loss of revenue, or of what seemed to be a warm, friendly, hug-laden, and sushi-intensive relationship with Scott Yara and some other folks. At various times, on the subject of its eBay installation:
- Greenplum overoptimistically told me that eBay’s Teradata installation would be replaced with Greenplum gear.
- Greenplum exaggerated the pace of its eBay installation; unfortunately, I believed them, and later had to publish a retraction.
- Greenplum neglected to tell me when eBay had its Greenplum equipment removed.
Now the same Scott Yara who hovered over me for months in marketing micromanagement before I broke the news of the Greenplum and Teradata eBay installations — he could do that because the whole discussion started out under NDA — doesn’t answer my email. Evidently, Greenplum thinks it’s OK to repeatedly be misleading, but doesn’t think it’s OK if my nuance is one they disagree with.
The most entertaining example I recall of Greenplum BS was when CTO Luke Lonergan told 50+ academics at the 2009 XLDB that Greenplum had 10 customers with half a petabyte each of data. I followed him out of the room and said “10 customers — half a petabyte each — I presume that’s for sufficiently small values of ‘one half’?” We eventually settled on a value of “one half” in the 0.2 range — which is actually a pretty impressive claim in itself.
Be all that as it may, EMC Greenplum has a couple of press releases out on which I’ve been asked to comment. One is a deal with SAS, less impressive than SAS’ deals with Teradata and Aster Data in that it offers no actual in-database modeling. Yes, it sounds like modeling on the same nodes where the data sits, but it sounds less desirable than true in-database modeling in that:
- You can only get great performance if the amount of data modeled is small enough to fit into RAM.
- Integration with other database processing, MapReduce, etc. may be limited.
Also, EMC Greenplum expanded its line of appliances, to include one that seems optimized for price-per-terabyte and one with solid-state drives. So far, that’s very standard stuff. There’s also a new data loading appliance, which seems to catch up with the Aster Data’s 2008 strategy of having separate nodes for bulk loading.
Ironically, when Aster moved away from a total reliance on that strategy, it was becoming more Greenplum-like. As is so often the case, it seems that different vendors’ feature sets are converging.
Meanwhile, the last I heard about Greenplum’s previously very strategic Chorus effort is that it’s being revamped. I don’t get the impression it’s nearly as central to Greenplum’s strategy as it used to be.