As previously hinted, Teradata has now announced 4 of the 5 members of its “Petabyte Power Players” club. These are enterprises with 1+ petabyte of data on Teradata equipment. As is commonly the case when Teradata discusses such figures, there’s some confusion as to how they’re actually counting. But as best I can tell, Teradata is counting:
- True user data (as opposed to spinning disk or whatever). I believe this part because I asked multiple times and got a consistent answer, and also because elsewhere in its presentations Teradata drew a clear distinction between user data and spinning disk.
- All systems the user has, including for redundancy, test, development, whatever. I believe this part because it’s what Eric Lai quoted Darryl MacDonald as saying in an article I can’t now find, and also because Oliver Ratzesberger of eBay put up slides suggesting he has two different 2+ TB Teradata systems, but no 5 TB one.
Teradata’s five Petabyte Power Players are:
- eBay, listed with 5 petabytes. Note: eBay is also the largest, multi-petabyte customer for one of the smaller data warehouse DBMS/appliance vendors, but that system is just going in, and hence has yet to prove itself.
- Walmart, listed with 2.5 petabytes. Note: To the best of my knowledge – and no thanks to the friendly but uncommunicative Walmart representative at the conference — that’s an order of magnitude more data than HP Neoview is targeted to manage at Walmart.
- Bank of America, listed with 1.5.
- An unnamed financial services company, listed with 1.4 petabytes.
- Dell, listed with 1 petabyte. Note: Dell is one of the three known customers for another data warehouse DBMS/appliance vendor.
eBay got the most attention, giving several talks. eBay is relatively mum on the actual benefits of analytics (competitive advantage and all that), but Oliver Ratzesberger did share a few points:
- eBay heavily tests every aspect of its web sites, even tiny ones.
- However, personalization based on true real-time analytics isn’t practical.
- eBay’s direct marketing gets 20-30% more efficient every year, with analytics as the driver of that improvement.
- Applying analytics to eBay’s own operations, parallel efficiency over servers was increased from under 50% to over 80% in less than a year. The only server count figure Oliver disclosed was >10,000, but I suspect it’s actually well over that. So that was like getting 1000s or 10s of 1000s of free servers, with the associated floor space and (at least to some extent) power savings.
Oliver also said that eBay’s vendor partners might get access to the analytics at some point for their own use. A more overwrought version of the same statement may be found in the headline here.