March 25th, 2008 Curt Monash
Oliver Ratzesberger and his crew have started a blog, focusing on xldb analytics. Naturally, one of the early posts gives a quick overview of their system stats. Highlights include:
Incoming data volumes exceed 40TB per day, with more than 10^11 new items/lines/records being added per day. Our analytical processing infrastructure exceeds 6PB of physical storage with over 2.9PB(1.4+1.5) in our largest cluster.
We leverage compression technologies wherever possible and are achieving compression ratios as high as 99% on our highest volume data feeds.
On any given day our massive parallel systems process more than 27PB of data, not factoring in various levels of caches that serve similar activities or processes and reduce the amount of physical IOs significantly.
We execute millions of requests on a daily basis, spanning from near realtime highly localized access to enormous jobs that span 100s of TB in a single or series of models.
Posted in Specific users, eBay | No Comments »
February 27th, 2008 Curt Monash
I’ve posted a couple times about eBay’s analytics side. As a companion, Don Burleson pointed me at a fascinating November, 2006 slide presentation outlining eBay’s transactional architecture and evolution. Highlights include:
- A whole lot of manual slicing of Oracle databases, so as not to exceed their capacity.
- A whole lot of careful design and ordering of transactions.
- Putting all the business logic in the application tier, with a custom O/R mapper. There’s lots of caching there, but very little state.
The presentation has a bunch of specific numbers, in case anybody wants to dive in.
Please subscribe to our feed!
Technorati Tags: transaction processing, OLTP
Posted in OLTP database management, Specific users, eBay | No Comments »
February 26th, 2008 Curt Monash
There’s been some confusion over my post about eBay’s multiple petabytes of data. So to clarify, let me say:
- eBay’s figure of >1.4 petabytes of data — for its largest single analytic database — counts disks or something, not raw user data.
- I previously published a strong conjecture that the database vendor in question was Teradata, which is definitely an eBay supplier. In particular, it is definitely not an Oracle data warehouse.
- While eBay isn’t saying who it is either — not even off-the-record — the 50%ish compression figures they experience just happen to map well to Teradata’s usual range.
- Edit: Just to be clear — not that there was any doubt, but I have reconfirmed that eBay is a Teradata user, in or including eBay’s Paypal division.
Please subscribe to our feed!
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Relational database management systems, Specific users, Teradata, eBay | No Comments »
February 11th, 2008 Curt Monash
Single largest database >1.4 petabytes.
From Oliver Ratzesberger’s LinkedIn profile:
Our systems process in excess of 10 billion records per day, serving thousands of users and delivering hundreds of millions of queries per month in a true global 24×7 operation with distributed teams around the globe on systems over 5 PB in size (largest single system >1.4PB).
Posted in Specific users, eBay | 3 Comments »
October 8th, 2007 Curt Monash
According to a hurried conversation I had with Chief Marketing Office Darryl MacDonald, Teradata has customers with over 1 petabyte of user data in a single instance. He wouldn’t disclose any names, but I’d guess one is eBay, who he did confim is a customer. The intelligence area is another one where I’d speculate there are Very Large Databases.
However, since Darryl mentioned testing systems internally up to 4 petabytes, I’d guess the upper limit of Teradata deployments is in the 1-2 petabyte range.
EDIT: I’m now guessing that Teradata’s largest classified database — which previously was the largest overall — isn’t much over a petabyte in size. And there’s a strong chance this is larger than any unclassified one.
Update: That wasn’t really 1+ petabyte of user data.
Technorati Tags: Teradata, petabyte, data warehouse
Posted in Analytics and analytic technologies, Data warehouse appliances, Data warehousing, Specific users, Teradata, eBay | No Comments »
August 8th, 2006 Curt Monash
Every sufficiently large or agile enterprise needs to follow the DBMS2 approach. The following is from an article on eBay’s version:
“eBay has built a software-based Integration Tier. This contains both a data access layer (DAL) and a services framework. The Integration Tier acts as an abstraction layer for software engineers to work with many disparate back-end data sources through a consistent set of abstractions.”
Posted in EII, ETL, and/or EAI, Specific users, eBay | No Comments »