Facebook’s Hadoop and Hive Data Mining
Friday, October 24th, 2008
The second cloud computing track was on massive data processing in clusters and 15clouds, including a presentation by Facebook on their use of Hadoop and custom development of Hive to facilitate their own operations.
Data mining the 180 TB of Facebook data, which grows at 2 TB a day is no small task, so the team uses a cluster of 350 8-core machines to crunch data and figure out which popular features deserve further investment, demographics, and whatever else they can fish out of the sea of personal information users provide.


del.icio.us/mbotos