Ronnie05's Blog

Facebook, Big Data and Project Prism

Posted in Big Data by Manas Ganguly on August 24, 2012

Facebook processes 2.5 billion pieces of content and 500+ terabytes of data each day. It’s pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour. The speed of data ingestion keeps on increasing, and the world is getting hungrier and hungrier for data. Facebook’s latest effort is about putting all this data in some perspective, to mine this data for insights across different storage clusters with efficient use of resources and cost leading to real time live performance management on data outputs. And to achieve a seamless integration of data across huge data centres, Facebook has put in place initiatives such as Project Prism and Corona.

‘Project Prism,’ will allow Facebook to maintain data in multiple data centers around the globe while allowing company engineers to maintain a holistic view of it, thanks to tools such as automatic replication. Corona, makes its Facebooks’ Apache Hadoop clusters less crash-prone while increasing the number of tasks that can be run on the infrastructure.

So while Google is indexing information around the world, Facebook is indexing user behavior and reactions to a wide range of stimulus around the world. Now then, the only thing that Facebook would ideally want to fix is the ability to sell this data and get a good price for its share.

Tagged with: ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: