A Day in the Life of Facebook Operations

2010-06-29 11:14

[facebook]

How big Facebook is?

and servers, 60K?

Rapid Growth



正在俄勒冈建立第3个数据中心

architect



Hiphop for PHP

memcached 300+TB live data in RAM
All services are independent.

Configure Managment

use CFengine 3

Deployment

we focus on web push.
Use BitTorrent to deploy source code, It's very fast.





一般开发模式可能是 Engineering/QA/Operations,但是我们没有QA,因为沟通成本太高
我们的Engineers write,and deploy their own code, 能够快速关注性能问题,流量等
运营人员‘embedded’ int oengineering teams,来更好的做好架构决策,更好的理解产品



Change logging
所做的操作都记录在案,大部分更新时间都在凌晨1点


Monitor and Metrics

Ganglia is very Fast.

another graph tool ODS



We basically use Nagios.

Aggregation



How we do it

Constant Growth
Constant Failure
Logical Units
servers, racks, clusters, datacenters
Constant Communication
Everyone use IRC


Small Teams



Recap


screenshot from youtube.