System Design for Ads/Recommendation Summary
Basics
Basic Component
- DNS(domain network system): domain -> ip address.
- web server: process query and return response.
- database: separate data tier and computation tier.
- load balancer: split the traffic to different server.
- cache: between web server and DB to improve the data access in memory.
- CDN: content delivery network, deliver the static content to save the web server capability and improve the response time.
- data center: each has its own web servers, Db, caches. Share NoSQL.
- message queue: seperate the producer/consumers logic, to make web servers/specific service more scalable.
- logging: aggregate the errors message and monitor message together.
- metrics: host level metrics, aggregated level metrics(DB, cache performance), DAU, revenue.
- monitoring: monitor the health status of the system.
- automation: like CICD.
Items
- Vertical & horizontal scaling: single machine boost vs multi-machine boost. For the horizontal scaling, resharding data(consistent hashing), celebrity problem(hot key), join and de-normalization(join the normalized data to one).
- Data replication: replicate the database to improve the stability/availibility of DB.
- Stateless web tier(stateless architecture): all servers share storage NoSQL.
- Stateful architecture: each server is responsible for specific part of users. Hard to handle the failure cases.
How to scale
- Stateless architecture: keep web tier stateless
- Replication: build redundancy at every tier
- Cache: as much you can
- Multiple data center: support
- Sharding: scale the DB
- Message queue: separate the producer/consumers, split tiers into individual services
- Monitor/logging/automation/metrics
System Design Steps
- Understand and clarify the scope
- Propose high level design
- Design deep dive
- Wrap: further discussion.Monitor/Rollback/Error logs/Scal up