System Design for Ads/Recommendation Summary

2023-03-27

Basics

Basic Component

  1. DNS(domain network system): domain -> ip address.
  2. web server: process query and return response.
  3. database: separate data tier and computation tier.
  4. load balancer: split the traffic to different server.
  5. cache: between web server and DB to improve the data access in memory.
  6. CDN: content delivery network, deliver the static content to save the web server capability and improve the response time.
  7. data center: each has its own web servers, Db, caches. Share NoSQL.
  8. message queue: seperate the producer/consumers logic, to make web servers/specific service more scalable.
  9. logging: aggregate the errors message and monitor message together.
  10. metrics: host level metrics, aggregated level metrics(DB, cache performance), DAU, revenue.
  11. monitoring: monitor the health status of the system.
  12. automation: like CICD.

Items

  1. Vertical & horizontal scaling: single machine boost vs multi-machine boost. For the horizontal scaling, resharding data(consistent hashing), celebrity problem(hot key), join and de-normalization(join the normalized data to one).
  2. Data replication: replicate the database to improve the stability/availibility of DB.
  3. Stateless web tier(stateless architecture): all servers share storage NoSQL.
  4. Stateful architecture: each server is responsible for specific part of users. Hard to handle the failure cases.

How to scale

  1. Stateless architecture: keep web tier stateless
  2. Replication: build redundancy at every tier
  3. Cache: as much you can
  4. Multiple data center: support
  5. Sharding: scale the DB
  6. Message queue: separate the producer/consumers, split tiers into individual services
  7. Monitor/logging/automation/metrics

System Design Steps

  1. Understand and clarify the scope
  2. Propose high level design
  3. Design deep dive
  4. Wrap: further discussion.Monitor/Rollback/Error logs/Scal up