Domain Knowledge for Ads/Recommendation Summary

2022-01-01

Basics

Recommendation Metric

Overfitting

Cold start

Exploration & exploitation.

Two tower

Items recall

Items ranking

Items re-ranking

Session modeling

Sequence model.

Multi-taks

Cross domain

Feature cross

Contrastive learning

Bias

Embedding

NAS

Distillation

Pre-train

Reinforcement learning

Issue 1: online vs offline performance discrepancy.

  1. GAUC; 2. Update model faster; 3. Wait for more labels;

Issue 2: one epoch overfitting

Issue 3: feature skew

Issue 4: feature leakage/across

The feature has strong correction with label. Train/eval metric will diversify.

Issue 5: model blow-up

Issue 6: feature drift

In some special time, like Black Friday.

Issue 7: model stale/feature stale

Issue 8: traffic dorminate

When meet with the Ad, the user is already converted.

Issue 9: strong bias features

Time/position related feature. Rerun the pctr model to get the eval for position 0. Or for the position, use the position offline, but set to 0 online.

Courses

Books/Paper

  1. On the Factory Floor ML Engineering for industrial-scale Ads Recommendation Models.
  2. DCN
  3. SE-Net

Blogs/Website