How do we use perplexity to choose between models?

Perplexity guides our decisions on model scaling.
Alec Radford

How It Works:

Evaluate candidate models on a held-out dataset; select the one with the best trade-off between low perplexity and inference speed/cost.

Key Benefits:

  • Objective metric: Reduces guesswork.
  • Cost-effective: Avoid overspending on marginal perplexity gains.
  • Performance balance: Aligns accuracy with latency.

Real-World Use Cases:

  • Chatbots: Pick the smallest model that meets perplexity targets.
  • Summarization: Choose a model that balances coherence and speed.

FAQs

How large should the evaluation set be?
Can perplexity predict user satisfaction?