Perplexity measures how surprised a language model is.
How It Works:
Perplexity quantifies a model?s uncertainty over a text sequence: lower values mean the model predicts the next token more confidently.
Key Benefits:
Real-World Use Cases:
Generally, but very low perplexity may indicate overfitting.
Depends on corpus-20-50 for English news, higher for diverse web text.