Tokens are the atoms of language models.
How It Works:
A token is a chunk of text (word, subword, or character) that models process individually; tokenization breaks input into these units before inference.
Key Benefits:
Real-World Use Cases:
Depends on tokenizer; “running” may split into “run” + “ning.”
Models cap inputs at a fixed token count (e.g., 8,192 tokens).