Token
TOH-kun
The smallest unit of text an LLM processes. Roughly 4 characters or 3/4 of a word. Tokens determine cost and context limits.
A token is the smallest unit of text an LLM processes. It is not a word. It is not a character. It is something in between. The word "hamburger" is three tokens: "ham," "bur," "ger." The word "the" is one token. A typical English sentence of 15 words is about 20 tokens.
Tokens matter because everything in AI is priced and measured in tokens. OpenAI charges per million input and output tokens. Claude charges per million tokens. Your context window is measured in tokens. The cost of a single API call depends on how many tokens go in and how many come out.
GPT-4o charges $2.50 per million input tokens and $10 per million output tokens. If you send a 2,000-token prompt and get a 500-token response, that call costs $0.01. One penny. Do that a million times and it is $10,000. Token economics determine whether your AI feature is profitable.
Examples
Estimating API costs for a product feature.
Your AI chatbot handles 10,000 conversations per day. Average conversation: 3,000 input tokens, 800 output tokens. Using Claude 3.5 Sonnet at $3/$15 per million tokens, daily cost is $90 + $120 = $210 per day. About $6,300 per month.
A developer hits the context window limit.
A developer pastes their entire codebase (50,000 tokens) into Claude's 200k context window along with a 2,000-token prompt. It works. They try the same thing with GPT-4o's 128k window and a 150,000-token codebase. It fails. The input exceeds the context window.
Tokenization differences across languages.
English text tokenizes efficiently: roughly 1 token per word. Chinese, Japanese, and Korean text can use 2-3x more tokens for the same semantic content. A 1,000-word English document might be 1,300 tokens. The equivalent content in Japanese might be 3,000 tokens.
In practice
Read more on the blog
Related terms
The maximum amount of text an LLM can process in a single request. Measured in tokens. Bigger windows handle more information at once.
A neural network trained on massive text data to generate and understand language. The technology behind ChatGPT, Claude, and Gemini.
Writing instructions that get the best output from an AI model. The difference between a useless response and a useful one.

Want the complete playbook?
Picks and Shovels is the definitive guide to developer marketing. Amazon #1 bestseller with practical strategies from 30 years of marketing to developers.