LLM Token Counter & Cost Estimator
Estimate tokens and API cost for GPT-4o, Claude, Gemini, and more — paste any prompt
Input tokens
36
Context used
0.0%
Input cost
$0.000090
Total cost
$0.002090
| Tokens | Rate (per 1M) | Cost | |
|---|---|---|---|
| Input | 36 | $2.50 | $0.000090 |
| Output | 200 | $10.00 | $0.002000 |
| Total | 236 | $0.002090 |
Token counts are approximations using a BPE heuristic. For exact counts use the official tokenizer for each model. Prices as of Q2 2025 — verify at provider pricing pages.
Pasting a long document into an LLM API and wondering how many tokens it will use — and what it will cost? Paste your text, pick a model, and instantly see an estimated token count, what percentage of the context window it occupies, and the estimated input cost in USD. Supports GPT-4o, GPT-4 Turbo, Claude Opus, Claude Sonnet, Claude Haiku, Gemini 1.5 Pro, and more.
How to Count Tokens and Estimate Cost
Paste your text, pick your model, and read the estimate instantly.
Paste your text or prompt
Paste any text — a system prompt, a document, a conversation, or a code snippet — into the input area. The character count updates immediately.
Select a model
Choose the target model from the dropdown. Each model uses a slightly different tokeniser and has different pricing — the estimate updates for every selection.
Read the token estimate and cost
The panel shows estimated token count, context window usage as a percentage, and the estimated input cost in USD. Use this to optimise prompts before making expensive API calls.
Features
Supports GPT-4o, GPT-4 Turbo, GPT-3.5, o1, o3-mini
Supports Claude Opus, Sonnet, and Haiku (Anthropic)
Supports Gemini 1.5 Pro and Gemini Flash
Token estimate with configurable chars-per-token calibration
Context window usage shown as a percentage
Input cost estimated in USD
Free, no API key required, no data sent to any server
Related Tools
Frequently Asked Questions
How are tokens counted?
Token count is estimated using a characters-per-token calibration factor specific to each model family. GPT models average roughly 4 characters per token; Claude models average roughly 3.8. This is an approximation — for exact counts, use the official tokeniser library (tiktoken for OpenAI, or the Anthropic token counting API).
Why does token count differ between models?
Each model family uses a different tokeniser (BPE vocabulary). GPT models use tiktoken; Claude models use a different vocabulary. The calibration factors in this tool reflect typical prose — code and non-English text may differ.
Does this tool send my text to any API?
No. All calculation is done in the browser using simple arithmetic. Your text is never sent to OpenAI, Anthropic, Google, or any server.
What is a context window?
The context window is the maximum number of tokens a model can process in a single request — including both the input (prompt + history) and the output (completion). Exceeding it causes the request to fail or the oldest messages to be truncated.