Free Tool · No signup required

LLM Token Counter & Cost Estimator

Estimate tokens and API cost for GPT-4o, Claude, Gemini, and more — paste any prompt

Provider

Model

Input tokens

Context used

0.0%

Input cost

$0.000090

Total cost

$0.002090

Context window — GPT-4o36 / 128,000 tokens

Input Text

131 chars

Token Visualization — each color block = ~1 token

approx.

The quick brown fox jumps over the lazy dog. Large language models tokenize text differently depending on the model and vocabulary.

Cost Breakdown

Output tokens: 200

	Tokens	Rate (per 1M)	Cost
Input	36	$2.50	$0.000090
Output	200	$10.00	$0.002000
Total	236		$0.002090

Token counts are approximations using a BPE heuristic. For exact counts use the official tokenizer for each model. Prices as of Q2 2025 — verify at provider pricing pages.

Pasting a long document into an LLM API and wondering how many tokens it will use — and what it will cost? Paste your text, pick a model, and instantly see an estimated token count, what percentage of the context window it occupies, and the estimated input cost in USD. Supports GPT-4o, GPT-4 Turbo, Claude Opus, Claude Sonnet, Claude Haiku, Gemini 1.5 Pro, and more.

How to Count Tokens and Estimate Cost

Paste your text, pick your model, and read the estimate instantly.

Step 1

Paste your text or prompt

Paste any text — a system prompt, a document, a conversation, or a code snippet — into the input area. The character count updates immediately.

Step 2

Select a model

Choose the target model from the dropdown. Each model uses a slightly different tokeniser and has different pricing — the estimate updates for every selection.

Step 3

Read the token estimate and cost

The panel shows estimated token count, context window usage as a percentage, and the estimated input cost in USD. Use this to optimise prompts before making expensive API calls.

Features

Supports GPT-4o, GPT-4 Turbo, GPT-3.5, o1, o3-mini

Supports Claude Opus, Sonnet, and Haiku (Anthropic)

Supports Gemini 1.5 Pro and Gemini Flash

Token estimate with configurable chars-per-token calibration

Context window usage shown as a percentage

Input cost estimated in USD

Free, no API key required, no data sent to any server

Frequently Asked Questions

How are tokens counted?

Token count is estimated using a characters-per-token calibration factor specific to each model family. GPT models average roughly 4 characters per token; Claude models average roughly 3.8. This is an approximation — for exact counts, use the official tokeniser library (tiktoken for OpenAI, or the Anthropic token counting API).

Why does token count differ between models?

Each model family uses a different tokeniser (BPE vocabulary). GPT models use tiktoken; Claude models use a different vocabulary. The calibration factors in this tool reflect typical prose — code and non-English text may differ.

Does this tool send my text to any API?

No. All calculation is done in the browser using simple arithmetic. Your text is never sent to OpenAI, Anthropic, Google, or any server.

What is a context window?

The context window is the maximum number of tokens a model can process in a single request — including both the input (prompt + history) and the output (completion). Exceeding it causes the request to fail or the oldest messages to be truncated.