▶ Interactive Lab

Tokenizer Compression Comparison

Same text, different tokenizers.

Advertisement
GPT-4 tiktoken ~4 chars/token. Llama ~3.8. Gemma ~3.6 (huge vocab).

What you're seeing

Compression rate affects context window utilization and inference cost.

★ KEY TAKEAWAY
Tokenization compression varies: GPT-4 ~4 chars/token, Llama ~3.8, Gemma ~3.5 (huge vocab). Affects context window utilization and cost.
▶ WHAT TO TRY
  • Paste your own text and tokenize.
  • Bigger vocabs compress better but inflate the embedding matrix.