What to cache

Long system prompts. Few-shot examples. RAG documents reused across queries. Tool definitions.

Advertisement

Cache breakpoints

Anthropic: up to 4 cache breakpoints per prompt. Content up to each breakpoint cached separately.

Advertisement

TTL

Typically 5 minutes since last hit. Some providers offer longer TTL for higher price.