LLMs are surprisingly good at calling tools — but only if the tool descriptions are written for them. Bad tool descriptions cause silent failures; good ones produce reliable agents. Five rules cover most production needs.
Tool name = verb_object
Use get_weather, send_email, create_invoice. The model uses the name as primary signal. Avoid generic names (process, handle) — model can't infer when to use them.
Description: what + when
'Get current weather for a city.' (what) 'Use when the user asks about weather, temperature, or precipitation.' (when). Most teams write only WHAT and wonder why the model misroutes queries. The WHEN is what disambiguates similar tools.
Argument descriptions: examples > types
Type alone ('string') is barely useful. location: string, e.g., 'Bangalore' or 'New York, NY' tells the model what format works. For enums, list all values: unit: 'celsius' or 'fahrenheit'.
Timeout + retry
Every tool call has a network round-trip. Set explicit timeout (5-10s for fast APIs, 30-60s for LLM-as-tool). On timeout, return a clear error to the agent: 'Tool timed out — try a simpler query'. Retry only idempotent tools.
Idempotency
If a tool sends an email, never auto-retry — duplicates are user-visible. Include an idempotency_key argument the agent fills in. The tool dedupes server-side. Same pattern as payment processors.