Model Context Protocol servers expose tools to agents. The basic spec is simple; design choices that distinguish toy servers from production-grade ones live in error semantics, observability, and tool surface design.

Advertisement

Tool surface design

Narrow tools beat fat ones. 'search_documents' + 'get_document' beats 'documents(verb='search'|'get')'. Each tool has a clear contract. Document the contract precisely; the model reads it.

Error semantics

Errors should teach. 'Invalid argument' is useless. 'Argument 'limit' must be 1-100; received 500' lets the model retry correctly. Same applies to upstream service failures.

Advertisement

Observability

Log each call with: caller (agent ID), arguments (with PII scrubbing), result, latency. Trace correlation across multiple tool calls in one conversation. Metrics: call count, error rate, p99 latency per tool.

Narrow tools, teaching errors, full observability. The 'works on my machine' MCP server fails in production.