MCP Server Design: Patterns That Work in 2026

How to design an MCP server an AI agent can actually use: tool schemas, error semantics, auth, granularity and performance patterns for 2026.

Blueprint-style diagram representing the design of an MCP server
Updated How we review →
Rob
By Rob18 June 2026 · 7 min read

Most MCP servers fail for the same reason: they are built like an internal API, then handed to a model that has none of the context a human developer would. The fix is a shift in mindset. An MCP server is a prompt surface as much as a service, and the design patterns that matter are the ones that make a model use it correctly without hand-holding. Here are the ones that hold up.

What is an MCP server actually for?

The Model Context Protocol (MCP) is an open standard from Anthropic for connecting AI models to external tools and data through a small, consistent interface. An MCP server exposes three kinds of capability: tools (actions the model can call), resources (data the model can read), and prompts (reusable templates). The client - a model like Claude, or an agent framework - discovers these at connection time and decides what to use, communicating with the server over JSON-RPC (a lightweight remote-procedure-call protocol encoded in JSON).

The crucial point for design is that the model picks what to call based almost entirely on the names and descriptions you ship. There is no engineer in the loop reading your docs. That single fact drives every pattern below: you are writing for an LLM that will act on your schema literally. The companion to this piece, why most MCP servers fail, covers the failure modes; this one is about getting the design right from the start.

How should you design tool schemas?

Tool schemas are the highest-leverage surface in the whole server. Three rules carry most of the weight. First, name tools for intent, not implementation - search_orders beats query_db, because the model reasons about what it wants to achieve. Second, write the description as a mini spec: say what the tool does, when to use it and when not to, and what comes back. A description that reads 'gets orders' is useless; 'look up a customer's orders by email; returns up to 50 most recent; use get_order_detail for full line items' tells the model how to chain its next call. Third, type every parameter tightly and document its format inline, because loose types invite malformed calls.

This is context engineering applied to an API boundary - the same discipline I cover in context engineering for AI agents. The model only knows what your schema tells it.

What does good error handling look like?

Errors are where most servers quietly sabotage the agent. A model cannot recover from a raw stack trace or a generic 500 - it just retries blindly or gives up. Return errors the model can reason about: a clear message that states what went wrong and what to do instead. 'Order ID not found - check the format (8 digits) or use search_orders to find it' lets the model self-correct on the next turn; 'Internal Server Error' does not.

Distinguish the two failure classes explicitly. A user-recoverable error (bad input, missing record) should tell the model how to fix the call. A system error (the upstream service is down) should say so plainly so the model stops retrying and reports back rather than looping. Structured, actionable error text is one of the biggest reliability wins available, and it costs almost nothing to add.

How do you handle authentication safely?

Auth is where MCP design meets security reality. Never bake long-lived secrets into the server's exposed surface, and never return them in a tool result where they could land in the model's context and be logged. Scope credentials to the narrowest capability the server needs, and prefer short-lived, per-session tokens over static keys. For servers acting on a user's behalf, an OAuth-style flow that hands the server a scoped token keeps the blast radius small if anything leaks.

The principle is containment: assume any value that enters the model's context could be surfaced somewhere it should not be, and design so that the worst case is a scoped, expiring token rather than a master key.

How granular should tools be?

There is a real tension between a few broad tools and many narrow ones. Too few, and each tool becomes an overloaded grab-bag the model misuses. Too many, and the model spends its attention budget just choosing between near-identical options - and the tool list itself eats context on every call.

The pattern that works: design tools around tasks the model will actually want to perform, not around your database tables or your internal service boundaries. One search_orders and one get_order_detail beats eight tools that each fetch one field. Aim for a small set of clearly-distinct, well-described tools, and merge any two the model keeps confusing.

How do you keep an MCP server fast and cheap?

Performance in MCP is mostly about payload discipline, because every token a tool returns is read by the model and billed on the next step. Three habits matter. Return only what is needed - trim verbose JSON to the fields the model will use rather than dumping the whole record. Paginate large results and tell the model how to fetch more, instead of returning a thousand rows it cannot reason over. And summarise where you can - a server that returns 'found 240 matches; here are the top 10 by relevance' is far more useful than one that returns all 240.

Latency matters too: a tool that takes ten seconds stalls the whole agent loop. Cache what is stable, and keep individual tool calls fast enough that chaining several of them still feels responsive.

What are the most common MCP design mistakes?

The recurring ones: tool descriptions written for humans instead of the model; returning raw upstream JSON instead of a trimmed, relevant payload; opaque errors the model cannot recover from; secrets that leak into tool outputs; and a sprawl of near-identical narrow tools that fragment the model's choice. Each one degrades reliability while looking fine in a unit test, because the unit test is not an LLM trying to decide what to call next. Design for that reader and most of these disappear.

Frequently asked questions

Q01What is the Model Context Protocol (MCP)?
MCP is an open standard from Anthropic for connecting AI models to external tools and data through a consistent interface. An MCP server exposes tools (actions), resources (readable data) and prompts (templates) that a client model can discover and use.
Q02How detailed should an MCP tool description be?
Detailed enough that a model with no other context calls it correctly: state what the tool does, when to use it and when not to, the format of each parameter, and what it returns. The description is effectively part of the prompt, so vague wording causes wrong calls.
Q03Should MCP errors be human-readable or machine-readable?
Both, but written so the model can act on them. Return a clear message that says what went wrong and what to do instead - for example, the expected ID format or a tool to find the right value - rather than a raw stack trace or a generic 500.
Q04How many tools should an MCP server expose?
A small set of clearly-distinct, well-described tools designed around the tasks a model will want to perform - not one tool per database field. Too many near-identical tools fragment the model's choice and waste context on every call.