Tools are the most operationally sensitive part of many MCP servers because they execute work on demand. Good tool design combines clear naming, explicit schemas, predictable outputs, and runtime validation that does not trust the host or model blindly.
This page is about the execution contract of tools: how they are described, called, validated, and evolved safely.
Treat a tool like a public API operation that just happens to be callable through an MCP host. If the schema, error model, or permission model would be unacceptable in an API review, it is also unacceptable here.
In current MCP revisions, tool definitions include a programmatic name, optional human-friendly title metadata, a description, an input schema, and optionally an output schema and annotations. Those annotations can communicate important hints such as read-only, destructive, idempotent, or open-world behavior.
That metadata helps both host UX and model decision-making. A host can render clearer labels, show warning badges for destructive tools, or deprioritize tools that are obviously unrelated to the current request.
Tool validation should happen before backend execution and should cover required fields, types, ranges, enum values, and semantic rules. Schema validation alone is not enough. If a query can only target approved repositories or a date range must stay within policy, those checks still belong in your handler.
When a tool fails, the server should return a clear structured error pattern rather than dumping an unparsed stack trace. In the official TypeScript SDK, tool handlers can return `isError: true` with explanatory content while still staying inside a valid tool result shape.
Tool outputs should be useful to both humans and hosts. Text content is often necessary, but structured output is equally valuable because it lets clients reason over fields rather than scraping prose.
When a tool result is large, attach a resource or resource link instead of embedding everything directly. That keeps tool calls responsive and reduces unnecessary context-window pressure.
A good MCP tool is boring in the best way: its name is obvious, its schema is narrow, its side effects are documented, its errors are predictable, and its output is easy for a host or model to consume. The model may be flexible, but the tool boundary should not be. Flexible schemas create ambiguous calls, fragile validation, and surprising backend behavior.
Use small tools that match actual operations. Instead of one `manage_ticket` tool with an action string, create separate tools such as `search_tickets`, `draft_ticket_reply`, and `update_ticket_status` when authorization and side effects differ. This makes tool selection easier for the model and safer for the server.
Every tool should define what happens on invalid input, missing permissions, backend timeout, partial success, and duplicate calls. Idempotency is especially important for write tools. If a user or host retries after a network failure, the server should avoid creating duplicate tickets, double-sending messages, or repeating payments.
Tool results feed the model next. That means output formatting affects both usefulness and safety. Return only what the next step needs, include source identifiers when facts matter, and avoid dumping raw backend objects full of hidden fields. If the model needs a summary, provide a bounded summary plus references rather than a massive payload.
For sensitive domains, treat output as a data product. Mask secrets, remove internal stack traces, limit row counts, and make truncation explicit. A safe tool call can still become a data leak if the result contains more than the user or model should see.
A production MCP tool should be boring, explicit, and difficult to misuse. Start with the operation name. A model should understand from the name and description when the tool is appropriate and when it is not. Then review the schema. Required fields, enums, typed objects, and clear descriptions reduce ambiguity before the request reaches backend code.
Next review side effects. Read tools, draft tools, write tools, destructive tools, and privileged tools should not be mixed into one broad command. They have different authorization, approval, retry, audit, and rollback requirements. If a tool can do many unrelated things based on an action string, split it.
Finally, review output. Tool results become model context, so they should be bounded, structured, source-aware, and scrubbed of sensitive fields. A successful backend response can still be unsafe if it includes secrets, internal stack traces, hidden identifiers, or too much irrelevant data.
Create a test matrix for every tool. Include valid input, missing required fields, wrong types, unknown fields, unauthorized caller, unauthorized target object, backend timeout, backend error, oversized result, and duplicate write attempt. If a tool cannot pass this matrix, it is not ready for model-driven use.
For write tools, add idempotency tests. The same request may be retried after a network failure, user refresh, or worker restart. Without idempotency keys or backend reconciliation, the agent may create duplicate tickets, repeated notifications, or repeated financial actions.
Also test tool descriptions. Give the model several similar tools and ask which one it would call for representative tasks. If selection is unreliable, improve names, descriptions, schemas, or split confusing capabilities. Tool usability is part of tool safety.
This follows the official TypeScript SDK registration style and models a read-only documentation search tool.
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';
const server = new McpServer({ name: 'docs-server', version: '1.0.0' });
server.registerTool(
'search_docs',
{
title: 'Search Internal Docs',
description: 'Search approved internal documentation by keyword.',
inputSchema: {
query: z.string().min(3),
limit: z.number().int().min(1).max(10).default(5),
},
outputSchema: {
results: z.array(
z.object({
uri: z.string(),
title: z.string(),
snippet: z.string(),
})
),
},
annotations: {
readOnlyHint: true,
idempotentHint: true,
},
},
async ({ query, limit }) => {
const rows = await runApprovedSearch(query, limit);
const output = {
results: rows.map((row) => ({
uri: row.uri,
title: row.title,
snippet: row.snippet,
})),
};
return {
content: [
{
type: 'text',
text: `Found ${output.results.length} matching documents.`,
},
],
structuredContent: output,
};
}
);
const transport = new StdioServerTransport();
await server.connect(transport);
server.registerTool(
'create_ticket',
{
description: 'Create a support ticket in the approved queue.',
inputSchema: {
title: z.string().min(10),
severity: z.enum(['low', 'medium', 'high']),
},
},
async ({ title, severity }) => {
try {
const id = await createApprovedTicket(title, severity);
return {
content: [{ type: 'text', text: `Created ticket ${id}` }],
};
} catch (error) {
return {
content: [
{
type: 'text',
text: 'Ticket creation failed. Check queue permissions and payload policy.',
},
],
isError: true,
};
}
}
);
Not every tool must, but output schemas are valuable whenever hosts or downstream logic rely on structuredContent.
No. They are hints to clients, not enforcement controls.
Explore 500+ free tutorials across 20+ languages and frameworks.