Debugging & Inspector: Tracing Capability Discovery, Calls, and Failures

MCP Debugging

MCP debugging is most effective when you debug the protocol session before debugging the model experience. A broken server can fail during process startup, initialize negotiation, capability discovery, schema validation, authorization, backend execution, transport framing, or host rendering. Each layer has different evidence.

The MCP Inspector is the fastest way to isolate server behavior because it lets you connect to a server, inspect advertised capabilities, call tools, read resources, fetch prompts, and observe protocol-level failures without the noise of a full host application.

This page teaches a repeatable troubleshooting workflow for local stdio servers, remote HTTP servers, host integrations, and production deployments. The goal is to move from “the MCP server is broken” to a precise diagnosis such as “the host is running an old build,” “tools/list does not include the expected capability,” “stdout is polluted before initialize,” or “the remote server rejects the token scope during invocation.”

Inspector Commands You Will Actually Use

Run the Inspector against the same command, build output, environment variables, and remote URL that your real host uses. Otherwise you may prove a different server works.

Inspector Commands You Will Actually Use

# Local stdio server from built JavaScript
npx @modelcontextprotocol/inspector node dist/index.js

# Local TypeScript server during development
npx @modelcontextprotocol/inspector npx tsx src/index.ts

# Pass environment variables required by the server
npx @modelcontextprotocol/inspector -e API_BASE_URL=https://internal.example.com -e LOG_LEVEL=debug node dist/index.js

# Rebuild when the host points at dist/
npx tsc
npx @modelcontextprotocol/inspector node dist/index.js

If Inspector cannot initialize the server, the host will not fix the problem.
If Inspector works but the host fails, compare command path, working directory, environment variables, build output, and host logs.
For stdio servers, any regular console output on stdout can corrupt protocol traffic. Write diagnostics to stderr.
Keep one known-good Inspector command in the project README so every developer can reproduce the same baseline.

Failure Layer Map

Layer	Typical Symptom	Best Evidence
Startup	Server never appears or exits immediately.	Manual command run, exit code, stderr, host process logs.
Initialize	Client connects but session fails before tools/resources/prompts appear.	Initialize request/response, protocol version, capability metadata.
Discovery	Expected tool, resource, or prompt is missing.	tools/list, resources/list, prompts/list output and permission filters.
Validation	Tool rejects input that looks correct from the model conversation.	Input schema, parsed arguments, field-level validation errors.
Authorization	Capability exists but call is denied or hidden for one user.	Token scopes, tenant, object-level policy decision, audit logs.
Backend execution	MCP call starts but external API, database, filesystem, or queue fails.	Dependency logs, timeout metrics, retry behavior, sanitized exception class.
Transport	Works locally but not remotely, or messages are truncated/interleaved.	HTTP status, headers, session IDs, proxies, stdout/stderr discipline.
Host rendering	Inspector result is correct but the host displays or selects it poorly.	Host logs, capability descriptions, response shape, UX constraints.

Debugging Order

Start with the narrowest reproduction that still shows the failure. If a server cannot start from the same command the host uses, do not debug model behavior. If it starts but cannot initialize in Inspector, focus on protocol setup. If initialize and discovery work in Inspector but the host fails, compare host configuration and runtime environment.

This order saves hours because MCP failures often look mysterious from the chat surface while being obvious at the protocol layer. A missing build, wrong absolute path, polluted stdout stream, invalid schema, unsupported protocol version, or missing token scope can all appear to the user as “the assistant cannot use the tool.”

A good workflow produces evidence at every step. You should be able to say which layer passed, which layer failed, and what exact request, response, log line, or status code proves it.

Run the exact server command outside the host and confirm it stays alive.
Connect with Inspector and verify initialize completes.
Inspect tools/list, resources/list, prompts/list, and resource templates before invoking anything.
Call one minimal valid tool and one intentionally invalid tool input.
Read at least one resource and fetch at least one prompt if the server exposes them.
Test authorization paths with an allowed identity and a denied identity for remote servers.
Only after those checks pass, debug host-specific selection, rendering, caching, or UX behavior.

Inspector Evidence

Inspector is not just a convenience UI. It is a protocol isolation tool. It removes the model, conversation state, host tool-selection heuristics, and product UI from the debugging loop so you can test the server contract directly.

Use Inspector to prove the server advertises what you think it advertises. Tool names, descriptions, input schemas, resource URIs, prompt arguments, and capability metadata are not implementation details; they are the contract the host sees. If that contract is wrong, the model will make bad choices even if the handler code is perfect.

Initialize succeeds with the protocol version and capabilities you expect.
Discovery returns the exact tools, resources, prompts, and templates you intended.
Tool schemas reject invalid input before backend calls.
Tool results are structured, bounded, and useful for a model response.
Resource reads return the right MIME type, content shape, and size.
Prompt arguments are discoverable and produce reusable workflows.
Server logs and user-facing errors tell different audiences the right amount of detail.

stdio Startup

Local MCP servers usually run over stdio. That means the host starts a child process and speaks JSON-RPC over stdin and stdout. The most important debugging rule is simple: stdout is protocol traffic. Do not print banners, debug text, stack traces, progress messages, or ordinary logs to stdout.

Many “server not found” bugs are not protocol bugs at all. They are process bugs: wrong executable path, missing dependency, stale compiled file, incorrect working directory, missing environment variable, permission problem, or a command that works in your terminal but not in the host environment.

Symptom	Likely Cause	First Check
Server exits immediately.	Unhandled startup exception, missing env var, bad import, missing dependency.	Run the exact command manually and inspect stderr plus exit code.
Inspector hangs during connect.	Process started but never entered MCP server loop.	Add startup diagnostics to stderr and confirm server transport is connected.
Initialize fails with parse errors.	Non-protocol text printed to stdout.	Search for console.log, print, echo, or framework startup banners.
Works with tsx but not host.	Host points at stale dist build.	Run the configured dist command and rebuild with tsc.
Works in terminal but not host.	Different PATH, cwd, env vars, or permissions.	Use absolute paths and log safe environment diagnostics to stderr.

Initialization Failures

Initialize is the first real protocol checkpoint. During initialize, the client and server establish that they can speak the protocol and exchange capability metadata. If initialize fails, tool handlers, resource readers, and prompts are irrelevant because the session never reached normal operation.

Version or capability mismatches should be handled explicitly. A robust server does not assume every client supports every optional behavior, and a robust client does not assume a server exposes tools, resources, prompts, roots, logging, or subscriptions unless discovery confirms them.

Record the protocol version used by the client and server during initialize.
Check advertised server capabilities against the features your host expects.
Fail with a clear initialization error if required configuration is missing.
Avoid lazy startup failures that appear only after the first tool call when they could be detected at initialize time.
Do not advertise capabilities whose handlers are not fully registered and ready.

Discovery Failures

Discovery is where the host learns what the server can do. If discovery output is unclear, missing, overbroad, or permission-blind, the host may hide capabilities, select the wrong tool, or show a confusing user experience.

Treat discovery responses as a primary debugging artifact. The tool name, description, input schema, resource URI, prompt arguments, and metadata are what the host and model actually see. The code you intended to write matters less than the contract that was advertised.

Discovery Problem	Consequence	Fix
Tool name is vague.	Model chooses it in the wrong situations.	Rename with action plus domain noun, such as search_runbooks or create_incident_update.
Description omits constraints.	Host cannot explain safe usage clearly.	State what the tool does, what it does not do, and important permission constraints.
Schema accepts generic strings or objects.	Model sends ambiguous or unsafe arguments.	Use enums, formats, minimums, maximums, and explicit required fields.
Resources list inaccessible objects.	User learns data exists even when reads are denied.	Filter discovery by identity and object-level permissions.
Prompt variables are unclear.	Reusable workflow becomes brittle.	Use named arguments with domain-specific descriptions and examples.

Tool Failures

Tool debugging should separate input parsing, schema validation, authorization, backend execution, result formatting, and host rendering. If these are collapsed into one catch-all error, every failure looks the same and every incident takes longer to diagnose.

Run three cases for every important tool: a minimal valid request, an invalid request that should fail before backend access, and an unauthorized request that should fail before sensitive work. These tests prove the server contract, not just the happy path.

Log the tool name, request ID, safe user or tenant identifier, validation status, authorization decision, backend duration, and final status.
Return user-safe errors to the host while keeping stack traces and backend details in logs.
Make validation errors specific enough for the model to correct input on the next attempt.
Keep result shapes stable so hosts and downstream model behavior do not break unexpectedly.
Add explicit limits for pagination, result count, response size, and backend timeout.

Resource and Prompt Failures

Resources fail differently from tools. A resource may be discoverable but unreadable, readable but too large, correctly addressed but stale, or returned with the wrong MIME type. Debug resource URI construction and permission checks as carefully as tool schemas.

Prompts fail when their template contract is unclear. Missing variables, weak descriptions, hard-coded context, or hidden assumptions make prompts look reusable while behaving inconsistently across hosts and users.

Capability	Failure Mode	Debugging Move
Resource	URI cannot be resolved.	Log parsed URI parts and verify template variables.
Resource	Content is too large for useful model context.	Add summaries, pagination, section reads, or range selection.
Resource	User can list but cannot read.	Align discovery filtering with read authorization.
Prompt	Required variable is missing or ambiguous.	Inspect prompts/list metadata and prompts/get arguments.
Prompt	Template depends on hidden server state.	Expose required inputs explicitly or pair the prompt with resources/tools.

Remote Authorization

Remote MCP debugging adds HTTP, identity, session, proxy, and deployment concerns. A server may be correct locally but fail remotely because the request lacks a bearer token, the token has the wrong audience, the reverse proxy strips headers, CORS policy blocks a web host, or session state is not preserved.

For enterprise deployments, debug authorization at discovery and invocation. A capability that is hidden during discovery may be a permission issue, not a registration issue. A tool that appears but fails during call may be hitting object-level authorization or scope checks.

Verify TLS, endpoint path, HTTP method, headers, and response status before inspecting MCP payloads.
Check token issuer, audience, expiry, scopes, tenant, and subject mapping.
Confirm discovery is filtered for the current identity and not accidentally cached across users.
Log authorization allow/deny decisions with policy reason codes and correlation IDs.
Test expired, missing, wrong-scope, and cross-tenant tokens deliberately.
Inspect proxy and load balancer behavior when streaming, session affinity, or long-running requests are involved.

Server Logging

Logging should tell an operator what happened without leaking what the model saw or what the user owns. Log identifiers, capability names, decisions, durations, counts, and sanitized error classes. Avoid raw prompts, tokens, secrets, full resource bodies, and large customer records unless you have an explicit controlled audit requirement.

For stdio servers, send logs to stderr. For remote services, use normal structured service logging plus metrics and traces. MCP logging notifications can be useful for client-visible diagnostics, but they should not replace durable server-side logs.

Use a request or correlation ID from initialize through discovery and invocation where possible.
Log capability registration counts at startup and discovery counts per identity during debug sessions.
Separate validation_error, authorization_denied, backend_timeout, backend_error, and result_truncated statuses.
Record response size and truncation decisions for large tool or resource outputs.
Emit audit events for sensitive reads, writes, exports, deletes, and administrative actions.
Redact tokens, API keys, cookies, raw prompts, and confidential resource content by default.

Host Diagnostics

After Inspector passes, move to the host. Host failures usually involve configuration, stale builds, restart behavior, environment variables, connector permissions, or the host deciding not to surface a capability. The server may be healthy while the host is using a different command, an old binary, or a filtered capability list.

For Claude Desktop-style local integrations, server status and host logs are often enough to find startup and config issues. For custom clients or web hosts, inspect client-side MCP session handling, auth token flow, transport implementation, and how the host maps discovered capabilities into model instructions.

Restart the host fully after changing server config or compiled output when hot reload is not supported.
Use absolute paths for commands, args, and config references.
Compare the host command exactly against the known-good Inspector command.
Check whether the host has cached discovery output from an earlier server version.
Inspect host logs before changing server code that already passes Inspector.
For custom clients, trace initialize, discovery, invocation, error handling, and reconnect behavior.

Incident Response

Production debugging should not begin with asking a user to reproduce the issue in chat. A production MCP server needs enough observability to identify failing capabilities, affected tenants, dependency failures, latency spikes, auth denials, payload truncation, and protocol errors from telemetry.

Define incident runbooks around failure layers. If tools/list latency spikes, investigate discovery dependencies and permission filters. If one tool has a high validation error rate, inspect schema descriptions and host prompts. If authorization denials jump after a deploy, compare scope mapping and policy changes.

Metric or Signal	Why It Matters
Initialize success rate	Detects broken deployments and version/capability negotiation failures.
Discovery latency and capability counts	Finds slow permission filters and accidental capability exposure/removal.
Tool call rate by tool and status	Shows which workflows are active and which are failing.
Validation error rate	Reveals schema or description problems causing bad model arguments.
Authorization denial rate	Highlights policy issues, missing scopes, or attempted misuse.
Backend dependency latency	Separates MCP server health from downstream system health.
Response size and truncation count	Catches unbounded outputs before they degrade model interactions.

Protocol Debugging Examples

Correlate Requests, Responses, and Failures

This small trace analyzer detects a response with no matching request and a request that never received a response.

Correlate Requests, Responses, and Failures

events = [
    {"direction": "out", "id": 10, "method": "tools/list"},
    {"direction": "out", "id": 11, "method": "tools/call"},
    {"direction": "in", "id": 10, "result": {"tools": []}},
    {"direction": "in", "id": 99, "error": {"code": -32602}},
]

pending = {}
for event in events:
    if event["direction"] == "out":
        pending[event["id"]] = event["method"]
        continue

    method = pending.pop(event["id"], None)
    if method is None:
        print("ORPHAN RESPONSE:", event["id"])
    elif "error" in event:
        print("FAILED:", method, event["error"]["code"])
    else:
        print("COMPLETED:", method)

for request_id, method in pending.items():
    print("MISSING RESPONSE:", request_id, method)

Correlation IDs distinguish protocol failures from handler failures.
The example reports response ID 99 as orphaned and request ID 11 as incomplete.
Real traces should also include server identity, duration, and safe error details.

Diagnose Invalid Tool Arguments

The validator reports precise field-level problems before backend code runs, making Inspector output actionable.

Diagnose Invalid Tool Arguments

schema = {"required": {"city": str, "days": int}, "allow_unknown": False}
arguments = {"city": "Delhi", "days": "three", "units": "metric"}

errors = []
for field, expected_type in schema["required"].items():
    if field not in arguments:
        errors.append(f"missing field: {field}")
    elif not isinstance(arguments[field], expected_type):
        errors.append(f"{field} must be {expected_type.__name__}")

if not schema["allow_unknown"]:
    unknown = set(arguments) - set(schema["required"])
    errors.extend(f"unknown field: {field}" for field in sorted(unknown))

print({"valid": not errors, "errors": errors})

The result identifies the incorrect days type and unexpected units field.
Precise validation errors are safer and faster to debug than backend stack traces.
Production validators should use the declared JSON Schema rather than a parallel hand-written schema.

Before you move on

Debugging Runbook

8 checks

Run the exact configured server command manually before opening the host.
Use Inspector to verify initialize, discovery, valid calls, invalid calls, resource reads, and prompts.
Keep stdout protocol-clean for stdio servers and send diagnostics to stderr.
Compare Inspector command, host command, working directory, environment variables, and build output.
Inspect tools/list, resources/list, and prompts/list as contracts, not incidental metadata.
Separate validation, authorization, backend, transport, and host-rendering failures in logs.
Return model-usable errors while keeping sensitive diagnostic details in server logs.
Add production metrics for initialize success, discovery latency, tool status, auth denials, and backend latency.

Debugging Questions

Start with process startup. Run the exact configured command manually, use absolute paths, inspect stderr, check missing environment variables, and confirm the process stays alive before debugging protocol behavior.

Usually no. Log identifiers, capability names, counts, durations, sanitized error classes, and correlation IDs instead. Raw prompts, tokens, secrets, and resource bodies should be redacted unless a controlled audit policy explicitly allows them.

The server contract is probably healthy, so compare host configuration: command path, args, working directory, environment variables, build output, restart behavior, host logs, and any cached discovery state.

Previous Next

Debugging & Inspector: Tracing Capability Discovery, Calls, and Failures

MCP Debugging

Inspector Commands You Will Actually Use

Failure Layer Map

Debugging Order

Inspector Evidence

stdio Startup

Initialization Failures

Discovery Failures

Tool Failures

Resource and Prompt Failures

Remote Authorization

Server Logging

Host Diagnostics

Incident Response

Protocol Debugging Examples

Correlate Requests, Responses, and Failures

Diagnose Invalid Tool Arguments

Debugging Runbook

Debugging Inspector Connections

Debugging Questions

What should I debug first when a server does not appear in a host?

Should I log full prompts and resource content?

If Inspector works but Claude Desktop or another host does not, what is most likely wrong?

Popular Tutorials