MCP Best Practices: Optimization and Performance
You have seven MCP servers connected. The AI takes 15 seconds to respond to simple questions because it is processing tool descriptions from servers you are not even using. When it does call a tool, the response floods the context window with 50,000 characters of raw JSON, pushing your earlier conversation out of memory. The database MCP returns stale schema information because it cached the metadata an hour ago. And when something breaks, you have no idea which server failed or why.
MCP is powerful, but it needs tuning. This guide covers the patterns that separate a frustrating MCP setup from a productive one.
What You’ll Walk Away With
Section titled “What You’ll Walk Away With”- Strategies for managing context window usage across multiple MCP servers
- Guidelines for which servers to connect when, and how many is too many
- Caching and performance patterns for MCP server responses
- A systematic approach to debugging MCP failures
- Team configuration patterns for shared development environments
How Many Servers Is Too Many?
Section titled “How Many Servers Is Too Many?”Every connected MCP server adds its tool descriptions to the AI’s system prompt. A typical server adds 500-2,000 tokens of tool descriptions. With seven servers, you might burn 10,000 tokens on tool descriptions before the conversation even starts.
The practical limit is 3-5 active servers per session. Beyond that, you hit two problems:
- Context window pressure. Tool descriptions compete with your conversation history and file contents for space in the context window.
- Tool selection confusion. With 40+ tools available, the AI may call the wrong tool or hesitate about which one to use.
Before you can prune, you need to see what is actually loaded. Each tool exposes this differently:
Open Settings > Tools & Integrations > MCP. Each server shows a status dot and an expandable list of the tools it contributes. Toggle a server off to drop its tool descriptions from the next request without deleting the config.
# List every configured server and its scopeclaude mcp list
# Inspect one server's command, env, and resolved statusclaude mcp get githubInside the REPL, /mcp shows connection status and lets you authenticate or reconnect a server interactively.
Servers live in ~/.codex/config.toml under [mcp_servers.<name>]. List and manage them with codex mcp list (and codex mcp get <name>) from the terminal, or run /mcp inside the TUI to see which servers are active and how many tools each loaded.
Server Selection by Task
Section titled “Server Selection by Task”Instead of connecting every server at startup, connect what you need for the current task:
| Task | Recommended Servers |
|---|---|
| Feature implementation | GitHub MCP + Context7 + Database MCP |
| Bug investigation | GitHub MCP + Playwright MCP + Database MCP |
| Design-to-code | Figma MCP + shadcn/ui MCP |
| Ticket-driven development | Atlassian MCP + GitHub MCP |
| Infrastructure work | Cloudflare MCP (Workers + KV + D1) |
| Documentation writing | Context7 + Filesystem MCP |
Context Window Management
Section titled “Context Window Management”The Problem: MCP Response Overflow
Section titled “The Problem: MCP Response Overflow”A database schema inspection can return 20,000+ characters. A GitHub code search can return dozens of file excerpts. A Confluence page can be 10,000 words. Each MCP tool response consumes context window space, and that space does not come back until the conversation resets.
The Solution: Scoped Prompts
Section titled “The Solution: Scoped Prompts”Instead of broad prompts, be specific about what you need:
Too broad: “Show me the database schema.” (Returns entire schema, possibly 30+ tables)
Better: “Show me the schema for the users and orders tables only, including their foreign key relationships.”
Too broad: “Search GitHub for authentication code.” (Returns dozens of files)
Better: “Search GitHub for files in the src/auth/ directory that handle JWT token validation.”
Caching and Performance
Section titled “Caching and Performance”MCP Server Startup Time
Section titled “MCP Server Startup Time”STDIO-based servers start a new process for each connection. If the server has heavy dependencies (a Python server importing ML libraries, a Node server bundling a large SDK), startup can take 5-10 seconds.
Mitigation strategies:
- Keep dependencies minimal. Your MCP server should be a thin wrapper, not a monolith.
- Pre-build your server. Ship compiled JavaScript instead of transpiling TypeScript at startup.
- Use
npx -yand ensure the package is cached locally by running it once before your work session. Pin a version inargs(for example@upstash/context7-mcp@1.0.0) so the startup cost is predictable and reproducible across the team.
Schema and Metadata Caching
Section titled “Schema and Metadata Caching”Database MCP servers often cache schema metadata at startup. If you run a migration, the MCP server will still report the old schema until restarted.
To restart an MCP server in Cursor:
- Open Settings > Tools & Integrations > MCP
- Find the server with the stale data
- Click the refresh icon to restart it
# Remove and re-add the server (use a maintained Postgres MCP, e.g. postgres-mcp)claude mcp remove database-serverclaude mcp add database-server -- npx -y postgres-mcpOr restart all MCP servers by exiting and restarting Claude Code. Inside the REPL, run /mcp to see which servers reconnected.
Codex reads servers from ~/.codex/config.toml ([mcp_servers.<name>]) at launch, so quit and restart the session to reinitialize a server. There is no per-server hot-restart. Confirm the reload with /mcp inside the TUI, or codex mcp list from the terminal.
Rate Limiting
Section titled “Rate Limiting”External API-backed servers (GitHub, Atlassian, Cloudflare) have rate limits. GitHub allows 5,000 API requests per hour for authenticated users. A single prompt like “analyze all open PRs” could burn through hundreds of requests.
Protect yourself:
- Limit the scope of your prompts. “Analyze the last 5 PRs” instead of “analyze all open PRs.”
- Use local alternatives when possible. The Git MCP server works against your local repo without API calls.
- Monitor your rate limit status. “Check the GitHub API rate limit status” is a valid prompt with the GitHub MCP server.
Debugging MCP Failures
Section titled “Debugging MCP Failures”Systematic Diagnosis
Section titled “Systematic Diagnosis”When an MCP server is not working, follow this sequence:
- Check server status. Is the server running? Can it start at all?
- Check authentication. Are the credentials valid? Did the OAuth token expire?
- Check the tool call. Is the AI calling the right tool with valid parameters?
- Check the response. Is the server returning data, an error, or nothing?
- Check the transport. Is the connection between client and server healthy?
Common Failures and Fixes
Section titled “Common Failures and Fixes”“Server disconnected” or “Connection reset.”
The MCP server process crashed. Check the server’s stderr output for error messages. Common causes: unhandled promise rejections, missing environment variables, or dependency installation failures.
# Test the server directly to see errorsnode /path/to/mcp-server/index.mjs 2>&1“Tool not found” after server connects.
The server connected, but the tool you are trying to use is not registered. This happens when an npx-launched server is resolving to a stale cached version. npm update does nothing here, because nothing is installed in a package.json tree — npx resolves from its own cache. Force the latest version, or pin one in args:
# Refresh the npx-cached server to the latest releasenpx -y postgres-mcp@latest# Or clear the npx cache entirely, then re-launchnpx clear-npx-cachePinning a version in your config args (for example postgres-mcp@1.0.5) is the reliable fix: the tool set is then deterministic.
“Timeout” on tool calls.
The tool is taking longer than the client’s timeout. This usually means the underlying API is slow or the query is too broad. Narrow your request or increase the timeout in your prompt: “Query the database, allowing up to 60 seconds for the response.”
AI calls the wrong server’s tool.
With multiple servers providing similar tools, the AI can pick the wrong one. Be explicit: “Using the GitHub MCP server (not the Git MCP server), search for files containing ‘validateToken’.”
“Permission denied” from the server.
The credentials have insufficient scope. See the MCP Security guide for token scoping recommendations.
Team Configuration Patterns
Section titled “Team Configuration Patterns”Shared Base Configuration
Section titled “Shared Base Configuration”Create a base MCP configuration that every developer on the team uses. Use GitHub’s official remote server (https://api.githubcopilot.com/mcp/) rather than the old @modelcontextprotocol/server-github package, which is now archived and deprecated on npm:
{ "mcpServers": { "github": { "url": "https://api.githubcopilot.com/mcp/", "headers": { "Authorization": "Bearer ${GITHUB_TOKEN}" } }, "context7": { "command": "npx", "args": ["-y", "@upstash/context7-mcp@latest"] } }}Commit this to the repository. Each developer sets GITHUB_TOKEN in their local environment. In Claude Code the equivalent one-liner is claude mcp add --transport http github https://api.githubcopilot.com/mcp/; in Codex, add a [mcp_servers.github] table with url and bearer_token_env_var = "GITHUB_TOKEN".
Project-Specific Overrides
Section titled “Project-Specific Overrides”Add project-specific servers in a .mcp.json or project-scoped settings file. The original @modelcontextprotocol/server-postgres is an archived reference implementation; prefer a maintained Postgres MCP such as postgres-mcp:
{ "mcpServers": { "project-db": { "command": "npx", "args": ["-y", "postgres-mcp"], "env": { "DATABASE_URL": "${DATABASE_URL}" } } }}Onboarding Checklist
Section titled “Onboarding Checklist”Document the MCP setup in your project’s contributing guide:
- Install the required MCP servers (list them)
- Create tokens with the documented scopes (link to the guide)
- Set environment variables (list them)
- Test each server connection (provide test prompts)
When This Breaks
Section titled “When This Breaks”Everything slows down after adding a new server. The new server’s tool descriptions are large or the server takes a long time to start. Disconnect servers you are not actively using.
AI stops using MCP tools and starts guessing. The context window is full. Start a new conversation or disconnect unused servers to free up space.
MCP servers work locally but fail in CI. CI environments typically do not have the same environment variables, tokens, or network access. Use mock MCP servers in CI or skip MCP-dependent tests.
Different team members get different results. MCP server versions may differ. Pin versions in your shared configuration and document the expected version.