Skip to content

MCP Optimization and Troubleshooting

You have seven MCP servers connected. The AI takes 15 seconds to respond to simple questions because it is processing tool descriptions from servers you are not even using. When it does call a tool, the response floods the context window with 50,000 characters of raw JSON, pushing your earlier conversation out of memory. The database MCP returns stale schema information because it cached the metadata an hour ago. And when something breaks, you have no idea which server failed or why.

MCP is powerful, but it needs tuning. This guide covers the patterns that separate a frustrating MCP setup from a productive one.

  • Strategies for managing context window usage across multiple MCP servers
  • Guidelines for which servers to connect when, and how many is too many
  • Caching and performance patterns for MCP server responses
  • A systematic approach to debugging MCP failures
  • Team configuration patterns for shared development environments

Every connected MCP server adds its tool descriptions to the AI’s system prompt. A typical server adds 500-2,000 tokens of tool descriptions. With seven servers, you might burn 10,000 tokens on tool descriptions before the conversation even starts.

The practical limit is 3-5 active servers per session. Beyond that, you hit two problems:

  1. Context window pressure. Tool descriptions compete with your conversation history and file contents for space in the context window.
  2. Tool selection confusion. With 40+ tools available, the AI may call the wrong tool or hesitate about which one to use.

Instead of connecting every server at startup, connect what you need for the current task:

TaskRecommended Servers
Feature implementationGitHub MCP + Context7 + Database MCP
Bug investigationGitHub MCP + Playwright MCP + Database MCP
Design-to-codeFigma MCP + shadcn/ui MCP
Ticket-driven developmentAtlassian MCP + GitHub MCP
Infrastructure workCloudflare MCP (Workers + KV + D1)
Documentation writingContext7 + Filesystem MCP

A database schema inspection can return 20,000+ characters. A GitHub code search can return dozens of file excerpts. A Confluence page can be 10,000 words. Each MCP tool response consumes context window space, and that space does not come back until the conversation resets.

Instead of broad prompts, be specific about what you need:

Too broad: “Show me the database schema.” (Returns entire schema, possibly 30+ tables)

Better: “Show me the schema for the users and orders tables only, including their foreign key relationships.”

Too broad: “Search GitHub for authentication code.” (Returns dozens of files)

Better: “Search GitHub for files in the src/auth/ directory that handle JWT token validation.”

STDIO-based servers start a new process for each connection. If the server has heavy dependencies (a Python server importing ML libraries, a Node server bundling a large SDK), startup can take 5-10 seconds.

Mitigation strategies:

  • Keep dependencies minimal. Your MCP server should be a thin wrapper, not a monolith.
  • Pre-build your server. Ship compiled JavaScript instead of transpiling TypeScript at startup.
  • Use npx with --yes and ensure the package is cached locally by running it once before your work session.

Database MCP servers often cache schema metadata at startup. If you run a migration, the MCP server will still report the old schema until restarted.

To restart an MCP server in Cursor:

  1. Open Settings > Tools & Integrations > MCP
  2. Find the server with the stale data
  3. Click the refresh icon to restart it

External API-backed servers (GitHub, Atlassian, Cloudflare) have rate limits. GitHub allows 5,000 API requests per hour for authenticated users. A single prompt like “analyze all open PRs” could burn through hundreds of requests.

Protect yourself:

  • Limit the scope of your prompts. “Analyze the last 5 PRs” instead of “analyze all open PRs.”
  • Use local alternatives when possible. The Git MCP server works against your local repo without API calls.
  • Monitor your rate limit status. “Check the GitHub API rate limit status” is a valid prompt with the GitHub MCP server.

When an MCP server is not working, follow this sequence:

  1. Check server status. Is the server running? Can it start at all?
  2. Check authentication. Are the credentials valid? Did the OAuth token expire?
  3. Check the tool call. Is the AI calling the right tool with valid parameters?
  4. Check the response. Is the server returning data, an error, or nothing?
  5. Check the transport. Is the connection between client and server healthy?

“Server disconnected” or “Connection reset.”

The MCP server process crashed. Check the server’s stderr output for error messages. Common causes: unhandled promise rejections, missing environment variables, or dependency installation failures.

Terminal window
# Test the server directly to see errors
node /path/to/mcp-server/index.mjs 2>&1

“Tool not found” after server connects.

The server connected, but the tool you are trying to use is not registered. This happens when the server version does not match the documentation. Update the server:

Terminal window
npm update @modelcontextprotocol/server-github

“Timeout” on tool calls.

The tool is taking longer than the client’s timeout. This usually means the underlying API is slow or the query is too broad. Narrow your request or increase the timeout in your prompt: “Query the database, allowing up to 60 seconds for the response.”

AI calls the wrong server’s tool.

With multiple servers providing similar tools, the AI can pick the wrong one. Be explicit: “Using the GitHub MCP server (not the Git MCP server), search for files containing ‘validateToken’.”

“Permission denied” from the server.

The credentials have insufficient scope. See the MCP Security guide for token scoping recommendations.

Create a base MCP configuration that every developer on the team uses:

{
"mcpServers": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_TOKEN}"
}
},
"context7": {
"command": "npx",
"args": ["-y", "@upstash/context7-mcp@latest"]
}
}
}

Commit this to the repository. Each developer sets GITHUB_TOKEN in their local environment.

Add project-specific servers in a .mcp.json or project-scoped settings file:

{
"mcpServers": {
"project-db": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": {
"DATABASE_URL": "${DATABASE_URL}"
}
}
}
}

Document the MCP setup in your project’s contributing guide:

  1. Install the required MCP servers (list them)
  2. Create tokens with the documented scopes (link to the guide)
  3. Set environment variables (list them)
  4. Test each server connection (provide test prompts)

Everything slows down after adding a new server. The new server’s tool descriptions are large or the server takes a long time to start. Disconnect servers you are not actively using.

AI stops using MCP tools and starts guessing. The context window is full. Start a new conversation or disconnect unused servers to free up space.

MCP servers work locally but fail in CI. CI environments typically do not have the same environment variables, tokens, or network access. Use mock MCP servers in CI or skip MCP-dependent tests.

Different team members get different results. MCP server versions may differ. Pin versions in your shared configuration and document the expected version.