Technical
How MCP Servers Make Coding Agents Expensive
MCP gives coding agents useful tools, but tool names, descriptions, schemas, default availability, and response payloads still create a context budget.
MCP is one of the better things to happen to coding agents.
It gives Claude Code, Cursor, Codex, and custom agents a standard way to discover tools and call into real systems. Browsers, databases, ticket trackers, deployment systems, design tools, logs, spreadsheets, and internal APIs can become part of the agent's working environment instead of something a human has to copy and paste.
That is the good part.
The part teams miss is that every enabled tool has a carrying cost somewhere.
That cost is not identical in every client. Claude Code can defer MCP tool definitions behind tool search. Cursor lets you disable MCP tools so they are not loaded into context or available to Agent. Codex gives you configuration controls for MCP server enablement, tool allowlists and denylists, approval behavior, output token limits, and profiles.
So the useful claim is not:
Every byte of every MCP tool is always stuffed into the prompt.
The useful claim is narrower:
Every tool that is visible, searchable, enabled by default, or likely to return large payloads becomes part of the agent's context economy.
The cost may show up as tool names at startup, tool-search decisions, schema detail after a tool is selected, output history after a call, or compaction pressure later in the session.
The expensive mistake is treating MCP servers as free capability.
They are not free. They are capability plus context tax.
What actually costs you
When teams talk about MCP overhead, they usually think about latency.
The browser tool takes time. The database query takes time. The GitHub lookup takes time. The external API may fail. Those costs are real, but they are visible.
The subtler cost appears before and after the call.
Before a tool is called, the agent has to route:
- Which tools exist?
- Which one is relevant?
- What arguments does it need?
- What risk does the call carry?
- Is the response worth adding to the session?
After a tool is called, the agent has to carry or summarize the result.
That result can be more expensive than the tool definition. A log search, browser snapshot, database query, ticket search, or docs lookup can flood the next turn with text the agent did not need.
This is why a team can add useful MCP servers and still make sessions feel slower, noisier, or more expensive.
The tools work. The tool surface is the problem.
The current-client nuance
Modern coding clients already have mitigations. Use them.
| Client | Useful controls | What still needs design |
|---|---|---|
| Claude Code | MCP tool search can defer tool definitions and load them on demand. /mcp shows connected servers and tool counts. MCP output limits can warn when a response is too large. | Tool names and server instructions still need to route well. Used tools and large outputs still enter the session. |
| Codex | MCP servers are configured in ~/.codex/config.toml or trusted project config. Config supports enabled, enabled_tools, disabled_tools, per-tool approval, profiles, compaction thresholds, and tool_output_token_limit. /mcp lists available MCP tools. | Enabled tools still expand the action surface. Tool results still need output discipline. Profiles only help if teams actually use them. |
| Cursor | MCP tools appear under available tools and can be toggled. Cursor docs say disabled tools are not loaded into context or available to Agent. | Enabled tools still need clear names, short descriptions, narrow schemas, and small default responses. |
That nuance matters. The right answer is not "MCP is bad."
The right answer is that MCP needs interface design.
API-shaped tools are the trap
Most teams design tools like APIs.
That usually means broad methods, flexible filters, many optional parameters, generic descriptions, complete response objects, and lots of enum values.
This is natural. Engineers are trained to expose power and flexibility.
For agents, the better interface is often narrower.
Instead of:
query_database(sql, timeout, role, warehouse, format, include_metadata, max_rows)you may want:
find_recent_failed_jobs(service_name)Instead of:
github_search(query, type, sort, order, page, per_page)you may want:
find_open_prs_touching_file(path)The low-level tool may still exist. But if it is exposed in every coding session, the agent has to reason about it every time.
Good agent tools are not just API wrappers. They are task affordances.
A practical MCP audit
Start with one normal coding workflow, then list every MCP server and tool available to the agent during that workflow.
For each tool, answer:
| Question | Why it matters |
|---|---|
| Can the use case be inferred from the name? | Bad names force the model to read or retrieve more detail. |
| Does the description say when to use and avoid the tool? | Descriptions should route, not document the whole API. |
| Is the tool default, task-profile, manual, or disabled? | Useful does not mean always available. |
| How large is the schema? | Optional parameters and huge enums create irrelevant choices. |
| What does the response return by default? | Full payloads can be more expensive than the call itself. |
| What happens if the model calls it at the wrong time? | Write-capable and high-risk tools need stricter availability and verification. |
| Could a shell command, script, or narrower tool do this better? | MCP is not automatically better than deterministic primitives. |
This audit usually finds obvious waste:
- tools that are always available but rarely used
- tools with overlapping names or descriptions
- tools that expose a whole platform API for one common workflow
- tools that return complete records when the agent needs status, count, or link
- tools that duplicate shell commands with worse output
- write-capable tools that do not report how success was verified
The goal is not to remove useful tools. The goal is to stop paying for irrelevant ones in every session.
Five design rules
Good MCP hygiene is mostly boring interface work.
-
Name tools for the task.
find_open_prs_touching_fileis easier for an agent to route thangithub_search. -
Make descriptions directional. A description should answer: when should the agent use this, when should it avoid it, what question does it answer, and what should the agent do with the result?
-
Narrow schemas to the workflow. If a parameter is almost never used, remove it from the default tool. If an enum has 80 values, consider search. If advanced cases matter, make an advanced follow-up tool.
-
Return summaries first. A test-history tool should not return every run, every log line, and every artifact URL by default. It should return latest status, failing test names, likely owner or package, a link or command for details, and the next diagnostic action.
-
Make write tools prove success. A deployment tool should not only say it triggered a deploy. It should say what environment changed, which commit is running, whether health checks passed, and where to inspect failures.
A good default response answers the next decision. It does not satisfy every possible investigation.
Available-by-default is the expensive default
Not every useful MCP server should be available in every session.
For coding work, tool access should match the task mode.
| Task mode | Useful tools | Defer |
|---|---|---|
| Local code edit | Filesystem, shell, focused browser smoke. | Deployment, database admin, broad web research. |
| UI verification | Browser snapshot, screenshot, console and network checks. | Issue tracker, production control-plane tools. |
| Production incident | Logs, metrics, deployment status, rollback commands. | Design tools, content tools. |
| Source-backed content | Filesystem, markdown preview, official docs lookup. | Database admin, deployment tools. |
The point is attention management.
Every extra enabled tool asks the agent to consider another path. In a small task, that can make the session slower, less decisive, or more likely to pick a generic tool when a narrower one would have been better.
Task profiles solve this by exposing the right cluster of tools for the work at hand.
A small example
Imagine an agent has access to a browser MCP server with tools for navigation, snapshots, screenshots, JavaScript evaluation, clicking, filling forms, console messages, network requests, performance traces, and heap snapshots.
That is a useful tool set. It is also too broad for many tasks.
For a basic landing-page smoke test, the agent probably needs:
- navigate to the URL
- inspect the rendered tree
- check console errors
- evaluate one assertion
- take a screenshot only when visual evidence matters
It probably does not need heap snapshots or performance tracing unless the task is specifically about memory or performance.
The server can stay powerful. The task profile should be narrower.
UI smoke profile:
- navigate_page
- take_snapshot
- list_console_messages
- evaluate_script
- take_screenshot when visual evidence is neededNow the agent has a shorter decision tree. It reaches for performance tooling when the task is about performance, not because those tools happened to sit next to screenshot and console checks.
What good looks like
You know the MCP setup is improving when sessions feel more directed.
Signals:
- The first tool call is usually relevant.
- The agent picks the right tool without trial and error.
- Tool responses are short enough to reason about.
- The agent asks fewer clarifying questions about tool purpose.
- Write-capable tools include verification in the workflow.
- Engineers stop pasting the same MCP guidance into prompts.
- Review comments shift away from "wrong tool" or "missed obvious state."
The best MCP setup is not the one with the most tools.
It is the one where the agent sees the right tool at the moment it can use it well.
The fix is not to avoid MCP. The fix is to treat MCP as an agent interface, not an API dump.
Name tools for the task. Describe them for routing. Narrow schemas to the workflow. Return summaries before raw payloads. Use each client's context controls. Expose expensive or rare tools only when the task needs them.
That is how MCP stops being a context tax and starts becoming leverage.
Primary references
Product behavior changes quickly, so the tool-specific examples above use these official docs as the source of record:
Related reading
Continue the coding-agent cost path
These links connect the note to the practical audits, checklists, and memory-system work that make the diagnosis actionable.
AI Coding Cost Audit
The full diagnostic for Claude Code, Cursor, Codex, Copilot, MCP tooling, context, memory, and model routing.
MCP Token Cost Audit
A narrower review of MCP servers, tool descriptions, schema weight, routing clarity, and response verbosity.
Agent Memory Systems
Patterns for persistent project memory, prompt libraries, prior decisions, and source-grounded engineering knowledge.
AI Coding Cost Audit Checklist
A practical checklist for finding repeated context, noisy tools, prompt drift, and verification gaps.
Have a workflow this should cover?
We're collecting concrete tasks and failure modes for future notes and benchmarks.