Vorp Labs//MCP

May 24, 2026

Technical

How MCP Servers Make Coding Agents Expensive

MCP gives coding agents useful tools, but tool names, descriptions, schemas, default availability, and response payloads still create a context budget.

8 min read

MCP is one of the better things to happen to coding agents.

It gives Claude Code, Cursor, Codex, and custom agents a standard way to discover tools and call into real systems. Browsers, databases, ticket trackers, deployment systems, design tools, logs, spreadsheets, and internal APIs can become part of the agent's working environment instead of something a human has to copy and paste.

That is the good part.

The part teams miss is that every enabled tool has a carrying cost somewhere.

That cost is not identical in every client. Claude Code can defer MCP tool definitions behind tool search. Cursor lets you disable MCP tools so they are not loaded into context or available to Agent. Codex gives you configuration controls for MCP server enablement, tool allowlists and denylists, approval behavior, output token limits, and profiles.

So the useful claim is not:

Every byte of every MCP tool is always stuffed into the prompt.

The useful claim is narrower:

Every tool that is visible, searchable, enabled by default, or likely to return large payloads becomes part of the agent's context economy.

The cost may show up as tool names at startup, tool-search decisions, schema detail after a tool is selected, output history after a call, or compaction pressure later in the session.

The expensive mistake is treating MCP servers as free capability.

They are not free. They are capability plus context tax.

What actually costs you

When teams talk about MCP overhead, they usually think about latency.

The browser tool takes time. The database query takes time. The GitHub lookup takes time. The external API may fail. Those costs are real, but they are visible.

The subtler cost appears before and after the call.

Before a tool is called, the agent has to route:

Which tools exist?
Which one is relevant?
What arguments does it need?
What risk does the call carry?
Is the response worth adding to the session?

After a tool is called, the agent has to carry or summarize the result.

That result can be more expensive than the tool definition. A log search, browser snapshot, database query, ticket search, or docs lookup can flood the next turn with text the agent did not need.

This is why a team can add useful MCP servers and still make sessions feel slower, noisier, or more expensive.

The tools work. The tool surface is the problem.

The current-client nuance

Modern coding clients already have mitigations. Use them.

Client	Useful controls	What still needs design
Claude Code	MCP tool search can defer tool definitions and load them on demand. `/mcp` shows connected servers and tool counts. MCP output limits can warn when a response is too large.	Tool names and server instructions still need to route well. Used tools and large outputs still enter the session.
Codex	MCP servers are configured in `~/.codex/config.toml` or trusted project config. Config supports `enabled`, `enabled_tools`, `disabled_tools`, per-tool approval, profiles, compaction thresholds, and `tool_output_token_limit`. `/mcp` lists available MCP tools.	Enabled tools still expand the action surface. Tool results still need output discipline. Profiles only help if teams actually use them.
Cursor	MCP tools appear under available tools and can be toggled. Cursor docs say disabled tools are not loaded into context or available to Agent.	Enabled tools still need clear names, short descriptions, narrow schemas, and small default responses.

That nuance matters. The right answer is not "MCP is bad."

The right answer is that MCP needs interface design.

API-shaped tools are the trap

Most teams design tools like APIs.

That usually means broad methods, flexible filters, many optional parameters, generic descriptions, complete response objects, and lots of enum values.

This is natural. Engineers are trained to expose power and flexibility.

For agents, the better interface is often narrower.

Instead of:

query_database(sql, timeout, role, warehouse, format, include_metadata, max_rows)

you may want:

find_recent_failed_jobs(service_name)

Instead of:

github_search(query, type, sort, order, page, per_page)

you may want:

find_open_prs_touching_file(path)

The low-level tool may still exist. But if it is exposed in every coding session, the agent has to reason about it every time.

Good agent tools are not just API wrappers. They are task affordances.

A practical MCP audit

Start with one normal coding workflow, then list every MCP server and tool available to the agent during that workflow.

For each tool, answer:

Question	Why it matters
Can the use case be inferred from the name?	Bad names force the model to read or retrieve more detail.
Does the description say when to use and avoid the tool?	Descriptions should route, not document the whole API.
Is the tool default, task-profile, manual, or disabled?	Useful does not mean always available.
How large is the schema?	Optional parameters and huge enums create irrelevant choices.
What does the response return by default?	Full payloads can be more expensive than the call itself.
What happens if the model calls it at the wrong time?	Write-capable and high-risk tools need stricter availability and verification.
Could a shell command, script, or narrower tool do this better?	MCP is not automatically better than deterministic primitives.

This audit usually finds obvious waste:

tools that are always available but rarely used
tools with overlapping names or descriptions
tools that expose a whole platform API for one common workflow
tools that return complete records when the agent needs status, count, or link
tools that duplicate shell commands with worse output
write-capable tools that do not report how success was verified

The goal is not to remove useful tools. The goal is to stop paying for irrelevant ones in every session.

Five design rules

Good MCP hygiene is mostly boring interface work.

Name tools for the task. find_open_prs_touching_file is easier for an agent to route than github_search.
Make descriptions directional. A description should answer: when should the agent use this, when should it avoid it, what question does it answer, and what should the agent do with the result?
Narrow schemas to the workflow. If a parameter is almost never used, remove it from the default tool. If an enum has 80 values, consider search. If advanced cases matter, make an advanced follow-up tool.
Return summaries first. A test-history tool should not return every run, every log line, and every artifact URL by default. It should return latest status, failing test names, likely owner or package, a link or command for details, and the next diagnostic action.
Make write tools prove success. A deployment tool should not only say it triggered a deploy. It should say what environment changed, which commit is running, whether health checks passed, and where to inspect failures.

A good default response answers the next decision. It does not satisfy every possible investigation.

Available-by-default is the expensive default

Not every useful MCP server should be available in every session.

For coding work, tool access should match the task mode.

Task mode	Useful tools	Defer
Local code edit	Filesystem, shell, focused browser smoke.	Deployment, database admin, broad web research.
UI verification	Browser snapshot, screenshot, console and network checks.	Issue tracker, production control-plane tools.
Production incident	Logs, metrics, deployment status, rollback commands.	Design tools, content tools.
Source-backed content	Filesystem, markdown preview, official docs lookup.	Database admin, deployment tools.

The point is attention management.

Every extra enabled tool asks the agent to consider another path. In a small task, that can make the session slower, less decisive, or more likely to pick a generic tool when a narrower one would have been better.

Task profiles solve this by exposing the right cluster of tools for the work at hand.

A small example

Imagine an agent has access to a browser MCP server with tools for navigation, snapshots, screenshots, JavaScript evaluation, clicking, filling forms, console messages, network requests, performance traces, and heap snapshots.

That is a useful tool set. It is also too broad for many tasks.

For a basic landing-page smoke test, the agent probably needs:

navigate to the URL
inspect the rendered tree
check console errors
evaluate one assertion
take a screenshot only when visual evidence matters

It probably does not need heap snapshots or performance tracing unless the task is specifically about memory or performance.

The server can stay powerful. The task profile should be narrower.

UI smoke profile:
- navigate_page
- take_snapshot
- list_console_messages
- evaluate_script
- take_screenshot when visual evidence is needed

Now the agent has a shorter decision tree. It reaches for performance tooling when the task is about performance, not because those tools happened to sit next to screenshot and console checks.

What good looks like

You know the MCP setup is improving when sessions feel more directed.

Signals:

The first tool call is usually relevant.
The agent picks the right tool without trial and error.
Tool responses are short enough to reason about.
The agent asks fewer clarifying questions about tool purpose.
Write-capable tools include verification in the workflow.
Engineers stop pasting the same MCP guidance into prompts.
Review comments shift away from "wrong tool" or "missed obvious state."

The best MCP setup is not the one with the most tools.

It is the one where the agent sees the right tool at the moment it can use it well.

The fix is not to avoid MCP. The fix is to treat MCP as an agent interface, not an API dump.

Name tools for the task. Describe them for routing. Narrow schemas to the workflow. Return summaries before raw payloads. Use each client's context controls. Expose expensive or rare tools only when the task needs them.

That is how MCP stops being a context tax and starts becoming leverage.

Primary references

Product behavior changes quickly, so the tool-specific examples above use these official docs as the source of record:

Continue the coding-agent cost path

These links connect the note to the practical audits, checklists, and memory-system work that make the diagnosis actionable.

Program

We're collecting concrete tasks and failure modes for future notes and benchmarks.

Start a diagnostic View programs

←Back to all notes