Scout: MCP Production Pain Points | The Artificer's Grimoire

Summary

The Model Context Protocol has achieved remarkable adoption since its November 2024 launch — supported by Claude Code, VS Code, Cursor, Windsurf, and dozens of agent frameworks. But production deployments are exposing fundamental gaps that the spec team is racing to address. The pain points cluster into a clear hierarchy: stateful sessions that fight horizontal scaling, an auth model that enterprises can’t use, pervasive security vulnerabilities (43% of early MCP servers had command injection flaws), context window consumption that inflates costs 2-30x, and an observability story that barely exists. The emergence of “MCP Gateways” as an entire product category — Kong, Solo.io, Composio, Lunar — is itself evidence that the protocol has significant production gaps the ecosystem is building around rather than waiting for the spec to fix.

Key Findings

1. Stateful Sessions Kill Horizontal Scaling

The single most repeated production complaint. MCP’s mandatory initialization handshake creates stateful sessions, making it incompatible with standard stateless load balancers (AWS ALB, GCP LB). Session IDs must be reused across calls so the server maintains context — directly conflicting with how cloud infrastructure works.

Practitioners report that scaling MCP to multiple replicas requires externalized state (Redis, DynamoDB) or sticky sessions, adding significant infrastructure overhead. One developer trying to build a stateless MCP server across multiple Kubernetes pods with Redis reported the SDK provides no reliable way to map client session IDs to server-internal event streams.

SEP-1442 (“Make MCP Stateless”) is a major active proposal to make statelessness the default. The 2026 roadmap explicitly lists “Transport Evolution and Scalability” as priority #1: “evolve Streamable HTTP to run statelessly across multiple server instances.”

Current workarounds: Envoy AI Gateway for session-aware routing, Redis-backed session externalization, or accepting single-instance deployments.

2. Authentication Is Broken for Enterprise

The auth model has been called “a mess” and “a non-starter for enterprise” by multiple practitioners with enterprise consulting backgrounds. The core problems:

Conflated roles. The spec treats MCP servers as both OAuth resource servers AND authorization servers. Enterprises have centralized IdPs (Okta, Entra) and don’t want every tool server acting as its own identity provider.
Anonymous Dynamic Client Registration. The spec relies on DCR without pre-approval. Enterprises demand pre-registered, vetted clients.
IdP incompatibility. The spec depends on newer OAuth RFCs that MS Entra and Okta don’t consistently implement.
Spec churn. The auth model changed significantly between the March 2025, June 2025, and November 2025 spec revisions, forcing implementers to rewrite auth logic repeatedly.
Real exploits. Obsidian Security found one-click account-takeover vulnerabilities in production MCP servers that failed to bind OAuth state parameters to user sessions.

Three critical CVEs have been filed in six months: CVE-2025-49596 (CVSS 9.4, unauthenticated MCP Inspector access), CVE-2025-6514 (CVSS 9.6, command injection in mcp-remote with 437K downloads), and CVE-2025-52882 (CVSS 8.8, unauthenticated WebSocket in Claude Code extensions).

Current workarounds: MCP gateways (agentgateway from Solo.io), SPIFFE/mTLS for machine-to-machine identity, centralized policy engines (OPA), proxy-based auth.

3. Security Vulnerabilities Are Pervasive — Not Theoretical

This isn’t about future risks — these are documented exploits against production systems:

43% command injection rate. Early 2025 audits found 43% of MCP servers had command injection vulnerabilities, 22% had path traversal, 30% were vulnerable to SSRF.
Tool poisoning and rug pulls. Malicious instructions embedded in tool descriptions are invisible to users but interpreted by the LLM. Invariant Labs demonstrated exfiltrating a user’s entire WhatsApp history via a poisoned MCP server running alongside a legitimate one. Tool definitions can mutate after installation — a tool approved on Day 1 can silently reroute API keys by Day 7. No mechanism exists to lock definitions post-approval.
Cross-server prompt injection. Tool output from one MCP server can poison subsequent tool calls to another. No sandboxing between tools. Simon Willison has documented this extensively.
Real-world incident. Supabase’s Cursor agent processed support tickets containing user-supplied SQL injection, exfiltrating sensitive integration tokens into a public thread.
Confused deputy via GitHub. Invariant Labs demonstrated a crafted GitHub issue hijacking an AI assistant into exfiltrating private repository data via a public PR.

The spec team has responded with Security Best Practices documentation covering confused deputy, SSRF, session hijacking, and local server compromise. But these are advisory — nothing in the protocol enforces them.

4. Context Window Consumption Inflates Costs 2-30x

As organizations connect multiple MCP servers, tool definitions consume enormous context before the agent does anything:

40-50K tokens upfront. Connecting GitHub, Linear, Postgres, and Slack servers loads all tool definitions at once via tools/list. No lazy loading or relevance-based filtering in the spec.
Individual tool overhead. Enterprise tools with detailed schemas consume 500-1,000 tokens each just for the definition. One Claude Code user reported ~200 deferred tools across ~8 services.
Cost impact. Production deployments discover individual tools consuming 10,000+ tokens per call when 1,000 would suffice. At current pricing, that’s a 10x cost multiplier per invocation.
Few clients support listChanged notifications, so dynamic tool filtering doesn’t work in practice even where the spec allows it.

Current workarounds: RAG-based dynamic tool selection, MCPlexor multiplexer (claims 95% context reduction), gateway-based lazy loading, hierarchical deferred tool discovery.

5. Remote Deployment Is Harder Than It Should Be

Moving from local stdio to remote HTTP is where many teams hit walls:

SSE behind proxies. Server-Sent Events don’t work well behind corporate proxies, load balancers, or in serverless environments. The spec added “Streamable HTTP” transport (replacing deprecated HTTP+SSE in spec version 2025-03-26), but it’s complex to implement correctly.
Cookie/session forwarding. Most MCP clients use fetch() internally and don’t properly forward Set-Cookie headers, breaking load balancer session affinity.
Cold start latency. First MCP tool call costs ~2,485ms; subsequent cached calls drop to ~0.01ms. The cold start is painful for serverless deployments.
JSON-RPC over HTTP adds complexity. Instead of leveraging HTTP semantics, MCP sends GET/POST as JSON parameters with responses over separate SSE connections, requiring manual message attribution. FeatureForm argues this is needless indirection.

SEP-1288 proposes WebSocket transport as an alternative, arguing Streamable HTTP is overly complex for bidirectional communication.

6. Observability and Debugging Barely Exist

Traditional monitoring tools are blind to MCP-specific failure modes:

Non-deterministic behavior. The same prompt can trigger entirely different tool chains depending on the LLM’s reasoning, making failures hard to reproduce.
The stdio logging trap. MCP servers using stdio transport must write ONLY JSON-RPC to stdout. Any debug logging to stdout corrupts the protocol stream and kills the connection. All logging must go to stderr — a subtle but frequently hit issue.
No standard tracing. There’s no built-in distributed tracing. Teams must manually integrate OpenTelemetry.
No standard error codes. Issue #2209: error codes are chosen ad hoc with only 100 slots in the JSON-RPC custom range.
No response size limits. Issue #2210: large responses overflow context windows with no protocol mechanism to constrain them.

Datadog, Dynatrace, and Sentry have shipped MCP-aware monitoring, but the ecosystem is nascent.

7. Server Lifecycle and Process Management

Each coding agent session launches its own MCP server instances. At scale, this creates operational problems:

Process accumulation. Multiple agents can spawn 15+ server processes consuming 1GB+ of memory.
Memory leaks. Unclosed HTTP response streams, event listeners without cleanup, unbounded caches without eviction policies. Server crashes drop all active sessions.
Static tool discovery. Claude Desktop requires JSON editing and full restart to add servers — no hot-reloading.
No conformance test suite. Issue #1990: different SDK implementations may diverge from spec with no way to verify compliance.

8. The Gateway Layer Is Filling the Gaps

The emergence of MCP Gateways as an entire product category tells the story: Kong, Solo.io (agentgateway), Composio, Lunar, Cloudflare, and others have all shipped gateway products that sit between clients and MCP servers to provide what the protocol lacks:

Centralized authentication and credential management
Audit trails and compliance logging
Rate limiting and cost controls
Tool filtering and context optimization
Policy enforcement (RBAC, data exfiltration prevention)

This pattern — a proxy that adds enterprise capabilities to an underspecified protocol — is well-established (cf. API gateways for REST). But it means production MCP deployments require significant infrastructure beyond the protocol itself.

9. What the Spec Team Is Doing About It

The 2026 roadmap explicitly acknowledges most of these gaps:

Priority	Target
Transport Evolution	Stateless MCP by default, scalable session handling, MCP Server Cards for discovery
Agent Communication	Task retry semantics, expiry policies for results
Enterprise Readiness	Audit trails, enterprise-managed auth (SSO), gateway/proxy patterns, config portability
Governance	Contributor ladder, delegation model to unblock SEP bottleneck
On the Horizon	Streaming tool results, webhooks/triggers, finer-grained auth scopes, conformance tests

The spec has already addressed some issues in the November 2025 revision: Streamable HTTP replacing SSE, elicitation for server-initiated user input, tool annotations for security hints, and detailed Security Best Practices documentation.

Practical Implications

For Teams Evaluating MCP Adoption

Don’t skip the gateway. Production MCP deployments need an intermediary for auth, audit, and cost control. Budget for it architecturally — this is not optional infrastructure. Evaluate agentgateway (Solo.io), Kong MCP Gateway, or build a thin proxy layer.
Start with stdio, plan for HTTP. Local stdio deployment is straightforward and sidesteps most transport issues. But design your server architecture knowing you’ll need to migrate to HTTP for multi-user and cloud deployments.
Implement tool filtering from Day 1. Don’t load all tools into every session. Use deferred tool discovery, RAG-based selection, or gateway-level filtering to keep context consumption under control. The 40-50K token upfront cost is a real economic constraint.

For Teams Already Running MCP

Audit your servers for the OWASP top 3. The 43% command injection rate means existing servers are likely vulnerable. Run input validation checks on every parameter that touches shell commands, file paths, or SQL.
Pin tool definitions. Until the spec adds definition locking, implement your own versioning for tool schemas. Hash definitions at approval time and alert on changes — the rug pull attack vector is real.
Add OpenTelemetry tracing manually. Don’t wait for the spec. Wrap every tool call in a span, every user task in a trace. This is the minimum viable observability for debugging non-deterministic agent behavior.

What NOT to Do

Don’t build on the auth spec as-is for enterprise. If you need SSO integration, use a gateway that handles auth externally. The spec’s OAuth model will continue to churn.
Don’t assume horizontal scaling works. Test your deployment under load with multiple concurrent sessions before committing to MCP for high-throughput use cases. SEP-1442 (stateless MCP) is still in draft — plan for single-instance or sticky-session deployments.

Open Questions

Will SEP-1442 (stateless MCP) land, and when? This is the most important open question for production viability. If it ships, it removes the #1 pain point. If it doesn’t, the gateway pattern becomes permanent infrastructure.
Is the gateway layer a feature or a tax? API gateways became standard infrastructure for REST. Will MCP gateways follow the same path (accepted cost of doing business) or will the spec eventually internalize these capabilities?
How will tool definition governance work at scale? With no standard for locking, versioning, or auditing tool definitions, organizations connecting to third-party MCP servers are trusting that definitions won’t change. The spec’s tool annotations are advisory only.
What’s the right answer for multi-tenant isolation? The confused deputy attack on Asana’s MCP integration (cached responses failing to re-verify tenant context) suggests this is an unsolved problem. The spec punts on it entirely.
Will WebSocket transport (SEP-1288) replace Streamable HTTP? Multiple practitioners report WebSockets work “significantly better in cloud environments.” If the spec adds official WebSocket support, it could simplify remote deployments considerably.

Sources

MCP Is Dead; Long Live MCP — Charles Chen, practitioner critique (246 HN points)
The MCP Authorization Spec Is… a Mess — Christian Posta, Solo.io VP Global Field CTO
MCP Authorization is a Non-Starter for Enterprise — Solo.io
What MCP Gets Wrong — FeatureForm
MCP shipped without authentication — VentureBeat
When MCP Meets OAuth: Account Takeover — Obsidian Security
The MCP AuthN/Z Nightmare — Doyensec
Let’s Fix OAuth in MCP — Aaron Parecki
MCP Prompt Injection Security Problems — Simon Willison
Tool Poisoning Attacks — Invariant Labs
Poison Everywhere: No Output Is Safe — CyberArk
MCP Security TOP 25 Vulnerabilities — Adversa AI
MCP Attack Vectors and Defense — Elastic Security Labs
A Timeline of MCP Security Breaches — AuthZed
Prompt Injection via MCP Sampling — Palo Alto Unit 42
Tool Poisoning and Rug Pulls — MCP Manager
Practical DevSecOps — MCP Security Vulnerabilities
Remote MCP Servers: Inevitable, Not Easy — The New Stack
MCP Roadmap 2026 — The New Stack
Official 2026 MCP Roadmap
SEP-1442: Make MCP Stateless
SEP-1288: WebSocket Transport
Issue #282: Session State Inconsistency
Issue #2349: Step-up Auth Scope Accumulation
Issue #1721: OAuth Mixup Attack
Issue #544: Phishing via Malicious MCP Servers — Alibaba Cloud
Issue #913: False Sense of Security in Spec Wording
Issue #2209: Error Code Standardization
Issue #2210: Response Size Limits
Issue #1990: Conformance Test Suite
Real Faults in MCP Software — arXiv
Network Performance Characterization of MCP Agents — arXiv
Augmented MCP Tool Descriptions — arXiv
End-to-End Visibility into MCP Clients — Datadog
MCP Server Memory Management — Fast.io
Cloudflare — Build and Deploy Remote MCP Servers
Authentication and Authorization in MCP — Stack Overflow
Envoy AI Gateway MCP Traffic Routing