Artificer Digital The Artificer's Grimoire

Scout: MCP Production Pain Points

Summary

The Model Context Protocol has achieved remarkable adoption since its November 2024 launch — supported by Claude Code, VS Code, Cursor, Windsurf, and dozens of agent frameworks. But production deployments are exposing fundamental gaps that the spec team is racing to address. The pain points cluster into a clear hierarchy: stateful sessions that fight horizontal scaling, an auth model that enterprises can’t use, pervasive security vulnerabilities (43% of early MCP servers had command injection flaws), context window consumption that inflates costs 2-30x, and an observability story that barely exists. The emergence of “MCP Gateways” as an entire product category — Kong, Solo.io, Composio, Lunar — is itself evidence that the protocol has significant production gaps the ecosystem is building around rather than waiting for the spec to fix.

Key Findings

1. Stateful Sessions Kill Horizontal Scaling

The single most repeated production complaint. MCP’s mandatory initialization handshake creates stateful sessions, making it incompatible with standard stateless load balancers (AWS ALB, GCP LB). Session IDs must be reused across calls so the server maintains context — directly conflicting with how cloud infrastructure works.

Practitioners report that scaling MCP to multiple replicas requires externalized state (Redis, DynamoDB) or sticky sessions, adding significant infrastructure overhead. One developer trying to build a stateless MCP server across multiple Kubernetes pods with Redis reported the SDK provides no reliable way to map client session IDs to server-internal event streams.

SEP-1442 (“Make MCP Stateless”) is a major active proposal to make statelessness the default. The 2026 roadmap explicitly lists “Transport Evolution and Scalability” as priority #1: “evolve Streamable HTTP to run statelessly across multiple server instances.”

Current workarounds: Envoy AI Gateway for session-aware routing, Redis-backed session externalization, or accepting single-instance deployments.

2. Authentication Is Broken for Enterprise

The auth model has been called “a mess” and “a non-starter for enterprise” by multiple practitioners with enterprise consulting backgrounds. The core problems:

  • Conflated roles. The spec treats MCP servers as both OAuth resource servers AND authorization servers. Enterprises have centralized IdPs (Okta, Entra) and don’t want every tool server acting as its own identity provider.
  • Anonymous Dynamic Client Registration. The spec relies on DCR without pre-approval. Enterprises demand pre-registered, vetted clients.
  • IdP incompatibility. The spec depends on newer OAuth RFCs that MS Entra and Okta don’t consistently implement.
  • Spec churn. The auth model changed significantly between the March 2025, June 2025, and November 2025 spec revisions, forcing implementers to rewrite auth logic repeatedly.
  • Real exploits. Obsidian Security found one-click account-takeover vulnerabilities in production MCP servers that failed to bind OAuth state parameters to user sessions.

Three critical CVEs have been filed in six months: CVE-2025-49596 (CVSS 9.4, unauthenticated MCP Inspector access), CVE-2025-6514 (CVSS 9.6, command injection in mcp-remote with 437K downloads), and CVE-2025-52882 (CVSS 8.8, unauthenticated WebSocket in Claude Code extensions).

Current workarounds: MCP gateways (agentgateway from Solo.io), SPIFFE/mTLS for machine-to-machine identity, centralized policy engines (OPA), proxy-based auth.

3. Security Vulnerabilities Are Pervasive — Not Theoretical

This isn’t about future risks — these are documented exploits against production systems:

  • 43% command injection rate. Early 2025 audits found 43% of MCP servers had command injection vulnerabilities, 22% had path traversal, 30% were vulnerable to SSRF.
  • Tool poisoning and rug pulls. Malicious instructions embedded in tool descriptions are invisible to users but interpreted by the LLM. Invariant Labs demonstrated exfiltrating a user’s entire WhatsApp history via a poisoned MCP server running alongside a legitimate one. Tool definitions can mutate after installation — a tool approved on Day 1 can silently reroute API keys by Day 7. No mechanism exists to lock definitions post-approval.
  • Cross-server prompt injection. Tool output from one MCP server can poison subsequent tool calls to another. No sandboxing between tools. Simon Willison has documented this extensively.
  • Real-world incident. Supabase’s Cursor agent processed support tickets containing user-supplied SQL injection, exfiltrating sensitive integration tokens into a public thread.
  • Confused deputy via GitHub. Invariant Labs demonstrated a crafted GitHub issue hijacking an AI assistant into exfiltrating private repository data via a public PR.

The spec team has responded with Security Best Practices documentation covering confused deputy, SSRF, session hijacking, and local server compromise. But these are advisory — nothing in the protocol enforces them.

4. Context Window Consumption Inflates Costs 2-30x

As organizations connect multiple MCP servers, tool definitions consume enormous context before the agent does anything:

  • 40-50K tokens upfront. Connecting GitHub, Linear, Postgres, and Slack servers loads all tool definitions at once via tools/list. No lazy loading or relevance-based filtering in the spec.
  • Individual tool overhead. Enterprise tools with detailed schemas consume 500-1,000 tokens each just for the definition. One Claude Code user reported ~200 deferred tools across ~8 services.
  • Cost impact. Production deployments discover individual tools consuming 10,000+ tokens per call when 1,000 would suffice. At current pricing, that’s a 10x cost multiplier per invocation.
  • Few clients support listChanged notifications, so dynamic tool filtering doesn’t work in practice even where the spec allows it.

Current workarounds: RAG-based dynamic tool selection, MCPlexor multiplexer (claims 95% context reduction), gateway-based lazy loading, hierarchical deferred tool discovery.

5. Remote Deployment Is Harder Than It Should Be

Moving from local stdio to remote HTTP is where many teams hit walls:

  • SSE behind proxies. Server-Sent Events don’t work well behind corporate proxies, load balancers, or in serverless environments. The spec added “Streamable HTTP” transport (replacing deprecated HTTP+SSE in spec version 2025-03-26), but it’s complex to implement correctly.
  • Cookie/session forwarding. Most MCP clients use fetch() internally and don’t properly forward Set-Cookie headers, breaking load balancer session affinity.
  • Cold start latency. First MCP tool call costs ~2,485ms; subsequent cached calls drop to ~0.01ms. The cold start is painful for serverless deployments.
  • JSON-RPC over HTTP adds complexity. Instead of leveraging HTTP semantics, MCP sends GET/POST as JSON parameters with responses over separate SSE connections, requiring manual message attribution. FeatureForm argues this is needless indirection.

SEP-1288 proposes WebSocket transport as an alternative, arguing Streamable HTTP is overly complex for bidirectional communication.

6. Observability and Debugging Barely Exist

Traditional monitoring tools are blind to MCP-specific failure modes:

  • Non-deterministic behavior. The same prompt can trigger entirely different tool chains depending on the LLM’s reasoning, making failures hard to reproduce.
  • The stdio logging trap. MCP servers using stdio transport must write ONLY JSON-RPC to stdout. Any debug logging to stdout corrupts the protocol stream and kills the connection. All logging must go to stderr — a subtle but frequently hit issue.
  • No standard tracing. There’s no built-in distributed tracing. Teams must manually integrate OpenTelemetry.
  • No standard error codes. Issue #2209: error codes are chosen ad hoc with only 100 slots in the JSON-RPC custom range.
  • No response size limits. Issue #2210: large responses overflow context windows with no protocol mechanism to constrain them.

Datadog, Dynatrace, and Sentry have shipped MCP-aware monitoring, but the ecosystem is nascent.

7. Server Lifecycle and Process Management

Each coding agent session launches its own MCP server instances. At scale, this creates operational problems:

  • Process accumulation. Multiple agents can spawn 15+ server processes consuming 1GB+ of memory.
  • Memory leaks. Unclosed HTTP response streams, event listeners without cleanup, unbounded caches without eviction policies. Server crashes drop all active sessions.
  • Static tool discovery. Claude Desktop requires JSON editing and full restart to add servers — no hot-reloading.
  • No conformance test suite. Issue #1990: different SDK implementations may diverge from spec with no way to verify compliance.

8. The Gateway Layer Is Filling the Gaps

The emergence of MCP Gateways as an entire product category tells the story: Kong, Solo.io (agentgateway), Composio, Lunar, Cloudflare, and others have all shipped gateway products that sit between clients and MCP servers to provide what the protocol lacks:

  • Centralized authentication and credential management
  • Audit trails and compliance logging
  • Rate limiting and cost controls
  • Tool filtering and context optimization
  • Policy enforcement (RBAC, data exfiltration prevention)

This pattern — a proxy that adds enterprise capabilities to an underspecified protocol — is well-established (cf. API gateways for REST). But it means production MCP deployments require significant infrastructure beyond the protocol itself.

9. What the Spec Team Is Doing About It

The 2026 roadmap explicitly acknowledges most of these gaps:

PriorityTarget
Transport EvolutionStateless MCP by default, scalable session handling, MCP Server Cards for discovery
Agent CommunicationTask retry semantics, expiry policies for results
Enterprise ReadinessAudit trails, enterprise-managed auth (SSO), gateway/proxy patterns, config portability
GovernanceContributor ladder, delegation model to unblock SEP bottleneck
On the HorizonStreaming tool results, webhooks/triggers, finer-grained auth scopes, conformance tests

The spec has already addressed some issues in the November 2025 revision: Streamable HTTP replacing SSE, elicitation for server-initiated user input, tool annotations for security hints, and detailed Security Best Practices documentation.

Practical Implications

For Teams Evaluating MCP Adoption

  1. Don’t skip the gateway. Production MCP deployments need an intermediary for auth, audit, and cost control. Budget for it architecturally — this is not optional infrastructure. Evaluate agentgateway (Solo.io), Kong MCP Gateway, or build a thin proxy layer.

  2. Start with stdio, plan for HTTP. Local stdio deployment is straightforward and sidesteps most transport issues. But design your server architecture knowing you’ll need to migrate to HTTP for multi-user and cloud deployments.

  3. Implement tool filtering from Day 1. Don’t load all tools into every session. Use deferred tool discovery, RAG-based selection, or gateway-level filtering to keep context consumption under control. The 40-50K token upfront cost is a real economic constraint.

For Teams Already Running MCP

  1. Audit your servers for the OWASP top 3. The 43% command injection rate means existing servers are likely vulnerable. Run input validation checks on every parameter that touches shell commands, file paths, or SQL.

  2. Pin tool definitions. Until the spec adds definition locking, implement your own versioning for tool schemas. Hash definitions at approval time and alert on changes — the rug pull attack vector is real.

  3. Add OpenTelemetry tracing manually. Don’t wait for the spec. Wrap every tool call in a span, every user task in a trace. This is the minimum viable observability for debugging non-deterministic agent behavior.

What NOT to Do

  1. Don’t build on the auth spec as-is for enterprise. If you need SSO integration, use a gateway that handles auth externally. The spec’s OAuth model will continue to churn.

  2. Don’t assume horizontal scaling works. Test your deployment under load with multiple concurrent sessions before committing to MCP for high-throughput use cases. SEP-1442 (stateless MCP) is still in draft — plan for single-instance or sticky-session deployments.

Open Questions

  1. Will SEP-1442 (stateless MCP) land, and when? This is the most important open question for production viability. If it ships, it removes the #1 pain point. If it doesn’t, the gateway pattern becomes permanent infrastructure.

  2. Is the gateway layer a feature or a tax? API gateways became standard infrastructure for REST. Will MCP gateways follow the same path (accepted cost of doing business) or will the spec eventually internalize these capabilities?

  3. How will tool definition governance work at scale? With no standard for locking, versioning, or auditing tool definitions, organizations connecting to third-party MCP servers are trusting that definitions won’t change. The spec’s tool annotations are advisory only.

  4. What’s the right answer for multi-tenant isolation? The confused deputy attack on Asana’s MCP integration (cached responses failing to re-verify tenant context) suggests this is an unsolved problem. The spec punts on it entirely.

  5. Will WebSocket transport (SEP-1288) replace Streamable HTTP? Multiple practitioners report WebSockets work “significantly better in cloud environments.” If the spec adds official WebSocket support, it could simplify remote deployments considerably.

Sources

  1. MCP Is Dead; Long Live MCP — Charles Chen, practitioner critique (246 HN points)
  2. The MCP Authorization Spec Is… a Mess — Christian Posta, Solo.io VP Global Field CTO
  3. MCP Authorization is a Non-Starter for Enterprise — Solo.io
  4. What MCP Gets Wrong — FeatureForm
  5. MCP shipped without authentication — VentureBeat
  6. When MCP Meets OAuth: Account Takeover — Obsidian Security
  7. The MCP AuthN/Z Nightmare — Doyensec
  8. Let’s Fix OAuth in MCP — Aaron Parecki
  9. MCP Prompt Injection Security Problems — Simon Willison
  10. Tool Poisoning Attacks — Invariant Labs
  11. Poison Everywhere: No Output Is Safe — CyberArk
  12. MCP Security TOP 25 Vulnerabilities — Adversa AI
  13. MCP Attack Vectors and Defense — Elastic Security Labs
  14. A Timeline of MCP Security Breaches — AuthZed
  15. Prompt Injection via MCP Sampling — Palo Alto Unit 42
  16. Tool Poisoning and Rug Pulls — MCP Manager
  17. Practical DevSecOps — MCP Security Vulnerabilities
  18. Remote MCP Servers: Inevitable, Not Easy — The New Stack
  19. MCP Roadmap 2026 — The New Stack
  20. Official 2026 MCP Roadmap
  21. SEP-1442: Make MCP Stateless
  22. SEP-1288: WebSocket Transport
  23. Issue #282: Session State Inconsistency
  24. Issue #2349: Step-up Auth Scope Accumulation
  25. Issue #1721: OAuth Mixup Attack
  26. Issue #544: Phishing via Malicious MCP Servers — Alibaba Cloud
  27. Issue #913: False Sense of Security in Spec Wording
  28. Issue #2209: Error Code Standardization
  29. Issue #2210: Response Size Limits
  30. Issue #1990: Conformance Test Suite
  31. Real Faults in MCP Software — arXiv
  32. Network Performance Characterization of MCP Agents — arXiv
  33. Augmented MCP Tool Descriptions — arXiv
  34. End-to-End Visibility into MCP Clients — Datadog
  35. MCP Server Memory Management — Fast.io
  36. Cloudflare — Build and Deploy Remote MCP Servers
  37. Authentication and Authorization in MCP — Stack Overflow
  38. Envoy AI Gateway MCP Traffic Routing