Scout: MCP Enterprise Gateway Patterns — The Middle Tier Emerges

Summary

The March 15 production-pain-points scout ended with a prediction: MCP gateways were becoming a product category because the protocol had significant gaps the ecosystem was building around. Four weeks later, the MCP Dev Summit North America (April 2–3, ~1,200 attendees under the Linux Foundation’s Agentic AI Foundation) confirmed the pattern as industry consensus. Amazon and Uber went on record with production deployments that look architecturally identical — a centralized gateway paired with a registry as the control plane, with everything else (auth, observability, tool fingerprinting, PII redaction, sandboxed execution) layered on top. Kong, Solo.io, Zuplo, Cloudflare, Docker, and Microsoft all shipped competing gateway products in the last two quarters. Google open-sourced a Colab MCP Server as the reference for offloaded sandboxed execution; Arcade brought 7,500+ governed tools through LangSmith Fleet as a gateway-as-catalog play. The decision has moved from “do we need a gateway?” (yes) to “which slots do we fill with vendors, which do we build, and which do we skip for now?”

Key Findings

1. The Middle Tier Is Now a Six-Slot Architecture

The prior scout described gateways as a monolithic category filling protocol gaps. The Dev Summit made clear that “MCP gateway” is actually a stack of distinct concerns, and each has its own vendor landscape forming around it:

Slot	Function	Leaders / reference implementations
Registry	Tool discovery, server cards, cryptographic signing	Official MCP Registry, Kong MCP Registry, agentregistry (Solo.io / CNCF)
Auth proxy	OAuth termination, SSO/IdP integration, credential brokering	Solo Enterprise for agentgateway, Zuplo, Kong
Policy engine	RBAC, tool allow-lists, PII redaction, prompt-injection blocking	Zuplo, Solo.io, Enkrypt, Uber’s internal GenAI Gateway
Observability	OpenTelemetry spans per tool call, audit trails, cost attribution	Datadog, Honeycomb via OpenLLMetry, LangSmith, Kong analytics
Transport / session router	Session-affine routing, HTTP-to-MCP translation, stateful handling	Microsoft mcp-gateway (K8s StatefulSet–aware), Envoy AI Gateway
Execution sandbox	Isolated runtime for untrusted code, GPU offload	Google Colab MCP Server, E2B, Modal, Daytona, Arcade

Two things are worth noting. First, no single vendor credibly fills all six slots today — Solo.io, Kong, and Zuplo each cover three or four well and gesture at the others. Second, the slots are not independent: the registry signs the tools the policy engine trusts; the auth proxy establishes the identity the policy engine evaluates; the observability layer needs the session router’s trace IDs to stitch spans. Decisions cluster.

2. Amazon and Uber Validate the Pattern — With Very Different Postures

Both shared production deployments at the Dev Summit (InfoQ). The architecture is the same; the implementation philosophy is not.

Uber is running tens of thousands of agent executions per week through an MCP Gateway and Registry that automatically exposes thousands of internal Thrift, Protobuf, and HTTP endpoints as MCP tools. A Go-based GenAI Gateway sits in front of external model calls performing PII redaction and identifier scrubbing before any request leaves the perimeter. First-party MCP servers are exposed as a single, consistent interface; third-party MCP servers are proxied through the same gateway. The design philosophy is that every agent interaction — tool call or model call — is routed through a central control plane that can enforce policy, attribute cost, and redact data.

Amazon built an internal MCP discovery infrastructure and formalized bundling of tools with agent skills, context files, and Standard Operating Procedures into shareable configurations (reflected in their open-source agent-sop project). Their contribution at the summit was less about runtime control and more about tool lifecycle governance: how a tool gets vetted, bundled, versioned, and distributed internally before any agent is allowed to call it. This is the “what does a tool supply chain look like?” problem, which the protocol punts on entirely.

The two postures are complementary. Uber’s gateway is the data-plane enforcer; Amazon’s registry is the approval pipeline. Every enterprise deploying MCP will need both, and most teams are underestimating the Amazon half.

3. The Vendor Landscape Has Sorted by Starting Point

By the spring of 2026, four distinct origin stories have converged on “MCP gateway” from different directions, and the starting point still predicts the feature shape:

From API gateways — Kong, Zuplo, Cloudflare. Strength: mature auth, rate limiting, observability, edge deployment. Weakness: MCP-specific concerns like tool fingerprinting and semantic tool selection are bolted on. Kong’s Enterprise MCP Gateway ships an “MCP Server Generation” feature that auto-converts REST APIs into remote MCP servers — which is exactly what you’d expect from a company that sees MCP as a new protocol to front-proxy rather than a new thing entirely.
From service mesh — Solo.io (agentgateway, now part of the Linux Foundation). Strength: the mesh-native concerns (mTLS, identity propagation, ambient waypoints, tool-server sandboxing) are native, not retrofitted. Solo Enterprise for agentgateway specifically ships tool fingerprinting and registration workflows as a tool-poisoning defense — the attack vector called out in the prior scout is now productized countermeasure.
From platform/Kubernetes infra — Microsoft mcp-gateway, Docker MCP Gateway. Strength: solves the stateful-session-vs-horizontal-scaling problem the prior scout flagged (SEP-1442 still pending) by doing session-aware routing with K8s StatefulSets and headless services. Weakness: operational complexity — you are running a platform, not consuming one.
From the agent/LLM-tooling side — Arcade (via LangSmith Fleet), Composio, Bifrost. Strength: the tool catalog itself is the product (7,500+ pre-built agent-optimized tools at Arcade; 500+ managed integrations at Composio). Weakness: you rent someone else’s catalog rather than governing your own.

The practical consequence: you rarely pick “an MCP gateway.” You pick which origin story your team already has infrastructure for, and you extend from there.

4. Sandboxed Execution Is the New Slot

The prior scout did not cover execution isolation as a gateway concern because in March the conversation was still about HTTP transport and auth. Four weeks later it is clearly the sixth slot. Two forces are converging:

Google’s Colab MCP Server established the pattern at the low end: any MCP-speaking agent (Gemini CLI, Claude Code) can offload execution to a Colab session instead of running untrusted code on the developer laptop. The isolation is “it runs in a different process on Google’s infrastructure” — adequate for individual developers, insufficient for regulated enterprise use.

At the high end, purpose-built sandbox platforms — E2B (Firecracker microVMs, hardware-level isolation), Modal (gVisor system-call interception), Daytona (Docker-based, sub-90ms startup, $24M Series A in February 2026) — are adding MCP server interfaces so they become drop-in execution backends for any gateway. Arcade’s role in LangSmith Fleet fits the same shape: a catalog of tools where each tool runs in sandboxed execution rather than on the agent’s host.

The new design question: where in your six-slot stack does untrusted code run? Three live answers:

In the tool server itself (the pre-2026 default; every vulnerability scout has documented why this is dangerous).
In a sandbox that the tool server wraps (E2B/Modal/Daytona behind an MCP facade).
In a sandbox the gateway routes to (Arcade/Colab-style — execution isolation is a platform service, not a tool-level concern).

Approach (3) is where the Dev Summit’s trajectory points. If untrusted execution is a platform service, tool authors stop being responsible for isolation — and the attack surface narrows to the sandbox provider rather than every MCP server in the catalog.

5. The Spec Is Catching Up, Slowly, in the Right Places

The prior scout noted the 2026 roadmap acknowledged most of the gaps. The Dev Summit revealed what’s actually moving:

SEP-1649 / SEP-2127 (Server Cards via .well-known/mcp.json) — standardized static metadata about a server’s capabilities that registries and gateways can discover without connecting. This is the primitive tool registries have been hand-rolling for six months. Backward compatible; optional.
SEP-1442 (Stateless MCP) — still pending. Microsoft’s mcp-gateway exists partly because this hasn’t landed.
SEP-1686 (Tasks primitive) — long-running agentic communication with durable handles, experimental.
MCP Apps — released January 26, 2026. Official extension for sandboxed iframe UIs with JSON-RPC communication.
Gateway semantics — the WorkOS roadmap analysis enumerates the three open problems: authorization propagation (token forwarding vs. claim rewriting), session semantics behind intermediaries, and gateway visibility boundaries (what a gateway is allowed to inspect). These are pre-RFC — no Enterprise Working Group yet. The gap the prior scout identified around authorization proxying is still formally open.

The thing to notice: the spec is moving fastest on the items that vendors can’t solve without protocol support (server cards, tasks primitive, stateless transport) and slowest on items where vendors have already shipped workarounds (gateway semantics, audit trails). This is a predictable equilibrium — but it means the gateway layer solidifies as permanent infrastructure rather than temporary scaffolding.

6. Observability Is Closing, With Caveats

The prior scout called the observability story “nascent.” Four weeks later, three shifts have happened:

OpenTelemetry semantic conventions for MCP agent telemetry are extending (mentioned at the Dev Summit, still draft).
Every serious MCP gateway now ships OpenTelemetry exporters by default — Kong captures tool usage, prompt/completion sizes, latency, error rates; agentgateway and Zuplo do the same.
The LLM-observability category consolidated. Traceloop was acquired by ServiceNow in March 2026 (~$60–80M); Langfuse was acquired by ClickHouse in January 2026. Datadog, Honeycomb (via OpenLLMetry), and LangSmith now all cover MCP tool invocations as first-class spans.

The caveat is the same as in every observability category: the standard only works when everyone emits the same semantic attributes. OpenLLMetry exists precisely because MCP-aware OpenTelemetry attributes were inconsistent. Expect this to shake out over 2026 — but don’t expect out-of-the-box cross-vendor tracing yet.

Practical Implications

A Build-vs-Buy-vs-Skip Decision Framework

Every team adopting MCP makes the same six decisions. Here’s a framework for each slot, sized to team profile.

Slot 1 — Registry (tool discovery, signing, server cards)

Skip if you’re running fewer than five MCP servers internally and have no third-party servers in the mix.
Buy if you have third-party MCP servers in production (tool poisoning defense requires cryptographic signing and fingerprinting; don’t roll your own — use Kong’s MCP Registry, Solo.io’s agentregistry, or Arcade’s governed catalog).
Build only if you are a large enterprise with existing internal API catalog infrastructure you can extend. Amazon built theirs; you probably shouldn’t. Default: buy.

Slot 2 — Auth proxy (OAuth termination, IdP integration)

Skip is not an option for enterprise deployments. The prior scout documented why the native MCP auth model is a non-starter.
Buy if you already have an enterprise IdP (Okta, Entra) and want it to be the source of truth — Solo Enterprise for agentgateway, Zuplo MCP Gateway, and Kong all handle this properly. Solo.io’s “Cross-App Access” direction is closest to where the spec is headed.
Build is a mistake for almost everyone. The OAuth spec churn referenced in the prior scout is still churning. Let a vendor absorb the compliance surface. Default: buy.

Slot 3 — Policy engine (RBAC, PII redaction, tool allow-lists)

Skip if you are pre-production or running low-sensitivity data only.
Buy if you have regulatory exposure (healthcare, finance, EU data residency). Zuplo, Solo.io, and Enkrypt all ship this. Uber’s GenAI Gateway is an existence proof that this is worth doing even at hyperscale.
Build a thin policy layer if your organization already has a centralized policy engine (OPA, Cedar) and you just need an adapter. Default: buy if you have compliance scope; skip otherwise.

Slot 4 — Observability (tracing, audit, cost attribution)

Skip is a short-term-only answer and a debugging disaster waiting to happen. Don’t.
Buy if you already use Datadog, Honeycomb, or LangSmith — the MCP coverage is in the box now. Add OpenLLMetry if your platform doesn’t natively trace LLM calls.
Build only means wiring OpenTelemetry yourself, which you’ll do anyway. Default: buy, but budget instrumentation work even if you buy.

Slot 5 — Transport/session routing

Skip if you’re running single-instance stdio MCP servers for local agents. This is the right answer for most coding-agent deployments.
Buy by choosing a vendor gateway that handles this (Microsoft mcp-gateway, Envoy AI Gateway, Solo.io all do session-affine routing). Don’t make this your own problem until SEP-1442 lands.
Build is actively discouraged — the session-affinity-vs-horizontal-scaling problem is exactly the kind of thing that eats a quarter of platform-engineering time. Default: skip for local; buy for remote.

Slot 6 — Execution sandbox

Skip if your MCP servers only expose read operations or narrowly scoped mutations with strong validation. Many coding-assistant setups fit here.
Buy the moment you need to run model-generated code against anything beyond a scratch environment. E2B (microVMs) for strong isolation, Daytona or Modal for speed, Colab for dev-local GPU offload, Arcade for “I want the catalog and the sandbox together.”
Build only if you have specialized compute needs (HPC, specialized GPUs, on-prem regulatory). Default: buy; this is where the newest tooling is.

The Decision Shape

In practical terms, a team committing to production MCP in mid-2026 should expect to make one “buy everything” choice (auth, policy, transport, registry from a single vendor — Solo.io, Kong, or Zuplo are the defensible picks) plus two separable choices: an observability vendor (your existing APM if it covers MCP; otherwise LangSmith or OpenLLMetry-to-Datadog/Honeycomb) and an execution sandbox (E2B, Modal, Daytona, or Arcade’s catalog, chosen on the isolation-vs-speed axis).

Teams doing fewer than five internal MCP servers with no compliance scope can skip most of this and run Microsoft’s mcp-gateway or Docker MCP Gateway as a thin reverse proxy. Teams doing more than twenty internal MCP servers or any third-party servers cannot skip any of the six slots.

The architectural shift the Edition 7 digest named — “point agent at server” → “consume a vetted tool gateway” — is now operationally concrete. Budget accordingly: a production MCP gateway stack is not a weekend build. It is infrastructure on the same order as an API gateway, with a vendor ecosystem and pricing tier to match.

Open Questions

Does the Enterprise Working Group form in time to matter? Gateway semantics (authorization propagation, session boundaries, visibility rules) are pre-RFC. If the group forms and converges on answers by late 2026, vendors can interoperate. If it doesn’t, the gateway pattern fragments along vendor lines and MCP gateways become as siloed as API gateways ever were.
Will “tool supply chain” become a governance category? Amazon’s agent-sop project is the first serious “how does a tool get approved, signed, bundled, versioned, and distributed internally” implementation. This looks less like API governance and more like package-manager governance (npm registries, Docker Hub). It needs a standard.
Does Arcade’s “7,500 governed tools” model generalize? If the answer to “what tools do our agents use?” becomes “whatever’s in the catalog we subscribe to,” the build vs. buy calculus for every individual tool shifts — you don’t write a GitHub MCP server anymore, you use the one in your gateway’s catalog. That’s a concentration risk and a velocity win simultaneously.
Can OpenTelemetry semantic conventions for MCP stabilize before vendor-specific attributes ossify? The race between the standards body and the installed base is real. Every month of delay is another set of instrumented tools that emit slightly different trace attributes.
Where does the execution sandbox settle — as a feature of the gateway, or as an independent platform? Arcade and Colab argue for the former; E2B/Modal/Daytona argue for the latter. The answer determines whether “MCP infrastructure” is a single vendor category or two.

Sources

AAIF’s MCP Dev Summit: Gateways, gRPC, and Observability Signal Protocol Hardening — InfoQ
Announcing the Colab MCP Server — Google Developers Blog
Arcade Dev Tools Now in LangSmith Fleet — LangChain
MCP Dev Summit 2026: AAIF Sets A Clear Direction With Disciplined Guardrails — Futurum Group
Introducing Kong’s Enterprise MCP Gateway — Kong
Kong MCP Server Registry — Kong
Zuplo MCP Gateway: Enterprise Governance for MCP at Scale — Zuplo
Solo.io Launches Solo Enterprise for agentgateway and Solo Labs for MCP — Solo.io
Agentgateway: The AI-Native Gateway — Solo.io
Prevent MCP Tool Poisoning With a Registration Workflow — Solo.io
microsoft/mcp-gateway — Microsoft, session-aware stateful routing for MCP on K8s
The 2026 MCP Roadmap — official MCP blog
MCP’s 2026 Roadmap Makes Enterprise Readiness a Top Priority — WorkOS
SEP-1649: MCP Server Cards — HTTP Server Discovery via .well-known
SEP-2127: MCP Server Cards PR
Official MCP Registry
5 Best MCP Gateways for Developers in 2026 — Maxim AI
10 Best MCP Gateways for Developers in 2026 — Composio
agentic-community/mcp-gateway-registry — open-source Keycloak/Entra-integrated MCP gateway + registry
Stainless: Generate MCP Servers from OpenAPI Specs
Daytona vs E2B in 2026: which sandbox for AI code execution? — Northflank
Prior scout: MCP Production Pain Points — builds on, does not duplicate