Table of Contents
When you need a gateway (and when you don’t)
MCP servers are straightforward to build and secure for a single application. That’s by design; it’s a dev-forward protocol that is still rapidly evolving. The challenge with this design is that, once you need to offer agent connections to dozens or hundreds of B2B customers, the one-size-fits-all approach falls apart. Each of these enterprise clients needs their own credentials, consent screens, and integrations farther downstream.
It’s not a new problem in SaaS, but it is a new problem with MCP. Routing and policy enforcement across dozens or hundreds of customers is untenable without some extra scaffolding. Meanwhile, MCP’s auth spec (OAuth 2.1, DCR/CIMD, scopes) makes multi-tenancy significantly more complex than traditional API-based scenarios.
This is why MCP gateway products are emerging as a partial solution to some (but not all) of these issues. Descope’s Agentic Identity Hub and its support for MCP gateways provide the managed identity infrastructure for multi-tenant MCP connections. It lets you model each tenant’s MCP server as a managed resource with its own client registration behavior, consent flows, connections, and access policies, all from a centralized control plane.
This guide covers the architectural decisions behind this approach, and whether an MCP gateway is right for you. You’ll learn:
When an MCP gateway is right for your multi-tenant needs, and when simpler patterns hold up
What popular MCP gateway products offer at the enforcement layer, and where they leave you to DIY
How the identity provider layer (which issues tokens, registers clients, handles consent flows, and manages credentials) fits behind the gateway
A practical implementation walkthrough using Golf.dev and Descope
When you need a gateway (and when you don’t)
Not every multi-tenant MCP deployment requires a gateway. The right architecture depends on how much your tenants diverge from each other, how many of them you need to support, and how much of the identity surface you want to consolidate.
Single MCP server with tenant context in the token
This is the simplest approach. One MCP server resolves tenant context per request and routes internally, determining which downstream credentials to use, which tools to expose, and which data to return. Tenant context can come from a claim in the token or from a server-side lookup against the user's identity.
The latter is generally preferable for MCP because the client holds the bearer token for the entire session with no mechanism to force a refresh, which means any tenant state baked into the JWT is frozen at login. Server-side resolution lets the same token serve multiple tenant contexts over the life of a session.
This works well when all tenants share the same session configuration, the same consent flow, and the same credential type for downstream services. A project management SaaS MCP server with 20 customers who all connect to the same internal API through the same OAuth scopes can operate this way without issue.
It falls apart when tenants need different session parameters (timeouts, refresh behavior), different scope mappings to downstream APIs, or isolated credential storage. It also breaks when you need to front both internal and third-party MCP servers from a single entrypoint.
At that point, tenant differentiation lives entirely in your application code, and the MCP server itself becomes the identity management layer by default. That is a maintenance headache that grows with every new customer onboarding and requirement.
If the single-server approach works for your situation, use it. You do not need a gateway.
Multiple independent MCP servers per tenant
This is explicit, highly regimented isolation. Each customer gets their own MCP server with its own .well-known endpoint, its own client registration configuration, its own scopes, and its own credential set.
Fly.io prefers this single-tenant model (one app per customer) for MCP server deployments, citing compute isolation and billing efficiency through the auto-start/stop of idle machines. Framework-level projects like MCP Plexus and Sage MCP exist solely because enough teams have hit this multi-tenant ceiling, justifying orchestration. Meanwhile, there’s an open issue on the official MCP servers repository from a team trying to serve AWS, Azure, and Jira integrations across multiple tenants without running a separate server per tenant.
The per-tenant pattern is effective for compute and data isolation. What scales poorly is the identity management around it. Every new customer means a new endpoint to configure, new DCR or CIMD settings, new credential sets, and new consent flows.
The MCP servers themselves stay clean, but the identity scaffolding around it sprawls outward. At 50 tenants, this is a staffing and time problem. At 500 it is untenable.
There is also a distribution constraint. If you publish your MCP server as a Claude Connector or ChatGPT App, the platform expects a single endpoint with one .well-known URL. Per-tenant servers with independent endpoints do not fit that model without an aggregation layer in front of them.
MCP gateway without a dedicated identity provider
The MCP gateway market has matured about as quickly as you might expect given the rapid adoption of MCP itself. What is substantially different about MCP gateways as a product, however, is that they are not simply generic API gateway repurposed for a new protocol.
We’ll look at two offerings to better understand what they can do, and where they leave off: Kong AI Gateway and Traefik Hub MCP Gateway.
The Kong AI Gateway ships with an AI MCP Proxy plugin that bridges MCP and HTTP, letting MCP clients call existing APIs or interact with upstream MCP servers through Kong. The AI MCP OAuth2 plugin implements OAuth enforcement for MCP servers.
The Traefik Hub MCP Gateway provides TBAC, or Task-Based Access Control. This governs access across three dimensions (tasks, tools, and transactions). There’s JWT integration to inject claims into policy invocations at runtime, and optional On-Behalf-Of (OBO) authentication can forward the client's token into the MCP server for downstream token exchange when enabled.
Both of these MCP gateway offerings solve the first half of the problem: enforcement, routing, observability, and token validation. But they do not solve the second half of the problem: the identity infrastructure that provides token issuance, client registration, consent flows, credential isolation, and a centralized control plane.
Without an identity layer, a gateway can validate a token and enforce scopes, but it cannot resolve tenant context. Which consent flow issued this token? Which downstream credential backs this tool call? Those answers live in the identity provider, not in the token.
Without one, that resolution falls back to your MCP server, which is exactly where you did not want it. You end up building the identity layer yourself inside or alongside the gateway, which is the same problem you were trying to centralize away from your MCP servers in the first place.
MCP gateway with a dedicated identity layer
This is an MCP gateway with a purpose-built routing layer that understands MCP’s specific semantics, handles enforcement, and layers in observability. A dedicated identity provider handles token issuance, client registration, consent flows, credential management, and per-tenant configuration. Standard gateways enforce just fine on their own. But they can’t decide. That’s what the IdP coupling is for.
This separation matters when you have many tenants, mixed MCP server types (internal and third-party), per-tenant credential and policy requirements, and the need for the control plane to be managed by a centralized platform rather than hand-maintained per customer.
Here’s why the identity layer is essential for enterprise MCP servers:
The gateway validates tokens and enforces policies at request time
The identity provider issues tokens and evaluates policies at issuance time
Both layers apply policy
Neither may be sufficient alone
An agent whose token passes issuance checks can still be denied at the gateway based on context (e.g., rate limits). Layering these two together natively, in one solution, is what makes the architecture more defensible.
MCP multi-tenant scaling solutions compared
| Single server | Per-tenant servers | Gateway without IdP | Gateway with IdP |
|---|---|---|---|---|
Tenant isolation | Token-level | Compute-level | Gateway-level | Gateway + identity |
Identity management | Tenant routing logic lives in your app code and grows with every new customer | Each server carries its own auth config; no shared control plane | You build it yourself inside the gateway or alongside it | One control plane manages all tenant configurations |
Credential isolation | Your code decides which credential to use per tenant | Each server holds its own credentials | Gateway can forward tokens but can't scope credentials per tenant | Each tenant's downstream credentials are stored and resolved in isolation |
Consent customization | Uniform | Per-server | Not addressed | Per-tenant flows |
Operational scaling | Low overhead, limited flexibility | Linear overhead | Routing scales, identity doesn’t | Both routing and identity scale together |
How Descope supports MCP gateway implementations
The two gateway patterns for multi-tenant MCP map to different identity requirements. Both are documented in the Descope MCP Gateways guide.
Mixed mode (internal and third-party MCP servers)
In this pattern, the gateway fronts MCP servers you build alongside third-party servers like Notion, Linear, or Slack. The identity challenge splits along the same line. Internal servers need resource tokens scoped to your APIs. Third-party servers need connection tokens scoped to external OAuth providers. The identity provider must issue and manage both token types without leaking credentials across tenant boundaries.
A single agent request might invoke a tool on your internal billing server (requiring a Descope-issued resource token), and then a tool on a customer’s connected Slack instance (requiring a tenant-specific OAuth token for the Slack API). As the identity provider, Descope brokers both, and the gateway routes both, but neither token type should be visible to the other server.

Per-customer MCP servers
In this pattern, each customer gets a logically separate MCP server, each backed by the same runtime code but modeled as a distinct resource in the identity provider. Each server can have its own `.well-known` configuration, client registration behavior, session settings, and branding. Unlike the Fly.io model presented above, this does not require separate compute instances. Isolation is achieved at the identity layer through per-tenant configuration of the same runtime.
The identity challenge is configuration management at scale. Each customer may require distinct session timeouts, different consent flows (e.g., extra MFA for financial services tenants, simplified consent for internal tools), and different downstream credential sets. As the identity provider, Descope resolves tenant context and issues appropriately scoped tokens before the gateway routes a single tool call. The MCP Server Management API allows programmatic creation and configuration of these per-tenant server definitions.

Implementation example of an MCP gateway with Descope and Golf.dev
Descope’s integration with Golf.dev provides a concrete example of this architecture in action. Golf.dev is an MCP firewall and gateway that acts as a runtime enforcement layer for identity policy. It sits in front of MCP servers, validates tokens, reads roles and scopes, and applies access policies before requests reach the server. Descope acts as the identity provider behind it, managing authentication, authorization, client registration, and credential storage.
In summary, here’s what is happening:
An agent requests access to Tenant A's billing MCP server.
Descope handles client registration (DCR or CIMD), runs the tenant-specific consent flow (SSO, MFA, branded consent screen with tool-level scopes), evaluates the issuance policy, and mints a scoped token.
Golf.dev validates the token and routes the request.
The MCP server retrieves Tenant A's downstream credentials from Descope's Connections and executes the tool.
Tenant B enters, with different consent flow (additional attribute collection), different downstream credentials (different billing provider), narrower scope mapping. Same gateway, same identity provider.
Descope resolves the tenant context and issues the right token with the right scopes backed by the right credentials.
As this all happens, there’s no leakage, and the two organizations get total tenant-aware isolation. Descope roles map to Golf.dev gateway groups, enabling tool-level RBAC: a user with an analyst role can get invoices but not write to the ledger.
The integration requires only a Descope Project ID, issuer URL, and audience identifiers. For full configuration details, see the Golf.dev integration guide.
Descope as the identity layer inside and MCP gateway product
Cequence Security processes nearly 10 billion API transactions daily across telecom, financial services, and retail. Their AI Gateway product turns any API into an MCP-compatible endpoint without custom server code, with built-in monitoring, agent personas, and enterprise access controls. Descope powers the identity infrastructure behind it.
Cequence CTO Shreyans Mehta said:
“Descope is a key under-the-hood component of the Cequence AI Gateway. Descope’s flexible, developer-friendly handling of MCP auth has played an important role in helping Cequence AI Gateway customers securely connect their applications, APIs, and data to AI agents.”
This is the same architectural split covered throughout this guide:
Cequence owns enforcement, routing, observability, and the no-code MCP server generation layer.
Descope owns token issuance, client registration, consent, and credential management.
The result of this combination is a gateway product that ships with enterprise-grade identity from day one. Any MCP gateway product that needs to offer multi-tenancy to its customers faces the same build-or-buy tension on the identity layer. Descope provides that layer as managed infrastructure.
For more on how Cequence uses Descope, read the Cequence Security case study.
MCP gateways with built-in identity at the core
MCP gateways are not a universal solution, and not every organization will need one. For multi-tenancy at scale, especially with mixed MCP server types and per-customer identity requirements, the identity layer is the part that determines whether the architecture holds or not. Existing gateway solutions handle routing and coarse-grained authorization competently. But even Traefik’s MCP gateway product lacks the ability to act as your identity provider, only offering the scaffolding that solves the first part (and not the second) of the multi-tenancy problem.
What these solutions need behind them is an identity provider that can issue the right tokens, manage client registration across tenants, orchestrate consent flows that vary per customer, and store credentials without leaking them across boundaries. That’s Descope.
Get started by exploring the MCP Gateways documentation, signing up for a Free Forever Descope account, or booking a demo with the Descope team. For building MCP servers with Descope auth, see the Agentic Identity Hub and Python MCP SDK.


