Anthropic introduced the Model Context Protocol on November 25, 2024, as an open standard for connecting AI agents to external tools and data sources. It addressed what engineers call the M-times-N problem: the combinatorial explosion of connecting M different AI models with N different tools, each requiring custom integration code. MCP reduced this to implement the client protocol once, implement the server protocol once, and everything connects.
The protocol gained traction quickly. OpenAI adopted MCP across its Agents SDK, Responses API, and ChatGPT desktop application in March 2025. Google DeepMind confirmed Gemini support in April 2025. Microsoft announced MCP integration for Windows 11, Azure, and Foundry at Build 2025. By December 2025, Anthropic donated MCP to the newly formed Agentic AI Foundation under the Linux Foundation, co-founded with OpenAI and Block, with backing from AWS, Google, Microsoft, Cloudflare, and Bloomberg. The protocol had reached 97 million monthly SDK downloads and over 10,000 active servers.
But beneath the adoption numbers, the specification itself has undergone significant evolution. Four spec versions shipped in just over a year: the initial release (2024-11-05), a major transport and auth overhaul (2025-03-26), a refinement pass (2025-06-18), and a one-year anniversary release (2025-11-25). The transport layer changed. Authentication was standardized. One of the three core primitives drifted into practical irrelevance. And a new challenge — context window bloat — emerged as teams pushed MCP into production at scale.
These are not cosmetic updates. They reflect hard lessons about how AI agents actually use protocols in production and carry direct implications for anyone building or consuming MCP servers today.
The Original Design: Three Primitives
MCP launched with three core primitives, each designed for a different interaction pattern:
Tools are model-controlled functions the agent can invoke. Create a ticket, send an email, query a database. Each tool has a name, description, and parameter schema. The LLM reads tool descriptions and decides when to call them. This is the active, agentic primitive — the one that lets AI systems take actions in the world.
Resources are application-controlled data sources. Documents, records, configuration files. Unlike tools, the application decides when to fetch resources and pass them as context to the model. Resources were designed as a browseable, read-only mechanism for exposing information without requiring function calls.
Prompts are predefined instruction templates that MCP servers expose to structure agent behavior for specific tasks. They standardize how models approach common operations and ensure consistency across teams using the same server.
The initial transport used HTTP with Server-Sent Events (SSE) for server-to-client streaming. Authentication was left to individual implementations — typically API tokens stored in environment variables.
Shift 1: The Resources Primitive that Builders Bypassed
The Resources primitive was conceptually sound. Instead of wrapping every data access in a tool call, agents could discover and read resources like files in a filesystem. The specification drew a clear architectural line: tools are model-controlled (the LLM decides when to invoke them), while resources are application-controlled (the host application decides when to fetch them).
In practice, the distinction proved less useful than expected. A data query — “get the customer record for ID 12345” — works identically as a tool call with parameters. The Tools primitive is flexible enough to handle both actions (create, update, delete) and queries (read, search, list). Routing everything through tools gives the LLM a single, consistent interface for interacting with external systems.
This is visible in the MCP server ecosystem. Examine community implementations, cloud-hosted services, and open-source projects: tool-heavy designs dominate. Resources exist in the specification and are supported by conforming SDKs, but the community has largely converged on tools as the universal primitive.
The pattern is familiar from protocol history. Initial specifications cast a wide net — multiple primitives covering different theoretical interaction patterns. Practice simplifies. The community discovers which abstractions carry their weight in production and which add surface area without proportional value.
Resources are not deprecated. They remain in every spec version through 2025-11-25 and are documented with full SDK support. But their practical irrelevance is an important signal for builders: when designing MCP servers, invest in tools.
Shift 2: Streamable HTTP Replaces SSE
The original transport used Server-Sent Events — a unidirectional streaming protocol requiring the server to maintain a persistent connection for pushing updates to the client. This worked for locally running agents but created serious problems at scale.
Why SSE Failed in Production
SSE requires persistent connections. In serverless environments — AWS Lambda, Cloudflare Workers, Vercel Functions — maintaining persistent connections is either impossible or prohibitively expensive. Organizations wanting to deploy MCP servers as serverless functions were forced into workarounds or could not use MCP at all.
Even in traditional server environments, persistent connections create operational burden: connection pooling, timeout management, reconnection logic, and load balancing all become harder when connections must stay alive indefinitely. The 2026 MCP roadmap explicitly acknowledges this, noting that running streamable HTTP at scale surfaced gaps where stateful sessions fight with load balancers and horizontal scaling requires workarounds.
What Changed
The specification version 2025-03-26, released in March 2025, deprecated HTTP+SSE and introduced the streamable HTTP transport. The TypeScript SDK added support in version 1.10.0 on April 17, 2025.
The new transport supports both stateful and stateless operation. A simple MCP server can handle each request independently — no persistent connection required, fully compatible with serverless deployment. More complex servers that need streaming or server-initiated messages can optionally upgrade the connection to use SSE within the new transport framework.
The server provides a single HTTP endpoint (the MCP endpoint) supporting POST and GET methods. Session management is optional — servers may assign a session ID via the Mcp-Session-Id header, but stateless servers can skip sessions entirely. This means MCP servers can now be deployed as:
- Serverless functions — Lambda, Cloud Functions, Vercel
- Traditional web services — Express, FastAPI, Spring Boot
- Container microservices — Kubernetes, Cloud Run, ECS
- Edge functions — Cloudflare Workers, Vercel Edge
The constraint of persistent connections is gone. MCP servers can be deployed wherever HTTP works — which is everywhere.
Backward Compatibility
Servers wanting to support older clients can host both the deprecated SSE endpoints and the new streamable HTTP endpoint simultaneously. This allows gradual migration without breaking existing integrations.
Practical Impact
For builders, this change is immediately actionable. New MCP servers should target streamable HTTP from day one. Even for initial local deployment, stateless HTTP design enables future migration to serverless or edge environments without rearchitecting.
For consumers evaluating MCP servers, transport support is now a compatibility question. Servers still using SSE-only transport will create deployment constraints in cloud-native environments.
Advertisement
Shift 3: OAuth 2.1 Closes the Authentication Gap
The initial MCP specification (2024-11-05) had no standardized authentication mechanism. Developers implemented their own schemes, typically passing API tokens in environment variables. This was acceptable for locally running servers where the developer controlled the environment but unacceptable for multi-tenant cloud deployments.
The Shared Credential Problem
Consider a company deploying an MCP server for GitHub access. With the old model, the server needs a GitHub token — either a personal access token scoped to one user or a service account token with broader permissions. Neither option is secure for a multi-user service.
A personal token means all requests go through one user’s permissions, with no audit trail distinguishing who did what. A service account token typically has more access than any individual user should have, violating the principle of least privilege.
How OAuth 2.1 Fixes It
The specification version 2025-03-26 introduced a comprehensive authorization framework based on OAuth 2.1. The architecture maps cleanly onto established OAuth roles: MCP servers act as OAuth 2.1 Resource Servers, MCP clients act as OAuth 2.1 clients, and a separate authorization server handles token issuance.
The flow works as follows:
- A user’s agent connects to a cloud-hosted MCP server
- The server responds with HTTP 401 Unauthorized, triggering the OAuth flow
- The client redirects the user to the authorization server’s consent screen
- The user authorizes access with specific scopes
- The MCP server receives an access token scoped to that specific user
- Subsequent tool calls use the user’s token, not a shared credential
The spec mandates several security measures. Clients must implement Resource Indicators (RFC 8707) to prevent malicious servers from obtaining access tokens meant for other services. Servers must implement OAuth 2.0 Protected Resource Metadata (RFC 9728) to indicate their authorization server locations. Tokens must be validated for intended audience.
Each user controls their own access. Tokens can be scoped to minimum necessary permissions. There is a clear audit trail per user. No shared credentials sit in environment variables.
Why This Unlocks Enterprise Adoption
The most common enterprise objection to MCP was security. OAuth 2.1 eliminates this objection. Each user authenticates independently through flows the enterprise already trusts. IT teams can enforce access policies per user. Compliance teams can audit per-user actions.
The November 2025 spec release (2025-11-25) further refined the authorization model, and the 2026 roadmap lists deeper security and SSO-integrated auth among its enterprise readiness priorities. This is the shift that moves MCP from developer tool to enterprise infrastructure.
The Emerging Challenge: Context Window Bloat
As MCP moved into production at scale, a new problem surfaced that the original designers did not anticipate: tool definitions consume context window tokens.
Every tool description, parameter schema, and response format eats into the model’s working memory. For agents connecting to multiple MCP servers, the overhead compounds rapidly. Industry reports indicate that connecting six MCP servers with 84 total tools consumes approximately 15,500 tokens at session start — before the agent performs any actual work. Some teams report MCP consuming 40 to 50 percent of available context windows.
The problem is significant enough that Perplexity’s CTO announced in March 2026 that the company was moving away from MCP in favor of traditional APIs and CLIs, citing token bloat and authentication friction as core issues.
The community is responding with several approaches. A code execution pattern demonstrated token reductions of up to 98 percent by letting agents write and execute code rather than making individual tool calls. Progressive disclosure sends a minimal manifest at connection time — just tool names and one-line descriptions — loading full schemas only when needed. The mcp2cli project converts MCP servers into CLI interfaces that agents call on demand, avoiding upfront schema injection entirely.
Anthropic themselves published engineering guidance on the code execution pattern, acknowledging the problem and providing a reference implementation. The 2026 roadmap lists transport scalability and session handling improvements that should help address context overhead at the protocol level.
This is worth watching. Context efficiency may become as important as transport and auth in determining MCP’s production viability.
The Maturation Pattern
MCP’s evolution follows a pattern familiar from protocol history. HTTP started simple, then grew to handle persistent connections, streaming, and multiplexing. REST APIs were initially free-form, then consolidated around OpenAPI specifications. GraphQL launched with all features exposed, then the community converged on a practical subset.
MCP is going through the same maturation. The initial spec cast a wide net — tools, resources, prompts, SSE transport, no standard auth. Four specification versions later, the community has narrowed the effective surface area: tools over streamable HTTP with OAuth 2.1, governed by the Linux Foundation through the Agentic AI Foundation.
The next specification release is tentatively slated for June 2026, focusing on transport scalability for load-balanced deployments, agent-to-agent communication semantics, and enterprise features like audit trails and configuration portability.
What to Do Now
If You Are Building MCP Servers
- Build tools, not resources. Route all functionality — actions and data queries — through the Tools primitive. This is where the community has converged.
- Design for streamable HTTP. Build stateless where possible, even for initial local deployment. This keeps future deployment options open.
- Implement OAuth 2.1 for any server that accesses user-specific data in multi-tenant scenarios. The spec mandates RFC 8707 Resource Indicators and RFC 9728 Protected Resource Metadata.
- Minimize tool descriptions. Context window overhead is a real production concern. Keep descriptions concise and parameter schemas tight.
If You Are Consuming MCP Servers
- Evaluate tool catalogs. The tools a server exposes are its real capabilities. Resource support is irrelevant for most use cases.
- Check transport compatibility. Servers using SSE-only transport will constrain your deployment options. Prefer streamable HTTP.
- Require OAuth for any server touching sensitive data. Shared API keys in environment variables are a security liability.
- Watch token overhead. Multiple MCP connections compound context consumption. Budget for schema overhead in your context window planning.
If You Are Evaluating MCP for Your Organization
- The protocol is production-ready and backed by all major AI providers under Linux Foundation governance.
- Start with consuming, not building. Use existing MCP servers for common integrations (over 10,000 are available) before investing in custom server development.
- Plan for multi-tenant. Even if your first deployment is single-user, OAuth-ready architecture avoids painful retrofitting later.
Conclusion
MCP in early 2026 is a more focused, deployable, and secure protocol than what launched in November 2024. Four specification versions have refined the protocol based on production experience. The community has converged on tools as the primary primitive, streamable HTTP as the transport, and OAuth 2.1 as the authentication standard. The Resources primitive remains in the spec but is effectively dormant. And a new challenge — context window bloat — is driving the next wave of protocol innovation.
For builders, the path is clear: build tools, deploy anywhere, authenticate properly, and keep your tool schemas lean. The protocol has found its production-ready form, and the ecosystem is maturing fast.
Frequently Asked Questions
Is the MCP Resources API deprecated?
Not officially. The Resources primitive remains in every specification version through 2025-11-25 and is supported by conforming SDKs. However, community adoption has been minimal — most MCP servers route all functionality through the Tools primitive. The key distinction is that Resources are application-controlled (the host app decides when to fetch) while Tools are model-controlled (the LLM decides when to invoke). In practice, this distinction has not proven valuable enough to justify a separate primitive. For new MCP server development, investing in Resources is generally not recommended.
Why did MCP switch from SSE to streamable HTTP?
Server-Sent Events required persistent connections, making MCP servers incompatible with serverless platforms like AWS Lambda and Cloudflare Workers where connections are short-lived. The specification version 2025-03-26 deprecated SSE and introduced streamable HTTP, which supports both stateful and stateless operation. Simple servers handle each request independently with no persistent connection, while complex servers can optionally upgrade to streaming. This opened MCP deployment to serverless functions, edge computing, containers, and traditional servers.
How does OAuth 2.1 change MCP security for enterprises?
Before OAuth 2.1, MCP servers accessing user-specific services needed shared credentials — API tokens in environment variables. This prevented per-user access control and created security risks in multi-tenant environments. The specification version 2025-03-26 introduced OAuth 2.1 with MCP servers acting as Resource Servers and clients implementing Resource Indicators (RFC 8707). Each user authenticates independently through standard OAuth consent flows. The result: per-user token scoping, clear audit trails, and no shared credentials — exactly what enterprise compliance teams require.
Sources & Further Reading
- Model Context Protocol — Official Specification
- MCP Streamable HTTP Transport Specification (2025-03-26)
- MCP Authorization Specification — OAuth 2.1
- Why MCP Deprecated SSE and Went with Streamable HTTP — fka.dev
- The 2026 MCP Roadmap — Model Context Protocol Blog
- Anthropic Donates MCP to the Agentic AI Foundation
- MCP Authentication and Authorization — Stack Overflow Blog
- Code Execution with MCP: Building More Efficient AI Agents — Anthropic Engineering















