Architecture, Implementation, and Strategic Use of the Model Context Protocol (MCP)

Learn how MCP standardizes AI context and tool access, from protocol primitives to secure deployment across Claude, Zed, and custom servers.

The interoperability problem in modern AI

LLMs are impressive at reasoning and generation, but operationally isolated. Claude, GPT-4, and Llama can explain an architecture diagram, yet they cannot query your production inventory, inspect a private repo, or read a live dashboard without custom glue code. That gap creates a hidden scalability tax.

Every new model-to-tool connection becomes a bespoke integration with its own authentication, error handling, and data transformation logic. If you run three assistants and need to connect four data sources, you are looking at 12 unique connectors. That N x M complexity locks teams into fragile systems and slows experimentation.

The Model Context Protocol (MCP) solves this by turning integrations into reusable infrastructure. Instead of building a connector per assistant, you build one MCP server per data source or tool, and any MCP-compatible client can use it. The result is a one-to-many model of interoperability, similar to how USB-C standardized physical device connections.

MCP in one sentence

MCP is an open, JSON-RPC based protocol that standardizes how AI clients discover, read, and execute context via resources, tools, prompts, and sampling.

That definition matters because it moves AI from “chat plus manual copy/paste” to “systems that can safely act.” With MCP, models gain hands and eyes: they can read files, query databases, or open tickets through controlled interfaces while preserving governance at the host layer.

Architecture: Host, Client, Server

MCP is a client-server protocol, but the roles are specific:

  1. Host: The user-facing app (Claude Desktop, Zed, Cursor). It controls session lifecycle, authentication, and user consent. Hosts decide how tool calls and retrieved data are presented to the LLM.
  2. Client: The MCP implementation within the host. It can connect to multiple servers simultaneously and aggregates their capabilities into a unified toolset for the model.
  3. Server: A lightweight process that exposes a single capability or data source (filesystem, Git, Slack, Postgres). Servers are isolated from each other and receive only the data required for the requested task.

This division is deliberate. The host is responsible for safety and user control. Servers are focused and composable. The client aggregates them so the model sees one cohesive toolkit.

Transports: stdio, SSE (legacy), and Streamable HTTP

MCP is transport-agnostic, but current practice converges on three modes.

Stdio (local, safest)

The host spawns the server as a local process and communicates through stdin/stdout. It is the lowest-latency and most secure option because nothing is exposed over the network. The tradeoff is that it does not scale to remote deployments.

Operational note: when using stdio, never write debug logs to stdout. Use stderr for logs; stdout is reserved for JSON-RPC messages only.

SSE over HTTP (legacy)

Server-Sent Events enable a read stream from server to client with POST requests in the other direction. It works but requires two channels and becomes awkward for advanced features. Most modern MCP stacks treat SSE as compatibility-only.

Streamable HTTP (preferred for remote)

Streamable HTTP uses a single endpoint that supports bidirectional streaming. It is simpler to deploy, easier for load balancers, and better for reconnection logic. Newer MCP SDKs, such as FastMCP, default to this mode.

JSON-RPC 2.0: the message contract

MCP uses JSON-RPC 2.0 for tool calls, resource requests, and notifications. This structure is explicit and machine-validated, which reduces ambiguous, text-only integrations.

Example: tools/call request

{
  "jsonrpc": "2.0",
  "id": 42,
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": {
      "city": "Santiago",
      "country": "CL"
    }
  }
}

Response

{
  "jsonrpc": "2.0",
  "id": 42,
  "result": {
    "content": "Clear skies, 26C",
    "isError": false
  }
}

Because tools advertise their JSON schemas up front, models can construct valid calls without ad-hoc prompt engineering.

MCP primitives: resources, tools, prompts, sampling

MCP’s power comes from four primitives that cover most context workflows.

Resources (read-only context)

Resources are passive data objects, identified by URIs. Think file:///logs/system.log or postgres://db/tables/orders/schema. Servers can also expose resource templates like postgres://db/tables/{table}/schema to support parameterized access. Clients may subscribe to resources and receive notifications when they change.

Tools (actions and queries)

Tools are executable functions. They can query a database, write a file, or create a ticket. Each tool exposes a JSON schema so the LLM can call it with structured parameters. Tool errors are returned explicitly with isError: true so the model can recover and retry with different inputs.

Prompts (guided flows)

Prompts are server-defined interactions that appear as slash commands. They package context and a system instruction for consistent workflows, such as “Generate a commit message” or “Summarize this file.” Prompts can request arguments from the user and then send a pre-filled conversation to the model.

Sampling (server asks the model)

Sampling reverses control: a server can request the host’s LLM to complete a task. For example, a server reads a long document, asks the model to summarize it, and then stores the summary back into a database. The host typically prompts the user for permission before fulfilling the sampling request.

Building a server with FastMCP

FastMCP (Python) streamlines server development with decorators and automatic schema generation. The following example exposes a tool and a resource template.

from fastmcp import FastMCP

mcp = FastMCP("FinanceOps")

@mcp.tool()
def compound_interest(principal: float, rate: float, years: int) -> str:
    """Compute compound interest for a simple investment."""
    total = principal * ((1 + rate) ** years)
    return f"Total after {years} years: ${total:.2f}"

@mcp.resource("finance://rates/{country}")
def reference_rate(country: str) -> str:
    rates = {"US": "5.25%", "EU": "4.50%", "CL": "7.25%"}
    return rates.get(country, "Rate not available")

if __name__ == "__main__":
    mcp.run()

Key development tips:

  • Use stderr for logs to avoid corrupting the stdio stream.
  • Pin dependencies and use a stable runtime path when running in desktop hosts.
  • Keep servers small and single-purpose to reduce blast radius.

Connecting MCP servers to clients

Claude Desktop

Claude Desktop reads a JSON config that declares MCP servers. Use absolute paths and environment variables for secrets.

{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "<token>"
      }
    },
    "financeops": {
      "command": "/Users/you/finance/.venv/bin/python",
      "args": ["/Users/you/finance/server.py"],
      "env": {
        "DB_URL": "postgres://user:pass@localhost:5432/db"
      }
    }
  }
}

Zed Editor

Zed supports MCP servers via context_servers in settings.

{
  "context_servers": {
    "docker-server": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "mcp/server-image"],
      "env": {}
    }
  }
}

Docker MCP Toolkit

Docker Desktop’s MCP Toolkit lets you run servers as containers and connect them to Claude without managing local dependencies. It is ideal for teams that want reproducible, sandboxed servers with OAuth flows handled by Docker.

Security and governance

MCP makes AI more powerful, which means security has to be explicit.

  • Human-in-the-loop: hosts gate sensitive tools and request confirmation for destructive actions.
  • Least privilege: servers should enforce user-level access, even if the request comes from an LLM.
  • Confused deputy risks: a malicious document can prompt a model to take action. Always validate permissions on the server side.
  • OAuth for remote servers: modern MCP specs emphasize OAuth 2.1 instead of static API keys for production deployments.

Governance is not an add-on. It is the difference between a safe agent and an autonomous attack surface.

Strategic adoption: where to start

If you are new to MCP, start with one high-value, low-risk server:

  1. Read-only data source (Postgres, analytics warehouse, docs).
  2. Low-risk tools (search, summarization, metadata extraction).
  3. A single host (Claude Desktop or Zed) to validate workflows.

Once you see consistent value, formalize a small internal catalog of MCP servers and standardize deployment. MCP’s main strategic benefit is portability: when you switch models or hosts, your context layer stays intact.

Where MCP fits in your architecture

MCP is not a replacement for APIs or service boundaries. It is a context layer that makes existing systems accessible to AI hosts in a standardized way. A good rule: keep business logic in your services and expose the minimum context and actions through MCP.

Practical patterns that work well:

  • Read vs write separation: expose read-only data through one server and write actions through another. This allows you to gate destructive tools without slowing safe lookups.
  • Domain-based servers: one server per domain (tickets, incidents, analytics, repos) keeps ownership clear and makes audits easier.
  • Context shaping: do not stream raw tables to the model. Add server-side filters and summaries so the model receives the smallest useful slice.

Here is a simple decision matrix:

Pattern Benefit Risk Read-only server Fast adoption, low risk Limited automation Read + write split Safe actions with clear approvals More servers to manage One server per domain Clear ownership More deployment overhead

If you already have internal APIs, MCP can wrap them instead of duplicating logic. That gives you AI access without replatforming core services.

Operational checklist for production

Before you roll MCP into production workflows, treat servers like any other internal service:

  • Authentication and authorization: validate end-user permissions in the server, not just in the host UI.
  • Secrets management: load tokens from environment variables or a secret manager, never hardcode them.
  • Rate limits and timeouts: protect upstream systems from runaway tool calls.
  • Observability: log tool usage, latency, and error types. Add trace IDs when possible.
  • Caching and pagination: limit payload size and reduce load on data sources.
  • Fallback behavior: return clear errors with suggested next steps so the model can retry.

MCP reduces integration complexity, but it does not remove operational discipline. Treat every server as production infrastructure.

Conclusion

MCP is more than another integration API. It is a protocol that reorganizes AI systems around interoperability, explicit contracts, and composability. It reduces the N x M integration problem to a clean one-to-many model, enabling agents that can read, reason, and act without brittle glue code.

For teams building AI systems in production, MCP is the most practical path from “smart chat” to “operational intelligence.” Adopt it early, keep servers focused, and invest in governance. The payoff is an AI stack that can evolve without reengineering every connection.

Sources