MCP Mesh Architecture and Design¶

Understanding the core architecture, design principles, and usage patterns of MCP Mesh

Overview¶

MCP Mesh is a distributed service orchestration framework built on top of the Model Context Protocol (MCP) that enables seamless dependency injection, service discovery, and inter-service communication. The architecture combines familiar FastMCP development patterns with powerful mesh orchestration capabilities.

Glossary¶

Key terms used throughout MCP Mesh documentation:

Term	Definition
Agent	A Python application that registers with the mesh and provides one or more tools. Each agent runs as an independent service.
Tool	A function decorated with `@app.tool()` and `@mesh.tool()` that can be discovered and called by other agents. Tools are the building blocks of MCP Mesh.
Capability	A unique identifier (string) that describes what a tool provides. Other tools declare dependencies on capabilities, not specific agents. Example: `"redis_store_memory"`, `"analyze_emotion"`.
Dependency	A capability that a tool requires. MCP Mesh automatically discovers and injects dependencies at runtime.
Registry	The central service that tracks all agents, their capabilities, and health status. Agents register on startup and send periodic heartbeats. Important: The registry is a facilitator, not a proxy—it helps agents find each other, but actual tool calls go directly between agents.
Heartbeat	Periodic signal sent by agents to the registry to indicate they're alive. Default interval is 15 seconds.
Proxy	An injected object (`McpMeshAgent`) that transparently handles communication with remote tools. You call it like a function; MCP Mesh handles the rest.
Tag	Metadata attached to tools for filtering during dependency resolution. Supports `+` (prefer) and `-` (avoid) operators.
MCP (Model Context Protocol)	The underlying protocol used for tool communication. MCP Mesh tools are standard MCP tools and can be invoked via HTTP using MCP's `tools/list` and `tools/call` endpoints.

About FastMCP¶

MCP Mesh uses FastMCP under the hood to handle MCP protocol details. You don't need to learn FastMCP separately—just use the @app.tool() decorator to define tools and MCP Mesh handles everything else.

What FastMCP provides (handled automatically):

MCP protocol serialization/deserialization
HTTP server for tool endpoints
Tool schema generation from Python type hints

What MCP Mesh adds:

Service discovery and registration
Automatic dependency injection
Health monitoring and heartbeats
LLM integration (@mesh.llm)
Multi-agent orchestration

MCP Compatibility:

MCP Mesh tools are standard MCP tools. They can be invoked using the meshctl CLI or any MCP client:

# List registered agents
meshctl list

# Call a tool (discovers agent via registry)
meshctl call my_tool '{"param":"value"}'

# Call with explicit agent
meshctl call my-agent:my_tool '{"param":"value"}'

Using curl directly (JSON-RPC)

# List available tools
curl -s -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'

# Call a tool
curl -s -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"my_tool","arguments":{"param":"value"}}}'

Core Architecture¶

High-Level Components¶

┌─────────────────────────────────────────────────────────────────┐
│                        MCP Mesh Ecosystem                       │
├─────────────────────────────────────────────────────────────────┤
│              ┌───────────────────────────────────┐              │
│              │           Redis                   │              │
│              │      (Session Storage)            │              │
│              │   session:* keys for stickiness   │              │
│              └─────────────┬─────────────────────┘              │
│                            │                                    │
│         ┌──────────────────┼──────────────────┐                 │
│         │                  │                  │                 │
│         ▼                  ▼                  ▼                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
│  │   Agent A   │  │   Agent B   │  │   Agent C   │              │
│  │             │◄─┼─────────────┼─►│             │              │
│  │ ┌─────────┐ │  │ ┌─────────┐ │  │ ┌─────────┐ │              │
│  │ │FastMCP  │◄┼──┼►│FastMCP  │◄┼──┼►│FastMCP  │ │              │
│  │ │Server   │ │  │ │Server   │ │  │ │Server   │ │              │
│  │ └─────────┘ │  │ └─────────┘ │  │ └─────────┘ │              │
│  │ ┌─────────┐ │  │ ┌─────────┐ │  │ ┌─────────┐ │              │
│  │ │Mesh     │ │  │ │Mesh     │ │  │ │Mesh     │ │              │
│  │ │Runtime  │ │  │ │Runtime  │ │  │ │Runtime  │ │              │
│  │ │(Inject) │ │  │ │(Inject) │ │  │ │(Inject) │ │              │
│  │ └─────────┘ │  │ └─────────┘ │  │ └─────────┘ │              │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘              │
│         │                │                │                     │
│         │ Heartbeat      │ Heartbeat      │ Heartbeat           │
│         │ + Discovery    │ + Discovery    │ + Discovery         │
│         │                │                │                     │
│         └────────────────┼────────────────┘                     │
│                          ▼                                      │
│                  ┌─────────────┐                                │
│                  │   Registry  │                                │
│                  │ (Background)│                                │
│                  │ ┌─────────┐ │                                │
│                  │ │Service  │ │                                │
│                  │ │Discovery│ │                                │
│                  │ │         │ │                                │
│                  │ │SQLite DB│ │                                │
│                  │ └─────────┘ │                                │
│                  │ ┌─────────┐ │                                │
│                  │ │Health   │ │                                │
│                  │ │Monitor  │ │                                │
│                  │ └─────────┘ │                                │
│                  └─────────────┘                                │
│                                                                 │
│  Direct MCP JSON-RPC calls between FastMCP servers              │
│  ◄──────────────────────────────────────────────────────────►   │
│  Registry for discovery, Redis for session stickiness           │
└─────────────────────────────────────────────────────────────────┘

Component Responsibilities¶

1. Agents (Python Runtime)¶

FastMCP Integration: Native MCP protocol support for direct agent-to-agent communication
Mesh Runtime: Background dependency injection and proxy creation
Auto-Discovery: Automatic capability registration with registry
Health Monitoring: Periodic heartbeats to registry (background process)

2. Registry (Go Service)¶

Service Discovery: Centralized capability and endpoint registry (background coordination)
Health Tracking: Agent health monitoring and failure detection
Dependency Resolution: Smart capability matching with tags at startup
Load Balancing: Multiple providers for same capability selection

3. meshctl CLI (Go Binary)¶

Lifecycle Management: Start, stop, monitor agents
Development Tools: File watching, auto-restart, debugging
Registry Operations: Query services, check health, troubleshoot

4. Redis (Session Storage)¶

Session Affinity: Maps session IDs to pod IPs for stateful operations
Distributed State: Enables session stickiness across multiple pod replicas
Graceful Fallback: Agents fall back to in-memory storage if Redis unavailable
TTL Management: Automatic session cleanup and expiration

Key Insight: Background Orchestration¶

MCP Mesh operates as background infrastructure:

Discovery Phase: Registry helps agents find each other during startup
Runtime Phase: Direct FastMCP-to-FastMCP communication (no proxy)
Monitoring Phase: Continuous health checks and capability updates in background

MCP Mesh Agents and Tools¶

MCP Mesh agents are simple Python scripts or modules that define one or more MCP tools. While tools communicate via the MCP protocol, all networking and communication complexity is abstracted away by MCP Mesh.

What is an MCP Mesh Agent?¶

An agent is a Python file that:

Creates a FastMCP app
Defines tools using the dual decorator pattern
Registers with the mesh using @mesh.agent

import mesh
from fastmcp import FastMCP

app = FastMCP("My Agent")

@app.tool()           # FastMCP decorator - defines MCP tool
@mesh.tool(           # Mesh decorator - adds orchestration
    capability="greeting",
    dependencies=["time_service"]
)
async def greet(name: str, time_service: mesh.McpMeshAgent = None):
    """Greet someone with the current time."""
    current_time = await time_service() if time_service else "unknown"
    return f"Hello {name}! The time is {current_time}"

@mesh.agent(
    name="greeting-agent",
    http_port=8080,
    enable_http=True,
    auto_run=True,
)
class GreetingAgent:
    pass

That's all the code you need. MCP Mesh handles:

Service discovery - Finding other agents in the network
Dependency injection - Automatically wiring dependencies
Health monitoring - Heartbeats and failure detection
HTTP exposure - Tools callable via REST API

Types of Tools¶

MCP Mesh supports four types of tools, each with its own decorator:

Decorator	Purpose	Use Case
`@mesh.tool`	Standard MCP tool with dependency injection	Data processing, storage, utilities
`@mesh.llm`	LLM-powered tool with agentic capabilities	AI analysis, content generation, reasoning
`@mesh.llm_provider`	Zero-code LLM vendor integration	Claude, GPT-4, Gemini providers
`@mesh.route`	FastAPI route with tool injection	REST APIs, webhooks, external interfaces

1. Standard Tools (`@mesh.tool`)¶

Standard tools are the building blocks of MCP Mesh. They expose capabilities that other agents can discover and call.

@app.tool()
@mesh.tool(
    capability="redis_store_memory",
    description="Store a memory in Redis",
    version="1.0.0",
    tags=["redis", "storage", "write"],
)
async def store_memory(
    user_email: str,
    content: str,
    importance: int,
) -> dict:
    """Store a new memory in Redis."""
    key = f"memory:{user_email}:{timestamp}"
    await redis_client.set(key, json.dumps({
        "content": content,
        "importance": importance,
    }))
    return {"status": "success", "key": key}

Key concepts:

Capability: A unique identifier that other tools use to declare dependencies
Tags: Metadata for filtering during dependency resolution
Version: Semantic versioning for compatibility matching
Dependencies: List of capabilities this tool requires

Dependency Injection¶

When a tool declares dependencies, MCP Mesh automatically resolves and injects them:

@app.tool()
@mesh.tool(
    capability="update_emotion",
    dependencies=["analyze_emotion"],  # Depends on another capability
)
async def update_emotion(
    user_email: str,
    avatar_id: str,
    analyze: mesh.McpMeshAgent = None,  # Injected automatically
) -> dict:
    """Update emotion state using LLM analysis."""

    # Call the injected dependency like a function
    result = await analyze(
        user_email=user_email,
        avatar_id=avatar_id,
    )

    # Use the result
    return {"emotion": result.emotion, "intensity": result.intensity}

How it works:

Agent starts and registers its capabilities with the registry
MCP Mesh discovers agents that provide required dependencies
Proxy objects (mesh.McpMeshAgent) are injected at runtime
Calling the proxy transparently invokes the remote tool via MCP
You don't manage URLs, connections, retries, or failures

Advanced Tag Matching¶

Tags support prefix operators for fine-grained control:

@mesh.tool(
    capability="analyze",
    dependencies=[
        "storage",              # Must have "storage" tag
        {"capability": "llm", "tags": ["+claude", "-gpt"]},  # Prefer Claude, avoid GPT
    ]
)

Operator	Meaning	Example
(none)	Must have tag	`"storage"`
`+`	Prefer if available	`"+claude"`
`-`	Avoid if possible	`"-expensive"`

2. LLM Tools (`@mesh.llm`)¶

LLM tools add intelligence to any MCP tool. They can:

Execute agentic loops with tool calling
Use dynamic system prompts (Jinja2 templates)
Filter available tools for the LLM
Return structured JSON or plain text

from pydantic import BaseModel, Field

class EmotionAnalysis(BaseModel):
    """Structured response from emotion analysis."""
    emotion: str = Field(..., description="Primary emotion")
    intensity: float = Field(..., ge=0.0, le=1.0)
    reasoning: str = Field(..., description="Brief explanation")

@app.tool()
@mesh.llm(
    provider={"capability": "llm", "tags": ["llm", "+gpt"]},  # Prefer GPT for speed
    max_iterations=1,                                          # Single LLM call
    system_prompt="file://prompts/emotion_analysis.jinja2",   # Dynamic prompt
    context_param="emotion_ctx",                               # Context injection
    response_format="json",                                    # Structured output
)
@mesh.tool(
    capability="analyze_emotion",
    tags=["emotion", "analysis", "llm"],
)
async def analyze_emotion(
    emotion_ctx: EmotionContext,
    llm: mesh.MeshLlmAgent = None,
) -> EmotionAnalysis:
    """Analyze avatar's emotional state from conversation."""
    return await llm("Analyze the avatar's emotional state")

LLM Decorator Options¶

Option	Description	Example
`provider`	Which LLM to use (capability + tags)	`{"capability": "llm", "tags": ["+claude"]}`
`max_iterations`	Max tool-calling loops	`10` for agentic, `1` for single call
`system_prompt`	Static string or Jinja2 file	`"file://prompts/system.jinja2"`
`context_param`	Parameter name for template context	`"emotion_ctx"`
`response_format`	Output format	`"json"`, `"text"`
`filter`	Filter available tools for LLM	`{"tags": ["memory-agent"]}`
`filter_mode`	How to apply filter	`"all"`, `"any"`

Agentic Tool Calling¶

LLM tools can call other tools in a loop:

@app.tool()
@mesh.llm(
    filter={"tags": ["memory-agent", "recall"]},  # Only memory tools available
    filter_mode="all",
    provider={"capability": "llm", "tags": ["+claude"]},
    max_iterations=10,  # Allow up to 10 tool calls
    system_prompt="file://prompts/avatar.jinja2",
    context_param="avatar_ctx",
    response_format="text",
)
@mesh.tool(
    capability="generic_avatar_respond",
    tags=["avatar", "companion"],
)
async def generic_respond(
    avatar_ctx: AvatarContext,
    llm: mesh.MeshLlmAgent = None,
) -> str:
    """Generate avatar response with memory recall capability."""
    # LLM can call memory_recall tool during its reasoning
    return await llm(avatar_ctx.conversation_history)

The LLM sees filtered tools and can call them autonomously. MCP Mesh handles the tool execution loop.

3. LLM Providers (`@mesh.llm_provider`)¶

LLM providers make any AI vendor available to the mesh with zero code:

@mesh.llm_provider(
    model="anthropic/claude-sonnet-4-5",
    capability="llm",
    tags=["llm", "claude", "anthropic", "sonnet", "provider"],
    version="1.0.0",
)
def claude_provider():
    """Claude provider - automatically creates process_chat endpoint."""
    pass  # Implementation is in the decorator

@mesh.agent(
    name="claude-provider",
    http_port=9110,
    enable_http=True,
    auto_run=True,
    health_check=claude_health_check,
)
class ClaudeProviderAgent:
    pass

What it provides:

Automatic process_chat(request: MeshLlmRequest) tool
LiteLLM integration for unified API
Error handling and retries
Mesh registration with capability and tags

Multiple Providers¶

Deploy multiple providers and let tools choose at runtime:

# claude_provider.py
@mesh.llm_provider(
    model="anthropic/claude-sonnet-4-5",
    tags=["llm", "claude", "anthropic", "expensive", "smart"],
)
def claude_provider():
    pass

# openai_provider.py
@mesh.llm_provider(
    model="openai/gpt-4o",
    tags=["llm", "openai", "gpt", "fast", "cheap"],
)
def openai_provider():
    pass

Tools select providers using tag matching:

@mesh.llm(
    # Use Claude for complex reasoning
    provider={"capability": "llm", "tags": ["+claude", "+smart"]},
)
async def complex_analysis(...): ...

@mesh.llm(
    # Use GPT for fast, simple tasks
    provider={"capability": "llm", "tags": ["+gpt", "+fast"]},
)
async def quick_extraction(...): ...

4. FastAPI Routes (`@mesh.route`)¶

The @mesh.route decorator lets FastAPI endpoints call MCP tools as regular Python functions:

from fastapi import APIRouter, Request
from mesh.types import McpMeshAgent

router = APIRouter(prefix="/chat", tags=["chat"])

@router.post("/completions")
@mesh.route(dependencies=["avatar_chat"])
async def chat_completions(
    request: Request,
    chat_request: ChatCompletionRequest,
    avatar_id: str = "maya-creative",
    avatar_agent: McpMeshAgent = None,  # Injected by @mesh.route
):
    """OpenAI-compatible chat endpoint."""

    # Extract user from JWT
    user_info = require_user_from_request(request)

    # Call MCP tool like a regular function
    result = await avatar_agent(
        user_email=user_info["email"],
        message=chat_request.messages[-1].content,
        avatar_id=avatar_id,
    )

    return ChatCompletionResponse(
        choices=[{"message": {"content": result["message"]}}]
    )

Benefits:

No network code: MCP Mesh handles connections
Type safety: Pydantic models work transparently
Dependency injection: Tools injected like FastAPI dependencies
Decoupled architecture: Routes don't know where tools run

Putting It All Together¶

Here's how these components work together in a real system (Maya):

┌──────────────────────────────────────────────────────────────────┐
│                     FastAPI Backend                               │
│  @mesh.route(dependencies=["avatar_chat"])                       │
│  POST /chat/completions → avatar_agent()                         │
└──────────────────────────────────┬───────────────────────────────┘
                                   │
                                   ▼
┌──────────────────────────────────────────────────────────────────┐
│                    Orchestrator Agent                             │
│  @mesh.tool(capability="avatar_chat",                            │
│             dependencies=["generic_avatar_respond",              │
│                          "memory_recall", "update_emotion"])     │
└──────────┬───────────────────────┬───────────────────────────────┘
           │                       │
           ▼                       ▼
┌─────────────────────┐  ┌─────────────────────────────────────────┐
│   Avatar Agent      │  │          Memory LLM Agent               │
│  @mesh.llm(         │  │  @mesh.llm(provider={"+gpt"})           │
│    provider={"+claude"}│  │  @mesh.tool(capability="memory_recall")│
│  )                  │  │                                         │
└─────────┬───────────┘  └─────────────────┬───────────────────────┘
          │                                │
          ▼                                ▼
┌─────────────────────┐           ┌─────────────────────┐
│   Claude Provider   │           │   OpenAI Provider   │
│  @mesh.llm_provider │           │  @mesh.llm_provider │
│  model="claude..."  │           │  model="gpt-4o"     │
└─────────────────────┘           └─────────────────────┘

Data flow:

HTTP request hits FastAPI route
@mesh.route injects avatar_agent (Orchestrator)
Orchestrator calls Avatar Agent with injected dependencies
Avatar Agent uses Claude via @mesh.llm
Memory LLM uses GPT for fast extraction
Response flows back through the chain

All of this happens with:

Zero network code in your tools
Automatic service discovery
Runtime dependency injection
Transparent LLM provider selection

Design Principles¶

1. True Resilient Architecture¶

MCP Mesh implements a fundamentally resilient architecture where agents operate independently and enhance each other when available:

Core Resilience Principles:

Standalone Operation: Agents function as vanilla FastMCP servers without any dependencies
Registry as Facilitator: Registry enables discovery and wiring, but agents don't depend on it
Dynamic Enhancement: Agents get enhanced capabilities when other agents are available
Graceful Degradation: Loss of registry or other agents doesn't break existing functionality
Self-Healing: Agents automatically reconnect and refresh when components return

Architecture Flow:

Agent Startup → Works Standalone (FastMCP mode)
       ↓
Registry Available → Agents Get Wired → Enhanced Capabilities
       ↓
Registry Down → Agents Continue Working → Direct MCP Communication Preserved
       ↓
Registry Returns → Agents Refresh → Topology Updates Resume

2. Dual Decorator Pattern¶

MCP Mesh uses a dual decorator approach that preserves FastMCP familiarity while adding mesh orchestration:

@app.tool()      # ← FastMCP: MCP protocol handling
@mesh.tool(      # ← Mesh: Dependency injection + orchestration
    capability="weather_data",
    dependencies=["time_service"]
)
def get_weather(time_service: Any = None) -> dict:
    # Business logic here

Benefits:

Familiar Development: Developers keep using FastMCP patterns
Enhanced Capabilities: Mesh adds dependency injection seamlessly
Zero Boilerplate: No manual server management or configuration
Gradual Adoption: Can add mesh features incrementally

3. Enhanced Proxy System¶

MCP Mesh provides automatic proxy configuration from decorator kwargs:

@mesh.tool(
    capability="enhanced_service",
    timeout=60,                    # Auto-configures proxy timeout
    retry_count=3,                 # Auto-configures retry policy
    streaming=True,                # Auto-selects streaming proxy
    auth_required=True             # Auto-enables authentication
)
def enhanced_tool():
    pass

Proxy Types:

EnhancedMCPClientProxy: Timeout, retry, auth auto-configuration
EnhancedFullMCPProxy: Streaming auto-selection + session management
Standard Proxies: Backward compatibility for simple tools

Implementation: See src/runtime/python/_mcp_mesh/engine/ for proxy classes

4. Session Management and Stickiness¶

For stateful operations, MCP Mesh provides automatic session affinity:

@mesh.tool(
    capability="stateful_counter",
    session_required=True,         # Enables session stickiness
    stateful=True,                 # Marks as stateful operation
    auto_session_management=True   # Automatic session lifecycle
)
def increment_counter(session_id: str, increment: int = 1):
    # Automatically routed to same pod for this session

Session Features:

Redis-Backed Storage: Distributed session affinity across pods
Automatic Routing: Requests with same session_id go to same pod
Graceful Fallback: In-memory storage when Redis unavailable
TTL Management: Automatic session cleanup

Implementation: See src/runtime/python/_mcp_mesh/engine/http_wrapper.py

5. Fast Heartbeat Architecture¶

Optimized health monitoring with dual-heartbeat system:

HEAD /heartbeat    # Lightweight timestamp update (5s intervals)
POST /heartbeat    # Full registration when triggered by HEAD response

Benefits:

Fast Failure Detection: Sub-20s failure detection
Network Efficiency: Minimal bandwidth usage
On-Demand Registration: Full updates only when needed

Implementation: See cmd/registry/ for heartbeat handling

Implementation Architecture¶

Two-Pipeline Design¶

MCP Mesh uses a sophisticated two-pipeline architecture that separates initialization from runtime operations:

Startup Pipeline (One-time Execution)¶

DecoratorCollectionStep → ConfigurationStep → HeartbeatPreparationStep →
FastMCPServerDiscoveryStep → HeartbeatLoopStep → FastAPIServerSetupStep

Purpose: Initialize agent, collect decorators, prepare for mesh integration Trigger: Agent startup, decorator debounce completion Outcome: Agent ready with capabilities registered and dependency injection configured

Implementation: See src/runtime/python/_mcp_mesh/pipeline/startup/

Heartbeat Pipeline (Continuous Loop)¶

RegistryConnectionStep → HeartbeatSendStep → DependencyResolutionStep

Purpose: Maintain mesh connectivity, update dependency topology Trigger: Periodic execution (30s intervals) Outcome: Updated dependency proxies, health status maintained

Implementation: See src/runtime/python/_mcp_mesh/pipeline/heartbeat/

Decorator Processing and Debounce Coordination¶

Challenge: Decorators are processed as Python imports the module, but we need to wait for all decorators before starting mesh processing.

Solution: Debounce coordinator with configurable delay

@mesh.tool()  # Triggers debounce timer
def tool1(): pass

@mesh.tool()  # Resets debounce timer
def tool2(): pass

# After MCP_MESH_DEBOUNCE_DELAY (default 1.0s) with no new decorators:
# → Startup pipeline begins

Design Benefits:

Handles dynamic decorator registration during import
Prevents race conditions in multi-decorator modules
Configurable timing via MCP_MESH_DEBOUNCE_DELAY

Implementation: See src/runtime/python/_mcp_mesh/engine/debounce_coordinator.py

Dependency Resolution and Proxy Architecture¶

Function Caching Strategy¶

Key Insight: Mesh decorators must process BEFORE FastMCP decorators to cache original functions.

@mesh.tool()  # ← Processes first, caches original function
@app.tool()   # ← FastMCP processes wrapped function
def hello(): pass

Implementation:

Mesh decorator caches func._mesh_original_func = func
Creates dependency injection wrapper
FastMCP receives wrapper (not original)
Runtime calls cached original with injected dependencies

Proxy Selection Logic¶

Registry-Driven: Heartbeat response determines proxy type based on dependency location and configuration.

if current_agent_id == target_agent_id:
    # Same agent - direct local call
    proxy = SelfDependencyProxy(original_func, function_name)
else:
    # Different agent - MCP JSON-RPC call
    if has_enhanced_config:
        proxy = EnhancedMCPClientProxy(endpoint, func_name, kwargs_config)
    else:
        proxy = MCPClientProxy(endpoint, func_name)

Enhanced Proxy Auto-Selection:

streaming=True → EnhancedFullMCPProxy
session_required=True → EnhancedFullMCPProxy with session management
Custom timeout/retry → EnhancedMCPClientProxy
Simple tools → Standard MCPClientProxy

Implementation: See src/runtime/python/_mcp_mesh/pipeline/heartbeat/dependency_resolution.py

Hash-Based Change Detection¶

Performance Optimization: Only update dependency injection when topology actually changes.

def _hash_dependency_state(dependency_state):
    state_str = json.dumps(dependency_state, sort_keys=True)
    return hashlib.sha256(state_str.encode()).hexdigest()

# In heartbeat pipeline:
current_hash = _hash_dependency_state(response.dependencies)
if current_hash == _last_dependency_hash:
    return  # Skip expensive dependency injection update

Benefits:

Eliminates unnecessary proxy recreation
Reduces CPU overhead in stable topologies
Enables high-frequency heartbeats without performance penalty

Registry as Facilitator Pattern¶

Design Philosophy: Registry coordinates but never controls agent execution.

Registry Responsibilities:

Accept agent registrations via heartbeat
Store capability metadata in SQLite
Resolve dependencies and return topology
Monitor health and mark unhealthy agents
Generate audit events

What Registry NEVER Does:

Make calls to agents (only agents call registry)
Control agent lifecycle
Proxy or intercept agent-to-agent communication
Store business logic or state

Agent Autonomy: Agents poll registry for updates but operate independently. Registry failure doesn't break existing agent-to-agent connections.

Implementation: See cmd/registry/ for Go registry service

Usage Patterns¶

Basic Agent Development¶

1. Simple Tool Creation:

from fastmcp import FastMCP
import mesh

app = FastMCP("My Service")

@app.tool()
@mesh.tool(capability="greeting")
def say_hello(name: str) -> str:
    return f"Hello, {name}!"

@mesh.agent(name="greeting-service")
class GreetingAgent:
    pass

2. With Dependencies:

@app.tool()
@mesh.tool(
    capability="time_greeting",
    dependencies=["time_service"]
)
def time_greeting(name: str, time_service=None) -> str:
    current_time = time_service() if time_service else "unknown time"
    return f"Hello, {name}! Current time: {current_time}"

Enhanced Configuration¶

3. Production-Ready Tool:

@app.tool()
@mesh.tool(
    capability="secure_data_processor",
    timeout=120,                           # 2 min timeout
    retry_count=3,                         # Retry on failure
    auth_required=True,                    # Require authentication
    custom_headers={"X-Service": "data"},  # Custom headers
    streaming=True                         # Enable streaming
)
async def process_large_dataset(data_url: str):
    # Auto-configured with enhanced proxy features

4. Stateful Operations:

@app.tool()
@mesh.tool(
    capability="user_session",
    session_required=True,        # Enable session stickiness
    stateful=True,               # Mark as stateful
    timeout=30
)
def update_user_state(session_id: str, updates: dict):
    # Automatically routed to same pod for session consistency

Deployment Patterns¶

Local Development:

# Terminal 1: Start registry
meshctl registry start

# Terminal 2: Start your agent
python my_agent.py

Docker Compose:

version: "3.8"
services:
  registry:
    image: mcpmesh/registry:latest
    ports: ["8000:8000"]

  my-service:
    build: .
    environment:
      MCP_MESH_REGISTRY_URL: http://registry:8000
    depends_on: [registry]

Kubernetes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-service
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: agent
          image: my-service:latest
          env:
            - name: MCP_MESH_REGISTRY_URL
              value: "http://mcp-registry:8000"
            - name: REDIS_URL
              value: "redis://redis:6379" # For session storage

Performance Characteristics¶

Scalability Metrics¶

Registry: 1000+ agents, 10,000+ capabilities, 100+ heartbeats/sec
Agent: 1000+ tool calls/sec, <2s startup time
Discovery: <10ms lookup time, <1s update propagation

Network Overhead¶

HEAD Heartbeat: ~200B per agent every 5 seconds
POST Heartbeat: ~2KB per agent when topology changes
Tool Calls: Standard MCP JSON-RPC (varies by payload)

Security Model¶

Current Architecture¶

Trusted Network Model: Assumes secure network environment
Service-to-Service: Direct HTTP communication between agents
No Built-in Auth: Authentication via proxy configuration

Production Recommendations¶

Network Segmentation: Use private networks or VPNs
Service Mesh: Deploy with Istio/Linkerd for mTLS
API Gateway: Use gateway for external access control
Enhanced Proxies: Use auth_required=True with bearer tokens

Extension Points¶

Custom Dependency Resolvers¶

class CustomDependencyResolver(DependencyResolver):
    async def resolve_capability(self, capability_spec):
        # Custom logic for finding capabilities
        candidates = await super().find_candidates(capability_spec)
        return self.apply_custom_selection(candidates)

Custom Health Checks¶

@mesh.tool(health_check=custom_health_check)
def database_tool():
    pass

async def custom_health_check():
    return {"status": "healthy", "connections": db.pool.size}

See src/runtime/python/_mcp_mesh/ for extension interfaces

Future Roadmap¶

Planned Features¶

Multi-Registry Federation: Cross-cluster service discovery
Circuit Breakers: Automatic failure isolation
Request Tracing: Distributed tracing integration
Metrics Collection: Prometheus/OpenTelemetry integration
Configuration Management: Dynamic configuration updates

Performance Optimizations¶

gRPC Support: Binary protocol for high-throughput scenarios
Connection Pooling: Efficient connection reuse between agents
Edge Caching: CDN-like caching for static capabilities

This architecture enables MCP Mesh to provide a seamless, scalable, and developer-friendly service orchestration platform that preserves the simplicity of FastMCP while adding powerful distributed system capabilities.

For Implementation Details: See source code in src/runtime/python/_mcp_mesh/ and cmd/registry/ For Examples: See examples/ directory for complete working examples For Configuration: See Environment Variables for all configuration options

MCP Mesh Architecture and Design¶

Overview¶

Glossary¶

About FastMCP¶

Core Architecture¶

High-Level Components¶

Component Responsibilities¶

1. Agents (Python Runtime)¶

2. Registry (Go Service)¶

3. meshctl CLI (Go Binary)¶

4. Redis (Session Storage)¶

Key Insight: Background Orchestration¶

MCP Mesh Agents and Tools¶

What is an MCP Mesh Agent?¶

Types of Tools¶

1. Standard Tools (@mesh.tool)¶

Dependency Injection¶

Advanced Tag Matching¶

2. LLM Tools (@mesh.llm)¶

LLM Decorator Options¶

Agentic Tool Calling¶

3. LLM Providers (@mesh.llm_provider)¶

Multiple Providers¶

4. FastAPI Routes (@mesh.route)¶

Putting It All Together¶

Design Principles¶

1. True Resilient Architecture¶

2. Dual Decorator Pattern¶

3. Enhanced Proxy System¶

4. Session Management and Stickiness¶

5. Fast Heartbeat Architecture¶

Implementation Architecture¶

Two-Pipeline Design¶

Startup Pipeline (One-time Execution)¶

Heartbeat Pipeline (Continuous Loop)¶

Decorator Processing and Debounce Coordination¶

Dependency Resolution and Proxy Architecture¶

Function Caching Strategy¶

Proxy Selection Logic¶

Hash-Based Change Detection¶

Registry as Facilitator Pattern¶

Usage Patterns¶

Basic Agent Development¶

Enhanced Configuration¶

Deployment Patterns¶

Performance Characteristics¶

Scalability Metrics¶

Network Overhead¶

Security Model¶

Current Architecture¶

Production Recommendations¶

Extension Points¶

Custom Dependency Resolvers¶

Custom Health Checks¶

Future Roadmap¶

Planned Features¶

Performance Optimizations¶

1. Standard Tools (`@mesh.tool`)¶

2. LLM Tools (`@mesh.llm`)¶

3. LLM Providers (`@mesh.llm_provider`)¶

4. FastAPI Routes (`@mesh.route`)¶