Hot Reload and Development Workflow¶

Accelerate development with automatic code reloading and efficient workflows

Overview¶

Hot reload automatically restarts your agents when you change code, eliminating the tedious cycle of stopping and restarting services manually. This guide covers enabling hot reload, optimizing your development workflow, and best practices for rapid iteration.

Combined with proper tooling, hot reload can reduce your development cycle from minutes to seconds, allowing you to test changes instantly and maintain flow state.

Key Concepts¶

File Watching: Monitor Python files for changes
Graceful Restart: Agents restart without losing registry connection
Selective Reload: Only restart changed agents, not the entire mesh
State Preservation: Maintain debugging context across reloads
Workflow Optimization: Tools and practices for maximum productivity

Step-by-Step Guide¶

Step 1: Enable Hot Reload¶

Hot reload is built into mcp-mesh-dev:

# Note: Hot reload is not currently implemented in meshctl
# For development, manually restart agents after changes

./bin/meshctl start examples/simple/my_agent.py

# To restart after changes:
# Ctrl+C to stop, then restart
./bin/meshctl start examples/simple/my_agent.py

Environment variable control:

# For development workflow, use entr or similar tools
# Watch files and restart automatically
ls examples/simple/*.py | entr -r ./bin/meshctl start examples/simple/my_agent.py

# Or use Docker Compose for auto-reload
cd examples/docker-examples
docker-compose up --build

Step 2: Configure Watch Patterns¶

Create .meshwatch file to control what triggers reloads:

# Use external tools for file watching
# Example with entr:
find examples/simple -name "*.py" | entr -r ./bin/meshctl start examples/simple/my_agent.py

# Example with watchexec:
watchexec -e py -r "./bin/meshctl start examples/simple/my_agent.py"

# Or use nodemon (if you have Node.js):
nodemon --exec "./bin/meshctl start examples/simple/my_agent.py" --ext py

Step 3: Optimize Your Editor for Hot Reload¶

VS Code Settings¶

{
  "files.autoSave": "afterDelay",
  "files.autoSaveDelay": 1000,
  "python.linting.lintOnSave": true,
  "editor.formatOnSave": true,
  "[python]": {
    "editor.formatOnSaveMode": "file"
  }
}

PyCharm Settings¶

Settings → Appearance & Behavior → System Settings
Enable "Save files automatically"
Set delay to 1 second
Enable "Save files on frame deactivation"

Step 4: Set Up Development Scripts¶

Create dev.sh for quick development:

#!/bin/bash
# dev.sh - Development helper script

# Colors for output
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

# Function to start development environment
start_dev() {
    echo -e "${GREEN}Starting MCP Mesh development environment...${NC}"

    # Clean logs for fresh start
    rm -f agent_debug.log

    # Set development environment
    export MCP_MESH_LOG_LEVEL=DEBUG
    export MCP_MESH_DEBUG_MODE=true

    # Start registry
    ./bin/meshctl start-registry &
    REGISTRY_PID=$!

    # Wait for registry
    sleep 3

    # Start agents
    ./bin/meshctl start examples/simple/system_agent.py &
    ./bin/meshctl start examples/simple/hello_world.py &
    DEV_PID=$!

    echo -e "${GREEN}Development environment started!${NC}"
    echo -e "${YELLOW}Watching for file changes...${NC}"
    echo "Process ID: $DEV_PID"

    # Monitor logs in another terminal
    if command -v gnome-terminal &> /dev/null; then
        gnome-terminal -- bash -c "tail -f agent_debug.log | grep -E 'ERROR|WARNING|Reloading'"
    fi

    wait $DEV_PID
}

# Function to run quick tests
quick_test() {
    echo -e "${GREEN}Running quick tests...${NC}"
    pytest tests/ -v --tb=short -x
}

# Main menu
case "$1" in
    start)
        start_dev
        ;;
    test)
        quick_test
        ;;
    *)
        echo "Usage: ./dev.sh {start|test}"
        exit 1
        ;;
esac

Configuration Options¶

Option	Description	Default	Example
`MCP_MESH_LOG_LEVEL`	Logging verbosity	INFO	DEBUG, ERROR
`MCP_MESH_DEBUG_MODE`	Enable debug output	false	true
`MCP_MESH_AUTO_RUN`	Auto-start agent HTTP server	true	false
`HOST`	Agent bind address	0.0.0.0	127.0.0.1
`MCP_MESH_HTTP_PORT`	Agent HTTP port (0=auto)	0	8080

Examples¶

Example 1: Development Workflow Script¶

# agents/dev_utils.py
import os
import time
from datetime import datetime
from functools import wraps

def development_mode(func):
    """Decorator for development-only features"""
    @wraps(func)
    def wrapper(*args, **kwargs):
        if os.getenv('MCP_DEV_MODE') == 'true':
            print(f"[DEV] {func.__name__} called at {datetime.now()}")
            start = time.time()
            result = func(*args, **kwargs)
            print(f"[DEV] {func.__name__} took {time.time() - start:.3f}s")
            return result
        return func(*args, **kwargs)
    return wrapper

# State that persists across reloads
_persistent_cache = {}

def get_persistent_cache():
    """Access cache that survives hot reloads"""
    return _persistent_cache

@tool(capability="dev_agent")
@development_mode
def process_with_cache(key: str, value: str = None):
    """Example using persistent cache"""
    cache = get_persistent_cache()

    if value is not None:
        cache[key] = value
        return f"Stored {key}={value}"

    return cache.get(key, "Not found")

Example 2: Watch Configuration for Complex Project¶

# watch_config.py
import watchdog.events
import watchdog.observers

class CustomReloadHandler(watchdog.events.FileSystemEventHandler):
    """Custom handler for fine-grained reload control"""

    def __init__(self, reload_callback):
        self.reload_callback = reload_callback
        self.last_reload = 0
        self.reload_delay = 0.5  # seconds

    def should_reload(self, event):
        """Determine if file change should trigger reload"""
        # Skip temporary files
        if event.src_path.endswith('.tmp') or '~' in event.src_path:
            return False

        # Skip test files during normal development
        if 'test_' in event.src_path and not os.getenv('RELOAD_TESTS'):
            return False

        # Debounce rapid changes
        current_time = time.time()
        if current_time - self.last_reload < self.reload_delay:
            return False

        self.last_reload = current_time
        return True

    def on_modified(self, event):
        if not event.is_directory and self.should_reload(event):
            print(f"Detected change in {event.src_path}")
            self.reload_callback(event.src_path)

Best Practices¶

Save Frequently: Configure auto-save in your editor
Small Changes: Make incremental changes for faster feedback
Watch Logs: Keep a terminal open with filtered logs
State Management: Design agents to handle restart gracefully
Exclude Tests: Don't reload on test file changes during development

Common Pitfalls¶

Pitfall 1: Reload Loops¶

Problem: Agent crashes on startup, causing infinite reload loop

Solution: Add startup validation and circuit breaker:

# At the top of your agent file
import sys
import time

# Simple circuit breaker
startup_file = '.startup_attempts'
max_attempts = 3
window_seconds = 10

try:
    with open(startup_file, 'r') as f:
        attempts = [float(line.strip()) for line in f]

    recent_attempts = [t for t in attempts if time.time() - t < window_seconds]

    if len(recent_attempts) >= max_attempts:
        print(f"Too many startup attempts ({len(recent_attempts)}) in {window_seconds}s")
        sys.exit(1)
except FileNotFoundError:
    recent_attempts = []

# Record this startup attempt
with open(startup_file, 'a') as f:
    f.write(f"{time.time()}\n")

Pitfall 2: State Loss on Reload¶

Problem: In-memory state lost when agent reloads

Solution: Use external state storage:

import pickle
import atexit

STATE_FILE = '.agent_state.pkl'

def save_state():
    """Save state before reload"""
    with open(STATE_FILE, 'wb') as f:
        pickle.dump({
            'cache': cache_data,
            'counters': counters,
            'timestamp': time.time()
        }, f)

def load_state():
    """Restore state after reload"""
    try:
        with open(STATE_FILE, 'rb') as f:
            return pickle.load(f)
    except FileNotFoundError:
        return {'cache': {}, 'counters': {}}

# Register save on exit
atexit.register(save_state)

# Load on startup
state = load_state()

Testing¶

Unit Test Example¶

# tests/test_hot_reload.py
import os
import time
import tempfile
import shutil

def test_file_change_triggers_reload():
    """Test that modifying a file triggers reload"""
    with tempfile.TemporaryDirectory() as tmpdir:
        # Create test agent
        agent_file = os.path.join(tmpdir, 'test_agent.py')
        with open(agent_file, 'w') as f:
            f.write('# version 1\n')

        # Start monitoring (mock)
        changes_detected = []

        # Modify file
        time.sleep(1)
        with open(agent_file, 'a') as f:
            f.write('# version 2\n')

        # Verify reload triggered
        assert len(changes_detected) == 1

Integration Test Example¶

# tests/test_reload_integration.py
import subprocess
import requests
import time

def test_agent_survives_reload():
    """Test agent remains accessible after reload"""
    # Start agent with hot reload
    proc = subprocess.Popen([
        'mcp-mesh-dev', 'start', 'test_agent.py'
    ])

    time.sleep(3)

    # Verify agent is running
    response = requests.get('http://localhost:8888/health')
    assert response.status_code == 200

    # Modify agent file
    with open('test_agent.py', 'a') as f:
        f.write('\n# trigger reload\n')

    time.sleep(2)  # Wait for reload

    # Verify agent still accessible
    response = requests.get('http://localhost:8888/health')
    assert response.status_code == 200

    proc.terminate()

Monitoring and Debugging¶

Logs to Check¶

# Watch reload events
tail -f ~/.mcp-mesh/logs/mcp-mesh.log | grep -i reload

# Monitor file system events
inotifywait -m -r agents/ -e modify

# Track reload performance
grep "Reload completed" ~/.mcp-mesh/logs/mcp-mesh.log | tail -20

Metrics to Monitor¶

Reload Time: Should be < 2 seconds for most agents
Reload Frequency: Too many reloads may indicate editor issues
Memory Growth: Check for leaks across reloads

🔧 Troubleshooting¶

Issue 1: Changes Not Triggering Reload¶

Symptoms: File saved but agent doesn't restart

Cause: File not in watch pattern or save not detected

Solution:

# Check if external file watcher is working
echo "test" >> examples/simple/my_agent.py

# Use external tools for file watching
ls examples/simple/*.py | entr -r ./bin/meshctl start examples/simple/my_agent.py

# Check file system events
inotifywait -m examples/simple/my_agent.py

Issue 2: Slow Reload Performance¶

Symptoms: Reload takes > 5 seconds

Cause: Large imports or initialization code

Solution:

# Move slow imports inside functions
def process_data():
    import pandas as pd  # Import only when needed
    return pd.DataFrame()

# Cache expensive initialization
_model = None

def get_model():
    global _model
    if _model is None:
        _model = load_expensive_model()
    return _model