Implementing Custom Middleware for Request Tracing in FastAPI
Debugging asynchronous API calls in production requires precise request correlation. This guide details how to implement custom middleware for request tracing within the broader Core Architecture & Routing Patterns ecosystem, ensuring every inbound request receives a unique, traceable identifier that propagates through logs, dependencies, and external service calls.
Understanding Request Tracing in Async Environments
In synchronous Python, developers historically relied on thread-local storage for request-scoped state. In an asyncio event loop, multiple coroutines share the same OS thread, making thread-locals unsafe and highly prone to cross-request contamination. Starlette’s middleware stack processes requests sequentially through a chain of callables, requiring a concurrency-safe mechanism for state isolation. Aligning your tracing strategy with established Middleware Implementation patterns ensures predictable execution order, minimal event-loop blocking, and clean separation of concerns.
Building the Custom Middleware Class
For production FastAPI applications, extending starlette.middleware.base.BaseHTTPMiddleware provides the cleanest interface for request/response interception. While raw ASGI middleware offers marginally lower overhead, BaseHTTPMiddleware safely handles request body streaming and response wrapping without manual scope management.
The following implementation extracts an existing trace ID from upstream proxies (e.g., API gateways, load balancers) or generates a cryptographically secure UUIDv4. It binds the identifier to an async-safe context variable and injects it into the response headers.
import uuid
from contextvars import ContextVar
from typing import Callable, Awaitable
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
# Request-scoped context variable for trace ID propagation
trace_id_ctx: ContextVar[str] = ContextVar("trace_id", default="")
class RequestTracingMiddleware(BaseHTTPMiddleware):
async def dispatch(
self, request: Request, call_next: Callable[[Request], Awaitable[Response]]
) -> Response:
# Extract from upstream proxy or generate new UUIDv4
incoming_trace_id = request.headers.get("X-Trace-Id")
trace_id = incoming_trace_id or str(uuid.uuid4())
# Bind trace ID to the async execution context
token = trace_id_ctx.set(trace_id)
try:
response = await call_next(request)
response.headers["X-Trace-Id"] = trace_id
return response
finally:
# CRITICAL: Reset context to prevent state leakage across concurrent requests
trace_id_ctx.reset(token)
Production Constraints:
- The
finallyblock is non-negotiable. Omitting it causes stale trace IDs to leak into subsequent requests handled by the same worker process. uuid.uuid4()is highly optimized in CPython and adds negligible overhead (<0.05ms).
Injecting Trace IDs into Request Context
Once the middleware binds the trace ID to contextvars, downstream dependencies and route handlers can access it without explicit parameter passing. This eliminates signature pollution and maintains clean separation between routing logic and observability concerns.
from fastapi import Depends, FastAPI, HTTPException
from starlette.status import HTTP_200_OK
app = FastAPI()
def get_active_trace_id() -> str:
"""FastAPI dependency to safely retrieve the active request trace ID."""
trace_id = trace_id_ctx.get()
if not trace_id:
raise HTTPException(status_code=500, detail="Trace context missing")
return trace_id
@app.get("/api/v1/health", status_code=HTTP_200_OK)
async def health_check(trace_id: str = Depends(get_active_trace_id)) -> dict:
# trace_id is automatically injected via DI
return {"status": "healthy", "trace_id": trace_id}
Context Propagation Rules:
contextvarsautomatically propagate to child tasks spawned viaasyncio.create_task()only if the task is created within the same coroutine scope.- For
BackgroundTasksor external thread pools, explicitly capturetrace_id_ctx.get()before dispatch and pass it as an argument.
Configuration & Environment Overrides
Hardcoding middleware behavior violates twelve-factor principles. Use Pydantic BaseSettings to toggle tracing, adjust header names, and implement sampling strategies via environment variables.
from pydantic_settings import BaseSettings
from pydantic import Field
class TracingConfig(BaseSettings):
enabled: bool = Field(default=True, description="Toggle request tracing globally")
header_name: str = Field(default="X-Trace-Id", description="HTTP header used for trace propagation")
sampling_rate: float = Field(default=1.0, ge=0.0, le=1.0, description="Probability of tracing a request (0.0-1.0)")
model_config = {"env_prefix": "TRACE_"}
config = TracingConfig()
# Conditional middleware registration
if config.enabled:
app.add_middleware(RequestTracingMiddleware)
Environment Overrides:
TRACE_ENABLED=falsedisables tracing in local development or staging.TRACE_SAMPLING_RATE=0.1traces 10% of traffic, reducing log volume in high-throughput production environments.- Upstream proxies (NGINX, AWS ALB) can inject custom headers by setting
TRACE_HEADER_NAME=X-Correlation-Id.
Debugging & Performance Validation
1. Header Presence Testing
Use fastapi.testclient.TestClient to assert trace ID generation and propagation without spinning up a live server.
from fastapi.testclient import TestClient
def test_trace_id_generation() -> None:
client = TestClient(app)
response = client.get("/api/v1/health")
assert response.status_code == 200
assert "X-Trace-Id" in response.headers
assert len(response.headers["X-Trace-Id"]) == 36 # UUIDv4 format
2. Latency Profiling
Middleware overhead should remain sub-millisecond. Profile with pyinstrument to isolate blocking calls:
pyinstrument -r html -o profile.html -m uvicorn main:app --host 0.0.0.0 --port 8000
If P99 latency spikes, verify that no synchronous I/O (e.g., requests, sqlite3, logging.FileHandler) executes inside dispatch().
3. Background Task & WebSocket Caveats
- Background Tasks:
contextvarsdo not auto-propagate toBackgroundTasksbecause they run in separate event loop cycles. Capture the ID explicitly:bg_task_id = trace_id_ctx.get(); background_tasks.add_task(worker, bg_task_id) - WebSockets:
BaseHTTPMiddlewareonly intercepts HTTP requests. For WebSocket tracing, implement a raw ASGI middleware that inspectsscope["type"] == "websocket"and attaches metadata toscope["state"].
Common Production Pitfalls
| Anti-Pattern | Impact | Production Fix |
|---|---|---|
| Using module-level globals for trace state | Race conditions corrupt logs across concurrent requests | Always use contextvars.ContextVar for async-safe isolation |
Omitting trace_id_ctx.reset(token) in finally | Memory leaks and stale IDs persist across worker lifecycles | Wrap call_next in try/finally to guarantee context cleanup |
Executing blocking I/O in dispatch() | Event loop starvation, P99 latency spikes >500ms | Use await for all network/DB calls; offload sync work to run_in_executor |
Frequently Asked Questions
Does custom request tracing middleware impact API latency?
Minimal overhead (<1ms) when using contextvars and avoiding blocking I/O. The primary cost is header parsing and UUID generation, both highly optimized in CPython.
How do I ensure trace IDs propagate to background tasks?
Capture the trace ID before dispatching the task using trace_id_ctx.get() and explicitly pass it as an argument. Inside the task, set it in a new context scope if downstream dependencies rely on contextvars.
Can this middleware coexist with CORS and authentication middleware?
Yes. FastAPI executes middleware in reverse order of registration. Place tracing middleware closest to the application core (register it last) to ensure it captures all routed requests after CORS preflight and authentication validation.