Request lifecycle

What happens between an HTTP POST /api/chat and the SSE chunk landing in the client.

Figure 10 — Twelve-step request lifecycle.

Stage details

HTTP entry — routes.py validates the X-Ghost-Key header. Streaming responses set the X-Request-ID response header for correlation with the agent log.
Context setup — the request_id is bound to a contextvars.ContextVar so log lines from inside tool calls inherit it.
Sampling decision — get_sampling_params picks coding vs general profile; classify_thinking_budget picks tight / extended / selfplay.
Hydration — see memory hydration (RRF).
Tool list — built per-turn so semantic skill routing reflects the current query.
Compression — ContextManager decides how aggressively to summarise, only escalating beyond L0 if the budget is > 60 % full.
Upstream call — chat completion streamed back through LLMClient; circuit breaker skips OPEN nodes.
Tool dispatch — XML tool blocks parsed; the registry's bound async lambda runs the tool inside the Docker sandbox.
Failure handling — tool_failure.classify + retry/replan/diagnostic; fallback_chains may surface alternative tools.
Verification — only invoked on substantive tool output (skips bookkeeping tools).
Final response — risk summary from UncertaintyTracker appended if any unknown reaches impact ≥ 4.
Background — events appended to the journal; the biological watchdog may decide it's time to trigger a dream cycle.