core / llm.py
Multi-pool LLM orchestrator. Owns httpx async clients, round-robin scheduling, and per-node circuit breakers.
Pool topology
Six logical pools, each backed by zero-or-more (url, model) nodes plus the always-present foreground client.
| Pool | Used for | Selector |
|---|---|---|
| foreground | Default chat completion | chat_completion() with no flags |
| swarm | Parallel inference / fan-out | chat_completion(use_swarm=True) |
| worker | Cheap classifier / verifier sub-tasks | route(task, ...) |
| visual | Multi-modal (image / PDF) | chat_completion(use_vision=True) |
| coding | Code-specialist generation | chat_completion(use_coding=True) |
| image_gen | SDXL image generation | generate_image() |
Routing tasks
The RoutingTask enum at line 71 advertises the labels worker pools can fulfil:
VALIDATE_TOOL_ARGS— quick JSON-schema sanity check.EXPAND_QUERY— sub-query decomposition for memory hydration.CLASSIFY_INTENT— factual vs procedural vs contextual.SCORE_RELEVANCE— re-rank candidate documents.REPAIR_JSON— coerce a malformed payload back to valid JSON.VERIFY— used by Verifier.
Circuit breaker
NodeCircuitBreaker(failure_threshold=3, cooldown_seconds=60.0) tracks per-node state:
Figure 3 — NodeCircuitBreaker state diagram.
HTTP client
One httpx.AsyncClient per node with:
tor_proxy set in constructor.X-Ghost-Key when calling other Ghost Agent instances.Public methods
| Method | Purpose |
|---|---|
async chat_completion(payload, use_swarm, use_worker, use_vision, use_coding, timeout) | Pool-aware dispatch. Falls back across nodes when a circuit is OPEN. |
async route(task, payload, max_tokens=128, temperature=0.0, fallback=None) | Send a small classification / repair task to the worker pool with a low max_tokens to keep cost down. |
async generate_image(payload) | POST to a node from the image_gen pool; returns base64 PNG. |
get_*_node() | Round-robin selection helpers (worker / vision / coding / image_gen). |
async close() | Closes all underlying httpx clients on lifespan shutdown. |
Concurrency model
- Fully async;
asyncio.Semaphore(3)caps concurrent background routing tasks. - Per-node httpx
AsyncClientmeans a single failing node cannot starve the others. - The breaker is process-local — restart resets state.