CLI Reference

Every flag accepted by python -m src.ghost_agent.main, sourced from main.py:61-138.

Networking & identity

FlagTypeDefaultEffect
--hoststr0.0.0.0uvicorn bind address.
--portint8000uvicorn listen port.
--upstream-urlstrhttp://127.0.0.1:8080OpenAI-compatible LLM backend (llama.cpp / Ollama / vLLM).
--modelstr$GHOST_MODEL or qwen-3.6-35b-a3Model identifier returned on Ollama compatibility routes (/api/show, /api/tags).
--api-keystr$GHOST_API_KEY or ghost-secret-123Value the agent expects in the X-Ghost-Key header.
--default-dbstr$GHOST_DEFAULT_DB or local Postgres DSNDefault DSN for the postgres_admin tool.

Logging & verbosity

FlagTypeEffect
-d / --daemonflagSuppress stdout logging (file-only).
--debugflagSet log level to DEBUG.
-v / --verboseflagDisable per-line truncation; raises LOG_TRUNCATE_LIMIT from 60 chars to 1M.

Memory behaviour

FlagDefaultEffect
--no-memoryoffSkip vector / graph / profile / episodic / adaptive-threshold / contradiction-log initialisation. Useful for regression tests.
--smart-memory0.0Initial value for the AdaptiveThreshold recall gate. The store self-tunes from observations once recording begins (window=100, MIN_OBSERVATIONS=20).
--max-context65536Maximum tokens before ContextManager escalates compression (L0..L4).

Tool surface

FlagDefaultEffect
--native-tools / --no-native-toolsonWhether to advertise OpenAI-style tools array on outbound LLM calls. When on, the equivalent XML <tool_def> schema is suppressed from the prompt to avoid double-shipping the same definitions (~7,800 token saving per turn). The XML format scaffolding stays so the parser still accepts the legacy <tool_call> shape as fallback. Off forces XML-only tool dispatch (no native channel; full schema in prompt). See Context compaction in core.agent.
--anonymousonRoute web search through Tor / DuckDuckGo (with identity rotation on 401/403/503).
--deep-reasonoffInitialise MCTSReasoner (max_candidates=3, max_depth=2) and HypothesisTester.
--perfect-itoffAppend a proactive optimisation pass after a session completes.

Stage-1 self-improvement pipeline

Local-only trajectory logging, self-critique reflection, and complexity-routed dispatch. All three default to ON (opt-out shape) when --no-memory is not set; the pipeline is fully local — no external teacher, no hosted embedder. See self-improvement pipeline for the architecture.

FlagTypeDefaultEffect
--no-trajectoriesflagoffDisable the JSONL trajectory log at $GHOST_HOME/system/trajectories/. Also implicitly disables reflection (which reads from the log).
--no-reflectionflagoffDisable the reflection biological phase (2.5) even if trajectory logging is on. Trajectories still write to disk but the Reflector never fires.
--router-modelpathunsetPath to a persisted ComplexityClassifier JSON. When unset, the dispatcher acts as a pass-through that always escalates to the full swarm pool list — never less capable, just never cheaper. Train a classifier via trajectories collected with --no-trajectories off.
--router-confidence-thresholdfloat0.3Minimum router confidence required to route a request to the cheap path. Below this, the dispatcher escalates to the full swarm (fail-safe).
--prm-modelpathunsetPath to a persisted PRM (Process Reward Model) JSON checkpoint. When set, the scorer loads on startup and plugs into the MCTS reasoner so plan candidates are scored in microseconds instead of paying a worker-LLM simulation per candidate. Unset → no-op scorer returning 0.5 for every candidate (call sites stay branch-free). See PRM algorithms doc.
--prm-train-cooldownint (seconds)10800Cooldown for the idle-time PRM retrain pass (biological phase 2.7). Default 3 hours. No effect when --prm-model is unset.
--frontier-selfplay / --no-frontier-selfplayflagonBiological-watchdog phase-3 self-play picks the next cluster by (PRM uncertainty × trajectory rarity) instead of only the brittle-pool score. Surfaces clusters the agent has barely tried — which the outcomes-only signal misses, because "never tried" looks the same as "solved instantly". Degrades gracefully when the PRM is untrained or trajectory store is empty (strict isinstance gate; transparent fallback to pick_seed). See core / frontier_selection.
--frontier-uniform-sample-probfloat0.2Probability per self-play tick that frontier-aware selection is bypassed in favour of legacy pick_seed — sanity floor so a systematically-wrong PRM can't lock self-play onto one cluster forever (the PRM is itself learned from trajectories self-play produces).

Helper scripts

Python entry points under scripts/:

ScriptPurpose
scripts/eval_baseline.py freeze|compareRun the offline eval suite (--suite {default,post_learning}) via a stub runner or HTTP against a running agent. Freeze the result as a baseline or diff a subsequent run. Flags: --runner {stub,http}, --base-url, --api-key, --model, --timeout N (default 300s — template tasks on a local Qwen-scale model commonly run 80–250s).
scripts/run_gepa.py --signature <name>Run DSPy / GEPA prompt optimisation on one of the allow-listed signatures (planning.decompose, tool_selection.pick, reflection.critique). Reads trajectories from $GHOST_HOME/trajectories, uses Ghost's own upstream as the optimiser LM (no external teacher), writes the tuned instruction JSON to $GHOST_HOME/system/optim/.
scripts/build_sandbox_image.shBuild ghost-agent-base:latest from sandbox/Dockerfile — bakes apt deps, Python stack, and Playwright Chromium (with --with-deps) into the image. Runs a Chromium smoke test at the end. One-shot per Ghost version; the runtime sandbox wrapper picks up the freshly-built image on next ensure_running.

Swarm topology

Each value is a comma-separated list of url|model pairs. Pools are independent and routed by LLMClient.

FlagUsed forSelector
--swarm-nodesParallel inference / planning fan-outchat_completion(use_swarm=True)
--worker-nodesCheap classifier / verifier sub-tasksroute(task=...) dispatch
--visual-nodesMultimodal vision (PDF + image)chat_completion(use_vision=True)
--coding-nodesCode generation specialistschat_completion(use_coding=True)
--image-gen-nodesSDXL image generationgenerate_image()

Utility modules

The src/ghost_agent/utils/ package centralises cross-cutting helpers:

ModuleHighlights
helpers.pyrequest_new_tor_identity() rotates Tor circuits; helper_fetch_url_content() wraps curl_cffi/httpx with SSRF guards, 5 MB body cap, 20 s timeout, and 3 retry attempts; recursive_split_text + semantic_split_text chunk text for ingestion; get_utc_timestamp / parse_utc_timestamp for ISO timestamps.
logging.pysetup_logging configures rotating file + stdout sinks; pretty_log() emits structured icons-and-tags log lines with per-request tagging via the request_id_context ContextVar; truncation defaults to 60 chars unless --verbose.
sanitizer.pyextract_code_from_markdown, fix_python_syntax (AST-driven repair loop, max 20 retries), and sanitize_code chain that scrubs control characters and heals partial Python before exec.
token_counter.pyload_tokenizer caches Qwen3 35B tokenizer locally with a 15 s download timeout fallback; estimate_tokens uses a bounded LRU cache; check_budget reports per-message token usage.