sandbox / docker.py — DockerSandbox
Long-lived Docker container that hosts every tool call. Workspace bind-mount, optional Tor, resource limits, output capping.
Container topology
Figure 8 — Workspace mount and container provisioning.
Lifecycle
| Step | Behaviour |
|---|---|
| __init__ (line 13-54) | Container name suffix = md5(workspace)[:8] for parallel-session isolation. Auto-detects Docker/OrbStack socket on macOS. |
| ensure_running (108-282) | Liveness probe (mount sync test + echo OK). Removes stale containers. --network host on Linux, bridge on macOS/Windows. Memory cap from GHOST_SANDBOX_MEM (default 4 g). CPU quota from GHOST_SANDBOX_CPU_QUOTA (default 200000 µs ≈ 2 vCPU). Lazy provisioning: commits provisioned container as ghost-agent-base:latest for instant reboot. |
| execute(cmd, timeout=300, memory_limit=None) (284-347) | Wraps with timeout -k 5s {timeout}s. Output cap 256 KB (head + tail; middle dropped). Returns (output_string, exit_code). Runs as host uid:gid on Linux, root on macOS. |
| close(remove=False) (349-391) | Stop (fast resume) by default; remove=True deletes. Idempotent and exception-safe. The lifespan shutdown calls with remove=False. |
Provisioned packages
Preferred: build ghost-agent-base:latest via the authoritative Dockerfile one-shot; the runtime wrapper then only verifies and reuses it.
scripts/build_sandbox_image.sh
# reads sandbox/Dockerfile, runs a Chromium smoke test at the end,
# ~5 min on a warm docker cache
- apt:
sudo · git · nodejs · npm · postgresql-client · tor · ripgrep · sqlite3 · libpq-dev. - pip:
numpy · pandas · scipy · matplotlib · seaborn · plotly · scikit-learn · yfinance · beautifulsoup4 · networkx · requests · pylint · black · mypy · bandit · dill · ipykernel · jupyter_client · pytest · pytest-asyncio · psycopg2-binary · asyncpg · sqlalchemy · tabulate · sqlglot · playwright · html2text · lxml · pysocks. - browser: Chromium installed via
python3 -m playwright install --with-deps chromium. The--with-depsis the key — it installs the system libs (libnss3,libatk1.0-0,libdrm2, …) thatheadless_shelldynamic-links against. Without them, the binary exists but can't launch; self-play would rediscover this at task time and burn ~100 s re-installing viaplaywright install chromium(no deps) before failing again. Baked into the Dockerfile so it can't be skipped.
Provision marker is versioned (/root/.supercharged.v2)
The runtime gate in ensure_running checks TWO things, not one:
test -f /root/.supercharged.v2— the version-bumped marker. Legacy.supercharged(v1) images — which may have been committed before--with-depswas in the bootstrap — are treated as un-provisioned, forcing a clean reinstall on next boot._chromium_binary_present()—find /root/.cache/ms-playwright -type f \( -name headless_shell -o -name chrome \) -print -quit. Defends against the silent-failure mode where a prior install exited 0 but the binary isn't actually on disk (network flake, mid-extract disk-full, etc.).
If the marker is present but the binary is missing, the gate logs ⚠ Provision marker present but Chromium binary missing. Reinstalling... and re-runs the full install. Post-install, _chromium_binary_present() is checked AGAIN before the marker is touched — fail-loud: the v2 marker is never set unless the binary actually verifies on disk, so a failed install cannot silently poison future boots.
Bump the marker to .v3 (in both sandbox/Dockerfile AND sandbox/docker.py) any time the install surface genuinely changes. The meta-test tests/test_sandbox_chromium_gate.py::test_v2_marker_gate_name_pinned asserts the two paths stay in sync.
Covered by tests/test_sandbox_chromium_gate.py (12 cases including binary-present/absent probes, legacy-marker handling, post-install verification refusing to mark a broken image, and Dockerfile-level assertions about --with-deps).
Resource caps
Tor
If tor_proxy is set in the constructor, the container starts a tor daemon and outbound HTTP from inside the sandbox is routed through it. The native browser tool additionally forces DNS-over-SOCKS via Chromium's --host-resolver-rules="MAP * ~NOTFOUND , EXCLUDE localhost" and disables non-proxied WebRTC (--webrtc-ip-handling-policy=disable_non_proxied_udp), so the browser path cannot leak DNS or the host IP even if the LLM forgets to configure the proxy correctly.