tools / memory.py — memory tool wrappers
LLM-facing wrappers around the underlying memory subsystem, plus the self-play control surface.
| Tool | Backend | Behaviour |
|---|---|---|
knowledge_base (insert_fact) | vector + optionally graph | Insert fact with MD5 dedup; if memory_bus is present, asynchronously extracts (subject, PREDICATE, object) triplets via the worker LLM. |
knowledge_base (ingest_document) | vector | Ingest URL or local file (PDF / text). Hard caps: 100 MB on disk, 5 MB extracted text, 1000 PDF pages. Chunked via semantic_split_text or recursive_split_text. |
knowledge_base (forget / list_docs / reset_all) | vector | Library management. |
recall | vector | Semantic search: memory_system.collection.query(query_texts=[query], n_results=...). |
update_profile | profile | Persist user facts via the canonical merge. |
learn_skill | skills | Store task / mistake / solution triplet. |
scratchpad | scratchpad | set / get / list / clear KV. |
dream_mode | dream | Triggers an active consolidation pass. |
self_play | dream + frontier | Runs one self-play cycle picked by the FrontierTracker. 600 s wall-clock cap. |
self_play_loop(max_cycles=0, model="") | dream + frontier | Spawns a background asyncio.Task running self-play cycles back-to-back. Stops when the user sends any message (handle_chat sets the stop event) or stop_self_play is called. Not persisted across restarts. Optional model override; max_cycles=0 is unbounded. |
stop_self_play | — | Signals the running loop to exit after its current cycle. No-op if no loop is running. |
list_lessons(scope="today", limit=20) | skills | Surface learned LESSONS (mistakes-and-fixes) filtered by local-time window. scope ∈ {today, week, all, self_play_only}. Routed automatically when the user asks "what did you learn today / so far?", "show me your lessons", "show me the lesson playbook". Distinct from manage_skills: a SKILL is a tool, a LESSON is a fix; "show me your skills" routes to manage_skills, NOT here. |
Intent guard on self_play / self_play_loop
Both tools refuse to run unless the current turn's user text explicitly asks for self-play, a practice round, or a training cycle. handle_chat stashes the turn's message on context.last_user_content right after parsing the request body; _user_asked_for_self_play(context) matches a conservative allow-list of phrases ("run self-play", "train until stopped", "practice cycle", "synthetic self-play", etc. — see _SELF_PLAY_INTENT_PHRASES). If the guard trips, the tool returns _SELF_PLAY_INTENT_REFUSAL without invoking the Dreamer. Rationale: the 2026-04-24 webOS incident showed the 30B-A3 model hallucinating "The user wants me to run self-play" 33 minutes into a webOS-building session where the user had never mentioned self-play. The tool description says "Use this EVERY TIME the user asks to practice, train, or do self-play" but that's LLM-side prose — the guard makes it enforceable.
The biological watchdog bypasses this check: phase 3 of _biological_tick calls Dreamer.synthetic_self_play directly, not through tool_self_play, so legitimate background self-play fires unaffected. Covered by tests/test_self_play_intent_guard.py.
Self-play loop internals
- Cool-off between cycles is adaptive (5–180 s clamped) via
FrontierTracker.adaptive_cooldown. Falls back to a 30 s baseline if no tracker is available. - The loop task + stop event are stashed on
context.selfplay_loop_taskandcontext.selfplay_loop_stop.handle_chatchecks these on every user message and sets the stop event — the loop checks it at each cycle boundary and during the cool-offasyncio.wait_for. - Between cycles the loop explicitly drains the short-term journal via
context.agent.process_journal_queue()so hippocampus doesn't fall behind during long runs. Cheap no-op when the journal is empty. - Inside the isolated sub-agent's context,
selfplay_loop_*attributes are stripped to prevent the inner solver'shandle_chatfrom tripping the user-message interrupt on the outer loop.
Memory bus pattern
When a MemoryBus is attached, insert_fact calls publish_fact("insert_fact", ...) rather than writing the vector store directly. The bus fans the write out to vector + graph (and other tiers) in parallel, and the dedup-LRU gate prevents re-publishes within a 256-unique-event window.