Files

Scott Idem 0afa135ce9 docs: document System block and OTR mode in ARCH__PERSONA.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-09 14:40:21 -04:00

5.3 KiB

Raw Blame History

Architecture: Persona System & Memory

How Inara (and other personas) know who they are and what they remember. Last updated: 2026-05-09

Filesystem Layout

Each persona lives in home/{username}/persona/{name}/:

home/scott/persona/inara/
  IDENTITY.md       Who Inara is — role, name, origin
  SOUL.md           Values, personality, voice, what she cares about
  PROTOCOLS.md      Behavioral rules — how she responds, what she avoids
  CONTEXT_TIERS.md  Documents which files load at each tier
  USER.md           Scott's profile — loaded into context so she knows who she's talking to
  HELP.md           Persona-specific help content (appended to shared HELP.md in UI)
  MEMORY_SHORT.md   Recent session digest (auto-distilled daily)
  MEMORY_MID.md     Mid-term summary (auto-distilled weekly)
  MEMORY_LONG.md    Long-term memory (auto-distilled monthly)
  REMINDERS.md      Pending reminders (auto-surfaced at tier 2+)
  SCRATCH.md        Ephemeral scratchpad (read/write via tools)
  TASKS.json        Personal task list (managed via tools)
  CRONS.json        Scheduled jobs (managed via tools)
  sessions/         Session turn logs — YYYY-MM-DD.md, one file per day

ContextVars: persona.py sets _user and _persona ContextVars per request. Everything downstream calls persona_path() to resolve the right directory — no globals, no thread-local state.

Context Tiers

Each chat request specifies a tier (default: 2). Higher tiers load more context — slower but richer.

Tier	Loaded Files	Use case
1	IDENTITY.md	Minimal — lightweight tasks
2	+ SOUL.md, PROTOCOLS.md, USER.md, MEMORY_SHORT.md, MEMORY_MID.md, REMINDERS.md	Standard chat
3	+ MEMORY_LONG.md, CONTEXT_TIERS.md	Deep sessions, long tasks
4	+ SCRATCH.md, TASKS.json	Full state — agent mode

context_loader.py assembles the system prompt from these files in order. The resulting prompt is passed to whichever LLM backend handles the request.

System Block

Before any persona files, context_loader.py prepends a --- System --- block with per-request metadata:

--- System ---
Current date and time: Friday, 2026-05-09 at 02:34 PM EDT
Current mode: Off The Record — this conversation is private and will not be logged or included in memory distillation

The date/time line is always present (unless the role has inject_datetime: false). The mode line is only added when the session is Off The Record — normal Chat mode adds nothing, so the block stays minimal. This mirrors the same principle as the mode indicator in the UI: only signal when something non-default is in effect.

Memory Distillation

Three-tier rolling memory system, run by APScheduler:

sessions/YYYY-MM-DD.md  ← raw session logs (written by session_logger.py)
        ↓ daily 03:00
MEMORY_SHORT.md         ← recent session digest (no LLM — pure aggregation)
        ↓ weekly Sun 03:30
MEMORY_MID.md           ← concise summary (LLM)
        ↓ monthly 1st 04:00
MEMORY_LONG.md          ← integrated long-term memory (LLM)

Short distill — reads the most recent session files that fit within the token budget, writes them in chronological order. No LLM involved — fast and cheap.

Mid distill — LLM summarizes MEMORY_SHORT into a concise digest. Prompt asks for recurring themes, decisions, ongoing projects, Scott's current state and priorities. Written in first person as Inara.

Long distill — LLM integrates MEMORY_MID into MEMORY_LONG. Rules: preserve historical facts, update stale info, absorb new themes, remove irrelevant entries.

Distill notifications — after mid and long runs, notification.py sends a message to the user's configured NC Talk notification room (if notification_room is set in channels.json).

Controls in .env:

AUTO_DISTILL=true
AUTO_DISTILL_SHORT=true
AUTO_DISTILL_MID=true
AUTO_DISTILL_LONG=true          # off by default — first run warrants manual review
DISTILL_BACKEND_MID=local       # use local model to save API credits
DISTILL_BACKEND_LONG=           # empty = primary backend (claude recommended)
MEMORY_BUDGET_SHORT=3000        # token budgets (soft caps)
MEMORY_BUDGET_MID=2000
MEMORY_BUDGET_LONG=2000

Manual distill via API:

POST /distill/short
POST /distill/mid
POST /distill/long
GET  /distill/status

Adding a New Persona

persona_template.py bootstraps a new persona directory from string templates. The onboarding flow (/setup/persona) calls this when a new user creates their first persona.

To add one manually:

Create home/{username}/persona/{name}/
Copy and edit the files from an existing persona (e.g. home/scott/persona/inara/)
At minimum: IDENTITY.md, SOUL.md, PROTOCOLS.md, USER.md
The distiller will create the MEMORY_*.md files on first run

Session Search

Past sessions are searchable via GET /sessions/search?q=...&user=...&persona=....

Available in the UI via the search box at the bottom of the Files panel (open with the Files button). Results are grouped by date with highlighted excerpts.

Active Personas

User	Persona	Description
scott	inara	Scott's primary assistant
scott	developer	Dev-focused persona
holly	tina	Holly's primary assistant
brian	wintermute	Brian's primary assistant

5.3 KiB Raw Blame History