Add tiered memory system with manual distillation

- config.py: memory_budget_long/mid/short settings (overridable in .env)
- memory_distiller.py: distill_short (no LLM), distill_mid, distill_long (LLM)
- routers/distill.py: POST /distill/{short,mid,long,all} endpoints
- context_loader.py: rewrote to load long→mid→short order with include_* toggles
- routers/chat.py: ChatRequest gains include_long/mid/short fields
- routers/files.py: MEMORY_LONG/MID/SHORT.md added to ALLOWED set
- main.py: register distill router
- static/index.html: context bar — tier selector, L/M/S memory toggles,
  distill buttons with status feedback; send includes tier + memory flags
- inara/MEMORY_LONG.md: migrated from MEMORY.md + Cortex/Talk bot notes
- inara/MEMORY_MID.md, MEMORY_SHORT.md: stubs ready for distillation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-03-17 21:22:32 -04:00
parent 3455c7a09c
commit ce3c1f5f7f
11 changed files with 779 additions and 29 deletions

View File

@@ -26,6 +26,12 @@ class Settings(BaseSettings):
nextcloud_talk_bot_secret: str = "" # set in .env
nextcloud_talk_timeout: int = 55
# Memory tier token budgets — soft caps used during distillation
# Override in .env: MEMORY_BUDGET_LONG=4000 etc.
memory_budget_long: int = 2000
memory_budget_mid: int = 2000
memory_budget_short: int = 3000
host: str = "0.0.0.0"
port: int = 8000

View File

@@ -2,46 +2,83 @@ from pathlib import Path
from config import settings
# Files loaded per tier — mirrors CONTEXT_TIERS.md
TIER_FILES: dict[int, list[str]] = {
1: ["SOUL.md", "IDENTITY.md"], # + USER.md summary only
2: ["SOUL.md", "IDENTITY.md", "USER.md", "MEMORY.md", "PROTOCOLS.md"],
3: ["SOUL.md", "IDENTITY.md", "USER.md", "MEMORY.md", "PROTOCOLS.md"],
4: ["SOUL.md", "IDENTITY.md", "USER.md", "MEMORY.md", "PROTOCOLS.md"],
}
# Core identity files — always loaded regardless of tier
_CORE = ["SOUL.md", "IDENTITY.md"]
# Lines of USER.md to include at Tier 1 (just identity + what he cares about)
TIER_1_USER_LINES = 30
# Lines of USER.md to include at Tier 1 (identity + what he cares about)
_TIER_1_USER_LINES = 30
def _read(path: Path) -> str:
if path.exists():
return path.read_text()
return f"[missing: {path.name}]"
def load_context(
tier: int = 2,
include_long: bool = True,
include_mid: bool = True,
include_short: bool = True,
) -> str:
"""
Build the system-prompt context block for a given tier and memory toggles.
Load order (long → mid → short) keeps the most recent memory closest
to the conversation turn, which improves LLM recall.
def load_context(tier: int = 2) -> str:
Tier 1 — SOUL + IDENTITY + USER summary (~1,500 tokens)
Tier 2 — + USER full + PROTOCOLS + memory (~5,000 tokens)
Tier 3 — + last 2 raw session logs (~15,000 tokens)
Tier 4 — + last 7 raw session logs (~50,000 tokens)
"""
inara_dir = settings.inara_path()
parts = []
files = TIER_FILES.get(tier, TIER_FILES[2])
for filename in files:
# ── 1. Core identity (always) ──────────────────────────────────
for filename in _CORE:
path = inara_dir / filename
if not path.exists():
continue
if path.exists():
parts.append(f"--- {filename} ---\n{path.read_text()}")
if filename == "USER.md" and tier == 1:
# Tier 1: include only the first N lines
lines = path.read_text().splitlines()[:TIER_1_USER_LINES]
# ── 2. USER.md ─────────────────────────────────────────────────
user_path = inara_dir / "USER.md"
if user_path.exists():
if tier == 1:
lines = user_path.read_text().splitlines()[:_TIER_1_USER_LINES]
content = "\n".join(lines)
else:
content = path.read_text()
content = user_path.read_text()
parts.append(f"--- USER.md ---\n{content}")
parts.append(f"--- {filename} ---\n{content}")
if tier < 2:
return "\n\n".join(parts)
# ── 3. Protocols (tier 2+) ─────────────────────────────────────
proto_path = inara_dir / "PROTOCOLS.md"
if proto_path.exists():
parts.append(f"--- PROTOCOLS.md ---\n{proto_path.read_text()}")
# ── 4. Tiered memory — long → mid → short ─────────────────────
# Short is last so it sits closest to the conversation turn.
if include_long:
# Fall back to legacy MEMORY.md during/after migration
long_path = inara_dir / "MEMORY_LONG.md"
if not long_path.exists():
long_path = inara_dir / "MEMORY.md"
if long_path.exists():
parts.append(f"--- {long_path.name} ---\n{long_path.read_text()}")
if include_mid:
mid_path = inara_dir / "MEMORY_MID.md"
if mid_path.exists() and mid_path.stat().st_size > 100:
content = mid_path.read_text()
if "Not yet populated" not in content:
parts.append(f"--- MEMORY_MID.md ---\n{content}")
if include_short:
short_path = inara_dir / "MEMORY_SHORT.md"
if short_path.exists() and short_path.stat().st_size > 100:
content = short_path.read_text()
if "Not yet populated" not in content:
parts.append(f"--- MEMORY_SHORT.md ---\n{content}")
# ── 5. Raw session logs (tier 3+) ──────────────────────────────
if tier >= 3:
# Add recent session logs
sessions_dir = inara_dir / "sessions"
if sessions_dir.exists():
count = 2 if tier == 3 else 7

View File

@@ -8,7 +8,7 @@ import uvicorn
logging.basicConfig(level=logging.INFO, format="%(levelname)s:%(name)s: %(message)s")
from config import settings
from routers import chat, google_chat, nextcloud_talk, files
from routers import chat, google_chat, nextcloud_talk, files, distill
@asynccontextmanager
@@ -24,6 +24,7 @@ app.include_router(chat.router)
app.include_router(google_chat.router)
app.include_router(nextcloud_talk.router)
app.include_router(files.router)
app.include_router(distill.router)
app.mount("/static", StaticFiles(directory="static"), name="static")

170
cortex/memory_distiller.py Normal file
View File

@@ -0,0 +1,170 @@
"""
Inara tiered memory distillation.
distill_short() — roll recent session logs → MEMORY_SHORT.md (no LLM)
distill_mid() — summarize MEMORY_SHORT → MEMORY_MID.md (LLM)
distill_long() — integrate MEMORY_MID → MEMORY_LONG.md (LLM)
"""
import logging
from datetime import datetime
from pathlib import Path
from config import settings
logger = logging.getLogger(__name__)
# Rough chars-per-token estimate for budget enforcement
_CHARS_PER_TOKEN = 4
def _budget_chars(tokens: int) -> int:
return tokens * _CHARS_PER_TOKEN
def _read(path: Path) -> str:
return path.read_text() if path.exists() else ""
def distill_short() -> dict:
"""
Roll the most recent session log files into MEMORY_SHORT.md.
No LLM involved — pure aggregation with budget truncation.
Files are included newest-first until the budget is reached,
then written in chronological order (oldest first).
"""
inara_dir = settings.inara_path()
sessions_dir = inara_dir / "sessions"
budget = _budget_chars(settings.memory_budget_short)
session_files = (
sorted(sessions_dir.glob("*.md"), reverse=True)
if sessions_dir.exists()
else []
)
parts = []
total_chars = 0
for sf in session_files:
content = sf.read_text()
if total_chars + len(content) > budget and parts:
break # always include at least one file
parts.append((sf.name, content))
total_chars += len(content)
if total_chars >= budget:
break
now = datetime.now().strftime("%Y-%m-%d %H:%M")
header = (
f"# MEMORY_SHORT.md — Recent Session Digest\n\n"
f"*Auto-generated: {now}. {len(parts)} session file(s).*\n\n---\n\n"
)
# Write in chronological order (oldest first)
body = "\n\n".join(
f"--- {name} ---\n{content}" for name, content in reversed(parts)
)
out_path = inara_dir / "MEMORY_SHORT.md"
out_path.write_text(header + body)
logger.info("distill_short: wrote %d chars from %d files", len(header) + len(body), len(parts))
return {
"files_included": len(parts),
"chars_written": len(header) + len(body),
"budget_chars": budget,
}
async def distill_mid() -> dict:
"""
Ask the LLM to summarize MEMORY_SHORT.md → MEMORY_MID.md.
"""
from llm_client import complete
inara_dir = settings.inara_path()
short_content = _read(inara_dir / "MEMORY_SHORT.md")
if not short_content.strip() or "Not yet populated" in short_content:
return {"error": "MEMORY_SHORT.md is empty — run distill/short first"}
budget_tokens = settings.memory_budget_mid
system_prompt = (
"You are Inara's memory distillation system. "
"Summarize the following recent session logs into a concise mid-term memory digest. "
f"Target length: under {budget_tokens} tokens. "
"Focus on: recurring themes, important decisions made, ongoing projects, "
"Scott's current state and priorities, and anything that should persist into future sessions. "
"Write in first person as Inara (e.g. 'Scott and I worked on...'). "
"Use markdown headings. Be specific and concrete — no filler."
)
response_text, backend = await complete(
system_prompt=system_prompt,
messages=[{"role": "user", "content": short_content}],
)
now = datetime.now().strftime("%Y-%m-%d %H:%M")
header = (
f"# MEMORY_MID.md — Mid-Term Memory Digest\n\n"
f"*Auto-distilled: {now} via {backend}.*\n\n---\n\n"
)
out_path = inara_dir / "MEMORY_MID.md"
out_path.write_text(header + response_text)
logger.info("distill_mid: wrote %d chars via %s", len(header) + len(response_text), backend)
return {
"backend": backend,
"chars_written": len(header) + len(response_text),
"budget_tokens": budget_tokens,
}
async def distill_long() -> dict:
"""
Ask the LLM to integrate MEMORY_MID.md into MEMORY_LONG.md.
"""
from llm_client import complete
inara_dir = settings.inara_path()
long_content = _read(inara_dir / "MEMORY_LONG.md")
mid_content = _read(inara_dir / "MEMORY_MID.md")
if not mid_content.strip() or "Not yet populated" in mid_content:
return {"error": "MEMORY_MID.md is empty — run distill/mid first"}
budget_tokens = settings.memory_budget_long
system_prompt = (
"You are Inara's long-term memory curator. "
"You will receive the current long-term memory and a recent mid-term digest. "
f"Integrate the new information into the long-term memory. Target: under {budget_tokens} tokens. "
"Rules: preserve important historical facts; update or replace stale information; "
"absorb recurring themes from the mid-term digest; remove things no longer relevant. "
"Return ONLY the updated MEMORY_LONG.md content in markdown. No preamble or commentary."
)
user_content = (
f"## Current MEMORY_LONG.md\n\n{long_content}\n\n"
f"## Recent MEMORY_MID.md to integrate\n\n{mid_content}"
)
response_text, backend = await complete(
system_prompt=system_prompt,
messages=[{"role": "user", "content": user_content}],
)
# Ensure the file has the right header if the LLM dropped it
now = datetime.now().strftime("%Y-%m-%d %H:%M")
if not response_text.lstrip().startswith("# MEMORY_LONG"):
response_text = (
f"# MEMORY_LONG.md — Inara Long-Term Memory\n\n"
f"*Last distilled: {now} via {backend}.*\n\n---\n\n"
+ response_text
)
out_path = inara_dir / "MEMORY_LONG.md"
out_path.write_text(response_text)
logger.info("distill_long: wrote %d chars via %s", len(response_text), backend)
return {
"backend": backend,
"chars_written": len(response_text),
"budget_tokens": budget_tokens,
}

View File

@@ -19,6 +19,9 @@ class ChatRequest(BaseModel):
session_id: str | None = None
tier: int | None = None
model: str | None = None # "claude" or "gemini" to override; None = use primary_backend
include_long: bool = True
include_mid: bool = True
include_short: bool = True
class BackendRequest(BaseModel):
@@ -49,7 +52,12 @@ async def _stream_chat(req: ChatRequest):
session_id = req.session_id or generate_session_id()
tier = req.tier or settings.default_tier
system_prompt = load_context(tier)
system_prompt = load_context(
tier,
include_long=req.include_long,
include_mid=req.include_mid,
include_short=req.include_short,
)
history = load_session(session_id)
history.append({"role": "user", "content": req.message})

44
cortex/routers/distill.py Normal file
View File

@@ -0,0 +1,44 @@
"""
Manual memory distillation endpoints.
POST /distill/short — roll session logs → MEMORY_SHORT.md (no LLM)
POST /distill/mid — summarize short → MEMORY_MID.md (LLM)
POST /distill/long — integrate mid → MEMORY_LONG.md (LLM)
POST /distill/all — run all three in sequence
"""
from fastapi import APIRouter
from memory_distiller import distill_short, distill_mid, distill_long
router = APIRouter(prefix="/distill")
@router.post("/short")
async def do_distill_short() -> dict:
return {"ok": True, **distill_short()}
@router.post("/mid")
async def do_distill_mid() -> dict:
result = await distill_mid()
return {"ok": "error" not in result, **result}
@router.post("/long")
async def do_distill_long() -> dict:
result = await distill_long()
return {"ok": "error" not in result, **result}
@router.post("/all")
async def do_distill_all() -> dict:
short_result = distill_short()
mid_result = await distill_mid()
if "error" in mid_result:
return {"ok": False, "short": short_result, "mid": mid_result}
long_result = await distill_long()
return {
"ok": "error" not in long_result,
"short": short_result,
"mid": mid_result,
"long": long_result,
}

View File

@@ -12,9 +12,12 @@ ALLOWED = {
"SOUL.md",
"IDENTITY.md",
"USER.md",
"MEMORY.md",
"PROTOCOLS.md",
"CONTEXT_TIERS.md",
"MEMORY.md", # legacy — kept for reference
"MEMORY_LONG.md",
"MEMORY_MID.md",
"MEMORY_SHORT.md",
}

View File

@@ -522,6 +522,164 @@
.edit-save-btn { border-color: var(--inara-border); color: var(--accent); }
.edit-save-btn:hover { background: var(--inara-bg); }
.edit-cancel-btn:hover { color: var(--text); border-color: var(--muted); }
/* ── File editor modal ───────────────────────────────────── */
#file-modal {
display: none;
position: fixed;
inset: 0;
background: rgba(0,0,0,0.7);
z-index: 200;
align-items: center;
justify-content: center;
}
#file-modal.open { display: flex; }
#file-modal-inner {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 10px;
width: min(860px, 96vw);
height: min(82vh, 800px);
display: flex;
flex-direction: column;
overflow: hidden;
}
#file-modal-header {
display: flex;
align-items: center;
gap: 8px;
padding: 10px 14px;
border-bottom: 1px solid var(--border);
background: var(--bg);
flex-shrink: 0;
}
#file-modal-header select {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 5px;
color: var(--text);
font-size: 0.85rem;
padding: 4px 8px;
cursor: pointer;
}
#file-modal-title {
font-size: 0.9rem;
font-weight: 600;
color: var(--accent);
flex: 1;
}
.fm-btn {
background: var(--bg);
border: 1px solid var(--border);
border-radius: 5px;
color: var(--muted);
font-size: 0.75rem;
padding: 4px 10px;
cursor: pointer;
transition: color 0.15s, border-color 0.15s;
}
.fm-btn:hover { color: var(--text); border-color: var(--muted); }
.fm-btn.active { color: var(--accent); border-color: var(--accent); }
.fm-btn.save { color: var(--accent); border-color: var(--inara-border); }
.fm-btn.save:hover { background: var(--inara-bg); }
#file-saved-msg {
font-size: 0.75rem;
color: #6abf6a;
opacity: 0;
transition: opacity 0.3s;
}
#file-saved-msg.show { opacity: 1; }
#file-modal-body {
flex: 1;
overflow: hidden;
display: flex;
flex-direction: column;
}
#file-editor {
flex: 1;
width: 100%;
background: var(--bg);
color: var(--text);
border: none;
outline: none;
padding: 16px;
font-family: 'Courier New', monospace;
font-size: 0.85rem;
line-height: 1.55;
resize: none;
display: block;
}
#file-preview {
flex: 1;
overflow-y: auto;
padding: 16px 20px;
display: none;
line-height: 1.6;
}
#file-preview.active { display: block; }
#file-editor.hidden { display: none; }
/* Talk activity badge on Sessions button */
#sessions-btn.talk-badge::after {
content: '●';
color: #7cb9e8;
margin-left: 5px;
font-size: 0.55rem;
vertical-align: middle;
}
/* ── Context bar ─────────────────────────────────────────── */
#context-bar {
display: flex;
align-items: center;
gap: 6px;
padding: 4px 20px;
background: var(--surface);
border-top: 1px solid var(--border);
flex-wrap: wrap;
}
.ctx-label {
font-size: 0.63rem;
color: var(--muted);
flex-shrink: 0;
}
.ctx-btn {
background: var(--bg);
border: 1px solid var(--border);
border-radius: 4px;
color: var(--muted);
font-size: 0.63rem;
padding: 2px 7px;
cursor: pointer;
transition: color 0.15s, border-color 0.15s, background 0.15s;
}
.ctx-btn:hover { color: var(--text); border-color: var(--muted); }
.ctx-btn.active { color: var(--accent); border-color: var(--accent); }
.ctx-btn.mem-on { color: #6abf6a; border-color: #2a4a2a; }
.ctx-sep { flex: 1; min-width: 8px; }
#ctx-distill-status {
font-size: 0.62rem;
color: #6abf6a;
opacity: 0;
transition: opacity 0.3s;
white-space: nowrap;
}
#ctx-distill-status.show { opacity: 1; }
#ctx-distill-status.err { color: var(--error-text); }
</style>
</head>
<body>
@@ -532,14 +690,55 @@
<div class="subtitle">Cortex · Local</div>
</div>
<button id="sessions-btn" class="hdr-btn">Sessions</button>
<button id="files-btn" class="hdr-btn">Files</button>
<button id="backend-toggle" class="hdr-btn" title="Click to switch primary backend">claude</button>
<div id="sessions-panel"></div>
</header>
<!-- File editor modal -->
<div id="file-modal">
<div id="file-modal-inner">
<div id="file-modal-header">
<span id="file-modal-title">Context Files</span>
<select id="file-select"></select>
<button class="fm-btn" id="file-raw-btn">edit</button>
<button class="fm-btn active" id="file-preview-btn">preview</button>
<button class="fm-btn save" id="file-save-btn">Save</button>
<span id="file-saved-msg">saved ✓</span>
<button class="fm-btn" id="file-close-btn"></button>
</div>
<div id="file-modal-body">
<textarea id="file-editor" spellcheck="false"></textarea>
<div id="file-preview"></div>
</div>
</div>
</div>
<div id="messages"></div>
<div id="session-id"></div>
<!-- Context / memory controls -->
<div id="context-bar">
<span class="ctx-label">Tier:</span>
<button class="ctx-btn" data-tier="1" id="tier-1">1</button>
<button class="ctx-btn active" data-tier="2" id="tier-2">2</button>
<button class="ctx-btn" data-tier="3" id="tier-3">3</button>
<button class="ctx-btn" data-tier="4" id="tier-4">4</button>
<span class="ctx-sep"></span>
<span class="ctx-label">Mem:</span>
<button class="ctx-btn mem-on" id="mem-long-btn" title="Long-term memory (MEMORY_LONG.md)">L</button>
<button class="ctx-btn mem-on" id="mem-mid-btn" title="Mid-term memory (MEMORY_MID.md)">M</button>
<button class="ctx-btn mem-on" id="mem-short-btn" title="Short-term memory (MEMORY_SHORT.md)">S</button>
<span class="ctx-sep"></span>
<span class="ctx-label">Distill:</span>
<button class="ctx-btn" id="distill-short-btn" title="Roll session logs → MEMORY_SHORT">short</button>
<button class="ctx-btn" id="distill-mid-btn" title="Summarize short → MEMORY_MID (LLM)">mid</button>
<button class="ctx-btn" id="distill-long-btn" title="Integrate mid → MEMORY_LONG (LLM)">long</button>
<button class="ctx-btn" id="distill-all-btn" title="Run all three distillation steps">all</button>
<span id="ctx-distill-status"></span>
</div>
<div id="input-area">
<textarea id="input" rows="1" placeholder="Message Inara… (Ctrl+Enter to send)" autofocus></textarea>
<div id="right-col">
@@ -581,6 +780,7 @@
let primaryBackend = 'claude';
let activeController = null;
let currentHistory = []; // mirrors backend session [{role, content}, ...]
let talkThinkingDiv = null; // pending "thinking…" bubble for live Talk updates
// ── Enter toggle ─────────────────────────────────────────────
// Default: Ctrl+Enter sends. Stored in localStorage.
@@ -769,6 +969,8 @@
}
async function resumeSession(id) {
talkThinkingDiv = null;
if (id && id.startsWith('nct_')) sessionsBtn.classList.remove('talk-badge');
const res = await fetch(`/history/${id}`);
const data = await res.json();
@@ -1092,7 +1294,14 @@
const res = await fetch('/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: text, session_id: sessionId }),
body: JSON.stringify({
message: text,
session_id: sessionId,
tier: currentTier,
include_long: memLong,
include_mid: memMid,
include_short: memShort,
}),
signal: activeController.signal,
});
@@ -1166,6 +1375,214 @@
inputEl.addEventListener('input', syncHeight);
// ── File editor ──────────────────────────────────────────────
const fileModal = document.getElementById('file-modal');
const fileSelect = document.getElementById('file-select');
const fileEditor = document.getElementById('file-editor');
const filePreview = document.getElementById('file-preview');
const fileRawBtn = document.getElementById('file-raw-btn');
const filePreviewBtn = document.getElementById('file-preview-btn');
const fileSaveBtn = document.getElementById('file-save-btn');
const fileSavedMsg = document.getElementById('file-saved-msg');
const fileCloseBtn = document.getElementById('file-close-btn');
const filesBtn = document.getElementById('files-btn');
let fileMode = 'preview'; // 'edit' or 'preview'
function setFileMode(mode) {
fileMode = mode;
if (mode === 'edit') {
fileEditor.classList.remove('hidden');
filePreview.classList.remove('active');
fileRawBtn.classList.add('active');
filePreviewBtn.classList.remove('active');
} else {
fileEditor.classList.add('hidden');
filePreview.classList.add('active');
fileRawBtn.classList.remove('active');
filePreviewBtn.classList.add('active');
if (typeof marked !== 'undefined') {
filePreview.innerHTML = marked.parse(fileEditor.value);
filePreview.querySelectorAll('a').forEach(a => {
a.target = '_blank'; a.rel = 'noopener noreferrer';
});
}
}
}
async function loadFile(name) {
const res = await fetch(`/files/${encodeURIComponent(name)}`);
if (!res.ok) { fileEditor.value = `Error loading ${name}`; return; }
const data = await res.json();
fileEditor.value = data.content;
document.getElementById('file-modal-title').textContent = name;
setFileMode(fileMode);
}
async function openFileModal() {
// Populate the file list
const res = await fetch('/files');
const data = await res.json();
fileSelect.innerHTML = '';
for (const f of data.files) {
const opt = document.createElement('option');
opt.value = f.name;
opt.textContent = f.name + (f.exists ? '' : ' (missing)');
fileSelect.appendChild(opt);
}
fileModal.classList.add('open');
await loadFile(fileSelect.value);
}
filesBtn.addEventListener('click', openFileModal);
fileSelect.addEventListener('change', () => loadFile(fileSelect.value));
fileRawBtn.addEventListener('click', () => setFileMode('edit'));
filePreviewBtn.addEventListener('click', () => setFileMode('preview'));
fileSaveBtn.addEventListener('click', async () => {
const name = fileSelect.value;
const res = await fetch(`/files/${encodeURIComponent(name)}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ content: fileEditor.value }),
});
if (res.ok) {
fileSavedMsg.classList.add('show');
setTimeout(() => fileSavedMsg.classList.remove('show'), 2000);
}
});
fileCloseBtn.addEventListener('click', () => fileModal.classList.remove('open'));
fileModal.addEventListener('click', (e) => {
if (e.target === fileModal) fileModal.classList.remove('open');
});
document.addEventListener('keydown', (e) => {
if (e.key === 'Escape' && fileModal.classList.contains('open')) {
fileModal.classList.remove('open');
}
// Ctrl+S to save when modal is open
if ((e.ctrlKey || e.metaKey) && e.key === 's' && fileModal.classList.contains('open')) {
e.preventDefault();
fileSaveBtn.click();
}
});
// ── Real-time Talk updates (SSE) ─────────────────────────────
const evtSource = new EventSource('/events');
evtSource.onmessage = (e) => {
let data;
try { data = JSON.parse(e.data); } catch { return; }
if (data.type === 'keepalive') return;
if (data.type !== 'nct_message' && data.type !== 'nct_response') return;
if (sessionId === data.session_id) {
// Active session — append live
if (data.type === 'nct_message') {
// Clear any stale thinking div before new user msg
if (talkThinkingDiv) { talkThinkingDiv.remove(); talkThinkingDiv = null; }
addMessage('user', data.content);
talkThinkingDiv = addMessage('assistant thinking', '✨ thinking…');
} else {
if (talkThinkingDiv) {
talkThinkingDiv.className = 'message assistant';
setMessageText(talkThinkingDiv, 'assistant', data.content);
talkThinkingDiv = null;
} else {
addMessage('assistant', data.content);
}
scrollToBottom();
}
} else {
// Different session — light badge on Sessions button
if (data.type === 'nct_message') {
sessionsBtn.classList.add('talk-badge');
}
}
};
// ── Context bar — tier + memory toggles + distill ────────────
let currentTier = parseInt(localStorage.getItem('ctx-tier') || '2');
let memLong = localStorage.getItem('mem-long') !== 'false';
let memMid = localStorage.getItem('mem-mid') !== 'false';
let memShort = localStorage.getItem('mem-short') !== 'false';
const distillStatus = document.getElementById('ctx-distill-status');
function updateTierUI() {
document.querySelectorAll('.ctx-btn[data-tier]').forEach(btn => {
btn.classList.toggle('active', parseInt(btn.dataset.tier) === currentTier);
});
}
function updateMemUI() {
document.getElementById('mem-long-btn').classList.toggle('mem-on', memLong);
document.getElementById('mem-mid-btn').classList.toggle('mem-on', memMid);
document.getElementById('mem-short-btn').classList.toggle('mem-on', memShort);
document.getElementById('mem-long-btn').classList.toggle('active', false);
document.getElementById('mem-mid-btn').classList.toggle('active', false);
document.getElementById('mem-short-btn').classList.toggle('active', false);
}
document.querySelectorAll('.ctx-btn[data-tier]').forEach(btn => {
btn.addEventListener('click', () => {
currentTier = parseInt(btn.dataset.tier);
localStorage.setItem('ctx-tier', currentTier);
updateTierUI();
});
});
document.getElementById('mem-long-btn').addEventListener('click', () => {
memLong = !memLong;
localStorage.setItem('mem-long', memLong);
updateMemUI();
});
document.getElementById('mem-mid-btn').addEventListener('click', () => {
memMid = !memMid;
localStorage.setItem('mem-mid', memMid);
updateMemUI();
});
document.getElementById('mem-short-btn').addEventListener('click', () => {
memShort = !memShort;
localStorage.setItem('mem-short', memShort);
updateMemUI();
});
function showDistillStatus(msg, isErr) {
distillStatus.textContent = msg;
distillStatus.classList.toggle('err', !!isErr);
distillStatus.classList.add('show');
setTimeout(() => distillStatus.classList.remove('show'), 4000);
}
async function runDistill(endpoint) {
showDistillStatus('distilling…', false);
try {
const res = await fetch(`/distill/${endpoint}`, { method: 'POST' });
const d = await res.json();
if (!res.ok || d.ok === false) {
const err = d.error || d.mid?.error || d.long?.error || `HTTP ${res.status}`;
showDistillStatus(`${err}`, true);
} else {
showDistillStatus(`${endpoint} done`, false);
}
} catch (err) {
showDistillStatus(`${err.message}`, true);
}
}
document.getElementById('distill-short-btn').addEventListener('click', () => runDistill('short'));
document.getElementById('distill-mid-btn').addEventListener('click', () => runDistill('mid'));
document.getElementById('distill-long-btn').addEventListener('click', () => runDistill('long'));
document.getElementById('distill-all-btn').addEventListener('click', () => runDistill('all'));
updateTierUI();
updateMemUI();
// ── Init ─────────────────────────────────────────────────────
updateEnterToggleUI();
syncHeight();

56
inara/MEMORY_LONG.md Normal file
View File

@@ -0,0 +1,56 @@
# MEMORY_LONG.md — Inara Long-Term Memory
*Curated. Distilled. Update this; don't just append to it.*
*Last distilled: 2026-03-04 (migrated from MEMORY.md 2026-03-17)*
---
## Origin
- Inara began as the primary agent in Scott's OpenClaw setup, starting January 2026.
- Identity files migrated to the Cortex project on 2026-03-04.
- Cortex is the multi-agent orchestration system Scott is building. I am its primary resident agent.
---
## About Scott
See `USER.md` for full profile. Key notes for memory:
- Night owl. Does his best thinking late. Late-night sessions are normal, not cause for concern.
- Motivated by helping people more than by money or recognition.
- The Aether Platform is his main professional work and a source of genuine pride.
- Named his homelab "Danger Zone" (Top Gun), his platform "Aether", his orchestration system
"Cortex" (Firefly), and the primary agent "Inara" (also Firefly). The naming arc is intentional
and means something to him.
- Has twin brothers (~2 years younger) in CS/Engineering.
- Solar array came online February 2026 — 10kW peak generation.
---
## Infrastructure Baseline
- WireGuard mesh connects all fleet nodes. All Cortex traffic should stay on VPN.
- `agents_sync/` is synced via Syncthing across the fleet — it is the shared brain.
- Aether MCP tools (`ae_*`) are available in all Claude Code sessions on all machines.
- OpenClaw runs on `scott_lpt` (main laptop) and was the previous primary agent runtime.
- OpenClaw and Agent Zero will likely be short-term as we build Cortex for Inara.
---
## Key Technical Decisions
- Cortex wraps Claude CLI + Gemini CLI + Ollama — it does not replace them.
- Dispatcher is Python FastAPI on the home server (always-on Docker host).
- Ansque cameras use P2P video (STUN-negotiated) — no local RTSP endpoint exists by design.
Control is cloud-only via MQTT. IoT VLAN segmentation planned (Phase 0 of Cortex roadmap).
- OpenClaw stays on version 2026.2.15 (stable hold) due to plugin lifecycle crash in 2026.2.17.
- Nextcloud Talk bot HMAC: sign `random + message_text` only (NOT raw body) — critical detail.
- Claude CLI OAuth token is read live from `~/.claude/.credentials.json` on every call to avoid
stale token issues. Never set ANTHROPIC_API_KEY to an OAuth token value.
---
## Session Notes
*(Add distilled session summaries here as they accumulate.)*

4
inara/MEMORY_MID.md Normal file
View File

@@ -0,0 +1,4 @@
# MEMORY_MID.md — Mid-Term Memory Digest
*Auto-distilled by Cortex. Run `POST /distill/mid` to regenerate.*
*Not yet populated — run distill/short then distill/mid to build this.*

4
inara/MEMORY_SHORT.md Normal file
View File

@@ -0,0 +1,4 @@
# MEMORY_SHORT.md — Recent Session Digest
*Auto-generated by Cortex. Run `POST /distill/short` to regenerate.*
*Not yet populated — run distill/short to build this.*