Files
Cortex-Inara/inara/HELP.md
Scott Idem 4253e69c0b Add auto memory distillation scheduler (APScheduler)
- scheduler.py: AsyncIOScheduler with three cron jobs
    short  daily     03:00 (no LLM, always fast)
    mid    weekly    Sun 03:30 (LLM)
    long   monthly   1st 04:00 (LLM — off by default)
- config.py: AUTO_DISTILL, AUTO_DISTILL_SHORT/MID/LONG .env flags
- main.py: start/stop scheduler in FastAPI lifespan
- routers/distill.py: GET /distill/status — next run times + config
- requirements.txt: apscheduler>=3.10
- HELP.md: updated planned items, added /distill/status to API table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 22:31:38 -04:00

7.9 KiB
Raw Blame History

Cortex UI — Help & Reference

This file is loaded into Inara's context at Tier 2+ so she can help Scott navigate the interface. It is also displayed in the web UI via the ? button.

Last updated: 2026-03-17


Header Controls

Button What it does
Sessions Open the sessions panel — list, resume, or start sessions
Files Open the identity file editor (SOUL, MEMORY, etc.)
⚙ N Open the Context & Memory panel (N = current tier)
claude / gemini Active backend — click to toggle primary
Aa / A+ / A Cycle font size: normal (16px) → large (18px) → small (14px)
☾ / ☀ Toggle dark / light theme
? Open this help panel

All header settings (theme, font size, tier, memory layers) persist in localStorage across page refreshes.


Chat

  • Send: Ctrl+Enter by default. Click ⌃↵ in the input controls to toggle to plain Enter mode.
  • Stop: Click Stop to cancel an in-progress response at any time.
  • Edit a message: Hover over any message → click edit. Ctrl+Enter saves, Esc cancels.
  • Delete a message: Hover over any message → click del. Removes from session history.
  • Copy a response: Hover over any assistant message → click copy.
  • New line while typing: Shift+Enter (in Ctrl+Enter mode) or Shift+Enter / Enter (in Enter mode).

Sessions

Sessions are named conversation threads that persist across page refreshes.

  • Click Sessions+ New to start a fresh session.
  • Click any listed session to resume it — full history loads instantly.
  • Sessions from Nextcloud Talk appear as nct_* prefixed IDs.
  • A blue badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.

Notes

Notes are injected into a session without triggering an LLM response.

  • Click Note to toggle note mode. The input border changes colour.
  • Private note (amber border) — visible only in the UI, never sent to the LLM.
  • Context note (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
  • Click the private / public label to switch between note types.

Backends

  • Claude CLI and Gemini CLI are both available. One is primary, the other is fallback.
  • Click the backend button (claude or gemini) to switch which is primary.
  • If the primary fails or times out, the fallback is used automatically. A notice appears in the chat when this happens.
  • Timeouts: Claude 60s, Gemini 120s.

Nextcloud Talk Bot

Inara is registered as a bot in Nextcloud Talk.

  • Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
  • The webhook returns 200 OK immediately; the LLM call and reply happen asynchronously.
  • Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
  • To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.

Files (Identity Editor)

The Files button opens an editor for Inara's identity and memory files:

File Purpose
SOUL.md Core personality, values, and voice
IDENTITY.md Role, capabilities, and context
USER.md Scott's profile, preferences, and history
PROTOCOLS.md Behavioural rules and communication protocols
CONTEXT_TIERS.md Defines what gets loaded at each context tier
MEMORY_LONG.md Permanent curated long-term memory
MEMORY_MID.md Rolling mid-term digest (LLM-distilled)
MEMORY_SHORT.md Recent session rollup (auto-aggregated)
HELP.md This file

Toggle preview / edit to switch between rendered markdown and raw text. Ctrl+S saves, Esc closes.


Context & Memory ( ⚙ panel )

Context Tiers

Controls how much context is prepended to each LLM call:

Tier Loads ~Tokens
T1 SOUL + IDENTITY + USER summary ~1,500
T2 + USER full + PROTOCOLS + HELP + memory layers ~5,000
T3 + last 2 raw session logs ~15,000
T4 + last 7 raw session logs ~50,000

Default is T2. Use T1 for small/local models. Use T3T4 for complex multi-session tasks.

Memory Layers

Three independently toggleable memory files, loaded Long → Mid → Short (short sits closest to the conversation turn for better LLM recall):

Layer File Contents
Long MEMORY_LONG.md Permanent facts — origin, key decisions, Scott's profile highlights
Mid MEMORY_MID.md Rolling digest of recent weeks — LLM-distilled from Short
Short MEMORY_SHORT.md Recent session rollup — auto-aggregated from session log files

Toggle any layer off to save tokens for a focused conversation where history isn't needed.

Memory Distillation (manual)

Distillation builds up the memory layers from raw session logs. Currently manual — trigger via the ⚙ panel:

Button What it does
short Rolls recent session log files → MEMORY_SHORT.md (fast, no LLM)
mid LLM summarizes MEMORY_SHORT.mdMEMORY_MID.md
long LLM integrates MEMORY_MID.mdMEMORY_LONG.md
all Runs short → mid → long in sequence

Recommended workflow:

  • Run short after any productive session to capture it.
  • Run mid weekly to distil short → mid.
  • Run long monthly to absorb mid into permanent memory.

Token budgets for each layer are set in .env (MEMORY_BUDGET_LONG, MEMORY_BUDGET_MID, MEMORY_BUDGET_SHORT).


Keyboard Shortcuts

Keys Action
Ctrl+Enter Send message (default mode)
Enter Send (when in Enter mode)
Shift+Enter New line in message input
Ctrl+Enter Save inline message edit
Esc Cancel inline edit
Ctrl+S Save file (Files modal)
Esc Close any open modal

API Reference

For direct access or scripting:

Method Endpoint Description
POST /chat Send a message — returns SSE stream
GET /backend Get current primary/fallback backends
POST /backend Set primary backend ({"primary": "claude"})
GET /sessions List all sessions
GET /history/{id} Get session message history
PUT /history/{id} Replace full session history
GET /events SSE stream for real-time Talk activity
POST /note Inject a context note into a session
GET /files List identity files
GET /files/{name} Read a file
PUT /files/{name} Write a file
POST /distill/short Aggregate session logs → MEMORY_SHORT
POST /distill/mid Summarize short → MEMORY_MID (LLM)
POST /distill/long Integrate mid → MEMORY_LONG (LLM)
POST /distill/all Run all three distillation steps
GET /distill/status Show scheduler status and next run times
GET /health Health check — returns {"status": "ok"}

Chat request body (POST /chat):

{
  "message": "string",
  "session_id": "string | null",
  "tier": 1,
  "model": "claude | gemini | null",
  "include_long": true,
  "include_mid": true,
  "include_short": true
}

In Progress / Planned

  • Auto memory distillation (long) — short and mid run automatically; long-term integration is off by default (set AUTO_DISTILL_LONG=true in .env to enable)
  • Ollama local model backend — direct Ollama API support (no CLI wrapper)
  • OAuth token auto-refresh notifications — alert when Claude CLI token is near expiry
  • Multi-user support — per-user identity/memory files; currently single-user (Scott)

Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent. Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.