# Cortex UI — Help & Reference *This file is loaded into Inara's context at Tier 2+ so she can help Scott navigate the interface. It is also displayed in the web UI via the **?** button.* *Last updated: 2026-03-17* --- ## Header Controls | Button | What it does | |---|---| | **Sessions** | Open the sessions panel — list, resume, or start sessions | | **Files** | Open the identity file editor (SOUL, MEMORY, etc.) | | **⚙ N** | Open the Settings panel (N = current context tier) | | **?** | Open this help panel | The **⚙ Settings** panel contains all configuration options: | Section | Controls | |---|---| | **Context Tier** | T1 – T4 context depth | | **Memory Layers** | Toggle Long / Mid / Short memory on/off | | **Distill Memory** | Manually trigger short / mid / long / all distillation | | **Backend** | Active LLM backend — click to toggle claude ↔ gemini | | **Display** | Aa/A+/A− font size cycle · ☾/☀ theme toggle | All header settings (theme, font size, tier, memory layers) persist in `localStorage` across page refreshes. --- ## Chat - **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode. - **Stop:** Click **Stop** to cancel an in-progress response at any time. - **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels. - **Delete a message:** Hover over any message → click **del**. Removes from session history. - **Copy a response:** Hover over any assistant message → click **copy**. - **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode). --- ## Sessions Sessions are named conversation threads that persist across page refreshes. - Click **Sessions** → **+ New** to start a fresh session. - Click any listed session to resume it — full history loads instantly. - Sessions from Nextcloud Talk appear as `nct_*` prefixed IDs. - A blue **●** badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing. --- ## Notes Notes are injected into a session without triggering an LLM response. - Click **Note** to toggle note mode. The input border changes colour. - **Private note** (amber border) — visible only in the UI, never sent to the LLM. - **Context note** (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message. - Click the `private / public` label to switch between note types. --- ## Backends - **Claude CLI** and **Gemini CLI** are both available. One is primary, the other is fallback. - Click **⚙** → **Backend** to toggle between `claude` and `gemini` as the primary. - If the primary fails or times out, the fallback is used automatically. A **⚡** notice appears in the chat when this happens. - Timeouts: Claude 60s, Gemini 120s. --- ## Nextcloud Talk Bot Inara is registered as a bot in Nextcloud Talk. - Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara. - The webhook returns `200 OK` immediately; the LLM call and reply happen asynchronously. - Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live. - To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara. --- ## Files (Identity Editor) The **Files** button opens an editor for Inara's identity and memory files: | File | Purpose | |---|---| | `SOUL.md` | Core personality, values, and voice | | `IDENTITY.md` | Role, capabilities, and context | | `USER.md` | Scott's profile, preferences, and history | | `PROTOCOLS.md` | Behavioural rules and communication protocols | | `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier | | `MEMORY_LONG.md` | Permanent curated long-term memory | | `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) | | `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) | | `HELP.md` | This file | Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes. --- ## Context & Memory ( ⚙ panel ) ### Context Tiers Controls how much context is prepended to each LLM call: | Tier | Loads | ~Tokens | |---|---|---| | **T1** | SOUL + IDENTITY + USER summary | ~1,500 | | **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 | | **T3** | + last 2 raw session logs | ~15,000 | | **T4** | + last 7 raw session logs | ~50,000 | Default is T2. Use T1 for small/local models. Use T3–T4 for complex multi-session tasks. ### Memory Layers Three independently toggleable memory files, loaded **Long → Mid → Short** (short sits closest to the conversation turn for better LLM recall): | Layer | File | Contents | |---|---|---| | **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, Scott's profile highlights | | **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short | | **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session log files | Toggle any layer off to save tokens for a focused conversation where history isn't needed. ### Memory Distillation (manual) Distillation builds up the memory layers from raw session logs. Currently **manual** — trigger via the ⚙ panel: | Button | What it does | |---|---| | **short** | Rolls recent session log files → `MEMORY_SHORT.md` (fast, no LLM) | | **mid** | LLM summarizes `MEMORY_SHORT.md` → `MEMORY_MID.md` | | **long** | LLM integrates `MEMORY_MID.md` → `MEMORY_LONG.md` | | **all** | Runs short → mid → long in sequence | **Recommended workflow:** - Run **short** after any productive session to capture it. - Run **mid** weekly to distil short → mid. - Run **long** monthly to absorb mid into permanent memory. Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BUDGET_MID`, `MEMORY_BUDGET_SHORT`). --- ## Keyboard Shortcuts | Keys | Action | |---|---| | `Ctrl+Enter` | Send message (default mode) | | `Enter` | Send (when in Enter mode) | | `Shift+Enter` | New line in message input | | `Ctrl+Enter` | Save inline message edit | | `Esc` | Cancel inline edit | | `Ctrl+S` | Save file (Files modal) | | `Esc` | Close any open modal | --- ## API Reference For direct access or scripting: | Method | Endpoint | Description | |---|---|---| | `POST` | `/chat` | Send a message — returns SSE stream | | `GET` | `/backend` | Get current primary/fallback backends | | `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) | | `GET` | `/sessions` | List all sessions | | `GET` | `/history/{id}` | Get session message history | | `PUT` | `/history/{id}` | Replace full session history | | `GET` | `/events` | SSE stream for real-time Talk activity | | `POST` | `/note` | Inject a context note into a session | | `GET` | `/files` | List identity files | | `GET` | `/files/{name}` | Read a file | | `PUT` | `/files/{name}` | Write a file | | `POST` | `/distill/short` | Aggregate session logs → MEMORY_SHORT | | `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) | | `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) | | `POST` | `/distill/all` | Run all three distillation steps | | `GET` | `/distill/status` | Show scheduler status and next run times | | `GET` | `/health` | Health check — returns `{"status": "ok"}` | Chat request body (`POST /chat`): ```json { "message": "string", "session_id": "string | null", "tier": 1, "model": "claude | gemini | null", "include_long": true, "include_mid": true, "include_short": true } ``` --- ## In Progress / Planned - **Auto memory distillation (long)** — short and mid run automatically; long-term integration is off by default (set `AUTO_DISTILL_LONG=true` in `.env` to enable) - **Ollama local model backend** — direct Ollama API support (no CLI wrapper) - **OAuth token auto-refresh notifications** — ✓ implemented: amber banner shown when Claude CLI token is within 24h of expiry (check `GET /auth/status`) - **Multi-user support** — per-user identity/memory files; currently single-user (Scott) --- *Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent.* *Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.*