- static/index.html: reduced to 127-line HTML shell - static/style.css: all styles extracted (~900 lines) + help modal styles + shared markdown rendering for file-preview and help-modal-body including tables (previously missing) - static/app.js: all JS extracted (~900 lines) + help modal fetch/render - index.html: adds ? help button + help modal HTML - inara/HELP.md: comprehensive reference doc covering all features, keyboard shortcuts, API endpoints, memory system, planned items - routers/files.py: HELP.md added to ALLOWED set - context_loader.py: HELP.md loaded at tier 2+ (after PROTOCOLS.md) so Inara can reference it when helping Scott with the interface Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.8 KiB
Cortex UI — Help & Reference
This file is loaded into Inara's context at Tier 2+ so she can help Scott navigate the interface. It is also displayed in the web UI via the ? button.
Last updated: 2026-03-17
Header Controls
| Button | What it does |
|---|---|
| Sessions | Open the sessions panel — list, resume, or start sessions |
| Files | Open the identity file editor (SOUL, MEMORY, etc.) |
| ⚙ N | Open the Context & Memory panel (N = current tier) |
| claude / gemini | Active backend — click to toggle primary |
| Aa / A+ / A− | Cycle font size: normal (16px) → large (18px) → small (14px) |
| ☾ / ☀ | Toggle dark / light theme |
| ? | Open this help panel |
All header settings (theme, font size, tier, memory layers) persist in localStorage across page refreshes.
Chat
- Send:
Ctrl+Enterby default. Click⌃↵in the input controls to toggle to plainEntermode. - Stop: Click Stop to cancel an in-progress response at any time.
- Edit a message: Hover over any message → click edit.
Ctrl+Entersaves,Esccancels. - Delete a message: Hover over any message → click del. Removes from session history.
- Copy a response: Hover over any assistant message → click copy.
- New line while typing:
Shift+Enter(inCtrl+Entermode) orShift+Enter/ Enter (in Enter mode).
Sessions
Sessions are named conversation threads that persist across page refreshes.
- Click Sessions → + New to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as
nct_*prefixed IDs. - A blue ● badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
Notes
Notes are injected into a session without triggering an LLM response.
- Click Note to toggle note mode. The input border changes colour.
- Private note (amber border) — visible only in the UI, never sent to the LLM.
- Context note (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the
private / publiclabel to switch between note types.
Backends
- Claude CLI and Gemini CLI are both available. One is primary, the other is fallback.
- Click the backend button (
claudeorgemini) to switch which is primary. - If the primary fails or times out, the fallback is used automatically. A ⚡ notice appears in the chat when this happens.
- Timeouts: Claude 60s, Gemini 120s.
Nextcloud Talk Bot
Inara is registered as a bot in Nextcloud Talk.
- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
- The webhook returns
200 OKimmediately; the LLM call and reply happen asynchronously. - Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
Files (Identity Editor)
The Files button opens an editor for Inara's identity and memory files:
| File | Purpose |
|---|---|
SOUL.md |
Core personality, values, and voice |
IDENTITY.md |
Role, capabilities, and context |
USER.md |
Scott's profile, preferences, and history |
PROTOCOLS.md |
Behavioural rules and communication protocols |
CONTEXT_TIERS.md |
Defines what gets loaded at each context tier |
MEMORY_LONG.md |
Permanent curated long-term memory |
MEMORY_MID.md |
Rolling mid-term digest (LLM-distilled) |
MEMORY_SHORT.md |
Recent session rollup (auto-aggregated) |
HELP.md |
This file |
Toggle preview / edit to switch between rendered markdown and raw text. Ctrl+S saves, Esc closes.
Context & Memory ( ⚙ panel )
Context Tiers
Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| T1 | SOUL + IDENTITY + USER summary | ~1,500 |
| T2 | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| T3 | + last 2 raw session logs | ~15,000 |
| T4 | + last 7 raw session logs | ~50,000 |
Default is T2. Use T1 for small/local models. Use T3–T4 for complex multi-session tasks.
Memory Layers
Three independently toggleable memory files, loaded Long → Mid → Short (short sits closest to the conversation turn for better LLM recall):
| Layer | File | Contents |
|---|---|---|
| Long | MEMORY_LONG.md |
Permanent facts — origin, key decisions, Scott's profile highlights |
| Mid | MEMORY_MID.md |
Rolling digest of recent weeks — LLM-distilled from Short |
| Short | MEMORY_SHORT.md |
Recent session rollup — auto-aggregated from session log files |
Toggle any layer off to save tokens for a focused conversation where history isn't needed.
Memory Distillation (manual)
Distillation builds up the memory layers from raw session logs. Currently manual — trigger via the ⚙ panel:
| Button | What it does |
|---|---|
| short | Rolls recent session log files → MEMORY_SHORT.md (fast, no LLM) |
| mid | LLM summarizes MEMORY_SHORT.md → MEMORY_MID.md |
| long | LLM integrates MEMORY_MID.md → MEMORY_LONG.md |
| all | Runs short → mid → long in sequence |
Recommended workflow:
- Run short after any productive session to capture it.
- Run mid weekly to distil short → mid.
- Run long monthly to absorb mid into permanent memory.
Token budgets for each layer are set in .env (MEMORY_BUDGET_LONG, MEMORY_BUDGET_MID, MEMORY_BUDGET_SHORT).
Keyboard Shortcuts
| Keys | Action |
|---|---|
Ctrl+Enter |
Send message (default mode) |
Enter |
Send (when in Enter mode) |
Shift+Enter |
New line in message input |
Ctrl+Enter |
Save inline message edit |
Esc |
Cancel inline edit |
Ctrl+S |
Save file (Files modal) |
Esc |
Close any open modal |
API Reference
For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
POST |
/chat |
Send a message — returns SSE stream |
GET |
/backend |
Get current primary/fallback backends |
POST |
/backend |
Set primary backend ({"primary": "claude"}) |
GET |
/sessions |
List all sessions |
GET |
/history/{id} |
Get session message history |
PUT |
/history/{id} |
Replace full session history |
GET |
/events |
SSE stream for real-time Talk activity |
POST |
/note |
Inject a context note into a session |
GET |
/files |
List identity files |
GET |
/files/{name} |
Read a file |
PUT |
/files/{name} |
Write a file |
POST |
/distill/short |
Aggregate session logs → MEMORY_SHORT |
POST |
/distill/mid |
Summarize short → MEMORY_MID (LLM) |
POST |
/distill/long |
Integrate mid → MEMORY_LONG (LLM) |
POST |
/distill/all |
Run all three distillation steps |
GET |
/health |
Health check — returns {"status": "ok"} |
Chat request body (POST /chat):
{
"message": "string",
"session_id": "string | null",
"tier": 1,
"model": "claude | gemini | null",
"include_long": true,
"include_mid": true,
"include_short": true
}
In Progress / Planned
- Auto memory distillation — currently manual trigger only; scheduled auto-run planned
- Ollama local model backend — direct Ollama API support (no CLI wrapper)
- pfSense port 2222 — Gitea SSH access from outside the LAN
- OAuth token auto-refresh notifications — alert when Claude CLI token is near expiry
- Multi-user support — per-user identity/memory files; currently single-user (Scott)
Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent. Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.