Refactor UI into separate CSS/JS, add help modal and HELP.md

- static/index.html: reduced to 127-line HTML shell - static/style.css: all styles extracted (~900 lines) + help modal styles + shared markdown rendering for file-preview and help-modal-body including tables (previously missing) - static/app.js: all JS extracted (~900 lines) + help modal fetch/render - index.html: adds ? help button + help modal HTML - inara/HELP.md: comprehensive reference doc covering all features, keyboard shortcuts, API endpoints, memory system, planned items - routers/files.py: HELP.md added to ALLOWED set - context_loader.py: HELP.md loaded at tier 2+ (after PROTOCOLS.md) so Inara can reference it when helping Scott with the interface Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 21:52:54 -04:00
parent fa96c50935
commit 0ebfbc6590
6 changed files with 2039 additions and 1662 deletions
--- a/inara/HELP.md
+++ b/inara/HELP.md
@@ -0,0 +1,208 @@
+# Cortex UI — Help & Reference
+
+*This file is loaded into Inara's context at Tier 2+ so she can help Scott navigate the interface. It is also displayed in the web UI via the **?** button.*
+
+*Last updated: 2026-03-17*
+
+---
+
+## Header Controls
+
+| Button | What it does |
+|---|---|
+| **Sessions** | Open the sessions panel — list, resume, or start sessions |
+| **Files** | Open the identity file editor (SOUL, MEMORY, etc.) |
+| **⚙ N** | Open the Context & Memory panel (N = current tier) |
+| **claude / gemini** | Active backend — click to toggle primary |
+| **Aa / A+ / A−** | Cycle font size: normal (16px) → large (18px) → small (14px) |
+| **☾ / ☀** | Toggle dark / light theme |
+| **?** | Open this help panel |
+
+All header settings (theme, font size, tier, memory layers) persist in `localStorage` across page refreshes.
+
+---
+
+## Chat
+
+- **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode.
+- **Stop:** Click **Stop** to cancel an in-progress response at any time.
+- **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels.
+- **Delete a message:** Hover over any message → click **del**. Removes from session history.
+- **Copy a response:** Hover over any assistant message → click **copy**.
+- **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode).
+
+---
+
+## Sessions
+
+Sessions are named conversation threads that persist across page refreshes.
+
+- Click **Sessions** → **+ New** to start a fresh session.
+- Click any listed session to resume it — full history loads instantly.
+- Sessions from Nextcloud Talk appear as `nct_*` prefixed IDs.
+- A blue **●** badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
+
+---
+
+## Notes
+
+Notes are injected into a session without triggering an LLM response.
+
+- Click **Note** to toggle note mode. The input border changes colour.
+- **Private note** (amber border) — visible only in the UI, never sent to the LLM.
+- **Context note** (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
+- Click the `private / public` label to switch between note types.
+
+---
+
+## Backends
+
+- **Claude CLI** and **Gemini CLI** are both available. One is primary, the other is fallback.
+- Click the backend button (`claude` or `gemini`) to switch which is primary.
+- If the primary fails or times out, the fallback is used automatically. A **⚡** notice appears in the chat when this happens.
+- Timeouts: Claude 60s, Gemini 120s.
+
+---
+
+## Nextcloud Talk Bot
+
+Inara is registered as a bot in Nextcloud Talk.
+
+- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
+- The webhook returns `200 OK` immediately; the LLM call and reply happen asynchronously.
+- Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
+- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
+
+---
+
+## Files (Identity Editor)
+
+The **Files** button opens an editor for Inara's identity and memory files:
+
+| File | Purpose |
+|---|---|
+| `SOUL.md` | Core personality, values, and voice |
+| `IDENTITY.md` | Role, capabilities, and context |
+| `USER.md` | Scott's profile, preferences, and history |
+| `PROTOCOLS.md` | Behavioural rules and communication protocols |
+| `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier |
+| `MEMORY_LONG.md` | Permanent curated long-term memory |
+| `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) |
+| `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) |
+| `HELP.md` | This file |
+
+Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes.
+
+---
+
+## Context & Memory ( ⚙ panel )
+
+### Context Tiers
+
+Controls how much context is prepended to each LLM call:
+
+| Tier | Loads | ~Tokens |
+|---|---|---|
+| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
+| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
+| **T3** | + last 2 raw session logs | ~15,000 |
+| **T4** | + last 7 raw session logs | ~50,000 |
+
+Default is T2. Use T1 for small/local models. Use T3–T4 for complex multi-session tasks.
+
+### Memory Layers
+
+Three independently toggleable memory files, loaded **Long → Mid → Short** (short sits closest to the conversation turn for better LLM recall):
+
+| Layer | File | Contents |
+|---|---|---|
+| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, Scott's profile highlights |
+| **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short |
+| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session log files |
+
+Toggle any layer off to save tokens for a focused conversation where history isn't needed.
+
+### Memory Distillation (manual)
+
+Distillation builds up the memory layers from raw session logs. Currently **manual** — trigger via the ⚙ panel:
+
+| Button | What it does |
+|---|---|
+| **short** | Rolls recent session log files → `MEMORY_SHORT.md` (fast, no LLM) |
+| **mid** | LLM summarizes `MEMORY_SHORT.md` → `MEMORY_MID.md` |
+| **long** | LLM integrates `MEMORY_MID.md` → `MEMORY_LONG.md` |
+| **all** | Runs short → mid → long in sequence |
+
+**Recommended workflow:**
+- Run **short** after any productive session to capture it.
+- Run **mid** weekly to distil short → mid.
+- Run **long** monthly to absorb mid into permanent memory.
+
+Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BUDGET_MID`, `MEMORY_BUDGET_SHORT`).
+
+---
+
+## Keyboard Shortcuts
+
+| Keys | Action |
+|---|---|
+| `Ctrl+Enter` | Send message (default mode) |
+| `Enter` | Send (when in Enter mode) |
+| `Shift+Enter` | New line in message input |
+| `Ctrl+Enter` | Save inline message edit |
+| `Esc` | Cancel inline edit |
+| `Ctrl+S` | Save file (Files modal) |
+| `Esc` | Close any open modal |
+
+---
+
+## API Reference
+
+For direct access or scripting:
+
+| Method | Endpoint | Description |
+|---|---|---|
+| `POST` | `/chat` | Send a message — returns SSE stream |
+| `GET` | `/backend` | Get current primary/fallback backends |
+| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
+| `GET` | `/sessions` | List all sessions |
+| `GET` | `/history/{id}` | Get session message history |
+| `PUT` | `/history/{id}` | Replace full session history |
+| `GET` | `/events` | SSE stream for real-time Talk activity |
+| `POST` | `/note` | Inject a context note into a session |
+| `GET` | `/files` | List identity files |
+| `GET` | `/files/{name}` | Read a file |
+| `PUT` | `/files/{name}` | Write a file |
+| `POST` | `/distill/short` | Aggregate session logs → MEMORY_SHORT |
+| `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) |
+| `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) |
+| `POST` | `/distill/all` | Run all three distillation steps |
+| `GET` | `/health` | Health check — returns `{"status": "ok"}` |
+
+Chat request body (`POST /chat`):
+```json
+{
+  "message": "string",
+  "session_id": "string | null",
+  "tier": 1,
+  "model": "claude | gemini | null",
+  "include_long": true,
+  "include_mid": true,
+  "include_short": true
+}
+```
+
+---
+
+## In Progress / Planned
+
+- **Auto memory distillation** — currently manual trigger only; scheduled auto-run planned
+- **Ollama local model backend** — direct Ollama API support (no CLI wrapper)
+- **pfSense port 2222** — Gitea SSH access from outside the LAN
+- **OAuth token auto-refresh notifications** — alert when Claude CLI token is near expiry
+- **Multi-user support** — per-user identity/memory files; currently single-user (Scott)
+
+---
+
+*Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent.*
+*Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.*