Files
Cortex-Inara/inara/HELP.md
Scott Idem 4253e69c0b Add auto memory distillation scheduler (APScheduler)
- scheduler.py: AsyncIOScheduler with three cron jobs
    short  daily     03:00 (no LLM, always fast)
    mid    weekly    Sun 03:30 (LLM)
    long   monthly   1st 04:00 (LLM — off by default)
- config.py: AUTO_DISTILL, AUTO_DISTILL_SHORT/MID/LONG .env flags
- main.py: start/stop scheduler in FastAPI lifespan
- routers/distill.py: GET /distill/status — next run times + config
- requirements.txt: apscheduler>=3.10
- HELP.md: updated planned items, added /distill/status to API table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 22:31:38 -04:00

209 lines
7.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Cortex UI — Help & Reference
*This file is loaded into Inara's context at Tier 2+ so she can help Scott navigate the interface. It is also displayed in the web UI via the **?** button.*
*Last updated: 2026-03-17*
---
## Header Controls
| Button | What it does |
|---|---|
| **Sessions** | Open the sessions panel — list, resume, or start sessions |
| **Files** | Open the identity file editor (SOUL, MEMORY, etc.) |
| **⚙ N** | Open the Context & Memory panel (N = current tier) |
| **claude / gemini** | Active backend — click to toggle primary |
| **Aa / A+ / A** | Cycle font size: normal (16px) → large (18px) → small (14px) |
| **☾ / ☀** | Toggle dark / light theme |
| **?** | Open this help panel |
All header settings (theme, font size, tier, memory layers) persist in `localStorage` across page refreshes.
---
## Chat
- **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode.
- **Stop:** Click **Stop** to cancel an in-progress response at any time.
- **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels.
- **Delete a message:** Hover over any message → click **del**. Removes from session history.
- **Copy a response:** Hover over any assistant message → click **copy**.
- **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode).
---
## Sessions
Sessions are named conversation threads that persist across page refreshes.
- Click **Sessions****+ New** to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as `nct_*` prefixed IDs.
- A blue **●** badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
---
## Notes
Notes are injected into a session without triggering an LLM response.
- Click **Note** to toggle note mode. The input border changes colour.
- **Private note** (amber border) — visible only in the UI, never sent to the LLM.
- **Context note** (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the `private / public` label to switch between note types.
---
## Backends
- **Claude CLI** and **Gemini CLI** are both available. One is primary, the other is fallback.
- Click the backend button (`claude` or `gemini`) to switch which is primary.
- If the primary fails or times out, the fallback is used automatically. A **⚡** notice appears in the chat when this happens.
- Timeouts: Claude 60s, Gemini 120s.
---
## Nextcloud Talk Bot
Inara is registered as a bot in Nextcloud Talk.
- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
- The webhook returns `200 OK` immediately; the LLM call and reply happen asynchronously.
- Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
---
## Files (Identity Editor)
The **Files** button opens an editor for Inara's identity and memory files:
| File | Purpose |
|---|---|
| `SOUL.md` | Core personality, values, and voice |
| `IDENTITY.md` | Role, capabilities, and context |
| `USER.md` | Scott's profile, preferences, and history |
| `PROTOCOLS.md` | Behavioural rules and communication protocols |
| `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier |
| `MEMORY_LONG.md` | Permanent curated long-term memory |
| `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) |
| `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) |
| `HELP.md` | This file |
Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes.
---
## Context & Memory ( ⚙ panel )
### Context Tiers
Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| **T3** | + last 2 raw session logs | ~15,000 |
| **T4** | + last 7 raw session logs | ~50,000 |
Default is T2. Use T1 for small/local models. Use T3T4 for complex multi-session tasks.
### Memory Layers
Three independently toggleable memory files, loaded **Long → Mid → Short** (short sits closest to the conversation turn for better LLM recall):
| Layer | File | Contents |
|---|---|---|
| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, Scott's profile highlights |
| **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short |
| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session log files |
Toggle any layer off to save tokens for a focused conversation where history isn't needed.
### Memory Distillation (manual)
Distillation builds up the memory layers from raw session logs. Currently **manual** — trigger via the ⚙ panel:
| Button | What it does |
|---|---|
| **short** | Rolls recent session log files → `MEMORY_SHORT.md` (fast, no LLM) |
| **mid** | LLM summarizes `MEMORY_SHORT.md``MEMORY_MID.md` |
| **long** | LLM integrates `MEMORY_MID.md``MEMORY_LONG.md` |
| **all** | Runs short → mid → long in sequence |
**Recommended workflow:**
- Run **short** after any productive session to capture it.
- Run **mid** weekly to distil short → mid.
- Run **long** monthly to absorb mid into permanent memory.
Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BUDGET_MID`, `MEMORY_BUDGET_SHORT`).
---
## Keyboard Shortcuts
| Keys | Action |
|---|---|
| `Ctrl+Enter` | Send message (default mode) |
| `Enter` | Send (when in Enter mode) |
| `Shift+Enter` | New line in message input |
| `Ctrl+Enter` | Save inline message edit |
| `Esc` | Cancel inline edit |
| `Ctrl+S` | Save file (Files modal) |
| `Esc` | Close any open modal |
---
## API Reference
For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/chat` | Send a message — returns SSE stream |
| `GET` | `/backend` | Get current primary/fallback backends |
| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
| `GET` | `/sessions` | List all sessions |
| `GET` | `/history/{id}` | Get session message history |
| `PUT` | `/history/{id}` | Replace full session history |
| `GET` | `/events` | SSE stream for real-time Talk activity |
| `POST` | `/note` | Inject a context note into a session |
| `GET` | `/files` | List identity files |
| `GET` | `/files/{name}` | Read a file |
| `PUT` | `/files/{name}` | Write a file |
| `POST` | `/distill/short` | Aggregate session logs → MEMORY_SHORT |
| `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) |
| `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) |
| `POST` | `/distill/all` | Run all three distillation steps |
| `GET` | `/distill/status` | Show scheduler status and next run times |
| `GET` | `/health` | Health check — returns `{"status": "ok"}` |
Chat request body (`POST /chat`):
```json
{
"message": "string",
"session_id": "string | null",
"tier": 1,
"model": "claude | gemini | null",
"include_long": true,
"include_mid": true,
"include_short": true
}
```
---
## In Progress / Planned
- **Auto memory distillation (long)** — short and mid run automatically; long-term integration is off by default (set `AUTO_DISTILL_LONG=true` in `.env` to enable)
- **Ollama local model backend** — direct Ollama API support (no CLI wrapper)
- **OAuth token auto-refresh notifications** — alert when Claude CLI token is near expiry
- **Multi-user support** — per-user identity/memory files; currently single-user (Scott)
---
*Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent.*
*Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.*