Cortex-Inara/cortex/static/HELP.md

# Cortex UI — Help & Reference

<!-- SHARED BASE: cortex/static/HELP.md
     This file is served to all users regardless of persona.
     Persona-specific additions live in home/{username}/persona/{name}/HELP.md
     and are appended automatically by help.html when present.
-->

*Last updated: 2026-05-09*

---

## Getting Started

If this is your first time using Cortex, you need one thing before the chat will work: an AI model connected to your account.

**Fastest path — OpenRouter:**
OpenRouter gives you access to Claude, Gemini, and dozens of other models with a single API key.

1. Get a free API key at [openrouter.ai/keys](https://openrouter.ai/keys)
2. Go to **☰ → Account → [Set up OpenRouter →]** (shown automatically if no model is configured)
3. Paste your key, pick a starting model, click **Connect**

That's it — you're ready to chat.

**Already past setup but seeing errors?** Go to **☰ → Account → Model Registry → Manage models** and confirm a model is assigned to the **Chat** role (Primary slot). If all slots are empty, add a model first.

---

## Header Controls

| Button | What it does |
|---|---|
| **Sessions** | Open the sessions panel — list, resume, or start sessions |
| **N** (sliders icon) | Open the Context & Memory panel (N = current context tier) |
| **☰** | Settings menu — Files, push notification toggle, Account, Sign Out |
| **?** | Open this help panel |

The **Context & Memory** panel (sliders icon with tier number) contains all configuration options:

| Section | Controls |
|---|---|
| **Context Tier** | T1 – T4 context depth |
| **Memory Layers** | Toggle Long / Mid / Short memory on/off |
| **Distill Memory** | Manually trigger Short / Mid / Long / All distillation |
| **Role** | Active LLM role — click to cycle through configured role assignments |
| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L** cycles input area height · **⌃↵** toggles send shortcut |

All settings persist in `localStorage` across page refreshes.

---

## Chat

- **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode.
- **Stop:** Click **Stop** to cancel an in-progress response at any time.
- **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels.
- **Delete a message:** Hover over any message → click **del**. Removes from session history.
- **Copy a response:** Hover over any assistant message → click **copy**.
- **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode).

Each assistant response shows a small **model tag** in the bottom-right corner identifying which model and host responded.

---

## Tools (⚡)

Click the **⚡** button in the input row to enable the Tools toggle. When lit (amber), **Send** changes to **Run** and messages are routed through the **orchestrator** instead of directly to the chat model.

The orchestrator runs a multi-step tool loop:

1. The **orchestrator model** reasons about the request and calls tools as needed
2. It produces an enriched summary of what it found
3. The **responder model** (set by the active Role) receives that context and writes the final user-facing reply
4. A `⚡ N tool calls: …` note appears below the response listing what was used

The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**.

Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.

Orchestrated sessions persist to history exactly like regular chat.

### Available Tools

45 tools across 12 categories. Each tool schema is sent to the model on every orchestrated call — fewer active tools means fewer tokens per call.

| Category | Tools |
|---|---|
| **Web** | `web_search`, `http_fetch` |
| **Files** | `file_read`, `file_list`, `file_write`, `session_search` |
| **Shell** | `shell_exec`, `claude_allow_dir` |
| **System** | `cortex_restart`, `cortex_logs`, `cortex_status`, `cortex_update` |
| **Tasks** | `task_list`, `task_create`, `task_update`, `task_complete` |
| **Cron** | `cron_list`, `cron_add`, `cron_remove`, `cron_toggle` |
| **Reminders** | `reminders_add`, `reminders_list`, `reminders_remove`, `reminders_clear` |
| **Scratchpad** | `scratch_read`, `scratch_write`, `scratch_append`, `scratch_clear` |
| **Notifications** | `web_push`, `email_send`, `nc_talk_send` |
| **Aether Journals** | `ae_journal_list/search`, `ae_journal_entries_list`, `ae_journal_entry_read/create/update/disable/append/prepend` |
| **Agent Notes** | `agent_notes_read`, `agent_notes_write`, `agent_notes_append`, `agent_notes_clear` |
| **Agents** | `spawn_agent` |

File, Shell, System, Agents, and some Notification tools are **admin-only** and not visible to regular users.

### Per-Role Tool Sets

Each role can be configured with a specific subset of tool categories. When a role has a tool subset configured, only those tools are sent to the orchestrator — the rest are invisible to the model for that session.

**Example:** a Coder role might only need Web, Files, Shell, and Agent Notes. A Research role might only need Web. Configuring this avoids sending schemas for 30+ irrelevant tools on every call.

Configure per-role tool sets in **Account → Model Registry → Role Assignments** — expand a role card to see the category checkboxes. The default (no checkboxes selected) sends all tools the user has access to.

---

## Sessions

Sessions are named conversation threads that persist across page refreshes.

- Click **Sessions** → **+ New** to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as `nct_*` prefixed IDs.
- A blue **●** badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.

---

## Notes

Notes are injected into a session without triggering an LLM response.

- Click **Note** to toggle note mode. The input border changes colour.
- **Private note** (amber border) — visible only in the UI, never sent to the LLM.
- **Context note** (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the `private / public` label to switch between note types.

---

## Install as App (PWA)

Cortex supports installation as a Progressive Web App — it runs in its own window with no browser chrome.

- **Chrome / Edge (desktop):** Look for the install icon in the address bar, or open the browser menu → **Install Cortex…**
- **Android (Chrome):** Tap ⋮ → **Add to Home Screen**
- **iOS (Safari):** Tap the Share button → **Add to Home Screen**

Once installed, opening Cortex from the home screen or app launcher skips the browser UI entirely.

---

## Backends

Three backends are available:

| Backend | What it is |
|---|---|
| **Claude** | Anthropic Claude via the Claude CLI (OAuth — no API key needed) |
| **Gemini** | Google Gemini via the Gemini CLI |
| **Local** | Any OpenAI-compatible endpoint (Open WebUI, Ollama, OpenRouter, etc.) |

The **Role** toggle in the Context & Memory panel cycles through configured role assignments. Each role maps to a Primary / Backup 1 / Backup 2 model chain set in the Model Registry.

- The active model label appears below the toggle button
- `auto` (default) uses the model assigned to the `chat` role in your Model Registry
- Forcing a specific backend overrides the role assignment for that session

If the active backend fails, a fallback is tried automatically. A **⚡** badge appears on the response when this happens.

Each response shows a **model tag** (bottom-right of message) with the model label and host, so you always know what responded.

---

## Account Settings

**Navigate to:** ☰ (top-right menu) → **Account**

| Section | What you can do |
|---|---|
| **Account** | View your username, role badge (Admin / User), rename your username |
| **Connected Accounts** | See which Google account is linked for OAuth sign-in |
| **Email Allowlist** | Regex patterns controlling which addresses the `email_send` tool can reach |
| **Notifications** | Dedicated page — set channel (Browser Push, NC Talk, Google Chat, email) for proactive messages; test buttons for instant verification |
| **Tool Permissions** | Allow or block specific orchestrator tools for your account |
| **Usage** | Token consumption by model — see below |
| **Browser Cache** | Clear UI preferences stored locally (theme, font size, session ID, etc.) |
| **Model Registry** | Configure AI providers, local hosts, and role assignments |
| **Change Password** | Update your login password |
| **Personas** | List and rename your personas |

---

## Usage

Token consumption is tracked automatically for API-backed models. **Navigate to:** ☰ → **Account** → **Usage** section.

The table shows all-time totals per model key, with columns for:

| Column | Meaning |
|---|---|
| **Model** | `backend/model-name` key (e.g. `gemini_api/gemini-2.5-flash`, `local/deepseek-v4`) |
| **Calls** | Number of API calls made |
| **Prompt** | Input tokens sent |
| **Output** | Completion tokens received |
| **Total** | Prompt + Output |

Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).

**What is and isn't tracked:**

- ✅ Gemini API calls (orchestrator, distillation)
- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
- ✗ Claude CLI — no structured token data is returned by the subprocess
- ✗ Gemini CLI — same reason

The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.

---

## Model Registry

Configure which AI models are available and which handles each task type.

**New user quick path:** ☰ → **Account** → **Set up OpenRouter →** (the guided wizard adds a host, model, and role assignment in one step).

**Full manual path:** ☰ → **Account** → scroll to **Model Registry** → **Manage models →**

---

### Step 1 — Set up providers and hosts

Do this before adding models — models need a provider account or local host to attach to.

**Anthropic (Claude):** Nothing to configure. Claude uses your existing CLI OAuth session. If Claude isn't working, run `claude auth login` in a terminal.

**Google (Gemini):** Add one entry per API key you want to use:
1. Scroll to **Cloud Providers → Google** → click **+ Add Google account**
2. Enter a label (e.g. "Work", "Personal") and your API key
3. Get a free key at [aistudio.google.com/apikey](https://aistudio.google.com/apikey)

**OpenRouter** (recommended for new users — one key for many models):
1. Get a key at [openrouter.ai/keys](https://openrouter.ai/keys)
2. Scroll to **Local Hosts** → **+ Add host**
3. Label: "OpenRouter", URL: `https://openrouter.ai/api/v1`, paste your key, Type: OpenAI-compatible
4. Click **Fetch models** to verify, then add models from the fetched list

**Other local hosts** (Open WebUI, Ollama, LM Studio, etc.):
1. Scroll to **Local Hosts** → click **+ Add host** to expand the form
2. Enter a label, the API URL (e.g. `http://192.168.1.100:3000`), and optional API key
3. Set **Type**: Open WebUI / Ollama, or OpenAI-compatible
4. Click **Fetch models** on the saved host card to verify connectivity

---

### Step 2 — Add models

Scroll to **Add Model**. Select the provider tab, fill in the details, click **Add Model**:

| Tab | What you need |
|---|---|
| **Local** | Select a host (from Step 1) → enter model name, or use **Fetch from host** to pick from a live list |
| **Google** | Select a Gemini model from the catalog → select a Google account (from Step 1) |
| **Anthropic** | Select a Claude model from the catalog → uses your CLI session automatically |

The label and context window size auto-fill from the catalog — edit them if you want. Tags are optional.

---

### Step 3 — Assign models to roles

Scroll to **Role Assignments** at the bottom of the page. Each role has **Primary**, **Backup 1**, and **Backup 2** slots — Primary is tried first, then backups in order. Changes save automatically.

| Role | Used for |
|---|---|
| **Chat** | Regular conversation |
| **Orchestrator** | Agent mode tool loop |
| **Distill** | Memory distillation (short / mid / long) |
| **Coder** | Code-focused tasks |
| **Research** | Long-context research tasks |

Leave all slots empty to use the server default.

**Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).

**Inject timestamp:** Each role card has an "Inject current date & time into system prompt" checkbox (default on). Disable it for pure processing roles (summarizer, classifier, translator) that don't need clock awareness.

---

## Nextcloud Talk Bot

Inara is registered as a bot in Nextcloud Talk.

- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to.
- The webhook returns `200 OK` immediately; the reply happens asynchronously.
- Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable the bot.

---

## Google Chat Bot

Inara is available as a bot in Google Chat (One Sky IT Workspace).

- Send Inara a direct message in Google Chat to start a conversation.
- Each DM thread is its own session (`gc_spaces/*` prefix) — history persists across messages.
- Responses are synchronous — Google Chat displays the reply directly in the thread.
- To add Inara to a space: open the space, add a person/app, search for **Inara**.
- Sessions from Google Chat appear as `gc_*` prefixed IDs in the Sessions panel.

---

## Files (Identity Editor)

The **Files** button opens an editor for your persona's identity and memory files:

| File | Purpose |
|---|---|
| `SOUL.md` | Core personality, values, and voice |
| `IDENTITY.md` | Role, capabilities, and context |
| `USER.md` | Your profile, preferences, and history |
| `PROTOCOLS.md` | Behavioural rules and communication protocols |
| `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier |
| `MEMORY_LONG.md` | Permanent curated long-term memory |
| `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) |
| `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) |
| `HELP.md` | This file — persona-specific additions appended below |
| `email_allowlist.json` | Regex patterns for permitted `email_send` recipients (one per line) |

Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes.

The **Audit Log** group at the bottom of the sidebar (collapsed by default) lists tool call logs by date (`YYYY-MM-DD.jsonl`). Click any date to view a read-only table of every orchestrator tool call: time, tool name, status, model, args, and result snippet. Status is colour-coded: green = ok, red = error, amber = denied.

---

## Push Notifications

Cortex can send browser push notifications — even when the tab is closed.

- Open **☰ → Enable notifications** and accept the browser permission prompt.
- Once enabled, the button shows **Notifications on** (in accent colour).
- Click again to disable. Subscriptions are stored per-device.
- The orchestrator's `web_push` tool lets Inara send you a push proactively (e.g. when a long task completes).

**Notification channel settings:** ☰ → **Account** → **Notification settings →** — choose Browser Push, Email, Nextcloud Talk, or Google Chat as the channel Inara uses for scheduled reminders, cron job completions, and memory digests. Use the **Send Test Notification** button to verify your setup, or **Check Reminders Now** to trigger the reminder check immediately.

---

## Context & Memory ( ⚙ panel )

### Context Tiers

Controls how much context is prepended to each LLM call:

| Tier | Loads | ~Tokens |
|---|---|---|
| **Min** | SOUL + IDENTITY + USER summary | ~1,500 |
| **Std** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| **Ext** | + last 2 raw session logs | ~15,000 |
| **Full** | + last 7 raw session logs | ~50,000 |

Default is **Std**. Use **Min** for small/local models. Use **Ext** or **Full** for complex multi-session tasks.

### Memory Layers

Three independently toggleable memory files, loaded **Long → Mid → Short**:

| Layer | File | Contents |
|---|---|---|
| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, profile highlights |
| **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short |
| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session logs |

Toggle any layer off to save tokens for a focused conversation.

### Memory Distillation

Distillation builds up the memory layers from raw session logs. Runs automatically on a schedule; trigger manually via the ⚙ panel:

| Button | What it does |
|---|---|
| **short** | Rolls recent session log files → `MEMORY_SHORT.md` (fast, no LLM) |
| **mid** | LLM summarizes `MEMORY_SHORT.md` → `MEMORY_MID.md` |
| **long** | LLM integrates `MEMORY_MID.md` → `MEMORY_LONG.md` |
| **all** | Runs short → mid → long in sequence |

**Recommended workflow:** run **short** after any productive session; **mid** weekly; **long** monthly.

---

## Keyboard Shortcuts

| Keys | Action |
|---|---|
| `Ctrl+Enter` | Send message (default mode) |
| `Enter` | Send (when in Enter mode) |
| `Shift+Enter` | New line in message input |
| `Ctrl+Enter` | Save inline message edit |
| `Esc` | Cancel inline edit / close any open modal |
| `Ctrl+S` | Save file (Files modal) |

---

## API Reference

For direct access or scripting:

| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/chat` | Send a message — returns SSE stream |
| `GET` | `/backend` | Get current primary/fallback backends |
| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
| `GET` | `/sessions` | List all sessions |
| `GET` | `/history/{id}` | Get session message history |
| `PUT` | `/history/{id}` | Replace full session history |
| `GET` | `/events` | SSE stream for real-time Talk activity |
| `POST` | `/note` | Inject a context note into a session |
| `GET` | `/files` | List identity files |
| `GET` | `/files/{name}` | Read a file |
| `PUT` | `/files/{name}` | Write a file |
| `POST` | `/distill/short` | Aggregate session logs → MEMORY_SHORT |
| `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) |
| `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) |
| `POST` | `/distill/all` | Run all three distillation steps |
| `GET` | `/distill/status` | Scheduler status and next run times |
| `POST` | `/orchestrate` | Submit an agent task — returns `{"job_id": "..."}` |
| `GET` | `/orchestrate/{job_id}` | Poll job status and result |
| `GET` | `/settings/models` | Model registry UI |
| `POST` | `/api/models/role` | Set a role assignment (JSON body) |
| `POST` | `/api/models/role-config` | Set per-role tool list and system prompt append |
| `GET` | `/api/push/vapid-key` | VAPID public key (for push subscription) |
| `POST` | `/api/push/subscribe` | Register a push subscription |
| `DELETE` | `/api/push/subscribe` | Remove a push subscription |
| `POST` | `/api/push/test` | Send a test notification via configured channel |
| `POST` | `/api/push/reminders/check` | Run reminder check immediately; returns `{"reminders_found": n}` |
| `GET` | `/api/audit/files` | List available audit log dates (own data) |
| `GET` | `/api/audit/day?date=` | Tool call entries for a specific date (own data) |
| `GET` | `/api/audit/recent` | Recent tool calls across days (admin) |
| `GET` | `/api/audit/stats` | Tool call counts by tool/status/user (admin) |
| `GET` | `/api/usage` | Full daily token usage log (own data) |
| `GET` | `/api/usage/summary` | Per-model token totals, all time (own data) |
| `GET` | `/api/usage/all` | Per-model totals for all users (admin) |
| `GET` | `/setup/model` | Guided OpenRouter setup form (Step 3 / standalone) |
| `POST` | `/setup/model` | Save OpenRouter host + model + assign to chat role |
| `GET` | `/health` | Health check — returns `{"status": "ok"}` |

Chat request body (`POST /chat`):
```json
{
  "message": "string",
  "session_id": "string | null",
  "tier": 2,
  "model": "claude | gemini | local | null",
  "include_long": true,
  "include_mid": true,
  "include_short": true
}
```

---

*Cortex is a self-hosted personal AI platform. Named after the 'verse-wide communications network in Firefly.*