- CLAUDE.md: date → 2026-05-08, add Proactive notifications row to channel table - HELP.md: update Notifications settings entry, expand Push Notifications section with channel config link, add test API endpoints to reference table - TODO__Agents.md: mark notifications dedicated page and pywebpush fix as done Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
21 KiB
Cortex UI — Help & Reference
Last updated: 2026-05-09
Getting Started
If this is your first time using Cortex, you need one thing before the chat will work: an AI model connected to your account.
Fastest path — OpenRouter: OpenRouter gives you access to Claude, Gemini, and dozens of other models with a single API key.
- Get a free API key at openrouter.ai/keys
- Go to ☰ → Account → [Set up OpenRouter →] (shown automatically if no model is configured)
- Paste your key, pick a starting model, click Connect
That's it — you're ready to chat.
Already past setup but seeing errors? Go to ☰ → Account → Model Registry → Manage models and confirm a model is assigned to the Chat role (Primary slot). If all slots are empty, add a model first.
Header Controls
| Button | What it does |
|---|---|
| Sessions | Open the sessions panel — list, resume, or start sessions |
| N (sliders icon) | Open the Context & Memory panel (N = current context tier) |
| ☰ | Settings menu — Files, push notification toggle, Account, Sign Out |
| ? | Open this help panel |
The Context & Memory panel (sliders icon with tier number) contains all configuration options:
| Section | Controls |
|---|---|
| Context Tier | T1 – T4 context depth |
| Memory Layers | Toggle Long / Mid / Short memory on/off |
| Distill Memory | Manually trigger Short / Mid / Long / All distillation |
| Role | Active LLM role — click to cycle through configured role assignments |
| Display | Aa cycles font size · ☾ toggles theme · S/M/L cycles input area height · ⌃↵ toggles send shortcut |
All settings persist in localStorage across page refreshes.
Chat
- Send:
Ctrl+Enterby default. Click⌃↵in the input controls to toggle to plainEntermode. - Stop: Click Stop to cancel an in-progress response at any time.
- Edit a message: Hover over any message → click edit.
Ctrl+Entersaves,Esccancels. - Delete a message: Hover over any message → click del. Removes from session history.
- Copy a response: Hover over any assistant message → click copy.
- New line while typing:
Shift+Enter(inCtrl+Entermode) orShift+Enter/ Enter (in Enter mode).
Each assistant response shows a small model tag in the bottom-right corner identifying which model and host responded.
Tools (⚡)
Click the ⚡ button in the input row to enable the Tools toggle. When lit (amber), Send changes to Run and messages are routed through the orchestrator instead of directly to the chat model.
The orchestrator runs a multi-step tool loop:
- The orchestrator model reasons about the request and calls tools as needed
- It produces an enriched summary of what it found
- The responder model (set by the active Role) receives that context and writes the final user-facing reply
- A
⚡ N tool calls: …note appears below the response listing what was used
The ⚡ toggle is independent of the Role selector — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in Account → Model Registry → Role Assignments → Orchestrator.
Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.
Orchestrated sessions persist to history exactly like regular chat.
Available Tools
45 tools across 12 categories. Each tool schema is sent to the model on every orchestrated call — fewer active tools means fewer tokens per call.
| Category | Tools |
|---|---|
| Web | web_search, http_fetch |
| Files | file_read, file_list, file_write, session_search |
| Shell | shell_exec, claude_allow_dir |
| System | cortex_restart, cortex_logs, cortex_status, cortex_update |
| Tasks | task_list, task_create, task_update, task_complete |
| Cron | cron_list, cron_add, cron_remove, cron_toggle |
| Reminders | reminders_add, reminders_list, reminders_remove, reminders_clear |
| Scratchpad | scratch_read, scratch_write, scratch_append, scratch_clear |
| Notifications | web_push, email_send, nc_talk_send |
| Aether Journals | ae_journal_list/search, ae_journal_entries_list, ae_journal_entry_read/create/update/disable/append/prepend |
| Agent Notes | agent_notes_read, agent_notes_write, agent_notes_append, agent_notes_clear |
| Agents | spawn_agent |
File, Shell, System, Agents, and some Notification tools are admin-only and not visible to regular users.
Per-Role Tool Sets
Each role can be configured with a specific subset of tool categories. When a role has a tool subset configured, only those tools are sent to the orchestrator — the rest are invisible to the model for that session.
Example: a Coder role might only need Web, Files, Shell, and Agent Notes. A Research role might only need Web. Configuring this avoids sending schemas for 30+ irrelevant tools on every call.
Configure per-role tool sets in Account → Model Registry → Role Assignments — expand a role card to see the category checkboxes. The default (no checkboxes selected) sends all tools the user has access to.
Sessions
Sessions are named conversation threads that persist across page refreshes.
- Click Sessions → + New to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as
nct_*prefixed IDs. - A blue ● badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
Notes
Notes are injected into a session without triggering an LLM response.
- Click Note to toggle note mode. The input border changes colour.
- Private note (amber border) — visible only in the UI, never sent to the LLM.
- Context note (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the
private / publiclabel to switch between note types.
Install as App (PWA)
Cortex supports installation as a Progressive Web App — it runs in its own window with no browser chrome.
- Chrome / Edge (desktop): Look for the install icon in the address bar, or open the browser menu → Install Cortex…
- Android (Chrome): Tap ⋮ → Add to Home Screen
- iOS (Safari): Tap the Share button → Add to Home Screen
Once installed, opening Cortex from the home screen or app launcher skips the browser UI entirely.
Backends
Three backends are available:
| Backend | What it is |
|---|---|
| Claude | Anthropic Claude via the Claude CLI (OAuth — no API key needed) |
| Gemini | Google Gemini via the Gemini CLI |
| Local | Any OpenAI-compatible endpoint (Open WebUI, Ollama, OpenRouter, etc.) |
The Role toggle in the Context & Memory panel cycles through configured role assignments. Each role maps to a Primary / Backup 1 / Backup 2 model chain set in the Model Registry.
- The active model label appears below the toggle button
auto(default) uses the model assigned to thechatrole in your Model Registry- Forcing a specific backend overrides the role assignment for that session
If the active backend fails, a fallback is tried automatically. A ⚡ badge appears on the response when this happens.
Each response shows a model tag (bottom-right of message) with the model label and host, so you always know what responded.
Account Settings
Navigate to: ☰ (top-right menu) → Account
| Section | What you can do |
|---|---|
| Account | View your username, role badge (Admin / User), rename your username |
| Connected Accounts | See which Google account is linked for OAuth sign-in |
| Email Allowlist | Regex patterns controlling which addresses the email_send tool can reach |
| Notifications | Dedicated page — set channel (Browser Push, NC Talk, Google Chat, email) for proactive messages; test buttons for instant verification |
| Tool Permissions | Allow or block specific orchestrator tools for your account |
| Usage | Token consumption by model — see below |
| Browser Cache | Clear UI preferences stored locally (theme, font size, session ID, etc.) |
| Model Registry | Configure AI providers, local hosts, and role assignments |
| Change Password | Update your login password |
| Personas | List and rename your personas |
Usage
Token consumption is tracked automatically for API-backed models. Navigate to: ☰ → Account → Usage section.
The table shows all-time totals per model key, with columns for:
| Column | Meaning |
|---|---|
| Model | backend/model-name key (e.g. gemini_api/gemini-2.5-flash, local/deepseek-v4) |
| Calls | Number of API calls made |
| Prompt | Input tokens sent |
| Output | Completion tokens received |
| Total | Prompt + Output |
Values ≥ 1,000 are displayed as k (e.g. 24.3k).
What is and isn't tracked:
- ✅ Gemini API calls (orchestrator, distillation)
- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
- ✗ Claude CLI — no structured token data is returned by the subprocess
- ✗ Gemini CLI — same reason
The raw data lives in home/{username}/usage.json and is also accessible via the Files panel or the API.
Model Registry
Configure which AI models are available and which handles each task type.
New user quick path: ☰ → Account → Set up OpenRouter → (the guided wizard adds a host, model, and role assignment in one step).
Full manual path: ☰ → Account → scroll to Model Registry → Manage models →
Step 1 — Set up providers and hosts
Do this before adding models — models need a provider account or local host to attach to.
Anthropic (Claude): Nothing to configure. Claude uses your existing CLI OAuth session. If Claude isn't working, run claude auth login in a terminal.
Google (Gemini): Add one entry per API key you want to use:
- Scroll to Cloud Providers → Google → click + Add Google account
- Enter a label (e.g. "Work", "Personal") and your API key
- Get a free key at aistudio.google.com/apikey
OpenRouter (recommended for new users — one key for many models):
- Get a key at openrouter.ai/keys
- Scroll to Local Hosts → + Add host
- Label: "OpenRouter", URL:
https://openrouter.ai/api/v1, paste your key, Type: OpenAI-compatible - Click Fetch models to verify, then add models from the fetched list
Other local hosts (Open WebUI, Ollama, LM Studio, etc.):
- Scroll to Local Hosts → click + Add host to expand the form
- Enter a label, the API URL (e.g.
http://192.168.1.100:3000), and optional API key - Set Type: Open WebUI / Ollama, or OpenAI-compatible
- Click Fetch models on the saved host card to verify connectivity
Step 2 — Add models
Scroll to Add Model. Select the provider tab, fill in the details, click Add Model:
| Tab | What you need |
|---|---|
| Local | Select a host (from Step 1) → enter model name, or use Fetch from host to pick from a live list |
| Select a Gemini model from the catalog → select a Google account (from Step 1) | |
| Anthropic | Select a Claude model from the catalog → uses your CLI session automatically |
The label and context window size auto-fill from the catalog — edit them if you want. Tags are optional.
Step 3 — Assign models to roles
Scroll to Role Assignments at the bottom of the page. Each role has Primary, Backup 1, and Backup 2 slots — Primary is tried first, then backups in order. Changes save automatically.
| Role | Used for |
|---|---|
| Chat | Regular conversation |
| Orchestrator | Agent mode tool loop |
| Distill | Memory distillation (short / mid / long) |
| Coder | Code-focused tasks |
| Research | Long-context research tasks |
Leave all slots empty to use the server default.
Per-role tool sets: Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).
Inject timestamp: Each role card has an "Inject current date & time into system prompt" checkbox (default on). Disable it for pure processing roles (summarizer, classifier, translator) that don't need clock awareness.
Nextcloud Talk Bot
Inara is registered as a bot in Nextcloud Talk.
- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to.
- The webhook returns
200 OKimmediately; the reply happens asynchronously. - Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable the bot.
Google Chat Bot
Inara is available as a bot in Google Chat (One Sky IT Workspace).
- Send Inara a direct message in Google Chat to start a conversation.
- Each DM thread is its own session (
gc_spaces/*prefix) — history persists across messages. - Responses are synchronous — Google Chat displays the reply directly in the thread.
- To add Inara to a space: open the space, add a person/app, search for Inara.
- Sessions from Google Chat appear as
gc_*prefixed IDs in the Sessions panel.
Files (Identity Editor)
The Files button opens an editor for your persona's identity and memory files:
| File | Purpose |
|---|---|
SOUL.md |
Core personality, values, and voice |
IDENTITY.md |
Role, capabilities, and context |
USER.md |
Your profile, preferences, and history |
PROTOCOLS.md |
Behavioural rules and communication protocols |
CONTEXT_TIERS.md |
Defines what gets loaded at each context tier |
MEMORY_LONG.md |
Permanent curated long-term memory |
MEMORY_MID.md |
Rolling mid-term digest (LLM-distilled) |
MEMORY_SHORT.md |
Recent session rollup (auto-aggregated) |
HELP.md |
This file — persona-specific additions appended below |
email_allowlist.json |
Regex patterns for permitted email_send recipients (one per line) |
Toggle preview / edit to switch between rendered markdown and raw text. Ctrl+S saves, Esc closes.
The Audit Log group at the bottom of the sidebar (collapsed by default) lists tool call logs by date (YYYY-MM-DD.jsonl). Click any date to view a read-only table of every orchestrator tool call: time, tool name, status, model, args, and result snippet. Status is colour-coded: green = ok, red = error, amber = denied.
Push Notifications
Cortex can send browser push notifications — even when the tab is closed.
- Open ☰ → Enable notifications and accept the browser permission prompt.
- Once enabled, the button shows Notifications on (in accent colour).
- Click again to disable. Subscriptions are stored per-device.
- The orchestrator's
web_pushtool lets Inara send you a push proactively (e.g. when a long task completes).
Notification channel settings: ☰ → Account → Notification settings → — choose Browser Push, Email, Nextcloud Talk, or Google Chat as the channel Inara uses for scheduled reminders, cron job completions, and memory digests. Use the Send Test Notification button to verify your setup, or Check Reminders Now to trigger the reminder check immediately.
Context & Memory ( ⚙ panel )
Context Tiers
Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| Min | SOUL + IDENTITY + USER summary | ~1,500 |
| Std | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| Ext | + last 2 raw session logs | ~15,000 |
| Full | + last 7 raw session logs | ~50,000 |
Default is Std. Use Min for small/local models. Use Ext or Full for complex multi-session tasks.
Memory Layers
Three independently toggleable memory files, loaded Long → Mid → Short:
| Layer | File | Contents |
|---|---|---|
| Long | MEMORY_LONG.md |
Permanent facts — origin, key decisions, profile highlights |
| Mid | MEMORY_MID.md |
Rolling digest of recent weeks — LLM-distilled from Short |
| Short | MEMORY_SHORT.md |
Recent session rollup — auto-aggregated from session logs |
Toggle any layer off to save tokens for a focused conversation.
Memory Distillation
Distillation builds up the memory layers from raw session logs. Runs automatically on a schedule; trigger manually via the ⚙ panel:
| Button | What it does |
|---|---|
| short | Rolls recent session log files → MEMORY_SHORT.md (fast, no LLM) |
| mid | LLM summarizes MEMORY_SHORT.md → MEMORY_MID.md |
| long | LLM integrates MEMORY_MID.md → MEMORY_LONG.md |
| all | Runs short → mid → long in sequence |
Recommended workflow: run short after any productive session; mid weekly; long monthly.
Keyboard Shortcuts
| Keys | Action |
|---|---|
Ctrl+Enter |
Send message (default mode) |
Enter |
Send (when in Enter mode) |
Shift+Enter |
New line in message input |
Ctrl+Enter |
Save inline message edit |
Esc |
Cancel inline edit / close any open modal |
Ctrl+S |
Save file (Files modal) |
API Reference
For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
POST |
/chat |
Send a message — returns SSE stream |
GET |
/backend |
Get current primary/fallback backends |
POST |
/backend |
Set primary backend ({"primary": "claude"}) |
GET |
/sessions |
List all sessions |
GET |
/history/{id} |
Get session message history |
PUT |
/history/{id} |
Replace full session history |
GET |
/events |
SSE stream for real-time Talk activity |
POST |
/note |
Inject a context note into a session |
GET |
/files |
List identity files |
GET |
/files/{name} |
Read a file |
PUT |
/files/{name} |
Write a file |
POST |
/distill/short |
Aggregate session logs → MEMORY_SHORT |
POST |
/distill/mid |
Summarize short → MEMORY_MID (LLM) |
POST |
/distill/long |
Integrate mid → MEMORY_LONG (LLM) |
POST |
/distill/all |
Run all three distillation steps |
GET |
/distill/status |
Scheduler status and next run times |
POST |
/orchestrate |
Submit an agent task — returns {"job_id": "..."} |
GET |
/orchestrate/{job_id} |
Poll job status and result |
GET |
/settings/models |
Model registry UI |
POST |
/api/models/role |
Set a role assignment (JSON body) |
POST |
/api/models/role-config |
Set per-role tool list and system prompt append |
GET |
/api/push/vapid-key |
VAPID public key (for push subscription) |
POST |
/api/push/subscribe |
Register a push subscription |
DELETE |
/api/push/subscribe |
Remove a push subscription |
POST |
/api/push/test |
Send a test notification via configured channel |
POST |
/api/push/reminders/check |
Run reminder check immediately; returns {"reminders_found": n} |
GET |
/api/audit/files |
List available audit log dates (own data) |
GET |
/api/audit/day?date= |
Tool call entries for a specific date (own data) |
GET |
/api/audit/recent |
Recent tool calls across days (admin) |
GET |
/api/audit/stats |
Tool call counts by tool/status/user (admin) |
GET |
/api/usage |
Full daily token usage log (own data) |
GET |
/api/usage/summary |
Per-model token totals, all time (own data) |
GET |
/api/usage/all |
Per-model totals for all users (admin) |
GET |
/setup/model |
Guided OpenRouter setup form (Step 3 / standalone) |
POST |
/setup/model |
Save OpenRouter host + model + assign to chat role |
GET |
/health |
Health check — returns {"status": "ok"} |
Chat request body (POST /chat):
{
"message": "string",
"session_id": "string | null",
"tier": 2,
"model": "claude | gemini | local | null",
"include_long": true,
"include_mid": true,
"include_short": true
}
Cortex is a self-hosted personal AI platform. Named after the 'verse-wide communications network in Firefly.