Restructures persona storage from a flat personas/{name}/ layout to
home/{username}/persona/{name}/, mirroring Linux home directories.
Changes:
- persona.py: two ContextVars (user + persona), Linux-style name validation,
set_context(), get_user(), get_persona(), validate(), list_users(),
list_user_personas(); persona_path() takes (username, name)
- config.py: replaces personas_dir with home_dir + home_root()
- git mv personas/inara → home/scott/persona/inara (history preserved)
- home/holly/persona/tina/: Holly's persona stub added
- cron_runner.py: all storage functions take (username, persona) params
- tools/cron.py: stamps user + persona on jobs; APScheduler IDs are
{user}:{persona}:{job_id} to prevent collisions across users
- memory_distiller.py: distill_short/mid/long take (username, persona);
added missing Path + settings imports
- scheduler.py: _load_user_crons() iterates home/*/persona/* (two-level)
- routers/chat.py, orchestrator.py: user field added; set_context() called
- tests/conftest.py: home_root fixture with two-level structure;
patches home_dir instead of personas_dir
- tests/test_persona.py: fully rewritten for two-level API
- tests/test_api_files.py: updated fixture name and path
- .env.default: documents HOME_DIR setting; scrubs stale API key
- CLAUDE.md, README.md: directory maps updated for new layout
All 80 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 KiB
Cortex UI — Help & Reference
This file is loaded into Inara's context at Tier 2+ so she can help Scott navigate the interface. It is also displayed in the web UI via the ? button.
Last updated: 2026-03-20
Header Controls
| Button | What it does |
|---|---|
| Sessions | Open the sessions panel — list, resume, or start sessions |
| Files | Open the identity file editor (SOUL, MEMORY, etc.) |
| ⚙ N | Open the Settings panel (N = current context tier) |
| ? | Open this help panel |
The ⚙ Settings panel contains all configuration options:
| Section | Controls |
|---|---|
| Context Tier | T1 – T4 context depth |
| Memory Layers | Toggle Long / Mid / Short memory on/off |
| Distill Memory | Manually trigger short / mid / long / all distillation |
| Backend | Active LLM backend — click to toggle claude ↔ gemini |
| Display | Aa/A+/A− font size cycle · ☾/☀ theme toggle |
All header settings (theme, font size, tier, memory layers) persist in localStorage across page refreshes.
Chat
- Send:
Ctrl+Enterby default. Click⌃↵in the input controls to toggle to plainEntermode. - Stop: Click Stop to cancel an in-progress response at any time.
- Edit a message: Hover over any message → click edit.
Ctrl+Entersaves,Esccancels. - Delete a message: Hover over any message → click del. Removes from session history.
- Copy a response: Hover over any assistant message → click copy.
- New line while typing:
Shift+Enter(inCtrl+Entermode) orShift+Enter/ Enter (in Enter mode).
Agent Mode
Click the Agent button in the input row to enable Agent mode. The button highlights and Send changes to Run.
In Agent mode, messages are routed through the orchestrator instead of directly to Claude:
- Gemini runs a tool loop — searches the web, reads files, checks tasks, calls APIs as needed
- Claude receives the enriched context and writes the final response
- A
⚡ N tool calls: …note appears below the response listing what was used
Agent mode is best for tasks that require research, multi-step reasoning, or tool use (e.g. "search for X", "add a task", "what's on my list?"). Regular chat is faster for conversational turns.
Agent mode sessions persist to history exactly like regular chat — they survive page refreshes and appear in the Sessions panel.
Sessions
Sessions are named conversation threads that persist across page refreshes.
- Click Sessions → + New to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as
nct_*prefixed IDs. - A blue ● badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
Notes
Notes are injected into a session without triggering an LLM response.
- Click Note to toggle note mode. The input border changes colour.
- Private note (amber border) — visible only in the UI, never sent to the LLM.
- Context note (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the
private / publiclabel to switch between note types.
Backends
- Claude CLI and Gemini CLI are both available. One is primary, the other is fallback.
- Click ⚙ → Backend to toggle between
claudeandgeminias the primary. - If the primary fails or times out, the fallback is used automatically. A ⚡ notice appears in the chat when this happens.
- Timeouts: Claude 60s, Gemini 120s.
Nextcloud Talk Bot
Inara is registered as a bot in Nextcloud Talk.
- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
- The webhook returns
200 OKimmediately; the LLM call and reply happen asynchronously. - Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
Google Chat Bot
Inara is available as a bot in Google Chat (One Sky IT Workspace).
- Send Inara a direct message in Google Chat to start a conversation.
- Each DM thread is its own session (
gc_spaces/*prefix) — history persists across messages. - Responses are synchronous — Google Chat displays Inara's reply directly in the thread.
- To add Inara to a space: open the space, add a person/app, search for Inara.
- Sessions from Google Chat appear as
gc_*prefixed IDs in the Sessions panel.
Technical note: Cortex uses Google's Workspace Add-on format (hostAppDataAction) — the modern API required for all Google Chat apps as of 2025.
Files (Identity Editor)
The Files button opens an editor for Inara's identity and memory files:
| File | Purpose |
|---|---|
SOUL.md |
Core personality, values, and voice |
IDENTITY.md |
Role, capabilities, and context |
USER.md |
Scott's profile, preferences, and history |
PROTOCOLS.md |
Behavioural rules and communication protocols |
CONTEXT_TIERS.md |
Defines what gets loaded at each context tier |
MEMORY_LONG.md |
Permanent curated long-term memory |
MEMORY_MID.md |
Rolling mid-term digest (LLM-distilled) |
MEMORY_SHORT.md |
Recent session rollup (auto-aggregated) |
TASKS.json |
Inara's personal task list (managed via Agent mode) |
HELP.md |
This file |
Toggle preview / edit to switch between rendered markdown and raw text. Ctrl+S saves, Esc closes.
Context & Memory ( ⚙ panel )
Context Tiers
Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| T1 | SOUL + IDENTITY + USER summary | ~1,500 |
| T2 | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| T3 | + last 2 raw session logs | ~15,000 |
| T4 | + last 7 raw session logs | ~50,000 |
Default is T2. Use T1 for small/local models. Use T3–T4 for complex multi-session tasks.
Memory Layers
Three independently toggleable memory files, loaded Long → Mid → Short (short sits closest to the conversation turn for better LLM recall):
| Layer | File | Contents |
|---|---|---|
| Long | MEMORY_LONG.md |
Permanent facts — origin, key decisions, Scott's profile highlights |
| Mid | MEMORY_MID.md |
Rolling digest of recent weeks — LLM-distilled from Short |
| Short | MEMORY_SHORT.md |
Recent session rollup — auto-aggregated from session log files |
Toggle any layer off to save tokens for a focused conversation where history isn't needed.
Memory Distillation (manual)
Distillation builds up the memory layers from raw session logs. Currently manual — trigger via the ⚙ panel:
| Button | What it does |
|---|---|
| short | Rolls recent session log files → MEMORY_SHORT.md (fast, no LLM) |
| mid | LLM summarizes MEMORY_SHORT.md → MEMORY_MID.md |
| long | LLM integrates MEMORY_MID.md → MEMORY_LONG.md |
| all | Runs short → mid → long in sequence |
Recommended workflow:
- Run short after any productive session to capture it.
- Run mid weekly to distil short → mid.
- Run long monthly to absorb mid into permanent memory.
Token budgets for each layer are set in .env (MEMORY_BUDGET_LONG, MEMORY_BUDGET_MID, MEMORY_BUDGET_SHORT).
Keyboard Shortcuts
| Keys | Action |
|---|---|
Ctrl+Enter |
Send message (default mode) |
Enter |
Send (when in Enter mode) |
Shift+Enter |
New line in message input |
Ctrl+Enter |
Save inline message edit |
Esc |
Cancel inline edit |
Ctrl+S |
Save file (Files modal) |
Esc |
Close any open modal |
API Reference
For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
POST |
/chat |
Send a message — returns SSE stream |
GET |
/backend |
Get current primary/fallback backends |
POST |
/backend |
Set primary backend ({"primary": "claude"}) |
GET |
/sessions |
List all sessions |
GET |
/history/{id} |
Get session message history |
PUT |
/history/{id} |
Replace full session history |
GET |
/events |
SSE stream for real-time Talk activity |
POST |
/note |
Inject a context note into a session |
GET |
/files |
List identity files |
GET |
/files/{name} |
Read a file |
PUT |
/files/{name} |
Write a file |
POST |
/distill/short |
Aggregate session logs → MEMORY_SHORT |
POST |
/distill/mid |
Summarize short → MEMORY_MID (LLM) |
POST |
/distill/long |
Integrate mid → MEMORY_LONG (LLM) |
POST |
/distill/all |
Run all three distillation steps |
GET |
/distill/status |
Show scheduler status and next run times |
POST |
/orchestrate |
Submit an agent task — returns {"job_id": "..."} |
GET |
/orchestrate/{job_id} |
Poll job status and result |
GET |
/orchestrate |
List all jobs from current session (in-memory) |
GET |
/health |
Health check — returns {"status": "ok"} |
Chat request body (POST /chat):
{
"message": "string",
"session_id": "string | null",
"tier": 1,
"model": "claude | gemini | null",
"include_long": true,
"include_mid": true,
"include_short": true
}
In Progress / Planned
- Ollama local model backend — direct Ollama API support (no CLI wrapper); target host: scott_gaming via WireGuard
- Nextcloud Talk stabilization — test end-to-end after restarts; complete bot registration docs
- Multi-user support — per-user identity/memory files; currently single-user (Scott); Holly instance planned
Recently Completed
- ✓ Google Chat bot — Workspace Add-on integration; DM and spaces; JWT verification; session persistence
- ✓ Agent mode — Gemini tool loop + Claude responder, accessible via UI toggle
- ✓ Personal task management —
task_list,task_create,task_update,task_completetools backed byTASKS.json - ✓ Web search fixed — DDG package updated (
ddgs);WebSearch/WebFetchallowed for Claude CLI fallback - ✓ Session persistence for orchestrator — agent mode turns now survive page refresh
- ✓ Systemd user service — Cortex runs as a user service; no sudo required (
systemctl --user restart cortex) - ✓ OAuth token warning banner — amber banner when Claude CLI token is within 24h of expiry
Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent. Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.