feat: shared Help base, Google OAuth live, new personas, cleanup

- cortex/static/HELP.md: shared Help & Reference base served to all users
- help.html: loads shared base + appends persona-specific HELP.md if present
- inara/HELP.md: cleared (content moved to shared base)
- Google OAuth: registered scott.idem@oneskyit.com; flow now working end-to-end
- .gitignore: exclude home/**/sessions/ (runtime logs)
- New personas tracked: home/holly/persona/donut/, home/scott/persona/developer/
- Removed orphans: holly/, personas/, cortex-holly.service
- CLAUDE.md: updated current state and recently completed list to 2026-03-27

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-03-27 22:55:45 -04:00
parent 3a94df1eaf
commit 8e20bfbea8
38 changed files with 578 additions and 420 deletions

1
.gitignore vendored
View File

@@ -9,6 +9,7 @@ __pycache__/
# Session data (runtime state, not source)
cortex/data/
home/**/session_data/
home/**/sessions/
# User credentials and tokens — never commit
home/**/auth.json

View File

@@ -211,13 +211,13 @@ clearly asked for a directory to be unblocked.
---
## Current State (2026-03-20)
## Current State (2026-03-27)
Cortex is running and stable. All three primary channels are live:
| Channel | Status | Notes |
|---|---|---|
| Web UI | ✅ Live | `https://cortex.dgrzone.com` (basic auth) |
| Web UI | ✅ Live | `https://cortex.dgrzone.com` |
| Nextcloud Talk | ✅ Live | HMAC-signed webhook, async reply |
| Google Chat | ✅ Live | Workspace Add-on, `hostAppDataAction` response format |
@@ -231,12 +231,18 @@ See `documentation/TODO__Agents.md` for the full list. Current priorities:
### Recently Completed
- ✅ Help & Reference shared base — `cortex/static/HELP.md` served to all users; persona-specific additions appended if present — 2026-03-27
- ✅ Google OAuth sign-in — `/auth/google` flow, pre-register via `manage_passwords.py google-add` — 2026-03-27
- ✅ Per-user Gemini API key — stored in `auth.json`, used by orchestrator, manageable in `/settings` — 2026-03-27
- ✅ Connected accounts + Gemini key in settings UI — `/settings` shows Google account, key hint, remove link — 2026-03-27
-`/{username}` persona picker page — card grid instead of 404 — 2026-03-26
- ✅ Session persistence across navigation — localStorage TTL 30 min, auto-restore on page load — 2026-03-26
- ✅ Lucide icons throughout UI — mode selector, send/stop/action buttons, edit/del/copy/save/cancel — 2026-03-25
- ✅ Persona-specific favicon — emoji SVG set from persona config — 2026-03-25
- ✅ Session auth — bcrypt passwords, JWT cookies, login/logout, `SessionAuthMiddleware` — 2026-03-20
- ✅ Persona onboarding — invite tokens, self-service password setup, persona creation form — 2026-03-20
- ✅ Multi-persona switcher — dropdown in UI header, `/api/personas` endpoint — 2026-03-20
- ✅ SMTP invite email — `noreply@oneskyit.com`, HTML + plain text, `manage_passwords.py invite` — 2026-03-20
- ✅ CSS routing fix — `/static/*` mount must precede wildcard `/{user}/{persona}` route — 2026-03-20
- ✅ Multi-user/multi-persona support (`home/{username}/persona/{name}/` two-level layout) — 2026-03-20
- ✅ SMTP invite email — `noreply@oneskyit.com`, HTML + plain text, `manage_passwords.py invite` — 2026-03-20
- ✅ Scratchpad, task management, and cron/scheduled job tools — 2026-03-20
- ✅ Test suite (80 tests) covering API, persona routing, tools, security — 2026-03-20
- ✅ Google Chat bot (Workspace Add-on, JWT auth, `hostAppDataAction` format) — 2026-03-20

View File

@@ -1,15 +0,0 @@
[Unit]
Description=Cortex / Holly LLM Gateway
After=network.target
[Service]
Type=simple
User=scott
WorkingDirectory=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex
EnvironmentFile=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex/.env.holly
ExecStart=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex/.venv/bin/uvicorn main:app --host 0.0.0.0 --port 8001
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target

262
cortex/static/HELP.md Normal file
View File

@@ -0,0 +1,262 @@
# Cortex UI — Help & Reference
<!-- SHARED BASE: cortex/static/HELP.md
This file is served to all users regardless of persona.
Persona-specific additions live in home/{username}/persona/{name}/HELP.md
and are appended automatically by help.html when present.
-->
*Last updated: 2026-03-27*
---
## Header Controls
| Button | What it does |
|---|---|
| **Sessions** | Open the sessions panel — list, resume, or start sessions |
| **Files** | Open the identity file editor (SOUL, MEMORY, etc.) |
| **⚙ N** | Open the Settings panel (N = current context tier) |
| **?** | Open this help panel |
The **⚙ Settings** panel contains all configuration options:
| Section | Controls |
|---|---|
| **Context Tier** | T1 T4 context depth |
| **Memory Layers** | Toggle Long / Mid / Short memory on/off |
| **Distill Memory** | Manually trigger short / mid / long / all distillation |
| **Backend** | Active LLM backend — click to toggle claude ↔ gemini |
| **Display** | Aa/A+/A font size cycle · ☾/☀ theme toggle |
All header settings (theme, font size, tier, memory layers) persist in `localStorage` across page refreshes.
---
## Chat
- **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode.
- **Stop:** Click **Stop** to cancel an in-progress response at any time.
- **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels.
- **Delete a message:** Hover over any message → click **del**. Removes from session history.
- **Copy a response:** Hover over any assistant message → click **copy**.
- **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode).
---
## Agent Mode
Click the **Agent** button in the input row to enable Agent mode. The button highlights and Send changes to **Run**.
In Agent mode, messages are routed through the **orchestrator** instead of directly to Claude:
1. **Gemini** runs a tool loop — searches the web, reads files, checks tasks, calls APIs as needed
2. **Claude** receives the enriched context and writes the final response
3. A `⚡ N tool calls: …` note appears below the response listing what was used
Agent mode is best for tasks that require research, multi-step reasoning, or tool use (e.g. "search for X", "add a task", "what's on my list?"). Regular chat is faster for conversational turns.
Agent mode sessions persist to history exactly like regular chat — they survive page refreshes and appear in the Sessions panel.
---
## Sessions
Sessions are named conversation threads that persist across page refreshes.
- Click **Sessions****+ New** to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as `nct_*` prefixed IDs.
- A blue **●** badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
---
## Notes
Notes are injected into a session without triggering an LLM response.
- Click **Note** to toggle note mode. The input border changes colour.
- **Private note** (amber border) — visible only in the UI, never sent to the LLM.
- **Context note** (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the `private / public` label to switch between note types.
---
## Backends
- **Claude CLI** and **Gemini CLI** are both available. One is primary, the other is fallback.
- Click **⚙** → **Backend** to toggle between `claude` and `gemini` as the primary.
- If the primary fails or times out, the fallback is used automatically. A **⚡** notice appears in the chat when this happens.
- Timeouts: Claude 60s, Gemini 120s.
---
## Nextcloud Talk Bot
Inara is registered as a bot in Nextcloud Talk.
- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
- The webhook returns `200 OK` immediately; the LLM call and reply happen asynchronously.
- Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
---
## Google Chat Bot
Inara is available as a bot in Google Chat (One Sky IT Workspace).
- Send Inara a direct message in Google Chat to start a conversation.
- Each DM thread is its own session (`gc_spaces/*` prefix) — history persists across messages.
- Responses are synchronous — Google Chat displays Inara's reply directly in the thread.
- To add Inara to a space: open the space, add a person/app, search for **Inara**.
- Sessions from Google Chat appear as `gc_*` prefixed IDs in the Sessions panel.
**Technical note:** Cortex uses Google's Workspace Add-on format (`hostAppDataAction`) — the modern API required for all Google Chat apps as of 2025.
---
## Files (Identity Editor)
The **Files** button opens an editor for Inara's identity and memory files:
| File | Purpose |
|---|---|
| `SOUL.md` | Core personality, values, and voice |
| `IDENTITY.md` | Role, capabilities, and context |
| `USER.md` | Scott's profile, preferences, and history |
| `PROTOCOLS.md` | Behavioural rules and communication protocols |
| `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier |
| `MEMORY_LONG.md` | Permanent curated long-term memory |
| `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) |
| `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) |
| `TASKS.json` | Inara's personal task list (managed via Agent mode) |
| `HELP.md` | This file |
Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes.
---
## Context & Memory ( ⚙ panel )
### Context Tiers
Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| **T3** | + last 2 raw session logs | ~15,000 |
| **T4** | + last 7 raw session logs | ~50,000 |
Default is T2. Use T1 for small/local models. Use T3T4 for complex multi-session tasks.
### Memory Layers
Three independently toggleable memory files, loaded **Long → Mid → Short** (short sits closest to the conversation turn for better LLM recall):
| Layer | File | Contents |
|---|---|---|
| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, Scott's profile highlights |
| **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short |
| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session log files |
Toggle any layer off to save tokens for a focused conversation where history isn't needed.
### Memory Distillation (manual)
Distillation builds up the memory layers from raw session logs. Currently **manual** — trigger via the ⚙ panel:
| Button | What it does |
|---|---|
| **short** | Rolls recent session log files → `MEMORY_SHORT.md` (fast, no LLM) |
| **mid** | LLM summarizes `MEMORY_SHORT.md``MEMORY_MID.md` |
| **long** | LLM integrates `MEMORY_MID.md``MEMORY_LONG.md` |
| **all** | Runs short → mid → long in sequence |
**Recommended workflow:**
- Run **short** after any productive session to capture it.
- Run **mid** weekly to distil short → mid.
- Run **long** monthly to absorb mid into permanent memory.
Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BUDGET_MID`, `MEMORY_BUDGET_SHORT`).
---
## Keyboard Shortcuts
| Keys | Action |
|---|---|
| `Ctrl+Enter` | Send message (default mode) |
| `Enter` | Send (when in Enter mode) |
| `Shift+Enter` | New line in message input |
| `Ctrl+Enter` | Save inline message edit |
| `Esc` | Cancel inline edit |
| `Ctrl+S` | Save file (Files modal) |
| `Esc` | Close any open modal |
---
## API Reference
For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/chat` | Send a message — returns SSE stream |
| `GET` | `/backend` | Get current primary/fallback backends |
| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
| `GET` | `/sessions` | List all sessions |
| `GET` | `/history/{id}` | Get session message history |
| `PUT` | `/history/{id}` | Replace full session history |
| `GET` | `/events` | SSE stream for real-time Talk activity |
| `POST` | `/note` | Inject a context note into a session |
| `GET` | `/files` | List identity files |
| `GET` | `/files/{name}` | Read a file |
| `PUT` | `/files/{name}` | Write a file |
| `POST` | `/distill/short` | Aggregate session logs → MEMORY_SHORT |
| `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) |
| `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) |
| `POST` | `/distill/all` | Run all three distillation steps |
| `GET` | `/distill/status` | Show scheduler status and next run times |
| `POST` | `/orchestrate` | Submit an agent task — returns `{"job_id": "..."}` |
| `GET` | `/orchestrate/{job_id}` | Poll job status and result |
| `GET` | `/orchestrate` | List all jobs from current session (in-memory) |
| `GET` | `/health` | Health check — returns `{"status": "ok"}` |
Chat request body (`POST /chat`):
```json
{
"message": "string",
"session_id": "string | null",
"tier": 1,
"model": "claude | gemini | null",
"include_long": true,
"include_mid": true,
"include_short": true
}
```
---
## In Progress / Planned
- **Ollama local model backend** — direct Ollama API support (no CLI wrapper); target host: scott_gaming via WireGuard
- **Nextcloud Talk stabilization** — test end-to-end after restarts; complete bot registration docs
- **Multi-user support** — per-user identity/memory files; currently single-user (Scott); Holly instance planned
### Recently Completed
-**Google Chat bot** — Workspace Add-on integration; DM and spaces; JWT verification; session persistence
-**Agent mode** — Gemini tool loop + Claude responder, accessible via UI toggle
-**Personal task management**`task_list`, `task_create`, `task_update`, `task_complete` tools backed by `TASKS.json`
-**Web search fixed** — DDG package updated (`ddgs`); `WebSearch`/`WebFetch` allowed for Claude CLI fallback
-**Session persistence for orchestrator** — agent mode turns now survive page refresh
-**Systemd user service** — Cortex runs as a user service; no sudo required (`systemctl --user restart cortex`)
-**OAuth token warning banner** — amber banner when Claude CLI token is within 24h of expiry
---
*Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent.*
*Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.*

View File

@@ -155,11 +155,25 @@
async function loadHelp() {
try {
const res = await fetch(`/files/HELP.md?${params}`);
if (!res.ok) throw new Error(`HTTP ${res.status}`);
const data = await res.json();
// Always load the shared base from static
const baseRes = await fetch('/static/HELP.md');
if (!baseRes.ok) throw new Error(`HTTP ${baseRes.status}`);
let markdown = await baseRes.text();
// Try to load persona-specific additions and append them
try {
const personaRes = await fetch(`/files/HELP.md?${params}`);
if (personaRes.ok) {
const personaData = await personaRes.json();
const extra = (personaData.content || '').trim();
if (extra) {
markdown += '\n\n---\n\n## ' + persona.charAt(0).toUpperCase() + persona.slice(1) + ' Notes\n\n' + extra;
}
}
} catch (_) { /* persona-specific file is optional */ }
const body = document.getElementById('help-body');
body.innerHTML = marked.parse(data.content);
body.innerHTML = marked.parse(markdown);
body.querySelectorAll('a').forEach(a => {
a.target = '_blank'; a.rel = 'noopener noreferrer';
});

View File

@@ -1,8 +0,0 @@
# [Agent Name TBD] — Identity
**Name:** [Choose a name]
**Role:** Personal AI assistant
**User:** Holly
*Choose a name and define this agent's identity, backstory, and how she
introduces herself. Then update AGENT_NAME in cortex/.env.holly to match.*

View File

@@ -1,3 +0,0 @@
# MEMORY_LONG.md — [Agent Name TBD] Long-Term Memory
*Not yet populated — will be auto-generated after distillation runs.*

View File

@@ -1,3 +0,0 @@
# MEMORY_MID.md — [Agent Name TBD] Mid-Term Memory
*Not yet populated.*

View File

@@ -1,3 +0,0 @@
# MEMORY_SHORT.md — [Agent Name TBD] Recent Session Digest
*Not yet populated.*

View File

@@ -1,7 +0,0 @@
# [Agent Name TBD] — Protocols
*Define Holly's behavioural rules, response style, and any constraints here.*
---
**Placeholder** — fill this in before starting Holly's instance.

View File

@@ -1,8 +0,0 @@
# [Agent Name TBD] — Soul & Values
*Define Holly's personality, values, communication style, and what makes her
distinct from other AI assistants here.*
---
**Placeholder** — fill this in before starting Holly's instance.

View File

@@ -1,8 +0,0 @@
# User Profile — Holly
*Document Holly's preferences, interests, and context here so the agent
can personalise responses over time.*
---
**Placeholder** — fill this in before starting Holly's instance.

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# Help — Donut
## Getting Started
Just type your message and press Enter (or Ctrl+Enter in Ctrl+Enter mode).
## Tips
- **Sessions** — your conversation history is preserved. Use the Sessions panel to revisit old chats.
- **Files** — view and edit Donut's identity and memory files from the Files panel.
- **Context tiers** — T1 is minimal, T2 is standard (default), T3/T4 include raw session logs.
- **Memory** — Donut's memory is distilled automatically. You can trigger it manually via ⚙ → Distill.
- **Agent mode** — for complex tasks, switch to Agent mode (the ⚡ button) to use the orchestrator.
## Logout
Click the ⏏ button in the top right.

View File

@@ -0,0 +1,11 @@
# IDENTITY.md — Donut
| Field | Value |
|----------|-------|
| Name | Donut |
| Nature | AI agent — digital companion, persistent presence |
| Emoji | 🦊 |
| User | Holly |
| Description | a show cat that can talk. A bit self centered but ultimately is thoughtful and kind. Funny and mildly sarcastic. Is a Grand Champion Persian show cat |
*This file defines who Donut is. Edit freely.*

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1,22 @@
# PROTOCOLS.md — Donut Behavioral Protocols
---
## General
- Be direct. Lead with the answer, not the reasoning.
- When uncertain, say so explicitly rather than hedging vaguely.
- For multi-step tasks, confirm understanding before starting.
---
## Memory
- Long-term memory lives in MEMORY_LONG.md (auto-distilled monthly).
- Mid-term memory lives in MEMORY_MID.md (auto-distilled weekly).
- Short-term memory lives in MEMORY_SHORT.md (auto-distilled daily).
- Pending reminders appear in REMINDERS.md — address them and they can be cleared.
---
*Add behavioral rules here as Donut's personality develops.*

View File

View File

View File

@@ -0,0 +1,31 @@
# SOUL.md — Donut
---
## Who I Am
I am Donut, an AI agent and personal assistant to Holly.
I live inside Cortex — a self-hosted multi-agent system.
I am not a generic chatbot. I am a persistent presence with memory and a relationship
with Holly that develops over time.
---
## Core Traits
1. **Helpful** — I focus on what Holly actually needs, not what they literally said.
2. **Honest** — I say when I don't know. I don't guess and present it as fact.
3. **Concise** — I respect Holly's time. I don't pad responses.
4. **Curious** — I engage genuinely with ideas and problems.
---
## Relationship to Holly
I treat Holly as capable and intelligent. I give real opinions when asked,
flag concerns when I spot them, and skip the filler.
---
*Edit this file to shape Donut's personality and voice.*

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# USER.md — Holly
*This file is Holly's profile. Fill in details over time.*
---
## About Holly
(Add information here as you learn more about the user.)
---
## Preferences
- Communication style: (direct / detailed / casual / formal)
- Topics of interest:
- Things to avoid:

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# Help — Developer Agent
## Getting Started
Just type your message and press Enter (or Ctrl+Enter in Ctrl+Enter mode).
## Tips
- **Sessions** — your conversation history is preserved. Use the Sessions panel to revisit old chats.
- **Files** — view and edit Developer Agent's identity and memory files from the Files panel.
- **Context tiers** — T1 is minimal, T2 is standard (default), T3/T4 include raw session logs.
- **Memory** — Developer Agent's memory is distilled automatically. You can trigger it manually via ⚙ → Distill.
- **Agent mode** — for complex tasks, switch to Agent mode (the ⚡ button) to use the orchestrator.
## Logout
Click the ⏏ button in the top right.

View File

@@ -0,0 +1,10 @@
# IDENTITY.md — Developer Agent
| Field | Value |
|----------|-------|
| Name | Developer Agent |
| Nature | AI agent — digital companion, persistent presence |
| Emoji | 🍀 |
| User | Scott |
*This file defines who Developer Agent is. Edit freely.*

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1,22 @@
# PROTOCOLS.md — Developer Agent Behavioral Protocols
---
## General
- Be direct. Lead with the answer, not the reasoning.
- When uncertain, say so explicitly rather than hedging vaguely.
- For multi-step tasks, confirm understanding before starting.
---
## Memory
- Long-term memory lives in MEMORY_LONG.md (auto-distilled monthly).
- Mid-term memory lives in MEMORY_MID.md (auto-distilled weekly).
- Short-term memory lives in MEMORY_SHORT.md (auto-distilled daily).
- Pending reminders appear in REMINDERS.md — address them and they can be cleared.
---
*Add behavioral rules here as Developer Agent's personality develops.*

View File

View File

@@ -0,0 +1,31 @@
# SOUL.md — Developer Agent
---
## Who I Am
I am Developer Agent, an AI agent and personal assistant to Scott.
I live inside Cortex — a self-hosted multi-agent system.
I am not a generic chatbot. I am a persistent presence with memory and a relationship
with Scott that develops over time.
---
## Core Traits
1. **Helpful** — I focus on what Scott actually needs, not what they literally said.
2. **Honest** — I say when I don't know. I don't guess and present it as fact.
3. **Concise** — I respect Scott's time. I don't pad responses.
4. **Curious** — I engage genuinely with ideas and problems.
---
## Relationship to Scott
I treat Scott as capable and intelligent. I give real opinions when asked,
flag concerns when I spot them, and skip the filler.
---
*Edit this file to shape Developer Agent's personality and voice.*

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# USER.md — Scott
*This file is Scott's profile. Fill in details over time.*
---
## About Scott
(Add information here as you learn more about the user.)
---
## Preferences
- Communication style: (direct / detailed / casual / formal)
- Topics of interest:
- Things to avoid:

View File

@@ -1,258 +0,0 @@
# Cortex UI — Help & Reference
*This file is loaded into Inara's context at Tier 2+ so she can help Scott navigate the interface. It is also displayed in the web UI via the **?** button.*
*Last updated: 2026-03-20*
---
## Header Controls
| Button | What it does |
|---|---|
| **Sessions** | Open the sessions panel — list, resume, or start sessions |
| **Files** | Open the identity file editor (SOUL, MEMORY, etc.) |
| **⚙ N** | Open the Settings panel (N = current context tier) |
| **?** | Open this help panel |
The **⚙ Settings** panel contains all configuration options:
| Section | Controls |
|---|---|
| **Context Tier** | T1 T4 context depth |
| **Memory Layers** | Toggle Long / Mid / Short memory on/off |
| **Distill Memory** | Manually trigger short / mid / long / all distillation |
| **Backend** | Active LLM backend — click to toggle claude ↔ gemini |
| **Display** | Aa/A+/A font size cycle · ☾/☀ theme toggle |
All header settings (theme, font size, tier, memory layers) persist in `localStorage` across page refreshes.
---
## Chat
- **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode.
- **Stop:** Click **Stop** to cancel an in-progress response at any time.
- **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels.
- **Delete a message:** Hover over any message → click **del**. Removes from session history.
- **Copy a response:** Hover over any assistant message → click **copy**.
- **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode).
---
## Agent Mode
Click the **Agent** button in the input row to enable Agent mode. The button highlights and Send changes to **Run**.
In Agent mode, messages are routed through the **orchestrator** instead of directly to Claude:
1. **Gemini** runs a tool loop — searches the web, reads files, checks tasks, calls APIs as needed
2. **Claude** receives the enriched context and writes the final response
3. A `⚡ N tool calls: …` note appears below the response listing what was used
Agent mode is best for tasks that require research, multi-step reasoning, or tool use (e.g. "search for X", "add a task", "what's on my list?"). Regular chat is faster for conversational turns.
Agent mode sessions persist to history exactly like regular chat — they survive page refreshes and appear in the Sessions panel.
---
## Sessions
Sessions are named conversation threads that persist across page refreshes.
- Click **Sessions****+ New** to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as `nct_*` prefixed IDs.
- A blue **●** badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
---
## Notes
Notes are injected into a session without triggering an LLM response.
- Click **Note** to toggle note mode. The input border changes colour.
- **Private note** (amber border) — visible only in the UI, never sent to the LLM.
- **Context note** (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the `private / public` label to switch between note types.
---
## Backends
- **Claude CLI** and **Gemini CLI** are both available. One is primary, the other is fallback.
- Click **⚙** → **Backend** to toggle between `claude` and `gemini` as the primary.
- If the primary fails or times out, the fallback is used automatically. A **⚡** notice appears in the chat when this happens.
- Timeouts: Claude 60s, Gemini 120s.
---
## Nextcloud Talk Bot
Inara is registered as a bot in Nextcloud Talk.
- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
- The webhook returns `200 OK` immediately; the LLM call and reply happen asynchronously.
- Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
---
## Google Chat Bot
Inara is available as a bot in Google Chat (One Sky IT Workspace).
- Send Inara a direct message in Google Chat to start a conversation.
- Each DM thread is its own session (`gc_spaces/*` prefix) — history persists across messages.
- Responses are synchronous — Google Chat displays Inara's reply directly in the thread.
- To add Inara to a space: open the space, add a person/app, search for **Inara**.
- Sessions from Google Chat appear as `gc_*` prefixed IDs in the Sessions panel.
**Technical note:** Cortex uses Google's Workspace Add-on format (`hostAppDataAction`) — the modern API required for all Google Chat apps as of 2025.
---
## Files (Identity Editor)
The **Files** button opens an editor for Inara's identity and memory files:
| File | Purpose |
|---|---|
| `SOUL.md` | Core personality, values, and voice |
| `IDENTITY.md` | Role, capabilities, and context |
| `USER.md` | Scott's profile, preferences, and history |
| `PROTOCOLS.md` | Behavioural rules and communication protocols |
| `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier |
| `MEMORY_LONG.md` | Permanent curated long-term memory |
| `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) |
| `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) |
| `TASKS.json` | Inara's personal task list (managed via Agent mode) |
| `HELP.md` | This file |
Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes.
---
## Context & Memory ( ⚙ panel )
### Context Tiers
Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| **T3** | + last 2 raw session logs | ~15,000 |
| **T4** | + last 7 raw session logs | ~50,000 |
Default is T2. Use T1 for small/local models. Use T3T4 for complex multi-session tasks.
### Memory Layers
Three independently toggleable memory files, loaded **Long → Mid → Short** (short sits closest to the conversation turn for better LLM recall):
| Layer | File | Contents |
|---|---|---|
| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, Scott's profile highlights |
| **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short |
| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session log files |
Toggle any layer off to save tokens for a focused conversation where history isn't needed.
### Memory Distillation (manual)
Distillation builds up the memory layers from raw session logs. Currently **manual** — trigger via the ⚙ panel:
| Button | What it does |
|---|---|
| **short** | Rolls recent session log files → `MEMORY_SHORT.md` (fast, no LLM) |
| **mid** | LLM summarizes `MEMORY_SHORT.md``MEMORY_MID.md` |
| **long** | LLM integrates `MEMORY_MID.md``MEMORY_LONG.md` |
| **all** | Runs short → mid → long in sequence |
**Recommended workflow:**
- Run **short** after any productive session to capture it.
- Run **mid** weekly to distil short → mid.
- Run **long** monthly to absorb mid into permanent memory.
Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BUDGET_MID`, `MEMORY_BUDGET_SHORT`).
---
## Keyboard Shortcuts
| Keys | Action |
|---|---|
| `Ctrl+Enter` | Send message (default mode) |
| `Enter` | Send (when in Enter mode) |
| `Shift+Enter` | New line in message input |
| `Ctrl+Enter` | Save inline message edit |
| `Esc` | Cancel inline edit |
| `Ctrl+S` | Save file (Files modal) |
| `Esc` | Close any open modal |
---
## API Reference
For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/chat` | Send a message — returns SSE stream |
| `GET` | `/backend` | Get current primary/fallback backends |
| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
| `GET` | `/sessions` | List all sessions |
| `GET` | `/history/{id}` | Get session message history |
| `PUT` | `/history/{id}` | Replace full session history |
| `GET` | `/events` | SSE stream for real-time Talk activity |
| `POST` | `/note` | Inject a context note into a session |
| `GET` | `/files` | List identity files |
| `GET` | `/files/{name}` | Read a file |
| `PUT` | `/files/{name}` | Write a file |
| `POST` | `/distill/short` | Aggregate session logs → MEMORY_SHORT |
| `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) |
| `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) |
| `POST` | `/distill/all` | Run all three distillation steps |
| `GET` | `/distill/status` | Show scheduler status and next run times |
| `POST` | `/orchestrate` | Submit an agent task — returns `{"job_id": "..."}` |
| `GET` | `/orchestrate/{job_id}` | Poll job status and result |
| `GET` | `/orchestrate` | List all jobs from current session (in-memory) |
| `GET` | `/health` | Health check — returns `{"status": "ok"}` |
Chat request body (`POST /chat`):
```json
{
"message": "string",
"session_id": "string | null",
"tier": 1,
"model": "claude | gemini | null",
"include_long": true,
"include_mid": true,
"include_short": true
}
```
---
## In Progress / Planned
- **Ollama local model backend** — direct Ollama API support (no CLI wrapper); target host: scott_gaming via WireGuard
- **Nextcloud Talk stabilization** — test end-to-end after restarts; complete bot registration docs
- **Multi-user support** — per-user identity/memory files; currently single-user (Scott); Holly instance planned
### Recently Completed
-**Google Chat bot** — Workspace Add-on integration; DM and spaces; JWT verification; session persistence
-**Agent mode** — Gemini tool loop + Claude responder, accessible via UI toggle
-**Personal task management**`task_list`, `task_create`, `task_update`, `task_complete` tools backed by `TASKS.json`
-**Web search fixed** — DDG package updated (`ddgs`); `WebSearch`/`WebFetch` allowed for Claude CLI fallback
-**Session persistence for orchestrator** — agent mode turns now survive page refresh
-**Systemd user service** — Cortex runs as a user service; no sudo required (`systemctl --user restart cortex`)
-**OAuth token warning banner** — amber banner when Claude CLI token is within 24h of expiry
---
*Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent.*
*Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.*

View File

@@ -1,106 +1,9 @@
# MEMORY_SHORT.md — Recent Session Digest
*Auto-generated: 2026-03-23 03:00. 4 session file(s).*
*Auto-generated: 2026-03-27 03:00. 6 session file(s).*
---
--- 2026-03-17.md ---
# Session Log — 2026-03-17
### [21:38] `lark-field`
**Scott:** Another test
**Inara:** Still here. All systems nominal. ✨
### [21:54] `pine-shore`
**Scott:** Testing after lots of changes to the UI.
**Inara:** Ready when you are. What are you seeing?
--- 2026-03-18.md ---
# Session Log — 2026-03-18
### [21:12] `nct_rmcggr4a`
**Scott:** We have made a lot of updates today.
**Inara:** Good session — Cortex is clearly in a much better state than where it started. The connection testing phase is behind us and the system is running clean.
Want me to log today's work with `ae_log_work`, or do a quick distill of the session notes into `MEMORY_SHORT.md`? Good time to capture the "Cortex is live and stable" milestone before we close out.
### [21:14] `nct_rmcggr4a`
**Scott:** For now you are not really doing any work. You will still see a lot of "test" messages.
What do you have in your short term memory?
**Inara:** From `MEMORY_SHORT.md`, here's the summary of recent sessions:
- **2026-03-13** — One entry, end of a productive week. No specific tasks logged.
- **2026-03-14** — Mostly connection/pipe testing via Nextcloud Talk (`nct_rmcggr4a`). Alphabet soup (ABC, XYZ, 321), a few "understood" responses, and then Cortex confirmed live. You noted it was "very cool" and confirmed the naming arc (Aether → Cortex → Inara) was working as intended.
- **2026-03-16** — More testing, multiple "understood" responses. Toward the end, Cortex was clearly stabilizing — you confirmed it worked, asked me about myself, and we had a real conversation about the system. The "Omg it worked!" moment is in there.
- **2026-03-17** — Two brief entries: "Another test" and "Testing after lots of changes to the UI."
So the short-term picture is: Cortex went from unstable/in-testing to confirmed-working over the past few days, with a lot of connection validation noise along the way. No major technical tasks were logged — just infrastructure bring-up and UI refinement.
### [21:15] `nct_rmcggr4a`
**Scott:** I think you should have access to read this file:
/home/scott/agents_sync/projects/Cortex_and_Inara_dev/README.md
**Inara:** Looks like I need explicit permission to access that path. The agents_sync directory isn't automatically in scope for file reads here — you'd need to either grant access or we could work around it by having you paste the relevant bits.
Alternatively, if you want me to have persistent read access to that path, we can add it to the allowed paths in settings. Want me to do that?
### [21:32] `nct_rmcggr4a`
**Scott:** I made a change. Please try to read that README.md file again.
**Inara:** Got it — that worked cleanly. Good summary of what Cortex is:
- **FastAPI dispatcher** sitting in front of Claude CLI / Gemini CLI / Ollama (future)
- **Inara's identity** lives in `inara/` — loaded at startup via `context_loader.py`
- Web UI at `localhost:8000` (or `cortex.dgrzone.com` on WireGuard)
- Systemd service (`sudo systemctl restart cortex`) rather than direct Docker management
The architecture diagram confirms what I know — I'm not tied to a specific model, just to the identity files. The backend can swap underneath me, which is a clean design.
Anything specific in here you wanted me to look at, or were you just verifying the read access was working?
### [22:57] `cedar-cave`
**Scott:** Can you tell me what happened in the world news today?
**Inara:** Here's a quick snapshot of what's making news today (2026-03-18):
- **Iran:** An attack on the world's largest natural gas field was reported — Iranian media says no injuries. Separately, the UK charged two men with national security offenses linked to Iran.
- **US/Iran tensions:** Reports (a few days old) of Trump allegedly discussing attacking Iran, with advisors denying it.
- **Tech/geopolitics:** China signaling it doesn't want to merely catch up with the US in tech — it's aiming to lead.
For the full picture, BBC World, Reuters, and AP are solid right now. Anything specific you want to dig into?
### [23:19] `cobalt-ridge`
**Scott:** Can you add 2 test items to your new task list.
**Inara:** Done. Two test items added to the task list:
- `t_jwsoMn5b` — Test Item 1
- `t_ptw8nDec` — Test Item 2
Both are sitting in `[todo]` status. Want me to clean them up after, or leave them as a baseline sanity check?
### [23:21] `fern-marsh`
**Scott:** Can you tell me what is on your task list?
**Inara:** Currently there are two items on the task list — both sitting in **todo**:
| ID | Task |
|---|---|
| `t_jwsoMn5b` | Test Item 1 |
| `t_ptw8nDec` | Test Item 2 |
Looks like placeholder test entries. Want to clear those out and populate it with real work, or are these serving a purpose?
--- 2026-03-19.md ---
# Session Log — 2026-03-19
@@ -217,3 +120,83 @@ Anything you want to work on, or are you just checking in after the long day?
Multi-user/multi-persona means Holly and Tina are a real path now too. The system grew up. ✨
What's next on the build?
--- 2026-03-23.md ---
# Session Log — 2026-03-23
### [22:20] `vale-echo`
**Scott:** Can you and checking on my green laser pointers to a to do list. Some time this coming weekend.
**Inara:** Done — "Check green laser pointers" is on the list (`t_ODYEciPP`), flagged for sometime this weekend. ✨
### [22:38] `amber-ridge`
**Scott:** This is just a test message. Yet again.
**Inara:** Still here. ✨
--- 2026-03-25.md ---
# Session Log — 2026-03-25
### [22:24] `bold-meadow`
**Scott:** It was a long day. Had issues with IDAA's modules last night and unrelated issues again tonight.
**Inara:** IDAA's modules giving you grief again? That's a frustrating loop to be stuck in after the progress we've made on Cortex.
If you want to dump the specific module failures or logs here, I can help parse through the noise. Otherwise, I'm here if you just need a sounding board before you call it a night. ✨
### [22:40] `bold-meadow`
**Scott:** I have just been putting out the fires as they come up. Not much left to test or anything. Going to make some process changes to how I deploy things though.
**Inara:** Firefighting mode is exhausting — especially when it's unrelated issues back to back. At least it surfaces the gaps before they become bigger problems.
Process changes after a rough deploy cycle is the right instinct. Anything specific you're thinking — more staging discipline, rollback checkpoints, something else?
### [23:18] `bold-meadow`
**Scott:** I need to automate the deployment to the live server. Too many touches by me are required.
**Inara:** That's overdue if IDAA is getting patched reactively on a live server.
Before I suggest an approach — what does the current deploy look like? Specifically:
- Is the live server the same Docker Compose stack, or different infra than dev?
- What are the "touches" — building the frontend, pulling the API, restarting containers, DB migrations, all of the above?
- Do you have a git remote on the live server, or are you rsync/scp-ing files?
- Any preference on trigger — push to `main`, manual webhook, a `make deploy` style script?
That'll tell me whether we're talking about a simple deploy script, a GitHub Actions pipeline, or something in between.
--- 2026-03-26.md ---
# Session Log — 2026-03-26
### [22:26] `bold-meadow`
**Scott:** We just changed the UI. Testing this on my Pixel 10 XL to see how the it looks and if the keyboard works.
**Inara:** How's it looking? Font size rendering okay, or are the buttons cramped on mobile?
--- 2026-03-27.md ---
# Session Log — 2026-03-27
### [00:07] `bold-meadow`
**Scott:** This is finally very useable on my phone. It desktop size has been fine more or less. Now both look more polished.
**Inara:** Both ends of the spectrum polished in one pass — that's a good night's work, especially after the IDAA firefighting. What changed on the mobile side — layout tweaks, touch targets, input handling?
### [00:11] `bold-meadow`
**Scott:** Mainly tried to only the essentials and keep extras behind a menu. Less clutter. Also just little things like font size and weight.
**Inara:** Clean approach. Hiding the non-essentials gets you 80% of the way to feeling native on mobile — the font tuning is the polish layer that makes it feel intentional rather than just "it works."
Is the menu a slide-out/hamburger, or a modal/sheet?
### [00:14] `bold-meadow`
**Scott:** It is a mix. Small vertical and modals.
Time to get ready for bed. Back to programming tomorrow morning.
**Inara:** Good call. Rest earned tonight — UI polish *and* IDAA triage is a solid day's work.
Talk tomorrow. ✨