# Cortex / Inara — Agent Task List

> Read this file before starting any work on this project.
> **Status:** Active development — ongoing.

---

## 🔴 High Priority

### [Local] Local orchestrator — reach full parity with Gemini orchestrator
`openai_orchestrator.py` is partially built and wired into `POST /orchestrate`.
When the `orchestrator` role resolves to a `local_openai` model it routes there
automatically. Remaining work is quality/reliability parity, not ground-up design.

- [ ] Audit tool schema conversion — Gemini `FunctionDeclaration` → OpenAI `tools` format
      (minor field rename, already partially done)
- [ ] Context budget enforcement per iteration (40–50k for E4B, 35–40k for 26B A4B)
- [ ] Context compaction — trim stale tool results mid-run when approaching limit
- [ ] Error handling parity with Gemini orchestrator (retry logic, malformed tool calls)
- [ ] Test end-to-end with Gemma 4 E4B and 26B A4B on scott_gaming
- [ ] Review `ARCH__FUTURE.md` agent architecture ideas before finalising design
- Reference: `docs/OPEN_WEBUI_API.md`, `documentation/ARCH__FUTURE.md` §1

---

## 🟡 Medium Priority

### [UI] Progressive Web App (PWA) ✅ — 2026-04-29
- manifest.json, sw.js, icon-192/512.png, SW registration in app.js
- `/manifest.json` and `/sw.js` served at root; added to `_PUBLIC` in auth_middleware
- Tested: install prompt confirmed working in Chromium

### [Tools] Orchestrator tool expansions
New tools for `cortex/tools/` — higher-value additions that fill obvious gaps.
- [x] **`cortex_restart`** — detached subprocess, 5s delay, admin-only, confirm-required — 2026-04-29
- [x] **`cortex_logs`** — `journalctl --user -u cortex -n N`, admin-only — 2026-04-29
- [x] **`http_fetch`** — direct URL fetch via httpx, 8192 char cap — 2026-04-29
- [x] **`file_list`** — directory listing with size, dirs first, 200 entry cap, admin-only — 2026-04-29
- [x] **`file_write`** — overwrite/append to home_root paths, admin-only, confirm-required — 2026-04-29
- [x] **`nc_talk_send`** — outbound NC Talk message via notification.py, admin-only — 2026-04-29
- [x] **`email_send`** — SMTP via email_utils, per-user regex allowlist in `home/{user}/email_allowlist.json`, managed via Settings UI textarea + Files panel raw editor — 2026-04-29
- [ ] **`web_push`** — send a browser push notification (requires push subscription stored
      per-user; pairs well with the PWA service worker already in place)

### [Channel] Proactive notifications
Inara reaches out on her own initiative via NC Talk or Google Chat when a reminder
fires, a cron job completes, or something else warrants attention. The cron/reminder
infrastructure already exists — this closes the loop so she can interrupt the user.
- [ ] Add outbound message helper for NC Talk (`send_nextcloud_message(user, text)`)
- [ ] Add outbound message helper for Google Chat (`send_google_chat_message(user, text)`)
- [ ] Wire cron job completion and reminder triggers to call outbound helper
- [ ] Store user preference: which channel to use for proactive notifications
- [ ] `channels.json` already per-user — add `notify_channel: "nextcloud" | "google_chat" | null`

### [UI] File attachments in chat
Upload an image or document inline and have it flow into context. Natural workflow
("here's this PDF, summarize it"); local backend already supports multimodal via Open WebUI.
- [ ] Add attachment button to input area (paperclip icon, hidden file input)
- [ ] Client: encode file as base64 or multipart; send alongside message text
- [ ] Server: accept file in `POST /chat`; route to appropriate backend
  - Claude: `content` array with `image` blocks (base64 or URL)
  - Gemini: `parts` array with `inline_data`
  - Local (Open WebUI): `content` array with image_url items
- [ ] UI: show thumbnail/filename above the sent message

### [Auth] Encrypted sessions
Allow users to opt-in to per-session encryption so session logs on disk cannot be
read without the user's key.
- [ ] Design key derivation: password-based (PBKDF2/Argon2) or separate passphrase
- [ ] Encrypt `session_logger.py` output before writing to `sessions/*.md`
- [ ] Decrypt on read in `session_store.py` (history reload, file browser)
- [ ] UI toggle in Settings to enable/disable encrypted sessions per persona
- [ ] Decide: encrypt at rest only, or also in-memory session store?
- [ ] Consider: how distillation and session search interact with encrypted files

### [Models] Model Registry V2 — Unified Provider System
See `DESIGN__Model_Registry_V2.md` for full design.
- [x] **Phase 1** — V2 schema with providers (Anthropic/Google), multi-account Gemini, auto migration, orchestrator uses account API key — 2026-04-27
- [ ] **Phase 2** — Cloud provider UI: Anthropic + Google sections in `/settings/models`, account management, model entry creation for cloud models
- [ ] **Phase 3** — Unified roles + toggle redesign: standalone role assignments, chat toggle cycles role slots (Primary/Backup 1/Backup 2) showing model label
- [ ] **Phase 4** — Polish: Claude API key, OpenRouter as named provider, catalog sync from API

### [Intelligence] Knowledge consolidation — Phase 1
See `ARCH__Intelligence_Layer.md` for full design.
- [x] Tool: `ae_journal_list` — list all journals for the account — 2026-04-28
- [x] Tool: `ae_journal_search` — search before creating to avoid duplicates
- [x] Tool: `ae_journal_entry_create` — write a new entry with source metadata
- [x] Tool: `ae_journal_entry_update` — PATCH any fields on an existing entry — 2026-04-28
- [x] Tool: `ae_journal_entry_disable` — soft-delete via enable=false — 2026-04-28
- [x] Tool: `ae_journal_entry_append` — read→append timestamped section→write (running logs) — 2026-04-28
- [x] Tool: `ae_journal_entry_prepend` — read→prepend timestamped section→write (newest-first logs) — 2026-04-28
- [x] Import script: walk a markdown directory, chunk by H2 section, create entries — 2026-05-05
- [x] Target: markdown files from `~/DgrZone_Nextcloud/` and `~/OSIT_Nextcloud/` — 2026-05-05
- [x] Tag strategy: source path, topic tags from path components — 2026-05-05

---

## 🟢 Lower Priority / Future

### [Research] Agent architecture patterns — review before building dev agent pipeline
The Claude Code system prompt was leaked April 2026. Two reimplementation repos have
useful design ideas directly applicable to the local orchestrator and dev agent work.
Read before finalising either design.
- [ ] Review https://github.com/HarnessLab/claw-code-agent (Python, targets local models)
- [ ] Review https://github.com/ultraworkers/claw-code (community port, interesting source)
- Key ideas to evaluate for Cortex:
  - Tiered permission model (read-only / write / shell / unsafe) — relevant once dev
    agent is writing and executing code
  - Agent lineage tracking — which agent spawned which sub-agent; essential for the
    orchestrator → specialist → supervisor chain
  - Hard token/cost budgets per operation — local models have fixed context ceilings
  - Context compaction mid-session — trim stale tool results before hitting limit
  - Nested agent delegation with dependency-aware batching
  - Plugin/manifest-based tool registration — worth considering before tool suite grows

### [Backend] API usage / cost tracking
Multi-user setup with real Gemini/Claude API costs. Track per-user token consumption
so Scott can see who's spending what.
- [x] Count input + output tokens — local backend (OpenAI `usage` field) + Gemini API (`usage_metadata`) — 2026-05-05
- [x] Append to `home/{user}/usage.json` — daily buckets, per-model breakdown — 2026-05-05
- [ ] Expose via `/api/usage` endpoint; add a summary row to the Settings page
- [ ] Optional: soft spending limit with a warning toast when exceeded

### [Intelligence] Dev agent pipeline
See `ARCH__Intelligence_Layer.md`. Full design not yet started.
- [ ] Specialist agent: frontend (SvelteKit) code changes
- [ ] Specialist agent: backend (FastAPI) code changes
- [ ] Supervisor agent: diff review, syntax check, test runner
- [ ] Gitea webhook integration: trigger on push/PR, report back
- [ ] Human approval gate before commit

### [Intelligence] Supervisor agent
- Runs `py_compile`, `svelte-check`, unit tests after specialist agent work
- Reports pass/fail back to orchestrator
- Only commits on explicit approval

### [Channel] Gitea webhooks
- Receive push/PR/issue events → route to appropriate agent
- `cortex/routers/` already has pattern; add `gitea.py`
- Gitea Actions (CI) for "run tests on push" — simpler than custom runner

### [Local] RAG via Open WebUI
Open WebUI has a full RAG pipeline (file upload → embed → knowledge collections →
reference in chat). Could feed Nextcloud docs or session logs into a local knowledge
base accessible to local models. Endpoints documented in `docs/OPEN_WEBUI_API.md`.
- `/api/v1/files/` upload + `/api/v1/retrieval/process/web` for URLs
- Reference in chat via `"files": [{"type": "collection", "id": "..."}]`

### [Backend] Intelligent model routing — automatic task-type dispatch
Model Registry V2 (2026-04-27) added role-based routing and manual role toggle — that's
the foundation. What remains is removing the need to toggle manually.
- [ ] Classify incoming messages by task type (heuristic or lightweight classifier)
- [ ] Map task type → role → model automatically:
  - User conversation → `chat` role → Claude (quality prose, persona fidelity)
  - Tool/research tasks → `orchestrator` role → Gemini API or local
  - Private/sensitive → `local` role → Ollama (no data leaves network)
  - Long context (>50k tokens) → Gemini 2.0 (1M ctx window)
  - Fast/cheap queries → local E4B (25 t/s, no API cost)
- [ ] Routing logic in `llm_client.py` or new `router.py`; expose override in UI

### [Ops] Permanent fleet hosting — home server deployment
Currently running on `scott-lt-i7-rtx` (gaming laptop). Long-term target is the
home server for always-on reliability. `docker-compose.yml` already exists.
- [ ] Copy project to home server
- [ ] Configure Nginx reverse proxy (already Docker-hosted on that machine)
- [ ] Point `cortex.dgrzone.com` → home server internal IP (pfSense alias update)
- [ ] WireGuard required for all access — not internet-exposed
- [ ] Update `FLEET_MANIFEST.md` to reflect new hosting location

### [Future] Cortex Mesh — multi-instance fleet coordination
Each fleet device runs its own Cortex instance. Instances delegate tasks to each
other based on resources and specialisation. No central coordinator required.
- Concept only — no design yet. Resolve these questions before building:
  - Auth between instances (shared JWT secret vs. per-instance API keys)
  - Capability advertisement (model registry over HTTP? shared Syncthing file?)
  - Whether `ae_send_message` / the inbox system is the right coordination layer
  - Session continuity — does a conversation stay on one node or migrate?
- Natural foundation already in place: Syncthing-synced `home/` and shared
  `model_registry.json` mean instances share persona memory without a central DB

---

## ✅ Completed

### [Tools] email_send tool + per-user email allowlist — 2026-04-29
- `email_send(to, subject, body)` in `cortex/tools/notify.py` — SMTP via `email_utils.py`
- Per-user regex allowlist at `home/{user}/email_allowlist.json` (JSON array of patterns)
- `re.fullmatch(..., re.IGNORECASE)` — supports wildcards like `.*@oneskyit\\.com`
- Blocked by default (no allowlist = no sends); non-matching addresses silently blocked
- Registered as admin-only tool in `TOOL_ROLES`
- **Settings UI**: `POST /settings/email-allowlist` — textarea in Account Settings, one pattern per line
- **Files panel**: `email_allowlist.json` added to `USER_FILES` in `files.py`; served from `home/{user}/`; appears in new "Settings" group in sidebar

### [Models] Edit existing model entries — 2026-04-29
- Inline edit form per model row in `local_llm.html` (`.model-row-header` + hidden `.model-edit-form`)
- "Edit" toggle shows pre-populated form; "Cancel" collapses it
- "Fetch models" button in edit form — same live-fetch flow as Add Model
- `POST /settings/local/models/{model_id}/edit` route in `local_llm.py` dispatches to `save_model` / `save_cloud_model` (upsert via `model_id`)
- Works for both `local_openai` and cloud model types

### [Sessions] Cross-session search — 2026-04-29
- `GET /sessions/search?q=&user=&persona=&limit=` in `files.py` — full-text grep across `sessions/*.md`, newest first
- Returns up to `limit` matches with 120-char excerpt and date; `total_files_searched` count
- UI: search input + results panel below Files sidebar; `Ctrl+F` / search icon shortcut; `marked.parse` highlights matches

### [Tools] Role-based access control + confirmation gate — 2026-04-29
- `TOOL_ROLES` dict maps tool names to minimum required role (`admin`/`user`)
- `CONFIRM_REQUIRED` set blocks destructive tools; orchestrator injects confirmation prompt instead
- `get_tools_for_role(role)` filters both Gemini declarations and callables
- `get_user_role(username)` added to `auth_utils.py`; passed through both orchestrators
- `manage_passwords.py role <username> admin|user` — shell-only admin promotion
- Admin-only tools: `shell_exec`, `claude_allow_dir`, `cortex_restart`, `cortex_logs`,
  `file_read`, `file_list`, `file_write`, `ae_task_list`, `nc_talk_send`
- Confirm-required tools: `cortex_restart`, `file_write`, `shell_exec`, `cron_remove`, `reminders_clear`

### [UI] Admin role badge in Account settings — 2026-04-29
- `GET /settings` now injects `user_role` from `auth.json` into settings page
- Role shown as a styled pill badge (purple ADMIN, muted USER) below username field

### [Local] Unsloth Gemma 4 variants — resolved 2026-04-29
- Ollama update resolved the `500: unable to load model` issue
- Unsloth Dynamic 2.0 Q4_K_M GGUFs loading correctly

### [Distill] Distill quality review — resolved 2026-04-29
- Short/mid/long output reviewed and quality confirmed acceptable
- No prompt tuning needed at this time

### [UI] Progressive Web App (PWA) — 2026-04-29
- `manifest.json`, `sw.js`, PNG icons (192/512) generated via rsvg-convert
- `/manifest.json` and `/sw.js` served at root via ui.py; exempted in auth_middleware
- Theme-color meta tag updated dynamically on light/dark toggle
- Install prompt confirmed working in Chromium desktop; apple-touch-icon for iOS

### [UI] CodeMirror markdown editor for identity/memory files — 2026-04-28
- Replaced textarea in Files panel with CodeMirror 5 (markdown mode, CDN)
- Syntax highlighting, line wrapping, Ctrl+S to save, per-file undo history

### [UI] Input area polish — 2026-04-28
- Single cycling S/M/L button replaces 3 separate height buttons (same UX as font size)
- S size collapses mode-select to a row (compact); M/L keep vertical column layout
- Input height minimum derived from setting so empty textarea reflects selected size
- Context & Memory panel and Settings dropdown are mutually exclusive (closeAllPanels fix)
- Both panels now use consistent shadow (var(--shadow)) and z-index (200)

### [Tools] Tools toggle — decoupled from Role/Backend — 2026-04-28
- Removed "Agent" mode from the mode selector; replaced with independent ⚡ toggle
- `toolsEnabled` persists in localStorage; routes to orchestrator regardless of active mode
- Layout: column (M/L) or row (S) driven by `data-size` attribute set by JS
- chat_role flows from UI → OrchestrateRequest → orchestrator_engine.run(response_role=...)

### [Tools] shell_exec tool — 2026-04-28
- `shell_exec(command, working_dir, timeout)` in `cortex/tools/system.py`
- Runs any shell command on the Cortex host; timeout clamped 1–120s
- Use for system diagnostics: `df -h`, `ps aux`, `journalctl`, `free -h`, etc.

### [Tools] Aether Journals full toolkit — 2026-04-28
- `ae_journal_list` — list all journals + ids for the account
- `ae_journal_entry_update` — PATCH any fields (title, content, summary, tags, enable)
- `ae_journal_entry_disable` — soft-delete via enable=false
- `ae_journal_entry_append` — read→append timestamped section→write (running/data logs)
- `ae_journal_entry_prepend` — read→prepend timestamped section→write (newest-first)
- Shared `_get_entry` / `_patch_entry` helpers; OpenAI JSON Schema auto-derived from Gemini declarations

### [Local] Per-user multi-model local LLM settings — 2026-04-01
- `home/{username}/local_llm.json` — `hosts[]` + `models[]` + `active_model_id` structure
- `cortex/user_settings.py` — CRUD functions: save_host, add_model, remove_model, set_active_model, get_active_local_model
- `cortex/routers/local_llm.py` + `cortex/static/local_llm.html` — dedicated `/settings/local` page
- "Fetch models from host" button — proxied via `/api/local-llm/fetch-models`, populates dropdown
- Active model shown in UI near backend toggle button (amber hint text)
- Migrates old flat `.env`-style config automatically on first use

### [UI] Copy button for user (sent) messages — 2026-04-01
- Added matching copy-on-hover button to user messages (same pattern as assistant messages)
- `div.dataset.raw` set on send; `makeCopyBtn(div)` appended inline

### [Backend] Local model backend (Open WebUI / Ollama) — 2026-04-01
- OpenAI-compatible API via `httpx` — no CLI wrapper needed
- Configured via `LOCAL_API_URL` / `LOCAL_API_KEY` / `LOCAL_MODEL` in `.env`
- Backend toggle cycles `claude → gemini → local` (amber color in UI)
- `/auth/status` includes local reachability check (`GET /api/models`)
- Tested end-to-end: `test-agent-simple` (Qwen3-8B) on `scott-lt-i7-rtx:3000`, full persona context flowing correctly

### [Testing] Gitea SSH port 2222 — 2026-03-29
- pfSense WAN → 192.168.32.7:2222 port forward confirmed working
- `ssh -p 2222 git@git.dgrzone.com` reaches Gitea (returns "Invalid repository path" — expected, confirms connectivity)
- Clone/push via SSH: `git clone ssh://git@git.dgrzone.com:2222/<user>/<repo>.git`

### [Multi-user] Brian onboarding — 2026-03-29
- Invite sent to `memedrift@gmail.com`
- Brian completed onboarding, created `wintermute` persona
- Google OAuth registered (`google-add brian memedrift@gmail.com`)

### [Tools] Reminders tools — 2026-03-29
- `reminders_add`, `reminders_list`, `reminders_clear` added to orchestrator tool suite
- Tools live in `cortex/tools/reminders.py`
- All persona PROTOCOLS.md updated with Tools & Modes reference (direct chat vs Agent mode)
- `persona_template.py` updated so new personas get the protocol automatically

### [Auth] Token expiry — no restart needed — 2026-03-27
- `llm_client._fresh_claude_token()` reads live from `~/.claude/.credentials.json` on every call
- systemd service is a user unit (no sudo) — `systemctl --user restart cortex` is sufficient
- No manual token sync required after `claude auth login`

### [Multi-user] Per-user channel config — 2026-03-27
- Google Chat and NC Talk secrets/config moved from `.env` to `home/{username}/channels.json`
- New endpoints: `POST /channels/google-chat/{username}` and `POST /webhook/nextcloud/{username}`
- No channel access by default — each user configures their own `channels.json`
- Setup guides: `docs/GOOGLE_CHAT_BOT.md` and `docs/NEXTCLOUD_TALK_BOT.md`

### [Auth] Google OAuth sign-in — 2026-03-27
- `GET /auth/google` → Google consent → `GET /auth/google/callback` flow
- Users pre-registered via `manage_passwords.py google-add <user> <email>`
- Google sign-in button on `/login`; auth.json stores `google_sub` + `google_email`
- Active users: scott (scott.idem@oneskyit.com), holly (holly.danner@gmail.com), brian (memedrift@gmail.com)

### [Settings] Per-user Gemini API key — 2026-03-27
- Stored in `home/{username}/auth.json` as `gemini_api_key`
- Orchestrator uses user key if set, falls back to server-level `GEMINI_API_KEY`
- Manageable via `/settings` UI (add, remove, masked hint)

### [UI] Session persistence across navigation — 2026-03-26
- localStorage keyed to `cx_sid_{user}_{persona}` with 30-min inactivity TTL
- Auto-restored silently on page load; cleared on "New session" or session delete

### [UI] Persona picker page — 2026-03-26
- `GET /{username}` shows a card grid of available personas instead of 404
- Each card links directly to `/{username}/{persona}`

### [UI] Lucide icons — 2026-03-25
- Icons throughout: mode selector, send/stop buttons, edit/del/copy, save/cancel
- Loaded via UMD CDN; `icon_html()` + `render_icons()` helpers in `app.js`

### [UI] Persona-specific favicon — 2026-03-25
- Emoji SVG favicon generated from persona config at load time

### [Multi-user] Holly onboarding — 2026-03-20
- Holly's invite sent; onboarding completed via `/setup/{token}`
- `home/holly/persona/tina/` created from template
- Google OAuth registered (`holly.danner@gmail.com`)

### [Channel] Nextcloud Talk integration ✅ — 2026-03-20, updated 2026-03-27
- HMAC verification: incoming uses `random + raw_body`; outgoing reply uses `random + message_text`
- Per-user routing added 2026-03-27 (endpoint: `/webhook/nextcloud/{username}`)
- Docs: `docs/NEXTCLOUD_TALK_BOT.md`

### [Channel] Google Chat integration ✅ — 2026-03-20, updated 2026-03-27
- JWT verification via `authorizationEventObject.systemIdToken`
- Workspace Add-on format: `hostAppDataAction.chatDataAction.createMessageAction`
- Per-user routing added 2026-03-27 (endpoint: `/channels/google-chat/{username}`)
- Docs: `docs/GOOGLE_CHAT_BOT.md`

### [Intelligence] Orchestrator service — Phase 1 — 2026-03-18
- Gemini API (google-genai SDK) tool loop → Claude final response
- `POST /orchestrate` (async job), `GET /orchestrate/{job_id}` (poll)
- Tools: web search, AE API, file read, task list, scratch, reminders, cron
- Default model: `gemini-2.5-flash`

### [Auth] Session auth + persona onboarding — 2026-03-20
- bcrypt passwords in `home/{username}/auth.json`
- JWT session cookies (HS256, 30-day expiry)
- Invite tokens (72h, one-time-use) — `manage_passwords.py invite <user> [email]`
- Self-service onboarding: `/setup/{token}` → `/setup/persona`
- SMTP invite email via `noreply@oneskyit.com`

### [UI] Mobile-friendly header — 2026-03
- Backend toggle, font size, theme buttons moved into ⚙ settings panel
- Header reduced to core buttons

### [UI] Help & Reference — 2026-03-27
- Shared base at `cortex/static/HELP.md` (served to all users)
- Persona-specific additions appended from `home/{username}/persona/{name}/HELP.md` if present
- Collapsible H2 sections via `<details>` elements

### [Backend] Gemini CLI backend — 2026-03
- `gemini -p` subprocess, streaming output; auth check at `/auth/status`

### [Backend] Memory distiller — 2026-03
- APScheduler: `distill_short` (daily 03:00), `distill_mid` (weekly Sun 03:30), `distill_long` (monthly 1st 04:00)
- Writes to `MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md` per persona

### [Backend] Session logging + file browser — 2026-03
- Sessions saved to `home/{user}/persona/{name}/sessions/`
- Files panel in UI browses persona directory

### [Backend] Dispatcher core — 2026-03-04
- FastAPI service with streaming SSE response
- Claude CLI and Gemini CLI subprocess backends
- Session context management (rolling window, `MAX_HISTORY_MESSAGES`)