# Cortex / Inara β Agent Task List
> Read this file before starting any work on this project.
> **Status:** Active development β ongoing.
---
## π΄ High Priority
### [Local] Tool-capable local orchestrator
Design and implement `local_orchestrator_engine.py` β a ReAct tool loop driven by
a local model via Open WebUI's OpenAI-compatible API, as an alternative to the
Gemini API orchestrator for private/offline tasks.
- [ ] Convert existing Cortex tool definitions (`cortex/tools/`) from Gemini
`FunctionDeclaration` format to OpenAI `tools` format (minor schema diff)
- [ ] Implement tool loop: send tools β parse `tool_calls` response β execute β
append result β loop until `finish_reason: stop`
- [ ] Wire into `routers/orchestrator.py` β new `mode` param: `"local"` vs `"gemini"`
- [ ] UI: Agent mode button routes to local orchestrator when local backend active
- [ ] Recommended models (scott_gaming, 8 GB VRAM):
Gemma 4 E4B β 25 t/s, 72k practical ctx β interactive/fast tasks
Gemma 4 26B A4B β 9 t/s, 50k practical ctx β heavier reasoning, background tasks
- Reference: `docs/OPEN_WEBUI_API.md` for full tool call request/response format
---
## π‘ Medium Priority
### [UI] Progressive Web App (PWA)
Low effort, meaningful mobile UX improvement β install Cortex as a home screen app.
- [ ] Add `manifest.json` (name, icons, theme color, display: standalone, start_url)
- [ ] Serve `manifest.json` from `cortex/routers/ui.py` or as a static file
- [ ] Add `` to `index.html`
- [ ] Basic service worker for offline shell (cache static assets; network-first for API)
- [ ] Register service worker in `app.js`
- [ ] Test on iOS (Safari) and Android (Chrome) β both support PWA install prompts
### [Channel] Proactive notifications
Inara reaches out on her own initiative via NC Talk or Google Chat when a reminder
fires, a cron job completes, or something else warrants attention. The cron/reminder
infrastructure already exists β this closes the loop so she can interrupt the user.
- [ ] Add outbound message helper for NC Talk (`send_nextcloud_message(user, text)`)
- [ ] Add outbound message helper for Google Chat (`send_google_chat_message(user, text)`)
- [ ] Wire cron job completion and reminder triggers to call outbound helper
- [ ] Store user preference: which channel to use for proactive notifications
- [ ] `channels.json` already per-user β add `notify_channel: "nextcloud" | "google_chat" | null`
### [UI] File attachments in chat
Upload an image or document inline and have it flow into context. Natural workflow
("here's this PDF, summarize it"); local backend already supports multimodal via Open WebUI.
- [ ] Add attachment button to input area (paperclip icon, hidden file input)
- [ ] Client: encode file as base64 or multipart; send alongside message text
- [ ] Server: accept file in `POST /chat`; route to appropriate backend
- Claude: `content` array with `image` blocks (base64 or URL)
- Gemini: `parts` array with `inline_data`
- Local (Open WebUI): `content` array with image_url items
- [ ] UI: show thumbnail/filename above the sent message
### [Models] Edit existing model entries
Currently models can only be removed and re-added. Add an edit flow so fields
(display name, model ID, context size, tags, notes) can be updated in-place.
- [ ] Add "Edit" button next to each model row in `local_llm.html` (alongside Remove)
- [ ] Populate the Add Model form with the model's current values when edit is clicked
- [ ] On save, `PATCH` or delete+recreate via `user_settings.py`
- [ ] Applies to both local and (future) cloud model entries
### [Auth] Encrypted sessions
Allow users to opt-in to per-session encryption so session logs on disk cannot be
read without the user's key.
- [ ] Design key derivation: password-based (PBKDF2/Argon2) or separate passphrase
- [ ] Encrypt `session_logger.py` output before writing to `sessions/*.md`
- [ ] Decrypt on read in `session_store.py` (history reload, file browser)
- [ ] UI toggle in Settings to enable/disable encrypted sessions per persona
- [ ] Decide: encrypt at rest only, or also in-memory session store?
- [ ] Consider: how distillation and session search interact with encrypted files
### [Models] Model Registry V2 β Unified Provider System
See `DESIGN__Model_Registry_V2.md` for full design.
- [x] **Phase 1** β V2 schema with providers (Anthropic/Google), multi-account Gemini, auto migration, orchestrator uses account API key β 2026-04-27
- [ ] **Phase 2** β Cloud provider UI: Anthropic + Google sections in `/settings/models`, account management, model entry creation for cloud models
- [ ] **Phase 3** β Unified roles + toggle redesign: standalone role assignments, chat toggle cycles role slots (Primary/Backup 1/Backup 2) showing model label
- [ ] **Phase 4** β Polish: Claude API key, OpenRouter as named provider, catalog sync from API
### [Intelligence] Knowledge consolidation β Phase 1
See `ARCH__Intelligence_Layer.md` for full design.
- [x] Tool: `ae_journal_list` β list all journals for the account β 2026-04-28
- [x] Tool: `ae_journal_search` β search before creating to avoid duplicates
- [x] Tool: `ae_journal_entry_create` β write a new entry with source metadata
- [x] Tool: `ae_journal_entry_update` β PATCH any fields on an existing entry β 2026-04-28
- [x] Tool: `ae_journal_entry_disable` β soft-delete via enable=false β 2026-04-28
- [x] Tool: `ae_journal_entry_append` β readβappend timestamped sectionβwrite (running logs) β 2026-04-28
- [x] Tool: `ae_journal_entry_prepend` β readβprepend timestamped sectionβwrite (newest-first logs) β 2026-04-28
- [ ] Import script: walk a markdown directory, chunk by H2 section, create entries
- [ ] Target: markdown files from `~/DgrZone_Nextcloud/` and `~/OSIT_Nextcloud/`
- [ ] Tag strategy: source path, date, topic tags from frontmatter or filename
### [Distill] Review first auto_distill_long output β 2026-04-01
- Ran April 1 at 04:00 as scheduled
- Manually review `inara/MEMORY_LONG.md` β confirm quality before fully trusting
- Adjust distill prompts in `cortex/memory_distiller.py` if needed
### [Distill] Distill quality review
- Short/mid/long distill prompts live in `cortex/memory_distiller.py`
- After first few automatic runs, review quality and tune
### [Local] Unsloth Gemma 4 variants
- Unsloth Dynamic 2.0 Q4_K_M GGUFs fail with `500: unable to load model` on Ollama v0.20.0
- Root cause: Ollama's bundled llama.cpp doesn't recognize Gemma 4 GGUF architecture metadata from raw files
- Waiting on Ollama point release (v0.20.1+) β then switch Open WebUI to Unsloth variants
- Expected speedup: ~10β20% smaller context footprint vs baseline, same quality
- `agent-support-gemma-small` β Unsloth E4B Q4_K_M; `agent-support-gemma-medium` β Unsloth 26B A4B Q4_K_M
---
## π’ Lower Priority / Future
### [Sessions] Cross-session search
The file browser has per-file session search, but no way to query across all sessions
for a persona. A unified search would make the session archive useful as a knowledge source.
- [ ] `POST /sessions/search?q=...` β walks `home/{user}/persona/{name}/sessions/*.md`, returns matching excerpts with date + line context
- [ ] UI: search input in file browser sidebar already present β wire to new endpoint
- [ ] Consider: index on startup vs. live grep (live grep is fine at typical session volume)
### [Backend] API usage / cost tracking
Multi-user setup with real Gemini/Claude API costs. Track per-user token consumption
so Scott can see who's spending what.
- [ ] Count input + output tokens per `/chat` and `/orchestrate` call (all backends return usage)
- [ ] Append to `home/{user}/usage.json` β daily buckets, per-model breakdown
- [ ] Expose via `/api/usage` endpoint; add a summary row to the Settings page
- [ ] Optional: soft spending limit with a warning toast when exceeded
### [Intelligence] Dev agent pipeline
See `ARCH__Intelligence_Layer.md`. Full design not yet started.
- [ ] Specialist agent: frontend (SvelteKit) code changes
- [ ] Specialist agent: backend (FastAPI) code changes
- [ ] Supervisor agent: diff review, syntax check, test runner
- [ ] Gitea webhook integration: trigger on push/PR, report back
- [ ] Human approval gate before commit
### [Intelligence] Supervisor agent
- Runs `py_compile`, `svelte-check`, unit tests after specialist agent work
- Reports pass/fail back to orchestrator
- Only commits on explicit approval
### [Channel] Gitea webhooks
- Receive push/PR/issue events β route to appropriate agent
- `cortex/routers/` already has pattern; add `gitea.py`
- Gitea Actions (CI) for "run tests on push" β simpler than custom runner
### [Local] RAG via Open WebUI
Open WebUI has a full RAG pipeline (file upload β embed β knowledge collections β
reference in chat). Could feed Nextcloud docs or session logs into a local knowledge
base accessible to local models. Endpoints documented in `docs/OPEN_WEBUI_API.md`.
- `/api/v1/files/` upload + `/api/v1/retrieval/process/web` for URLs
- Reference in chat via `"files": [{"type": "collection", "id": "..."}]`
### [Backend] Intelligent model routing
- Currently hardcoded: Claude default, Gemini fallback, local third
- Design direction (now informed by real local model perf):
- **Private/offline tasks** β local (Gemma 4 E4B for speed, 26B A4B for reasoning)
- **Complex tool tasks / long context** β Gemini (1M token context, strong function calling)
- **Final user-facing responses** β Claude (quality prose, persona fidelity)
- Future: auto-route by task type rather than requiring user to toggle backend manually
---
## β Completed
### [UI] Input area polish β 2026-04-28
- Single cycling S/M/L button replaces 3 separate height buttons (same UX as font size)
- S size collapses mode-select to a row (compact); M/L keep vertical column layout
- Input height minimum derived from setting so empty textarea reflects selected size
- Context & Memory panel and Settings dropdown are mutually exclusive (closeAllPanels fix)
- Both panels now use consistent shadow (var(--shadow)) and z-index (200)
### [Tools] Tools toggle β decoupled from Role/Backend β 2026-04-28
- Removed "Agent" mode from the mode selector; replaced with independent β‘ toggle
- `toolsEnabled` persists in localStorage; routes to orchestrator regardless of active mode
- Layout: column (M/L) or row (S) driven by `data-size` attribute set by JS
- chat_role flows from UI β OrchestrateRequest β orchestrator_engine.run(response_role=...)
### [Tools] shell_exec tool β 2026-04-28
- `shell_exec(command, working_dir, timeout)` in `cortex/tools/system.py`
- Runs any shell command on the Cortex host; timeout clamped 1β120s
- Use for system diagnostics: `df -h`, `ps aux`, `journalctl`, `free -h`, etc.
### [Tools] Aether Journals full toolkit β 2026-04-28
- `ae_journal_list` β list all journals + ids for the account
- `ae_journal_entry_update` β PATCH any fields (title, content, summary, tags, enable)
- `ae_journal_entry_disable` β soft-delete via enable=false
- `ae_journal_entry_append` β readβappend timestamped sectionβwrite (running/data logs)
- `ae_journal_entry_prepend` β readβprepend timestamped sectionβwrite (newest-first)
- Shared `_get_entry` / `_patch_entry` helpers; OpenAI JSON Schema auto-derived from Gemini declarations
### [Local] Per-user multi-model local LLM settings β 2026-04-01
- `home/{username}/local_llm.json` β `hosts[]` + `models[]` + `active_model_id` structure
- `cortex/user_settings.py` β CRUD functions: save_host, add_model, remove_model, set_active_model, get_active_local_model
- `cortex/routers/local_llm.py` + `cortex/static/local_llm.html` β dedicated `/settings/local` page
- "Fetch models from host" button β proxied via `/api/local-llm/fetch-models`, populates dropdown
- Active model shown in UI near backend toggle button (amber hint text)
- Migrates old flat `.env`-style config automatically on first use
### [UI] Copy button for user (sent) messages β 2026-04-01
- Added matching copy-on-hover button to user messages (same pattern as assistant messages)
- `div.dataset.raw` set on send; `makeCopyBtn(div)` appended inline
### [Backend] Local model backend (Open WebUI / Ollama) β 2026-04-01
- OpenAI-compatible API via `httpx` β no CLI wrapper needed
- Configured via `LOCAL_API_URL` / `LOCAL_API_KEY` / `LOCAL_MODEL` in `.env`
- Backend toggle cycles `claude β gemini β local` (amber color in UI)
- `/auth/status` includes local reachability check (`GET /api/models`)
- Tested end-to-end: `test-agent-simple` (Qwen3-8B) on `scott-lt-i7-rtx:3000`, full persona context flowing correctly
### [Testing] Gitea SSH port 2222 β 2026-03-29
- pfSense WAN β 192.168.32.7:2222 port forward confirmed working
- `ssh -p 2222 git@git.dgrzone.com` reaches Gitea (returns "Invalid repository path" β expected, confirms connectivity)
- Clone/push via SSH: `git clone ssh://git@git.dgrzone.com:2222//.git`
### [Multi-user] Brian onboarding β 2026-03-29
- Invite sent to `memedrift@gmail.com`
- Brian completed onboarding, created `wintermute` persona
- Google OAuth registered (`google-add brian memedrift@gmail.com`)
### [Tools] Reminders tools β 2026-03-29
- `reminders_add`, `reminders_list`, `reminders_clear` added to orchestrator tool suite
- Tools live in `cortex/tools/reminders.py`
- All persona PROTOCOLS.md updated with Tools & Modes reference (direct chat vs Agent mode)
- `persona_template.py` updated so new personas get the protocol automatically
### [Auth] Token expiry β no restart needed β 2026-03-27
- `llm_client._fresh_claude_token()` reads live from `~/.claude/.credentials.json` on every call
- systemd service is a user unit (no sudo) β `systemctl --user restart cortex` is sufficient
- No manual token sync required after `claude auth login`
### [Multi-user] Per-user channel config β 2026-03-27
- Google Chat and NC Talk secrets/config moved from `.env` to `home/{username}/channels.json`
- New endpoints: `POST /channels/google-chat/{username}` and `POST /webhook/nextcloud/{username}`
- No channel access by default β each user configures their own `channels.json`
- Setup guides: `docs/GOOGLE_CHAT_BOT.md` and `docs/NEXTCLOUD_TALK_BOT.md`
### [Auth] Google OAuth sign-in β 2026-03-27
- `GET /auth/google` β Google consent β `GET /auth/google/callback` flow
- Users pre-registered via `manage_passwords.py google-add `
- Google sign-in button on `/login`; auth.json stores `google_sub` + `google_email`
- Active users: scott (scott.idem@oneskyit.com), holly (holly.danner@gmail.com), brian (memedrift@gmail.com)
### [Settings] Per-user Gemini API key β 2026-03-27
- Stored in `home/{username}/auth.json` as `gemini_api_key`
- Orchestrator uses user key if set, falls back to server-level `GEMINI_API_KEY`
- Manageable via `/settings` UI (add, remove, masked hint)
### [UI] Session persistence across navigation β 2026-03-26
- localStorage keyed to `cx_sid_{user}_{persona}` with 30-min inactivity TTL
- Auto-restored silently on page load; cleared on "New session" or session delete
### [UI] Persona picker page β 2026-03-26
- `GET /{username}` shows a card grid of available personas instead of 404
- Each card links directly to `/{username}/{persona}`
### [UI] Lucide icons β 2026-03-25
- Icons throughout: mode selector, send/stop buttons, edit/del/copy, save/cancel
- Loaded via UMD CDN; `icon_html()` + `render_icons()` helpers in `app.js`
### [UI] Persona-specific favicon β 2026-03-25
- Emoji SVG favicon generated from persona config at load time
### [Multi-user] Holly onboarding β 2026-03-20
- Holly's invite sent; onboarding completed via `/setup/{token}`
- `home/holly/persona/tina/` created from template
- Google OAuth registered (`holly.danner@gmail.com`)
### [Channel] Nextcloud Talk integration β β 2026-03-20, updated 2026-03-27
- HMAC verification: incoming uses `random + raw_body`; outgoing reply uses `random + message_text`
- Per-user routing added 2026-03-27 (endpoint: `/webhook/nextcloud/{username}`)
- Docs: `docs/NEXTCLOUD_TALK_BOT.md`
### [Channel] Google Chat integration β β 2026-03-20, updated 2026-03-27
- JWT verification via `authorizationEventObject.systemIdToken`
- Workspace Add-on format: `hostAppDataAction.chatDataAction.createMessageAction`
- Per-user routing added 2026-03-27 (endpoint: `/channels/google-chat/{username}`)
- Docs: `docs/GOOGLE_CHAT_BOT.md`
### [Intelligence] Orchestrator service β Phase 1 β 2026-03-18
- Gemini API (google-genai SDK) tool loop β Claude final response
- `POST /orchestrate` (async job), `GET /orchestrate/{job_id}` (poll)
- Tools: web search, AE API, file read, task list, scratch, reminders, cron
- Default model: `gemini-2.5-flash`
### [Auth] Session auth + persona onboarding β 2026-03-20
- bcrypt passwords in `home/{username}/auth.json`
- JWT session cookies (HS256, 30-day expiry)
- Invite tokens (72h, one-time-use) β `manage_passwords.py invite [email]`
- Self-service onboarding: `/setup/{token}` β `/setup/persona`
- SMTP invite email via `noreply@oneskyit.com`
### [UI] Mobile-friendly header β 2026-03
- Backend toggle, font size, theme buttons moved into β settings panel
- Header reduced to core buttons
### [UI] Help & Reference β 2026-03-27
- Shared base at `cortex/static/HELP.md` (served to all users)
- Persona-specific additions appended from `home/{username}/persona/{name}/HELP.md` if present
- Collapsible H2 sections via `` elements
### [Backend] Gemini CLI backend β 2026-03
- `gemini -p` subprocess, streaming output; auth check at `/auth/status`
### [Backend] Memory distiller β 2026-03
- APScheduler: `distill_short` (daily 03:00), `distill_mid` (weekly Sun 03:30), `distill_long` (monthly 1st 04:00)
- Writes to `MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md` per persona
### [Backend] Session logging + file browser β 2026-03
- Sessions saved to `home/{user}/persona/{name}/sessions/`
- Files panel in UI browses persona directory
### [Backend] Dispatcher core β 2026-03-04
- FastAPI service with streaming SSE response
- Claude CLI and Gemini CLI subprocess backends
- Session context management (rolling window, `MAX_HISTORY_MESSAGES`)