18 KiB
Cortex / Inara — Agent Task List
Read this file before starting any work on this project. Status: Active development — ongoing.
🔴 High Priority
[Local] Tool-capable local orchestrator
Design and implement local_orchestrator_engine.py — a ReAct tool loop driven by
a local model via Open WebUI's OpenAI-compatible API, as an alternative to the
Gemini API orchestrator for private/offline tasks.
- Convert existing Cortex tool definitions (
cortex/tools/) from GeminiFunctionDeclarationformat to OpenAItoolsformat (minor schema diff) - Implement tool loop: send tools → parse
tool_callsresponse → execute → append result → loop untilfinish_reason: stop - Wire into
routers/orchestrator.py— newmodeparam:"local"vs"gemini" - UI: Agent mode button routes to local orchestrator when local backend active
- Recommended models (scott_gaming, 8 GB VRAM): Gemma 4 E4B — 25 t/s, 72k practical ctx — interactive/fast tasks Gemma 4 26B A4B — 9 t/s, 50k practical ctx — heavier reasoning, background tasks
- Reference:
docs/OPEN_WEBUI_API.mdfor full tool call request/response format
🟡 Medium Priority
[UI] Progressive Web App (PWA)
Low effort, meaningful mobile UX improvement — install Cortex as a home screen app.
- Add
manifest.json(name, icons, theme color, display: standalone, start_url) - Serve
manifest.jsonfromcortex/routers/ui.pyor as a static file - Add
<link rel="manifest">toindex.html - Basic service worker for offline shell (cache static assets; network-first for API)
- Register service worker in
app.js - Test on iOS (Safari) and Android (Chrome) — both support PWA install prompts
[Channel] Proactive notifications
Inara reaches out on her own initiative via NC Talk or Google Chat when a reminder fires, a cron job completes, or something else warrants attention. The cron/reminder infrastructure already exists — this closes the loop so she can interrupt the user.
- Add outbound message helper for NC Talk (
send_nextcloud_message(user, text)) - Add outbound message helper for Google Chat (
send_google_chat_message(user, text)) - Wire cron job completion and reminder triggers to call outbound helper
- Store user preference: which channel to use for proactive notifications
channels.jsonalready per-user — addnotify_channel: "nextcloud" | "google_chat" | null
[UI] File attachments in chat
Upload an image or document inline and have it flow into context. Natural workflow ("here's this PDF, summarize it"); local backend already supports multimodal via Open WebUI.
- Add attachment button to input area (paperclip icon, hidden file input)
- Client: encode file as base64 or multipart; send alongside message text
- Server: accept file in
POST /chat; route to appropriate backend- Claude:
contentarray withimageblocks (base64 or URL) - Gemini:
partsarray withinline_data - Local (Open WebUI):
contentarray with image_url items
- Claude:
- UI: show thumbnail/filename above the sent message
[Models] Edit existing model entries
Currently models can only be removed and re-added. Add an edit flow so fields (display name, model ID, context size, tags, notes) can be updated in-place.
- Add "Edit" button next to each model row in
local_llm.html(alongside Remove) - Populate the Add Model form with the model's current values when edit is clicked
- On save,
PATCHor delete+recreate viauser_settings.py - Applies to both local and (future) cloud model entries
[Auth] Encrypted sessions
Allow users to opt-in to per-session encryption so session logs on disk cannot be read without the user's key.
- Design key derivation: password-based (PBKDF2/Argon2) or separate passphrase
- Encrypt
session_logger.pyoutput before writing tosessions/*.md - Decrypt on read in
session_store.py(history reload, file browser) - UI toggle in Settings to enable/disable encrypted sessions per persona
- Decide: encrypt at rest only, or also in-memory session store?
- Consider: how distillation and session search interact with encrypted files
[Models] Model Registry V2 — Unified Provider System
See DESIGN__Model_Registry_V2.md for full design.
- Phase 1 — V2 schema with providers (Anthropic/Google), multi-account Gemini, auto migration, orchestrator uses account API key — 2026-04-27
- Phase 2 — Cloud provider UI: Anthropic + Google sections in
/settings/models, account management, model entry creation for cloud models - Phase 3 — Unified roles + toggle redesign: standalone role assignments, chat toggle cycles role slots (Primary/Backup 1/Backup 2) showing model label
- Phase 4 — Polish: Claude API key, OpenRouter as named provider, catalog sync from API
[Intelligence] Knowledge consolidation — Phase 1
See ARCH__Intelligence_Layer.md for full design.
- Tool:
ae_journal_list— list all journals for the account — 2026-04-28 - Tool:
ae_journal_search— search before creating to avoid duplicates - Tool:
ae_journal_entry_create— write a new entry with source metadata - Tool:
ae_journal_entry_update— PATCH any fields on an existing entry — 2026-04-28 - Tool:
ae_journal_entry_disable— soft-delete via enable=false — 2026-04-28 - Tool:
ae_journal_entry_append— read→append timestamped section→write (running logs) — 2026-04-28 - Tool:
ae_journal_entry_prepend— read→prepend timestamped section→write (newest-first logs) — 2026-04-28 - Import script: walk a markdown directory, chunk by H2 section, create entries
- Target: markdown files from
~/DgrZone_Nextcloud/and~/OSIT_Nextcloud/ - Tag strategy: source path, date, topic tags from frontmatter or filename
[Distill] Review first auto_distill_long output — 2026-04-01
- Ran April 1 at 04:00 as scheduled
- Manually review
inara/MEMORY_LONG.md— confirm quality before fully trusting - Adjust distill prompts in
cortex/memory_distiller.pyif needed
[Distill] Distill quality review
- Short/mid/long distill prompts live in
cortex/memory_distiller.py - After first few automatic runs, review quality and tune
[Local] Unsloth Gemma 4 variants
- Unsloth Dynamic 2.0 Q4_K_M GGUFs fail with
500: unable to load modelon Ollama v0.20.0 - Root cause: Ollama's bundled llama.cpp doesn't recognize Gemma 4 GGUF architecture metadata from raw files
- Waiting on Ollama point release (v0.20.1+) — then switch Open WebUI to Unsloth variants
- Expected speedup: ~10–20% smaller context footprint vs baseline, same quality
agent-support-gemma-small→ Unsloth E4B Q4_K_M;agent-support-gemma-medium→ Unsloth 26B A4B Q4_K_M
🟢 Lower Priority / Future
[Sessions] Cross-session search
The file browser has per-file session search, but no way to query across all sessions for a persona. A unified search would make the session archive useful as a knowledge source.
POST /sessions/search?q=...— walkshome/{user}/persona/{name}/sessions/*.md, returns matching excerpts with date + line context- UI: search input in file browser sidebar already present — wire to new endpoint
- Consider: index on startup vs. live grep (live grep is fine at typical session volume)
[Backend] API usage / cost tracking
Multi-user setup with real Gemini/Claude API costs. Track per-user token consumption so Scott can see who's spending what.
- Count input + output tokens per
/chatand/orchestratecall (all backends return usage) - Append to
home/{user}/usage.json— daily buckets, per-model breakdown - Expose via
/api/usageendpoint; add a summary row to the Settings page - Optional: soft spending limit with a warning toast when exceeded
[Intelligence] Dev agent pipeline
See ARCH__Intelligence_Layer.md. Full design not yet started.
- Specialist agent: frontend (SvelteKit) code changes
- Specialist agent: backend (FastAPI) code changes
- Supervisor agent: diff review, syntax check, test runner
- Gitea webhook integration: trigger on push/PR, report back
- Human approval gate before commit
[Intelligence] Supervisor agent
- Runs
py_compile,svelte-check, unit tests after specialist agent work - Reports pass/fail back to orchestrator
- Only commits on explicit approval
[Channel] Gitea webhooks
- Receive push/PR/issue events → route to appropriate agent
cortex/routers/already has pattern; addgitea.py- Gitea Actions (CI) for "run tests on push" — simpler than custom runner
[Local] RAG via Open WebUI
Open WebUI has a full RAG pipeline (file upload → embed → knowledge collections →
reference in chat). Could feed Nextcloud docs or session logs into a local knowledge
base accessible to local models. Endpoints documented in docs/OPEN_WEBUI_API.md.
/api/v1/files/upload +/api/v1/retrieval/process/webfor URLs- Reference in chat via
"files": [{"type": "collection", "id": "..."}]
[Backend] Intelligent model routing
- Currently hardcoded: Claude default, Gemini fallback, local third
- Design direction (now informed by real local model perf):
- Private/offline tasks → local (Gemma 4 E4B for speed, 26B A4B for reasoning)
- Complex tool tasks / long context → Gemini (1M token context, strong function calling)
- Final user-facing responses → Claude (quality prose, persona fidelity)
- Future: auto-route by task type rather than requiring user to toggle backend manually
✅ Completed
[UI] Input area polish — 2026-04-28
- Single cycling S/M/L button replaces 3 separate height buttons (same UX as font size)
- S size collapses mode-select to a row (compact); M/L keep vertical column layout
- Input height minimum derived from setting so empty textarea reflects selected size
- Context & Memory panel and Settings dropdown are mutually exclusive (closeAllPanels fix)
- Both panels now use consistent shadow (var(--shadow)) and z-index (200)
[Tools] Tools toggle — decoupled from Role/Backend — 2026-04-28
- Removed "Agent" mode from the mode selector; replaced with independent ⚡ toggle
toolsEnabledpersists in localStorage; routes to orchestrator regardless of active mode- Layout: column (M/L) or row (S) driven by
data-sizeattribute set by JS - chat_role flows from UI → OrchestrateRequest → orchestrator_engine.run(response_role=...)
[Tools] shell_exec tool — 2026-04-28
shell_exec(command, working_dir, timeout)incortex/tools/system.py- Runs any shell command on the Cortex host; timeout clamped 1–120s
- Use for system diagnostics:
df -h,ps aux,journalctl,free -h, etc.
[Tools] Aether Journals full toolkit — 2026-04-28
ae_journal_list— list all journals + ids for the accountae_journal_entry_update— PATCH any fields (title, content, summary, tags, enable)ae_journal_entry_disable— soft-delete via enable=falseae_journal_entry_append— read→append timestamped section→write (running/data logs)ae_journal_entry_prepend— read→prepend timestamped section→write (newest-first)- Shared
_get_entry/_patch_entryhelpers; OpenAI JSON Schema auto-derived from Gemini declarations
[Local] Per-user multi-model local LLM settings — 2026-04-01
home/{username}/local_llm.json—hosts[]+models[]+active_model_idstructurecortex/user_settings.py— CRUD functions: save_host, add_model, remove_model, set_active_model, get_active_local_modelcortex/routers/local_llm.py+cortex/static/local_llm.html— dedicated/settings/localpage- "Fetch models from host" button — proxied via
/api/local-llm/fetch-models, populates dropdown - Active model shown in UI near backend toggle button (amber hint text)
- Migrates old flat
.env-style config automatically on first use
[UI] Copy button for user (sent) messages — 2026-04-01
- Added matching copy-on-hover button to user messages (same pattern as assistant messages)
div.dataset.rawset on send;makeCopyBtn(div)appended inline
[Backend] Local model backend (Open WebUI / Ollama) — 2026-04-01
- OpenAI-compatible API via
httpx— no CLI wrapper needed - Configured via
LOCAL_API_URL/LOCAL_API_KEY/LOCAL_MODELin.env - Backend toggle cycles
claude → gemini → local(amber color in UI) /auth/statusincludes local reachability check (GET /api/models)- Tested end-to-end:
test-agent-simple(Qwen3-8B) onscott-lt-i7-rtx:3000, full persona context flowing correctly
[Testing] Gitea SSH port 2222 — 2026-03-29
- pfSense WAN → 192.168.32.7:2222 port forward confirmed working
ssh -p 2222 git@git.dgrzone.comreaches Gitea (returns "Invalid repository path" — expected, confirms connectivity)- Clone/push via SSH:
git clone ssh://git@git.dgrzone.com:2222/<user>/<repo>.git
[Multi-user] Brian onboarding — 2026-03-29
- Invite sent to
memedrift@gmail.com - Brian completed onboarding, created
wintermutepersona - Google OAuth registered (
google-add brian memedrift@gmail.com)
[Tools] Reminders tools — 2026-03-29
reminders_add,reminders_list,reminders_clearadded to orchestrator tool suite- Tools live in
cortex/tools/reminders.py - All persona PROTOCOLS.md updated with Tools & Modes reference (direct chat vs Agent mode)
persona_template.pyupdated so new personas get the protocol automatically
[Auth] Token expiry — no restart needed — 2026-03-27
llm_client._fresh_claude_token()reads live from~/.claude/.credentials.jsonon every call- systemd service is a user unit (no sudo) —
systemctl --user restart cortexis sufficient - No manual token sync required after
claude auth login
[Multi-user] Per-user channel config — 2026-03-27
- Google Chat and NC Talk secrets/config moved from
.envtohome/{username}/channels.json - New endpoints:
POST /channels/google-chat/{username}andPOST /webhook/nextcloud/{username} - No channel access by default — each user configures their own
channels.json - Setup guides:
docs/GOOGLE_CHAT_BOT.mdanddocs/NEXTCLOUD_TALK_BOT.md
[Auth] Google OAuth sign-in — 2026-03-27
GET /auth/google→ Google consent →GET /auth/google/callbackflow- Users pre-registered via
manage_passwords.py google-add <user> <email> - Google sign-in button on
/login; auth.json storesgoogle_sub+google_email - Active users: scott (scott.idem@oneskyit.com), holly (holly.danner@gmail.com), brian (memedrift@gmail.com)
[Settings] Per-user Gemini API key — 2026-03-27
- Stored in
home/{username}/auth.jsonasgemini_api_key - Orchestrator uses user key if set, falls back to server-level
GEMINI_API_KEY - Manageable via
/settingsUI (add, remove, masked hint)
[UI] Session persistence across navigation — 2026-03-26
- localStorage keyed to
cx_sid_{user}_{persona}with 30-min inactivity TTL - Auto-restored silently on page load; cleared on "New session" or session delete
[UI] Persona picker page — 2026-03-26
GET /{username}shows a card grid of available personas instead of 404- Each card links directly to
/{username}/{persona}
[UI] Lucide icons — 2026-03-25
- Icons throughout: mode selector, send/stop buttons, edit/del/copy, save/cancel
- Loaded via UMD CDN;
icon_html()+render_icons()helpers inapp.js
[UI] Persona-specific favicon — 2026-03-25
- Emoji SVG favicon generated from persona config at load time
[Multi-user] Holly onboarding — 2026-03-20
- Holly's invite sent; onboarding completed via
/setup/{token} home/holly/persona/tina/created from template- Google OAuth registered (
holly.danner@gmail.com)
[Channel] Nextcloud Talk integration ✅ — 2026-03-20, updated 2026-03-27
- HMAC verification: incoming uses
random + raw_body; outgoing reply usesrandom + message_text - Per-user routing added 2026-03-27 (endpoint:
/webhook/nextcloud/{username}) - Docs:
docs/NEXTCLOUD_TALK_BOT.md
[Channel] Google Chat integration ✅ — 2026-03-20, updated 2026-03-27
- JWT verification via
authorizationEventObject.systemIdToken - Workspace Add-on format:
hostAppDataAction.chatDataAction.createMessageAction - Per-user routing added 2026-03-27 (endpoint:
/channels/google-chat/{username}) - Docs:
docs/GOOGLE_CHAT_BOT.md
[Intelligence] Orchestrator service — Phase 1 — 2026-03-18
- Gemini API (google-genai SDK) tool loop → Claude final response
POST /orchestrate(async job),GET /orchestrate/{job_id}(poll)- Tools: web search, AE API, file read, task list, scratch, reminders, cron
- Default model:
gemini-2.5-flash
[Auth] Session auth + persona onboarding — 2026-03-20
- bcrypt passwords in
home/{username}/auth.json - JWT session cookies (HS256, 30-day expiry)
- Invite tokens (72h, one-time-use) —
manage_passwords.py invite <user> [email] - Self-service onboarding:
/setup/{token}→/setup/persona - SMTP invite email via
noreply@oneskyit.com
[UI] Mobile-friendly header — 2026-03
- Backend toggle, font size, theme buttons moved into ⚙ settings panel
- Header reduced to core buttons
[UI] Help & Reference — 2026-03-27
- Shared base at
cortex/static/HELP.md(served to all users) - Persona-specific additions appended from
home/{username}/persona/{name}/HELP.mdif present - Collapsible H2 sections via
<details>elements
[Backend] Gemini CLI backend — 2026-03
gemini -psubprocess, streaming output; auth check at/auth/status
[Backend] Memory distiller — 2026-03
- APScheduler:
distill_short(daily 03:00),distill_mid(weekly Sun 03:30),distill_long(monthly 1st 04:00) - Writes to
MEMORY_SHORT.md,MEMORY_MID.md,MEMORY_LONG.mdper persona
[Backend] Session logging + file browser — 2026-03
- Sessions saved to
home/{user}/persona/{name}/sessions/ - Files panel in UI browses persona directory
[Backend] Dispatcher core — 2026-03-04
- FastAPI service with streaming SSE response
- Claude CLI and Gemini CLI subprocess backends
- Session context management (rolling window,
MAX_HISTORY_MESSAGES)