Files

Scott Idem a5658eb3c4 feat: edit existing model entries in the Model Registry

- Inline edit form per model row (label, model name/ID, host/account, context, tags)
- Fetch models button in edit form for local models — same live-picker UX as Add Model
- POST /settings/local/models/{id}/edit route in local_llm.py
- Admin role badge (ADMIN/USER pill) in Account Settings page
- HELP.md updated: new tools table with admin/confirm markers, PWA install section
- TODO updated: tool expansions marked done, distill review and Unsloth resolved,
  role-based access and admin badge added to completed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-29 21:08:09 -04:00

22 KiB

Raw Blame History

Cortex / Inara — Agent Task List

Read this file before starting any work on this project. Status: Active development — ongoing.

🔴 High Priority

[Local] Local orchestrator — reach full parity with Gemini orchestrator

openai_orchestrator.py is partially built and wired into POST /orchestrate. When the orchestrator role resolves to a local_openai model it routes there automatically. Remaining work is quality/reliability parity, not ground-up design.

Audit tool schema conversion — Gemini FunctionDeclaration → OpenAI tools format (minor field rename, already partially done)
Context budget enforcement per iteration (40–50k for E4B, 35–40k for 26B A4B)
Context compaction — trim stale tool results mid-run when approaching limit
Error handling parity with Gemini orchestrator (retry logic, malformed tool calls)
Test end-to-end with Gemma 4 E4B and 26B A4B on scott_gaming
Review ARCH__FUTURE.md agent architecture ideas before finalising design
Reference: docs/OPEN_WEBUI_API.md, documentation/ARCH__FUTURE.md §1

🟡 Medium Priority

[UI] Progressive Web App (PWA) ✅ — 2026-04-29

manifest.json, sw.js, icon-192/512.png, SW registration in app.js
/manifest.json and /sw.js served at root; added to _PUBLIC in auth_middleware
Tested: install prompt confirmed working in Chromium

[Tools] Orchestrator tool expansions

New tools for cortex/tools/ — higher-value additions that fill obvious gaps.

cortex_restart — detached subprocess, 5s delay, admin-only, confirm-required — 2026-04-29
cortex_logs — journalctl --user -u cortex -n N, admin-only — 2026-04-29
http_fetch — direct URL fetch via httpx, 8192 char cap — 2026-04-29
file_list — directory listing with size, dirs first, 200 entry cap, admin-only — 2026-04-29
file_write — overwrite/append to home_root paths, admin-only, confirm-required — 2026-04-29
nc_talk_send — outbound NC Talk message via notification.py, admin-only — 2026-04-29
email_send — send email via existing email_utils.py SMTP helper
web_push — send a browser push notification (requires push subscription stored per-user; pairs well with the PWA service worker already in place)

[Channel] Proactive notifications

Inara reaches out on her own initiative via NC Talk or Google Chat when a reminder fires, a cron job completes, or something else warrants attention. The cron/reminder infrastructure already exists — this closes the loop so she can interrupt the user.

Add outbound message helper for NC Talk (send_nextcloud_message(user, text))
Add outbound message helper for Google Chat (send_google_chat_message(user, text))
Wire cron job completion and reminder triggers to call outbound helper
Store user preference: which channel to use for proactive notifications
channels.json already per-user — add notify_channel: "nextcloud" | "google_chat" | null

[UI] File attachments in chat

Upload an image or document inline and have it flow into context. Natural workflow ("here's this PDF, summarize it"); local backend already supports multimodal via Open WebUI.

Add attachment button to input area (paperclip icon, hidden file input)
Client: encode file as base64 or multipart; send alongside message text
Server: accept file in POST /chat; route to appropriate backend
- Claude: content array with image blocks (base64 or URL)
- Gemini: parts array with inline_data
- Local (Open WebUI): content array with image_url items
UI: show thumbnail/filename above the sent message

[Models] Edit existing model entries

Currently models can only be removed and re-added. Add an edit flow so fields (display name, model ID, context size, tags, notes) can be updated in-place.

Add "Edit" button next to each model row in local_llm.html (alongside Remove)
Populate the Add Model form with the model's current values when edit is clicked
On save, PATCH or delete+recreate via user_settings.py
Applies to both local and (future) cloud model entries

[Auth] Encrypted sessions

Allow users to opt-in to per-session encryption so session logs on disk cannot be read without the user's key.

Design key derivation: password-based (PBKDF2/Argon2) or separate passphrase
Encrypt session_logger.py output before writing to sessions/*.md
Decrypt on read in session_store.py (history reload, file browser)
UI toggle in Settings to enable/disable encrypted sessions per persona
Decide: encrypt at rest only, or also in-memory session store?
Consider: how distillation and session search interact with encrypted files

[Models] Model Registry V2 — Unified Provider System

See DESIGN__Model_Registry_V2.md for full design.

Phase 1 — V2 schema with providers (Anthropic/Google), multi-account Gemini, auto migration, orchestrator uses account API key — 2026-04-27
Phase 2 — Cloud provider UI: Anthropic + Google sections in /settings/models, account management, model entry creation for cloud models
Phase 3 — Unified roles + toggle redesign: standalone role assignments, chat toggle cycles role slots (Primary/Backup 1/Backup 2) showing model label
Phase 4 — Polish: Claude API key, OpenRouter as named provider, catalog sync from API

[Intelligence] Knowledge consolidation — Phase 1

See ARCH__Intelligence_Layer.md for full design.

Tool: ae_journal_list — list all journals for the account — 2026-04-28
Tool: ae_journal_search — search before creating to avoid duplicates
Tool: ae_journal_entry_create — write a new entry with source metadata
Tool: ae_journal_entry_update — PATCH any fields on an existing entry — 2026-04-28
Tool: ae_journal_entry_disable — soft-delete via enable=false — 2026-04-28
Tool: ae_journal_entry_append — read→append timestamped section→write (running logs) — 2026-04-28
Tool: ae_journal_entry_prepend — read→prepend timestamped section→write (newest-first logs) — 2026-04-28
Import script: walk a markdown directory, chunk by H2 section, create entries
Target: markdown files from ~/DgrZone_Nextcloud/ and ~/OSIT_Nextcloud/
Tag strategy: source path, date, topic tags from frontmatter or filename

[Distill] Review first auto_distill_long output — 2026-04-01

Ran April 1 at 04:00 as scheduled
Manually review inara/MEMORY_LONG.md — confirm quality before fully trusting
Adjust distill prompts in cortex/memory_distiller.py if needed

🟢 Lower Priority / Future

[Research] Agent architecture patterns — review before building dev agent pipeline

The Claude Code system prompt was leaked April 2026. Two reimplementation repos have useful design ideas directly applicable to the local orchestrator and dev agent work. Read before finalising either design.

Review https://github.com/HarnessLab/claw-code-agent (Python, targets local models)
Review https://github.com/ultraworkers/claw-code (community port, interesting source)
Key ideas to evaluate for Cortex:
- Tiered permission model (read-only / write / shell / unsafe) — relevant once dev agent is writing and executing code
- Agent lineage tracking — which agent spawned which sub-agent; essential for the orchestrator → specialist → supervisor chain
- Hard token/cost budgets per operation — local models have fixed context ceilings
- Context compaction mid-session — trim stale tool results before hitting limit
- Nested agent delegation with dependency-aware batching
- Plugin/manifest-based tool registration — worth considering before tool suite grows

[Sessions] Cross-session search

The file browser has per-file session search, but no way to query across all sessions for a persona. A unified search would make the session archive useful as a knowledge source.

POST /sessions/search?q=... — walks home/{user}/persona/{name}/sessions/*.md, returns matching excerpts with date + line context
UI: search input in file browser sidebar already present — wire to new endpoint
Consider: index on startup vs. live grep (live grep is fine at typical session volume)

[Backend] API usage / cost tracking

Multi-user setup with real Gemini/Claude API costs. Track per-user token consumption so Scott can see who's spending what.

Count input + output tokens per /chat and /orchestrate call (all backends return usage)
Append to home/{user}/usage.json — daily buckets, per-model breakdown
Expose via /api/usage endpoint; add a summary row to the Settings page
Optional: soft spending limit with a warning toast when exceeded

[Intelligence] Dev agent pipeline

See ARCH__Intelligence_Layer.md. Full design not yet started.

Specialist agent: frontend (SvelteKit) code changes
Specialist agent: backend (FastAPI) code changes
Supervisor agent: diff review, syntax check, test runner
Gitea webhook integration: trigger on push/PR, report back
Human approval gate before commit

[Intelligence] Supervisor agent

Runs py_compile, svelte-check, unit tests after specialist agent work
Reports pass/fail back to orchestrator
Only commits on explicit approval

[Channel] Gitea webhooks

Receive push/PR/issue events → route to appropriate agent
cortex/routers/ already has pattern; add gitea.py
Gitea Actions (CI) for "run tests on push" — simpler than custom runner

[Local] RAG via Open WebUI

Open WebUI has a full RAG pipeline (file upload → embed → knowledge collections → reference in chat). Could feed Nextcloud docs or session logs into a local knowledge base accessible to local models. Endpoints documented in docs/OPEN_WEBUI_API.md.

/api/v1/files/ upload + /api/v1/retrieval/process/web for URLs
Reference in chat via "files": [{"type": "collection", "id": "..."}]

[Backend] Intelligent model routing — automatic task-type dispatch

Model Registry V2 (2026-04-27) added role-based routing and manual role toggle — that's the foundation. What remains is removing the need to toggle manually.

Classify incoming messages by task type (heuristic or lightweight classifier)
Map task type → role → model automatically:
- User conversation → chat role → Claude (quality prose, persona fidelity)
- Tool/research tasks → orchestrator role → Gemini API or local
- Private/sensitive → local role → Ollama (no data leaves network)
- Long context (>50k tokens) → Gemini 2.0 (1M ctx window)
- Fast/cheap queries → local E4B (25 t/s, no API cost)
Routing logic in llm_client.py or new router.py; expose override in UI

[Ops] Permanent fleet hosting — home server deployment

Currently running on scott-lt-i7-rtx (gaming laptop). Long-term target is the home server for always-on reliability. docker-compose.yml already exists.

Copy project to home server
Configure Nginx reverse proxy (already Docker-hosted on that machine)
Point cortex.dgrzone.com → home server internal IP (pfSense alias update)
WireGuard required for all access — not internet-exposed
Update FLEET_MANIFEST.md to reflect new hosting location

[Future] Cortex Mesh — multi-instance fleet coordination

Each fleet device runs its own Cortex instance. Instances delegate tasks to each other based on resources and specialisation. No central coordinator required.

Concept only — no design yet. Resolve these questions before building:
- Auth between instances (shared JWT secret vs. per-instance API keys)
- Capability advertisement (model registry over HTTP? shared Syncthing file?)
- Whether ae_send_message / the inbox system is the right coordination layer
- Session continuity — does a conversation stay on one node or migrate?
Natural foundation already in place: Syncthing-synced home/ and shared model_registry.json mean instances share persona memory without a central DB

✅ Completed

[Tools] Role-based access control + confirmation gate — 2026-04-29

TOOL_ROLES dict maps tool names to minimum required role (admin/user)
CONFIRM_REQUIRED set blocks destructive tools; orchestrator injects confirmation prompt instead
get_tools_for_role(role) filters both Gemini declarations and callables
get_user_role(username) added to auth_utils.py; passed through both orchestrators
manage_passwords.py role <username> admin|user — shell-only admin promotion
Admin-only tools: shell_exec, claude_allow_dir, cortex_restart, cortex_logs, file_read, file_list, file_write, ae_task_list, nc_talk_send
Confirm-required tools: cortex_restart, file_write, shell_exec, cron_remove, reminders_clear

[UI] Admin role badge in Account settings — 2026-04-29

GET /settings now injects user_role from auth.json into settings page
Role shown as a styled pill badge (purple ADMIN, muted USER) below username field

[Local] Unsloth Gemma 4 variants — resolved 2026-04-29

Ollama update resolved the 500: unable to load model issue
Unsloth Dynamic 2.0 Q4_K_M GGUFs loading correctly

[Distill] Distill quality review — resolved 2026-04-29

Short/mid/long output reviewed and quality confirmed acceptable
No prompt tuning needed at this time

[UI] Progressive Web App (PWA) — 2026-04-29

manifest.json, sw.js, PNG icons (192/512) generated via rsvg-convert
/manifest.json and /sw.js served at root via ui.py; exempted in auth_middleware
Theme-color meta tag updated dynamically on light/dark toggle
Install prompt confirmed working in Chromium desktop; apple-touch-icon for iOS

[UI] CodeMirror markdown editor for identity/memory files — 2026-04-28

Replaced textarea in Files panel with CodeMirror 5 (markdown mode, CDN)
Syntax highlighting, line wrapping, Ctrl+S to save, per-file undo history

[UI] Input area polish — 2026-04-28

Single cycling S/M/L button replaces 3 separate height buttons (same UX as font size)
S size collapses mode-select to a row (compact); M/L keep vertical column layout
Input height minimum derived from setting so empty textarea reflects selected size
Context & Memory panel and Settings dropdown are mutually exclusive (closeAllPanels fix)
Both panels now use consistent shadow (var(--shadow)) and z-index (200)

[Tools] Tools toggle — decoupled from Role/Backend — 2026-04-28

Removed "Agent" mode from the mode selector; replaced with independent ⚡ toggle
toolsEnabled persists in localStorage; routes to orchestrator regardless of active mode
Layout: column (M/L) or row (S) driven by data-size attribute set by JS
chat_role flows from UI → OrchestrateRequest → orchestrator_engine.run(response_role=...)

[Tools] shell_exec tool — 2026-04-28

shell_exec(command, working_dir, timeout) in cortex/tools/system.py
Runs any shell command on the Cortex host; timeout clamped 1–120s
Use for system diagnostics: df -h, ps aux, journalctl, free -h, etc.

[Tools] Aether Journals full toolkit — 2026-04-28

ae_journal_list — list all journals + ids for the account
ae_journal_entry_update — PATCH any fields (title, content, summary, tags, enable)
ae_journal_entry_disable — soft-delete via enable=false
ae_journal_entry_append — read→append timestamped section→write (running/data logs)
ae_journal_entry_prepend — read→prepend timestamped section→write (newest-first)
Shared _get_entry / _patch_entry helpers; OpenAI JSON Schema auto-derived from Gemini declarations

[Local] Per-user multi-model local LLM settings — 2026-04-01

home/{username}/local_llm.json — hosts[] + models[] + active_model_id structure
cortex/user_settings.py — CRUD functions: save_host, add_model, remove_model, set_active_model, get_active_local_model
cortex/routers/local_llm.py + cortex/static/local_llm.html — dedicated /settings/local page
"Fetch models from host" button — proxied via /api/local-llm/fetch-models, populates dropdown
Active model shown in UI near backend toggle button (amber hint text)
Migrates old flat .env-style config automatically on first use

[UI] Copy button for user (sent) messages — 2026-04-01

Added matching copy-on-hover button to user messages (same pattern as assistant messages)
div.dataset.raw set on send; makeCopyBtn(div) appended inline

[Backend] Local model backend (Open WebUI / Ollama) — 2026-04-01

OpenAI-compatible API via httpx — no CLI wrapper needed
Configured via LOCAL_API_URL / LOCAL_API_KEY / LOCAL_MODEL in .env
Backend toggle cycles claude → gemini → local (amber color in UI)
/auth/status includes local reachability check (GET /api/models)
Tested end-to-end: test-agent-simple (Qwen3-8B) on scott-lt-i7-rtx:3000, full persona context flowing correctly

[Testing] Gitea SSH port 2222 — 2026-03-29

pfSense WAN → 192.168.32.7:2222 port forward confirmed working
ssh -p 2222 git@git.dgrzone.com reaches Gitea (returns "Invalid repository path" — expected, confirms connectivity)
Clone/push via SSH: git clone ssh://git@git.dgrzone.com:2222/<user>/<repo>.git

[Multi-user] Brian onboarding — 2026-03-29

Invite sent to memedrift@gmail.com
Brian completed onboarding, created wintermute persona
Google OAuth registered (google-add brian memedrift@gmail.com)

[Tools] Reminders tools — 2026-03-29

reminders_add, reminders_list, reminders_clear added to orchestrator tool suite
Tools live in cortex/tools/reminders.py
All persona PROTOCOLS.md updated with Tools & Modes reference (direct chat vs Agent mode)
persona_template.py updated so new personas get the protocol automatically

[Auth] Token expiry — no restart needed — 2026-03-27

llm_client._fresh_claude_token() reads live from ~/.claude/.credentials.json on every call
systemd service is a user unit (no sudo) — systemctl --user restart cortex is sufficient
No manual token sync required after claude auth login

[Multi-user] Per-user channel config — 2026-03-27

Google Chat and NC Talk secrets/config moved from .env to home/{username}/channels.json
New endpoints: POST /channels/google-chat/{username} and POST /webhook/nextcloud/{username}
No channel access by default — each user configures their own channels.json
Setup guides: docs/GOOGLE_CHAT_BOT.md and docs/NEXTCLOUD_TALK_BOT.md

GET /auth/google → Google consent → GET /auth/google/callback flow
Users pre-registered via manage_passwords.py google-add <user> <email>
Google sign-in button on /login; auth.json stores google_sub + google_email
Active users: scott (scott.idem@oneskyit.com), holly (holly.danner@gmail.com), brian (memedrift@gmail.com)

[Settings] Per-user Gemini API key — 2026-03-27

Stored in home/{username}/auth.json as gemini_api_key
Orchestrator uses user key if set, falls back to server-level GEMINI_API_KEY
Manageable via /settings UI (add, remove, masked hint)

localStorage keyed to cx_sid_{user}_{persona} with 30-min inactivity TTL
Auto-restored silently on page load; cleared on "New session" or session delete

[UI] Persona picker page — 2026-03-26

GET /{username} shows a card grid of available personas instead of 404
Each card links directly to /{username}/{persona}

[UI] Lucide icons — 2026-03-25

Icons throughout: mode selector, send/stop buttons, edit/del/copy, save/cancel
Loaded via UMD CDN; icon_html() + render_icons() helpers in app.js

[UI] Persona-specific favicon — 2026-03-25

Emoji SVG favicon generated from persona config at load time

[Multi-user] Holly onboarding — 2026-03-20

Holly's invite sent; onboarding completed via /setup/{token}
home/holly/persona/tina/ created from template
Google OAuth registered (holly.danner@gmail.com)

[Channel] Nextcloud Talk integration ✅ — 2026-03-20, updated 2026-03-27

HMAC verification: incoming uses random + raw_body; outgoing reply uses random + message_text
Per-user routing added 2026-03-27 (endpoint: /webhook/nextcloud/{username})
Docs: docs/NEXTCLOUD_TALK_BOT.md

[Channel] Google Chat integration ✅ — 2026-03-20, updated 2026-03-27

JWT verification via authorizationEventObject.systemIdToken
Workspace Add-on format: hostAppDataAction.chatDataAction.createMessageAction
Per-user routing added 2026-03-27 (endpoint: /channels/google-chat/{username})
Docs: docs/GOOGLE_CHAT_BOT.md

[Intelligence] Orchestrator service — Phase 1 — 2026-03-18

Gemini API (google-genai SDK) tool loop → Claude final response
POST /orchestrate (async job), GET /orchestrate/{job_id} (poll)
Tools: web search, AE API, file read, task list, scratch, reminders, cron
Default model: gemini-2.5-flash

[Auth] Session auth + persona onboarding — 2026-03-20

bcrypt passwords in home/{username}/auth.json
JWT session cookies (HS256, 30-day expiry)
Invite tokens (72h, one-time-use) — manage_passwords.py invite <user> [email]
Self-service onboarding: /setup/{token} → /setup/persona
SMTP invite email via noreply@oneskyit.com

[UI] Mobile-friendly header — 2026-03

Backend toggle, font size, theme buttons moved into ⚙ settings panel
Header reduced to core buttons

[UI] Help & Reference — 2026-03-27

Shared base at cortex/static/HELP.md (served to all users)
Persona-specific additions appended from home/{username}/persona/{name}/HELP.md if present
Collapsible H2 sections via <details> elements

[Backend] Gemini CLI backend — 2026-03

gemini -p subprocess, streaming output; auth check at /auth/status

[Backend] Memory distiller — 2026-03

APScheduler: distill_short (daily 03:00), distill_mid (weekly Sun 03:30), distill_long (monthly 1st 04:00)
Writes to MEMORY_SHORT.md, MEMORY_MID.md, MEMORY_LONG.md per persona

[Backend] Session logging + file browser — 2026-03

Sessions saved to home/{user}/persona/{name}/sessions/
Files panel in UI browses persona directory

[Backend] Dispatcher core — 2026-03-04

FastAPI service with streaming SSE response
Claude CLI and Gemini CLI subprocess backends
Session context management (rolling window, MAX_HISTORY_MESSAGES)

22 KiB Raw Blame History Unescape Escape