Compare commits

...

32 Commits

Author SHA1 Message Date
Scott Idem
8ba5247ef5 tooling: install script, workspace file, and dev-restart helper
- install.py — idempotent setup script (venv, systemd service, linger,
  auth checks); supports --check for read-only status inspection
- .stignore — exclude .venv and runtime dirs from Syncthing so each
  host maintains its own machine-local venv
- Cortex_and_Inara.code-workspace — VS Code workspace (service, personas,
  docs folders; launch config for uvicorn --reload)
- dev-restart.sh — SSH wrapper to restart Cortex on the gaming laptop
  and tail logs; supports restart / logs / status subcommands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:11:27 -04:00
Scott Idem
a6e404c143 feat: host_type field for OpenRouter / OpenAI-compatible API support
Adds host_type ("openwebui" | "openai") to the host schema so Cortex can
talk to both Open WebUI/Ollama and OpenRouter/standard-OpenAI endpoints.

Path differences per type:
  openwebui (default): /api/chat/completions, /api/models
  openai:              /chat/completions,     /models

model_registry.py:
  - host_type added to host schema (default "openwebui", backward compat)
  - save_host() accepts host_type parameter
  - _resolve_model() passes host_type through with the merged host fields

llm_client._local():
  - Reads host_type from resolved model_cfg
  - Selects correct chat completions path accordingly

routers/local_llm.py:
  - save_host route accepts host_type form field
  - fetch-models uses /models for openai type, /api/models for openwebui
  - Existing host rows show type selector pre-filled from stored value

local_llm.html:
  - "Add host" form includes type selector

To use OpenRouter:
  - Add host: URL = https://openrouter.ai/api/v1, Type = OpenAI-compatible
  - API key from openrouter.ai (store in .env or model_registry.json only)
  - Fetch models or add manually (e.g. anthropic/claude-sonnet-4-5-20251022)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 21:11:22 -04:00
Scott Idem
2dd94696d5 fix: model-tag color was #334155 (invisible on dark theme) → #475569 2026-04-05 22:25:09 -04:00
Scott Idem
8570e8d852 fix: backend toggle not sent to server; add per-message model tag
Fixes:
  - app.js was tracking primaryBackend locally but never included
    model: primaryBackend in the /chat POST body, so the server always
    used settings.primary_backend regardless of what the user clicked.
    Now model: primaryBackend is sent on every chat request.

  - Responses were only annotated when fallback occurred. Now every
    assistant message shows a small model tag at the bottom right.

chat.py:
  - _backend_label() resolves human-readable name:
      claude → "Claude", gemini → "Gemini",
      local → registry label (e.g. "Gemma 4 E4B") or model_name
  - SSE payload now includes backend_label field

app.js:
  - model: primaryBackend added to /chat fetch body
  - After every response, appends .model-tag div with backend_label
  - Fallback shows " fallback → <label>" in amber; normal is muted
  - Removed separate system message for fallback (tag covers it)

style.css:
  - .model-tag: small muted text, right-aligned, separated by thin line
  - .model-tag.fallback: amber (#f59e0b)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 22:10:40 -04:00
Scott Idem
9299ce5ba6 test: model registry unit test suite (45 tests)
Covers model_registry.py without requiring a running service or LLM:

  Empty/fresh state: no files, missing user dir
  Save/load: round-trip, corrupt file fallback
  Migration: v1 hosts/models, v1 no active, v0 flat, v0 empty url,
             distill_backend_mid=local → distill role, saves file after migrate
  Built-in resolution: claude_cli, gemini_api, gemini_cli, unknown → None
  User model resolution: local_openai merges host, missing host → None
  get_model_for_role: registry primary, built-in from registry, skips missing,
                      walks full backup chain, .env fallback, hardcoded fallback,
                      custom roles
  get_best_local_model: prefers role chain, falls back to first local, None if no local
  Host CRUD: create, update, unknown ID creates new, remove + cascades to models
  Model CRUD: create, update, remove + clears role refs
  set_role: assign model, assign built-in, clear with None, invalid slot,
            unknown model ID, creates new role key
  get_defined_roles: returns all settings roles, fills gaps with {}
  Multi-user isolation: registries don't bleed across users

All tests use tmp_path + patch.object(config.settings, ...) — no real files touched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 21:48:00 -04:00
Scott Idem
608e1de246 feat: model registry UI — hosts, models, role assignments
Replaces the single-host local model settings page with a full model
registry interface at /settings/local.

Hosts section:
- List existing hosts with inline edit + save + remove
- Collapsible "Add host" form
- Per-host "Fetch models" button

Models section:
- List all models with label, model name, host, context_k badge, tags
- Remove button

Add Model section:
- Host dropdown, label, model name, context_k, tags (comma-separated)
- "Fetch models from host" with auto-fill picker

Role Assignments section:
- One row per defined role (chat, orchestrator, distill, coder, research)
- Primary + backup_1 + backup_2 dropdowns per role
- Dropdowns pre-filled from registry on load
- AJAX save on change (POST /api/models/role) with toast confirmation
- Built-in models (claude_cli, gemini_cli, gemini_api) always available in dropdowns

Backend:
- All user_settings references replaced with model_registry
- host/{id}/remove route added
- fetch-models now accepts host_id query param
- POST /api/models/role for AJAX role assignment

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 21:31:32 -04:00
Scott Idem
6a1a1c2686 feat: unified model registry with role-based routing
Introduces model_registry.py as the single source of truth for all LLM
backend configuration. Replaces scattered backend settings across user_settings,
config distill_backend_*, and the UI toggle.

model_registry.py:
- Per-user home/{user}/model_registry.json with version, hosts, models, roles
- Models have: type (local_openai|claude_cli|gemini_cli|gemini_api), label,
  model_name, host_id, context_k (tokens), tags (capability labels)
- Roles map to priority chains: primary, backup_1..backup_4
- Built-in IDs (claude_cli, gemini_cli, gemini_api) always resolvable
- Auto-migrates existing local_llm.json on first access
- CRUD: save_host, remove_host, save_model, remove_model, set_role
- get_model_for_role(): registry → .env default → hardcoded fallback

config.py:
- role_chat/orchestrator/distill/coder/research .env defaults
- defined_roles: comma-separated standard role list (extensible)
- get_defined_roles() and get_role_default() helper methods

llm_client.complete():
- New role= parameter (default "chat") for registry-based routing
- model= still accepted for explicit UI toggle override
- _claude() and _local() accept model_cfg dict instead of raw string
- _local() uses pre-resolved config from registry

memory_distiller.py:
- distill_mid/long now use role="distill" (no more distill_backend_* .env vars needed)

cron_runner.py:
- brief jobs use role="chat"

routers/chat.py + auth.py:
- Use model_registry instead of user_settings for local model info

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 21:25:18 -04:00
Scott Idem
a4daebdc9b feat: local LLM multi-model, session search, cron proactive types, notifications, docs overhaul
Local LLM:
- user_settings.py: per-user hosts/models config (local_llm.json)
- routers/local_llm.py + static/local_llm.html: dedicated settings page
- llm_client.py: local OpenAI-compatible backend via httpx
- config.py: LOCAL_API_URL/KEY/MODEL + per-backend timeouts
- Active model shown near backend toggle (amber hint text)

Memory distillation:
- memory_distiller.py: DISTILL_BACKEND_MID/LONG .env overrides
- scheduler.py + notification.py: notify NC Talk after mid/long distill
- notification.py: outbound channel abstraction (NC Talk, extensible)

Session search:
- routers/files.py: GET /sessions/search?q= with excerpts grouped by date
- static/index.html + app.js: search UI in file sidebar with highlight
- _esc() helper to prevent XSS in search results

Proactive cron:
- cron_runner.py: new job types — message (send directly) and brief (LLM + send)
- Both support optional per-job channel override

Channels:
- routers/nextcloud_talk.py: consolidated using notification._send_nct_message()
- routers/auth.py: local backend status in /auth/status
- routers/chat.py: /backend returns {primary, fallback, local_model} object

UI / UX:
- Copy button for user messages (matching assistant)
- Autocomplete disabled on sensitive form fields
- settings.html: local model section replaced with link to /settings/local

Docs overhaul:
- MASTER.md hub + ARCH__SYSTEM/BACKENDS/PERSONA/CHANNELS/FUTURE.md
- ARCH__Intelligence_Layer.md replaced with redirect table
- CORTEX.md trimmed to vision only; README updated
- OPEN_WEBUI_API.md added to docs/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 20:53:06 -04:00
Scott Idem
bd6532e93a feat: shared nav bar on Help and Settings pages
Replaces the lone "← Back to Cortex" link with a consistent page-nav
on both pages: ← Chat | Help | Settings | Sign out

Active page is highlighted purple; others are muted gray.
Settings page gets a {{ help_href }} template var from settings.py.
Help page builds nav links from the existing cfg JS object.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 22:09:08 -04:00
Scott Idem
a94fdc869d docs: fix Gitea SSH URL to use git.dgrzone.com
cortex subdomain works incidentally but git.dgrzone.com is correct.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 21:53:54 -04:00
Scott Idem
1fefd42e19 docs: Gitea SSH port 2222 verified working
WAN port forward confirmed end-to-end. Clone URL:
ssh://git@cortex.dgrzone.com:2222/<user>/<repo>.git

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 21:52:36 -04:00
Scott Idem
0c17b4b1ab docs: overhaul TODO__Agents.md to reflect current state
Moved to completed: token expiry restart, Holly onboarding, per-user
channel config, Google OAuth, per-user Gemini key, session persistence,
persona picker, Lucide icons, favicon, Help shared base, reminders tools,
Brian onboarding.

Updated in-progress: knowledge consolidation tools (ae_journal_* done,
import script still pending). NC Talk and Google Chat notes updated for
per-user routing. Removed stale "default user only" notes.

High priority now: Ollama backend, Gitea SSH verification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 21:44:56 -04:00
Scott Idem
cec6d3e23a docs: update README for current state
- .env location: cortex/.env + cortex/.env.example (not project root)
- Webhook endpoints: per-user /webhook/nextcloud/{username} and /channels/google-chat/{username}
- Personas table: added brian/wintermute and scott/developer
- Docs table: added GOOGLE_CHAT_BOT.md, cortex/static/HELP.md
- Channels section: per-user webhook note + links to setup docs
- User management: added google-add command and channels.json note
- Removed stale Inara/Tina-only framing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 21:42:11 -04:00
Scott Idem
2d3a380d6b docs: add Tools & Modes protocol to all personas and template
Every persona now knows: direct chat has no tools, Agent mode () has
the full tool suite. If asked to write a reminder/task/etc in chat mode,
tell the user to switch modes rather than silently failing.

Updated: inara, tina, donut, wintermute, developer, cleo PROTOCOLS.md
Updated: persona_template.py so all future personas get this by default

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 21:35:54 -04:00
Scott Idem
662924c6a1 fix: pass user to _run_job so get_user_gemini_key resolves correctly
NameError: name 'user' is not defined in orchestrator._run_job —
user was resolved in the endpoint but not forwarded to the background task.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 21:21:14 -04:00
Scott Idem
e5b6d58889 feat: reminders_add and reminders_list tools
- New cortex/tools/reminders.py with reminders_add, reminders_list, reminders_clear
- reminders_clear moved here from cron.py (cron still imports from same file)
- __init__.py: wired up new callables and Gemini declarations
- Inara can now add/read reminders in Agent mode via the orchestrator

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 21:14:22 -04:00
Scott Idem
6b725afc3e docs: update NC Talk doc with real container name, commands, and logs section
- Use dgr_zone_nextcloud-app-1 throughout (actual container name)
- talk:bot:uninstall (not remove — wrong command in previous version)
- Added Logs section: occ log:tail + journalctl
- Bruteforce reset command now includes full docker exec form

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 13:47:43 -04:00
Scott Idem
ddf5dd6338 docs: add Google Chat setup guide, update NC Talk for per-user routing
- docs/GOOGLE_CHAT_BOT.md: new step-by-step guide covering channels.json,
  Google Cloud Console config, JWT audience, and troubleshooting
- docs/NEXTCLOUD_TALK_BOT.md: updated for per-user endpoints
  (/webhook/nextcloud/{username}), channels.json config, removed old
  server-level .env references, updated Multi-User note

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 13:24:36 -04:00
Scott Idem
93f7f44e51 feat: per-user channel config for Google Chat and Nextcloud Talk
- New endpoints: POST /channels/google-chat/{username} and /webhook/nextcloud/{username}
- Channel secrets/config live in home/{username}/channels.json (gitignored)
- auth_utils: get_user_channels() helper reads channels.json
- Both routers load persona, audience/secret, backend, timeout per user;
  set_context() wires the correct persona before building the system prompt
- Removed server-level channel settings from config.py and .env —
  no user gets a channel until they create their own channels.json
- .gitignore: home/**/channels.json added

To migrate: update Google Chat Add-on webhook URL to /channels/google-chat/{username}
and re-register NC Talk bot at /webhook/nextcloud/{username}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 13:02:45 -04:00
Scott Idem
496da58f58 chore: consolidate .env files — one .env in cortex/, one .env.example
- Removed orphaned root .env and .env.default (values already in cortex/.env,
  which is what the systemd service actually loads)
- Replaced outdated cortex/.env.example with the comprehensive .env.default content
- Also tracks: tested/persona/cleo/ (new test persona), Inara memory updates

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 12:22:49 -04:00
Scott Idem
8e20bfbea8 feat: shared Help base, Google OAuth live, new personas, cleanup
- cortex/static/HELP.md: shared Help & Reference base served to all users
- help.html: loads shared base + appends persona-specific HELP.md if present
- inara/HELP.md: cleared (content moved to shared base)
- Google OAuth: registered scott.idem@oneskyit.com; flow now working end-to-end
- .gitignore: exclude home/**/sessions/ (runtime logs)
- New personas tracked: home/holly/persona/donut/, home/scott/persona/developer/
- Removed orphans: holly/, personas/, cortex-holly.service
- CLAUDE.md: updated current state and recently completed list to 2026-03-27

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 22:55:45 -04:00
Scott Idem
3a94df1eaf fix: prevent password managers autofilling Gemini API key field
Change type="password" to type="text" — the main signal password
managers use. Also add autocomplete="off", data-lpignore, data-1p-ignore
for broader coverage across Bitwarden, 1Password, LastPass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 21:57:14 -04:00
Scott Idem
ce806e52ed fix: google-add now also sets profile.json email
The invite command reads email from profile.json, not auth.json.
google-add was only writing to auth.json so invite had no address
to send to. Now calls set_email() as well.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 21:32:48 -04:00
Scott Idem
7438031797 feat: connected accounts + Gemini API key in account settings UI
Settings page gains two new sections:
- Connected Accounts: shows linked Google email (read-only)
- Gemini API Key: paste personal key from aistudio.google.com,
  shows masked hint of saved key, remove link to revert to server key

POST /settings/gemini-key saves/clears gemini_api_key in auth.json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 21:16:37 -04:00
Scott Idem
8aec6aafcc feat: Google OAuth sign-in + per-user Gemini API key
Users with Google accounts can now sign in without a password.

Auth flow:
- GET /auth/google → Google consent page (CSRF state cookie)
- GET /auth/google/callback → exchange code, lookup user, set JWT
- auth.json gains google_sub + google_email fields
- set_password() no longer overwrites unrelated auth.json fields

Admin setup:
  python manage_passwords.py google-add <username> <email>
  # add GOOGLE_CLIENT_ID + GOOGLE_CLIENT_SECRET to .env

Per-user Gemini key:
- get_user_gemini_key() reads gemini_api_key from auth.json
- orchestrator_engine.run() accepts gemini_api_key param
- orchestrator router passes user's key, falls back to server key

login.html: "Sign in with Google" button above the password form.
manage_passwords.py list: now shows auth method columns (pw / google).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 21:01:52 -04:00
Scott Idem
62fde62653 feat: persona-specific favicon + fix favicon.ico 404
app.js updates the <link rel="icon"> to the active persona's emoji on
load (CORTEX_EMOJI is already injected server-side). /favicon.ico route
added as a fallback for login/settings/help pages that don't have
persona context.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:45:36 -04:00
Scott Idem
f8d89bc272 fix: close SSE connection cleanly on page navigation
beforeunload closes the EventSource explicitly so the browser doesn't
log "connection interrupted while page was loading". onerror handler
suppresses auto-reconnect noise if the connection temporarily drops.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:44:12 -04:00
Scott Idem
92350f7a7b feat: persist active session across page navigation with inactivity TTL
Session ID is stored in localStorage keyed to user+persona. On page load
it's silently restored if within 30 min of last activity. Timestamp
updates on every sent message. New session / delete session clears the
stored ID so the TTL logic stays consistent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:38:04 -04:00
Scott Idem
4f09823afe feat: Lucide icons on edit/del/copy and inline edit save/cancel buttons
pencil → edit, trash-2 → del, copy → copy, check → copied feedback,
check → Save, x → Cancel. All small action buttons get inline-flex
alignment for consistent icon+label layout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:32:19 -04:00
Scott Idem
826bd6cfe3 feat: /{username} persona picker landing page
Visiting /scott (or any user root) now shows a clean card page listing
all their personas with emoji + name, each linking to /{user}/{persona}.
Previously the route was unhandled (404 or wildcard match).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:19:04 -04:00
Scott Idem
c3507f8e11 fix: help page back link preserves active persona
Pass ?persona= query param on the help link so the server knows which
persona to return to. Previously always defaulted to personas[0], causing
navigation back to the wrong persona.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:13:52 -04:00
Scott Idem
65548ebf36 feat: Lucide SVG icons throughout main UI
Replace all emoji/unicode icons with Lucide SVG icons:
- Mode select dropdown: message-circle / pencil / lock / bot
- Send button: arrow-up (chat/OTR), pencil (note), zap (agent)
- Stop button: square icon
- Header nav already had Lucide SVGs; render_icons() now called at init

Add icon_html() + render_icons() helpers; update update_mode_ui() and
open_mode_dropdown() to use innerHTML + lucide.createIcons(). CSS: .btn-icon
alignment, inline-flex on .hdr-btn / .hdr-dd-item / #send / #stop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:06:01 -04:00
119 changed files with 7400 additions and 1369 deletions

View File

@@ -1,88 +0,0 @@
# Cortex .env reference — copy to .env and fill in values
# DO NOT commit .env — it contains secrets
# ── Agent identity ───────────────────────────────────────────────────────────
# Global display names used in distillation prompts and session logs.
# Individual persona identities live in home/{username}/persona/{name}/IDENTITY.md
AGENT_NAME=Inara
USER_NAME=Scott
# ── Home directory ────────────────────────────────────────────────────────────
# Root for all user/persona data. Layout: home/{username}/persona/{name}/
# Relative paths are resolved from the cortex/ directory.
# Default: ../home (i.e. Cortex_and_Inara_dev/home/)
# HOME_DIR=../home
# ── Session auth ─────────────────────────────────────────────────────────────
# Generate with: python3 -c "import secrets; print(secrets.token_hex(32))"
JWT_SECRET=change-me-in-dotenv
JWT_EXPIRE_DAYS=30
# ── SMTP (invite emails + future notifications) ───────────────────────────────
SMTP_SERVER=linode.oneskyit.com
SMTP_PORT=465
SMTP_USERNAME=send_mail
SMTP_PASSWORD=
SMTP_FROM_EMAIL=noreply@oneskyit.com
SMTP_FROM_NAME=Cortex
# Base URL included in invite links
CORTEX_BASE_URL=https://cortex.dgrzone.com
# ── Server ──────────────────────────────────────────────────────────────────
HOST=0.0.0.0
PORT=8000
# ── Google Chat bot ──────────────────────────────────────────────────────────
# JWT audience for verifying inbound Workspace Add-on Chat webhook requests.
# For Workspace Add-on Chat apps, the aud claim = the endpoint URL.
# Leave blank to disable verification (dev/testing only).
GOOGLE_CHAT_AUDIENCE=https://cortex.dgrzone.com/channels/google-chat
# ── Nextcloud Talk bot ───────────────────────────────────────────────────────
NEXTCLOUD_URL=https://cloud.dgrzone.com
NEXTCLOUD_TALK_BOT_SECRET=
# ── LLM backends ────────────────────────────────────────────────────────────
# Primary backend: "claude" or "gemini" (other is always fallback)
PRIMARY_BACKEND=claude
# Timeouts in seconds
TIMEOUT_CLAUDE=60
TIMEOUT_GEMINI=120
# ── Orchestrator (Gemini API — not Gemini CLI) ───────────────────────────────
# Required for /orchestrate endpoint and tool use
# Free tier key: https://aistudio.google.com/apikey
GEMINI_API_KEY=
# Model for the orchestration tool loop (not the user-facing response)
ORCHESTRATOR_MODEL=gemini-2.5-flash
# Safety cap on tool loop iterations
ORCHESTRATOR_MAX_ROUNDS=10
# ── DuckDuckGo search ────────────────────────────────────────────────────────
# Leave blank for free unauthenticated tier
# Set to your API key for higher rate limits (paid DuckDuckGo account)
DDG_API_KEY=
DDG_MAX_RESULTS=5
# ── Aether Platform API ───────────────────────────────────────────────────────
# Used by orchestrator tools: ae_journal_search, ae_journal_entry_create, ae_task_list
# Same values as agents_sync/mcp/.env — copy from there
AE_API_URL=https://dev-api.oneskyit.com
AE_API_KEY=
AE_ACCOUNT_ID=
AE_API_TIMEOUT=15
# ── Distillation schedule ────────────────────────────────────────────────────
SCHEDULER_TIMEZONE=America/New_York
AUTO_DISTILL=true
AUTO_DISTILL_SHORT=true
AUTO_DISTILL_MID=true
AUTO_DISTILL_LONG=false # manual review recommended before enabling
# Memory tier token budgets (soft caps)
MEMORY_BUDGET_SHORT=3000
MEMORY_BUDGET_MID=2000
MEMORY_BUDGET_LONG=2000

2
.gitignore vendored
View File

@@ -9,11 +9,13 @@ __pycache__/
# Session data (runtime state, not source)
cortex/data/
home/**/session_data/
home/**/sessions/
# User credentials and tokens — never commit
home/**/auth.json
home/**/invite.json
home/**/profile.json
home/**/channels.json
# Syncthing Metadata
.stfolder/

5
.stignore Normal file
View File

@@ -0,0 +1,5 @@
// Machine-local — never sync across hosts
.venv/
__pycache__/
*.pyc
cortex/data/

View File

@@ -82,6 +82,7 @@ Cortex_and_Inara_dev/
docs/ ← Integration reference docs
NEXTCLOUD_TALK_BOT.md
OPEN_WEBUI_API.md ← Open WebUI API: tool calling, RAG, model management
documentation/ ← Architecture decisions and agent task list
TODO__Agents.md ← READ THIS FIRST — active task list
@@ -211,37 +212,21 @@ clearly asked for a directory to be unblocked.
---
## Current State (2026-03-20)
## Current State (2026-04-03)
Cortex is running and stable. All three primary channels are live:
| Channel | Status | Notes |
|---|---|---|
| Web UI | ✅ Live | `https://cortex.dgrzone.com` (basic auth) |
| Web UI | ✅ Live | `https://cortex.dgrzone.com` |
| Nextcloud Talk | ✅ Live | HMAC-signed webhook, async reply |
| Google Chat | ✅ Live | Workspace Add-on, `hostAppDataAction` response format |
| Local backend | ✅ Live | Open WebUI/Ollama, per-user multi-model config |
### Active Tasks
Active users: scott (inara, developer), holly (tina), brian (wintermute)
See `documentation/TODO__Agents.md` for the full list. Current priorities:
- **[High]** Ollama backend — local LLM via `scott_gaming` over WireGuard
- **[Medium]** NC Talk — complete bot registration docs (`docs/NEXTCLOUD_TALK_BOT.md`)
- **[Medium]** Knowledge consolidation — markdown → AE Journals
### Recently Completed
- ✅ Session auth — bcrypt passwords, JWT cookies, login/logout, `SessionAuthMiddleware` — 2026-03-20
- ✅ Persona onboarding — invite tokens, self-service password setup, persona creation form — 2026-03-20
- ✅ Multi-persona switcher — dropdown in UI header, `/api/personas` endpoint — 2026-03-20
- ✅ SMTP invite email — `noreply@oneskyit.com`, HTML + plain text, `manage_passwords.py invite` — 2026-03-20
- ✅ CSS routing fix — `/static/*` mount must precede wildcard `/{user}/{persona}` route — 2026-03-20
- ✅ Multi-user/multi-persona support (`home/{username}/persona/{name}/` two-level layout) — 2026-03-20
- ✅ Scratchpad, task management, and cron/scheduled job tools — 2026-03-20
- ✅ Test suite (80 tests) covering API, persona routing, tools, security — 2026-03-20
- ✅ Google Chat bot (Workspace Add-on, JWT auth, `hostAppDataAction` format) — 2026-03-20
- ✅ Orchestrator Agent mode UI + session persistence — 2026-03-18
- ✅ Memory distiller (APScheduler, short/mid/long) — 2026-03
See `documentation/TODO__Agents.md` for the active task list.
See `documentation/ROADMAP.md` for phases and what's next.
---
@@ -249,8 +234,14 @@ See `documentation/TODO__Agents.md` for the full list. Current priorities:
| File | Purpose |
|---|---|
| `documentation/MASTER.md` | **Start here** — index, current state, all doc links |
| `documentation/TODO__Agents.md` | Active task list — read before starting work |
| `documentation/ARCH__Intelligence_Layer.md` | Full architecture design |
| `~/agents_sync/projects/CORTEX.md` | High-level project vision and phases |
| `documentation/ROADMAP.md` | Phases — what's done, what's next |
| `documentation/ARCH__SYSTEM.md` | System architecture and component map |
| `documentation/ARCH__BACKENDS.md` | LLM backends, routing, per-user config |
| `documentation/ARCH__PERSONA.md` | Persona system, context tiers, memory distillation |
| `documentation/ARCH__CHANNELS.md` | Input channels — web, NC Talk, Google Chat, cron |
| `documentation/ARCH__FUTURE.md` | Planned: local orchestrator, dev agents, knowledge layer |
| `~/agents_sync/projects/CORTEX.md` | Project vision and philosophy |
| `~/agents_sync/CLAUDE.md` | Fleet coordination rules |
| `~/CLAUDE.md` | Machine identity (`scott_lpt`) |

View File

@@ -0,0 +1,75 @@
{
"folders": [
{
"name": "cortex (service)",
"path": "cortex"
},
{
"name": "home (personas)",
"path": "home"
},
{
"name": "documentation",
"path": "documentation"
},
{
"name": "docs (integrations)",
"path": "docs"
},
{
"name": "project root",
"path": "."
}
],
"settings": {
"files.exclude": {
"**/__pycache__": true,
"**/*.pyc": true,
"cortex/.venv": true,
"cortex/data": true
},
"search.exclude": {
"**/__pycache__": true,
"cortex/.venv": true,
"cortex/data": true,
"home/**/sessions": true,
"home/**/session_data": true
},
"[python]": {
"editor.formatOnSave": false
},
"editor.rulers": [100],
"files.associations": {
"*.env": "dotenv",
"*.env.default": "dotenv"
}
},
"extensions": {
"recommendations": [
"ms-python.python",
"ms-python.vscode-pylance",
"humao.rest-client",
"tamasfe.even-better-toml"
]
},
"launch": {
"version": "0.2.0",
"configurations": [
{
"name": "Cortex (uvicorn dev)",
"type": "python",
"request": "launch",
"module": "uvicorn",
"args": [
"main:app",
"--host", "0.0.0.0",
"--port", "8000",
"--reload"
],
"cwd": "${workspaceFolder:cortex (service)}",
"envFile": "${workspaceFolder:cortex (service)}/.env",
"justMyCode": false
}
]
}
}

View File

@@ -6,7 +6,7 @@
> *"You can't stop the signal."*
Cortex is a self-hosted multi-agent AI platform. It supports multiple users, each with their own named AI persona. Inara (Scott's persona) and Tina (Holly's persona) are the initial instances.
Cortex is a self-hosted multi-agent AI platform. It supports multiple users, each with their own named AI persona.
---
@@ -16,9 +16,7 @@ Cortex is a self-hosted multi-agent AI platform. It supports multiple users, eac
|---|---|
| `cortex/` | FastAPI service — dispatcher, routing, LLM backends, session management |
| `home/` | User and persona data (`home/{username}/persona/{name}/`) |
| `home/scott/persona/inara/` | Inara identity, memory, and context files |
| `home/holly/persona/tina/` | Tina identity, memory, and context files |
| `docs/` | Integration reference docs (NC Talk bot, etc.) |
| `docs/` | Integration reference docs (NC Talk bot, Google Chat bot) |
| `documentation/` | Architecture decisions, project plans, agent task lists |
---
@@ -69,49 +67,55 @@ http://localhost:8000 (or cortex.dgrzone.com on WireGuard)
The service starts automatically at boot via `loginctl enable-linger`.
Service file: `~/.config/systemd/user/cortex.service`
Config lives in `cortex/config.py` and a `.env` file at the project root (not tracked — see `.env.default`).
Config lives in `cortex/config.py` and `cortex/.env` (not tracked — see `cortex/.env.example`).
---
## Key Documentation
**Start here for a full picture:** [`documentation/MASTER.md`](documentation/MASTER.md)
| File | Purpose |
|---|---|
| `documentation/TODO__Agents.md` | Active task list — read first |
| `documentation/ARCH__Intelligence_Layer.md` | Intelligence layer architecture (orchestrator, dev agents, knowledge) |
| `docs/NEXTCLOUD_TALK_BOT.md` | NC Talk bot setup |
| `home/scott/persona/inara/IDENTITY.md` | Inara persona and identity |
| `home/scott/persona/inara/HELP.md` | In-app help content (rendered in UI) |
| `home/scott/persona/inara/PROTOCOLS.md` | Inara behavioral protocols |
| `~/agents_sync/projects/CORTEX.md` | High-level project vision and phases |
| `documentation/MASTER.md` | Index — current state, all doc links, quick reference |
| `documentation/ROADMAP.md` | Phases — what's done, what's next |
| `documentation/TODO__Agents.md` | Active task list |
| `documentation/ARCH__SYSTEM.md` | System architecture and component map |
| `documentation/ARCH__BACKENDS.md` | LLM backends, routing, fallback |
| `documentation/ARCH__PERSONA.md` | Persona system, context tiers, memory distillation |
| `documentation/ARCH__CHANNELS.md` | Input channels — web, NC Talk, Google Chat, cron |
| `documentation/ARCH__FUTURE.md` | Planned features — local orchestrator, dev agents, knowledge layer |
| `docs/NEXTCLOUD_TALK_BOT.md` | NC Talk bot setup and troubleshooting |
| `docs/GOOGLE_CHAT_BOT.md` | Google Chat Add-on setup |
| `docs/OPEN_WEBUI_API.md` | Open WebUI/Ollama API reference |
---
## Architecture at a Glance
```
[User / Cron / Webhook]
[Web UI / NC Talk / Google Chat / Cron / Webhooks]
Cortex Dispatcher (FastAPI, cortex/)
├─ POST /chat — direct to LLM (streaming SSE)
├─ POST /orchestrate — Gemini tool loop → Claude response
├─ POST /webhook/nextcloud — Nextcloud Talk bot
└─ POST /webhook/google — Google Chat Add-on
├─ POST /chat — direct to LLM (streaming SSE)
├─ POST /orchestrate — Gemini tool loop → Claude response
├─ POST /webhook/nextcloud/{username} — Nextcloud Talk bot (per-user)
└─ POST /channels/google-chat/{username} — Google Chat Add-on (per-user)
LLM Backend(s)
• Claude CLI — primary reasoning, coding, long-context
• Gemini CLI — secondary / cost routing
• Gemini API — orchestrator tool loop (separate from Gemini CLI)
Ollamaoffline/private (scott_gaming, future)
LLM Backends
• Claude CLI — primary, all user-facing responses
• Gemini CLI — fallback
• Gemini API — orchestrator tool loop only (not general chat)
Local Open WebUI/Ollama on scott_gaming (private/offline)
Persona context loaded from home/{user}/persona/{name}/
```
See `documentation/ARCH__Intelligence_Layer.md` for the orchestrator/responder and dev-agent architecture.
See `documentation/ARCH__SYSTEM.md` for the full architecture breakdown.
---
## Inara / Tina
## Personas
Each persona has its own identity, memory, and session history.
They are not tied to a specific LLM model — the name is fixed, the backend varies.
@@ -120,17 +124,23 @@ Context is loaded at request time from `home/{user}/persona/{name}/` via `cortex
| User | Persona | Description |
|---|---|---|
| scott | inara | Scott's primary AI assistant |
| scott | developer | Scott's dev-focused persona |
| holly | tina | Holly's primary AI assistant |
| brian | wintermute | Brian's primary AI assistant |
---
## Channels
| Channel | Status | Notes |
Webhook endpoints are per-user — each user configures their own secrets in `home/{username}/channels.json`.
| Channel | Status | Endpoint |
|---|---|---|
| Web UI | Live | `https://cortex.dgrzone.com` — session auth (login form + JWT cookie) |
| Nextcloud Talk | Live | HMAC-signed webhook, async reply |
| Google Chat | Live | Workspace Add-on, JWT auth |
| Nextcloud Talk | Live | `POST /webhook/nextcloud/{username}` HMAC-signed, async reply |
| Google Chat | Live | `POST /channels/google-chat/{username}` Workspace Add-on, JWT auth |
See `docs/NEXTCLOUD_TALK_BOT.md` and `docs/GOOGLE_CHAT_BOT.md` for setup instructions.
---
@@ -142,7 +152,10 @@ cd cortex
# Create a user directory and send an invite email
.venv/bin/python manage_passwords.py invite <username> <email>
# List users with password and email status
# Register a Google account for sign-in (run after user completes onboarding)
.venv/bin/python manage_passwords.py google-add <username> <email>
# List users with password, Google, and email status
.venv/bin/python manage_passwords.py list
# Set/check a password directly
@@ -152,6 +165,8 @@ cd cortex
New users receive a link to `/setup/{token}` where they set their own password and create their first persona. Invite tokens expire in 72 hours and are one-time-use.
To enable a channel for a user, create `home/{username}/channels.json` — see the relevant doc in `docs/`.
---
## Testing

View File

@@ -1,15 +0,0 @@
[Unit]
Description=Cortex / Holly LLM Gateway
After=network.target
[Service]
Type=simple
User=scott
WorkingDirectory=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex
EnvironmentFile=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex/.env.holly
ExecStart=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex/.venv/bin/uvicorn main:app --host 0.0.0.0 --port 8001
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target

View File

@@ -1,33 +1,106 @@
# Auth is handled by the claude CLI (claude setup-token) — no API key needed here.
# ANTHROPIC_API_KEY=only_needed_if_switching_to_sdk
# Cortex .env reference — copy to .env and fill in values
# DO NOT commit .env — it contains secrets
# Path to the inara/ identity directory — relative to cortex/ or absolute
INARA_DIR=../inara
# ── Agent identity ───────────────────────────────────────────────────────────
# Global display names used in distillation prompts and session logs.
# Individual persona identities live in home/{username}/persona/{name}/IDENTITY.md
AGENT_NAME=Inara
USER_NAME=Scott
# Path for persistent JSON session files
SESSIONS_DIR=./data/sessions
# ── Home directory ────────────────────────────────────────────────────────────
# Root for all user/persona data. Layout: home/{username}/persona/{name}/
# Relative paths are resolved from the cortex/ directory.
# Default: ../home (i.e. Cortex_and_Inara_dev/home/)
# HOME_DIR=../home
# LLM defaults
DEFAULT_MODEL=claude-sonnet-4-6
DEFAULT_TIER=2
# ── Google OAuth — "Sign in with Google" ────────────────────────────────────
# Create credentials at console.cloud.google.com → APIs & Services → Credentials
# Application type: Web Application
# Authorised redirect URI: https://cortex.dgrzone.com/auth/google/callback
# Pre-register users: cd cortex && .venv/bin/python manage_passwords.py google-add <user> <email>
# Per-user Gemini key: add "gemini_api_key": "AIza..." to home/{username}/auth.json
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
# Session rolling window — number of messages to keep (user + assistant pairs)
# 40 = 20 turns
MAX_HISTORY_MESSAGES=40
# ── Session auth ─────────────────────────────────────────────────────────────
# Generate with: python3 -c "import secrets; print(secrets.token_hex(32))"
JWT_SECRET=change-me-in-dotenv
JWT_EXPIRE_DAYS=30
# Per-backend timeouts (seconds)
# Gemini is generous — it frequently takes 30-60s under load
# Local models may need time to load into VRAM before first response
# ── SMTP (invite emails + future notifications) ───────────────────────────────
SMTP_SERVER=linode.oneskyit.com
SMTP_PORT=465
SMTP_USERNAME=send_mail
SMTP_PASSWORD=
SMTP_FROM_EMAIL=noreply@oneskyit.com
SMTP_FROM_NAME=Cortex
# Base URL included in invite links
CORTEX_BASE_URL=https://cortex.dgrzone.com
# ── Server ──────────────────────────────────────────────────────────────────
HOST=0.0.0.0
PORT=8000
# ── Google Chat bot ──────────────────────────────────────────────────────────
# JWT audience for verifying inbound Workspace Add-on Chat webhook requests.
# For Workspace Add-on Chat apps, the aud claim = the endpoint URL.
# Leave blank to disable verification (dev/testing only).
GOOGLE_CHAT_AUDIENCE=https://cortex.dgrzone.com/channels/google-chat
# ── Nextcloud Talk bot ───────────────────────────────────────────────────────
NEXTCLOUD_URL=https://cloud.dgrzone.com
NEXTCLOUD_TALK_BOT_SECRET=
# ── LLM backends ────────────────────────────────────────────────────────────
# Primary backend: "claude", "gemini", or "local" (switchable at runtime via UI)
PRIMARY_BACKEND=claude
# Timeouts in seconds
TIMEOUT_CLAUDE=60
TIMEOUT_GEMINI=120
TIMEOUT_LOCAL=300
TIMEOUT_LOCAL=300 # local models may need time to load
# Google Chat — must respond within 30s or Chat shows an error to the user
GOOGLE_CHAT_TIMEOUT=25
# Backend pinned for Google Chat (claude recommended — more reliable within 25s)
GOOGLE_CHAT_BACKEND=claude
# TODO: add GOOGLE_CHAT_TOKEN for request verification once endpoint is public
# ── Local model (Open WebUI / Ollama — OpenAI-compatible API) ────────────────
# Leave LOCAL_API_URL blank to disable. When set, "local" appears as a backend option.
# API key: Open WebUI → Settings → Account → API Keys
# Model: workspace alias or full Ollama model name
LOCAL_API_URL=http://192.168.32.19:3000
LOCAL_API_KEY=
LOCAL_MODEL=test-agent-simple
# Server
PORT=8000
HOST=0.0.0.0
# ── Orchestrator (Gemini API — not Gemini CLI) ───────────────────────────────
# Required for /orchestrate endpoint and tool use
# Free tier key: https://aistudio.google.com/apikey
GEMINI_API_KEY=
# Model for the orchestration tool loop (not the user-facing response)
ORCHESTRATOR_MODEL=gemini-2.5-flash
# Safety cap on tool loop iterations
ORCHESTRATOR_MAX_ROUNDS=10
# ── DuckDuckGo search ────────────────────────────────────────────────────────
# Leave blank for free unauthenticated tier
# Set to your API key for higher rate limits (paid DuckDuckGo account)
DDG_API_KEY=
DDG_MAX_RESULTS=5
# ── Aether Platform API ───────────────────────────────────────────────────────
# Used by orchestrator tools: ae_journal_search, ae_journal_entry_create, ae_task_list
# Same values as agents_sync/mcp/.env — copy from there
AE_API_URL=https://dev-api.oneskyit.com
AE_API_KEY=
AE_ACCOUNT_ID=
AE_API_TIMEOUT=15
# ── Distillation schedule ────────────────────────────────────────────────────
SCHEDULER_TIMEZONE=America/New_York
AUTO_DISTILL=true
AUTO_DISTILL_SHORT=true
AUTO_DISTILL_MID=true
AUTO_DISTILL_LONG=false # manual review recommended before enabling
# Memory tier token budgets (soft caps)
MEMORY_BUDGET_SHORT=3000
MEMORY_BUDGET_MID=2000
MEMORY_BUDGET_LONG=2000

View File

@@ -19,8 +19,8 @@ from auth_utils import COOKIE_NAME, decode_token
# Paths that don't require a session cookie
_PUBLIC = {"/login", "/logout", "/health"}
# Path prefixes that are always public (setup flow + webhooks)
_PUBLIC_PREFIXES = ("/setup/", "/channels/", "/webhook/")
# Path prefixes that are always public (setup flow + webhooks + Google OAuth)
_PUBLIC_PREFIXES = ("/setup/", "/channels/", "/webhook/", "/auth/google")
class SessionAuthMiddleware(BaseHTTPMiddleware):

View File

@@ -29,33 +29,92 @@ ALGORITHM = "HS256"
# ---------------------------------------------------------------------------
# Password helpers
# auth.json helpers — read/write without clobbering unrelated fields
# ---------------------------------------------------------------------------
def _auth_path(username: str) -> Path:
return settings.home_root() / username / "auth.json"
def _read_auth(username: str) -> dict:
path = _auth_path(username)
if not path.exists():
return {}
try:
return json.loads(path.read_text())
except Exception:
return {}
def _write_auth(username: str, data: dict) -> None:
path = _auth_path(username)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(data, indent=2) + "\n")
# ---------------------------------------------------------------------------
# Password helpers
# ---------------------------------------------------------------------------
def set_password(username: str, password: str) -> None:
"""Hash and store a password for a user. Creates auth.json if needed."""
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt()).decode()
_auth_path(username).write_text(json.dumps({"password_hash": hashed}) + "\n")
"""Hash and store a password. Preserves any existing fields in auth.json."""
data = _read_auth(username)
data["password_hash"] = bcrypt.hashpw(password.encode(), bcrypt.gensalt()).decode()
_write_auth(username, data)
logger.info("password set for user: %s", username)
def check_credentials(username: str, password: str) -> bool:
"""Return True if username+password are valid, False otherwise."""
path = _auth_path(username)
if not path.exists():
return False
try:
data = json.loads(path.read_text())
stored = data.get("password_hash", "").encode()
stored = _read_auth(username).get("password_hash", "").encode()
if not stored:
return False
return bcrypt.checkpw(password.encode(), stored)
except Exception:
return False
# ---------------------------------------------------------------------------
# Google OAuth helpers
# ---------------------------------------------------------------------------
def find_user_by_google(sub: str, email: str) -> str | None:
"""
Scan all users for one whose auth.json matches the given Google sub or email.
Sub match takes priority (stable); email match is a fallback for first sign-in.
Returns the username, or None if no match.
"""
root = settings.home_root()
if not root.exists():
return None
for user_dir in sorted(root.iterdir()):
if not user_dir.is_dir():
continue
data = _read_auth(user_dir.name)
if not data:
continue
if sub and data.get("google_sub") == sub:
return user_dir.name
if email and data.get("google_email", "").lower() == email.lower():
return user_dir.name
return None
def link_google(username: str, sub: str, email: str) -> None:
"""Store / update Google sub and email in a user's auth.json."""
data = _read_auth(username)
data["google_sub"] = sub
data["google_email"] = email
_write_auth(username, data)
logger.info("Google account linked for user: %s (%s)", username, email)
def get_user_gemini_key(username: str) -> str | None:
"""Return the user's personal Gemini API key, or None to use the server key."""
return _read_auth(username).get("gemini_api_key") or None
# ---------------------------------------------------------------------------
# JWT helpers
# ---------------------------------------------------------------------------
@@ -136,3 +195,22 @@ def consume_invite(username: str) -> None:
path.write_text(json.dumps(data) + "\n")
except Exception:
pass
# ---------------------------------------------------------------------------
# Per-user channel config
# ---------------------------------------------------------------------------
def _channels_path(username: str) -> Path:
return settings.home_root() / username / "channels.json"
def get_user_channels(username: str) -> dict:
"""Return the parsed channels.json for a user, or {} if not found."""
path = _channels_path(username)
if not path.exists():
return {}
try:
return json.loads(path.read_text())
except Exception:
return {}

View File

@@ -5,6 +5,12 @@ from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
anthropic_api_key: str | None = None # not used — claude CLI handles auth
# Google OAuth — "Sign in with Google" for all users
# Create credentials at console.cloud.google.com → APIs & Services → Credentials
# Add https://<your-domain>/auth/google/callback as an authorised redirect URI
google_client_id: str | None = None
google_client_secret: str | None = None
# Orchestrator (Gemini API — separate from Gemini CLI)
# Get a key at: https://aistudio.google.com/apikey (free tier is sufficient)
gemini_api_key: str | None = None
@@ -34,26 +40,17 @@ class Settings(BaseSettings):
max_history_messages: int = 40 # rolling window — 20 turns (user + assistant)
primary_backend: str = "claude" # "claude" or "gemini" — other is always fallback
# Local model backend — OpenAI-compatible API (Open WebUI / Ollama)
# Set LOCAL_API_URL in .env to enable; leave blank to disable
local_api_url: str = "" # e.g. http://192.168.32.19:3000
local_api_key: str = "" # sk-... from Open WebUI → Settings → Account → API Keys
local_model: str = "" # workspace or model name, e.g. test-agent-simple
# Per-backend timeouts in seconds
timeout_claude: int = 60
timeout_gemini: int = 120 # frequently slow under load
timeout_local: int = 300 # local models may need to load first
# Google Chat
# JWT audience (aud) claim to verify on inbound webhook requests.
# Google Chat sets aud = the Google Cloud project number (e.g. "741112865538").
# Set to "" to disable verification (dev/testing only).
google_chat_audience: str = ""
# Google Chat must receive a response within 30s or shows an error to the user
google_chat_timeout: int = 25
# Backend forced for Google Chat — Claude is more reliable within the 25s deadline
google_chat_backend: str = "claude"
# Nextcloud Talk bot
nextcloud_url: str = "https://cloud.dgrzone.com"
nextcloud_talk_bot_secret: str = "" # set in .env
nextcloud_talk_timeout: int = 55
# Auto-distillation schedule — override in .env
# AUTO_DISTILL=false disables entirely
scheduler_timezone: str = "America/New_York" # IANA tz — override in .env if needed
@@ -62,6 +59,26 @@ class Settings(BaseSettings):
auto_distill_mid: bool = True # weekly Sunday at 03:30 — LLM summarizes short → mid
auto_distill_long: bool = False # monthly 1st at 04:00 — off by default (manual review recommended)
# Which backend to use for distillation LLM calls.
# "" = use primary_backend (default); "local" = use local model (saves API credits).
# "long" stays on default (claude/gemini) for best quality.
distill_backend_mid: str = ""
distill_backend_long: str = ""
# Model registry: default backend type per role when user registry has no entry.
# Values: "claude_cli" | "gemini_cli" | "gemini_api" (builtin IDs)
# Override in .env: ROLE_CHAT=claude_cli ROLE_DISTILL=gemini_api etc.
role_chat: str = "claude_cli"
role_orchestrator: str = "gemini_api"
role_distill: str = "claude_cli"
role_coder: str = "claude_cli"
role_research: str = "gemini_api"
# Comma-separated list of standard roles shown in the model settings UI.
# Add custom roles here to extend the UI without code changes.
# Example: DEFINED_ROLES=chat,orchestrator,distill,coder,research,medical
defined_roles: str = "chat,orchestrator,distill,coder,research"
# Memory tier token budgets — soft caps used during distillation
# Override in .env: MEMORY_BUDGET_LONG=4000 etc.
memory_budget_long: int = 2000
@@ -87,6 +104,14 @@ class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")
def get_defined_roles(self) -> list[str]:
"""Return the ordered list of standard roles from the defined_roles setting."""
return [r.strip() for r in self.defined_roles.split(",") if r.strip()]
def get_role_default(self, role: str) -> str:
"""Return the .env default backend type for a role (e.g. 'claude_cli')."""
return getattr(self, f"role_{role.replace('-', '_')}", "claude_cli")
def home_root(self) -> Path:
"""Resolve home_dir relative to this file's location if not absolute."""
if self.home_dir.is_absolute():

View File

@@ -10,16 +10,20 @@ Job schema:
"id": "c_abc123",
"label": "Human-readable name",
"schedule": "daily:09:00", # see parse_schedule() for all formats
"type": "remind" | "note",
"payload": "Text to write when the job fires",
"type": "remind" | "note" | "message" | "brief",
"payload": "Text or prompt when the job fires",
"channel": null | "nextcloud" | "google_chat", # for message/brief types
"enabled": true,
"created_at": "ISO 8601",
"last_run": null | "ISO 8601"
}
Job types:
remind → appends to inara/REMINDERS.md (auto-loaded into Inara's context)
note → appends to inara/SCRATCH.md (read on demand via scratch_read)
remind → appends to REMINDERS.md (auto-loaded into context at tier 2+)
note → appends to SCRATCH.md (read on demand via scratch_read)
message → sends payload as-is to NC Talk notification_room
brief → runs LLM with payload as the prompt, sends response to NC Talk
(good for morning briefings, summaries, proactive check-ins)
"""
import logging
@@ -150,6 +154,40 @@ async def run_job(job: dict) -> None:
p.write_text(existing.rstrip() + "\n" + section)
logger.info("cron [note] fired: %s", label)
elif job_type == "message":
# Send payload text directly to the user's notification channel
from notification import notify
username = job.get("user") or "scott"
channel = job.get("channel") or None
await notify(username, payload, channel=channel)
logger.info("cron [message] sent: %s", label)
elif job_type == "brief":
# Run LLM with payload as the prompt, send response to notification channel.
# Great for morning briefings, reminders, proactive check-ins.
from context_loader import load_context
from llm_client import complete
from notification import notify
from persona import set_context
from config import settings as _s
username = job.get("user") or _s.user_name.lower()
persona_nm = job.get("persona") or _s.agent_name.lower()
channel = job.get("channel") or None
set_context(username, persona_nm)
system_prompt = load_context(2) # tier 2: identity + memory + user profile
try:
response_text, backend = await complete(
system_prompt=system_prompt,
messages=[{"role": "user", "content": payload}],
role="chat",
)
await notify(username, response_text, channel=channel)
logger.info("cron [brief] sent via %s: %s", backend, label)
except Exception as e:
logger.error("cron [brief] LLM error for %s: %s", label, e)
else:
logger.warning("cron: unknown type %r (job %s)", job_type, job.get("id"))
return

View File

@@ -31,22 +31,59 @@ async def cleanup() -> None:
_active_pgroups.clear()
# Map from registry model type → dispatch function key
_TYPE_TO_BACKEND = {
"claude_cli": "claude",
"gemini_cli": "gemini",
"gemini_api": "gemini", # gemini_api falls back to CLI in this context
"local_openai": "local",
}
# Explicit UI toggle values (kept for backward compat)
_EXPLICIT_BACKENDS = ("claude", "gemini", "local")
_FALLBACK = {"claude": "gemini", "gemini": "claude", "local": "claude"}
async def complete(
system_prompt: str,
messages: list[dict],
model: str | None = None,
role: str = "chat",
max_tokens: int = 2048,
) -> tuple[str, str]:
"""Returns (response_text, actual_backend_used)."""
if model in ("claude", "gemini"):
"""
Returns (response_text, actual_backend_used).
model: explicit backend override ("claude" | "gemini" | "local") from UI toggle.
None = resolve via model registry for the given role.
role: registry role used when model is None (default: "chat").
"""
import model_registry as _reg
from persona import _user
username = _user.get()
resolved_cfg: dict | None = None
if model in _EXPLICIT_BACKENDS:
# User explicitly selected a backend in the UI
if model == "local":
resolved_cfg = _reg.get_best_local_model(username, role)
if not resolved_cfg:
raise RuntimeError("No local model configured — add one at /settings/models")
primary = model
else:
primary = settings.primary_backend
# Role-based routing via model registry
resolved = _reg.get_model_for_role(username, role)
if resolved:
resolved_cfg = resolved
primary = _TYPE_TO_BACKEND.get(resolved["type"], "claude")
else:
primary = settings.primary_backend
fallback = "gemini" if primary == "claude" else "claude"
fallback = _FALLBACK.get(primary, "claude")
try:
response = await _dispatch(primary, system_prompt, messages, model)
response = await _dispatch(primary, system_prompt, messages, resolved_cfg)
return response, primary
except Exception as e:
err_str = str(e)
@@ -61,11 +98,13 @@ async def _dispatch(
backend: str,
system_prompt: str,
messages: list[dict],
model: str | None,
model_cfg: dict | None,
) -> str:
if backend == "gemini":
return await _gemini(system_prompt, messages)
return await _claude(system_prompt, messages, model)
if backend == "local":
return await _local(system_prompt, messages, model_cfg)
return await _claude(system_prompt, messages, model_cfg)
def _fresh_claude_token() -> str | None:
@@ -85,14 +124,16 @@ def _fresh_claude_token() -> str | None:
return None
async def _claude(system_prompt: str, messages: list[dict], model: str | None) -> str:
async def _claude(system_prompt: str, messages: list[dict], model_cfg: dict | None) -> str:
model_name = (model_cfg or {}).get("model_name") if model_cfg else None
cmd = [
"claude", "--print",
"--no-session-persistence",
"--output-format", "text",
]
if model and model not in ("claude", "gemini"):
cmd.extend(["--model", model])
# Only pass --model if it's a real model name (not a backend type string)
if model_name and model_name not in ("claude", "gemini", "local", ""):
cmd.extend(["--model", model_name])
if system_prompt:
cmd.extend(["--system-prompt", system_prompt])
cmd.append(_build_conversation(messages))
@@ -108,6 +149,60 @@ async def _claude(system_prompt: str, messages: list[dict], model: str | None) -
return await _run(cmd, timeout=settings.timeout_claude, env=env)
async def _local(system_prompt: str, messages: list[dict], model_cfg: dict | None = None) -> str:
"""OpenAI-compatible backend — Open WebUI / Ollama.
model_cfg is pre-resolved by complete() via model_registry.
Falls back to registry lookup if not provided.
"""
import httpx
cfg = model_cfg
if not cfg:
# Fallback: resolve directly from registry
import model_registry as _reg
from persona import _user
cfg = _reg.get_best_local_model(_user.get())
if not cfg:
raise RuntimeError("No local model configured — add one at /settings/models")
api_url = cfg["api_url"]
api_key = cfg["api_key"]
model = cfg["model_name"]
if not api_url:
raise RuntimeError("local_api_url not configured — set LOCAL_API_URL in .env or add a host at /settings/local")
if not model:
raise RuntimeError("local_model not configured — add a model at /settings/local")
host_type = cfg.get("host_type", "openwebui")
# "openwebui" uses Open WebUI/Ollama path layout; "openai" uses standard OpenAI layout
chat_path = "/chat/completions" if host_type == "openai" else "/api/chat/completions"
logger.info("local backend (%s): %s @ %s", host_type, model, api_url)
msgs: list[dict] = []
if system_prompt:
msgs.append({"role": "system", "content": system_prompt})
msgs.extend(messages)
url = api_url.rstrip("/") + chat_path
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
payload = {"model": model, "messages": msgs}
async with httpx.AsyncClient(timeout=settings.timeout_local) as client:
resp = await client.post(url, json=payload, headers=headers)
resp.raise_for_status()
data = resp.json()
text = data["choices"][0]["message"]["content"]
if not text or not text.strip():
raise RuntimeError("Local model returned an empty response")
return text.strip()
async def _gemini(system_prompt: str, messages: list[dict]) -> str:
# Gemini CLI spawns MCP child processes that keep stdout pipes open after responding.
# start_new_session=True puts the whole tree in its own process group so

View File

@@ -9,7 +9,7 @@ logging.basicConfig(level=logging.INFO, format="%(levelname)s:%(name)s: %(messag
from config import settings
from auth_middleware import SessionAuthMiddleware
from routers import chat, google_chat, nextcloud_talk, files, distill, auth, orchestrator
from routers import ui, onboarding, settings, help
from routers import ui, onboarding, settings, help, auth_google, local_llm
@asynccontextmanager
@@ -39,11 +39,15 @@ app.include_router(orchestrator.router)
# ui.router has a wildcard /{username}/{persona} that would otherwise catch /static/style.css etc.
app.mount("/static", StaticFiles(directory="static"), name="static")
# Google OAuth — must be before ui.router (wildcard /{user}/{persona} would swallow it)
app.include_router(auth_google.router)
# Onboarding (invite tokens + persona creation — before ui.router)
app.include_router(onboarding.router)
# Account settings
app.include_router(settings.router)
app.include_router(local_llm.router)
# Help page
app.include_router(help.router)

View File

@@ -6,9 +6,10 @@ Usage:
python manage_passwords.py set <username> # prompt for password
python manage_passwords.py set <username> <pass> # set directly (avoid in shell history)
python manage_passwords.py check <username> # test a password interactively
python manage_passwords.py list # show users, passwords, and emails
python manage_passwords.py list # show users, auth methods, and emails
python manage_passwords.py invite <username> [email] # generate + optionally email invite link
python manage_passwords.py email <username> <email> # store/update an email address
python manage_passwords.py google-add <username> <email> # register a user for Google sign-in
"""
import json
@@ -18,7 +19,7 @@ import getpass
# Add cortex/ to path so we can import config and auth_utils
sys.path.insert(0, str(__import__('pathlib').Path(__file__).parent))
from auth_utils import set_password, check_credentials, _auth_path, create_invite
from auth_utils import set_password, check_credentials, _auth_path, create_invite, link_google, _read_auth
from persona import list_users
from config import settings
@@ -96,10 +97,14 @@ def cmd_list(_args):
if not users:
print(" No users found in home/")
return
print(f" {'USER':<18} {'PW':<6} {'GOOGLE':<8} {'EMAIL'}")
print(f" {'-'*18} {'-'*6} {'-'*8} {'-'*30}")
for user in users:
has_pw = "✓ pw" if _auth_path(user).exists() else "✗ pw"
email = get_email(user) or ""
print(f" {user:<20} {has_pw} {email}")
auth = _read_auth(user)
has_pw = "" if auth.get("password_hash") else ""
google = auth.get("google_email") or ""
email = get_email(user) or ""
print(f" {user:<18} {has_pw:<6} {google:<36} {email}")
def cmd_email(args):
@@ -149,6 +154,22 @@ def cmd_invite(args):
print("Tip: python manage_passwords.py invite <username> <email> to email it next time.\n")
def cmd_google_add(args):
if len(args) < 2:
print("Usage: manage_passwords.py google-add <username> <google_email>")
sys.exit(1)
username, email = args[0], args[1].lower().strip()
# Ensure the user directory exists
(settings.home_root() / username).mkdir(parents=True, exist_ok=True)
# Store in auth.json (google_sub filled in on first sign-in) + profile.json (for invites)
link_google(username, sub="", email=email)
set_email(username, email)
print(f"Google sign-in registered for {username!r}: {email}")
print(f"They can now sign in at {settings.cortex_base_url}/login using that Google account.")
if __name__ == "__main__":
if len(sys.argv) < 2:
print(__doc__)
@@ -167,6 +188,8 @@ if __name__ == "__main__":
cmd_email(rest)
elif command == "invite":
cmd_invite(rest)
elif command == "google-add":
cmd_google_add(rest)
else:
print(f"Unknown command: {command}")
print(__doc__)

View File

@@ -77,10 +77,16 @@ def distill_short(username: str | None = None, persona: str | None = None) -> di
async def distill_mid(username: str | None = None, persona: str | None = None) -> dict:
"""
Ask the LLM to summarize MEMORY_SHORT.md → MEMORY_MID.md.
Uses DISTILL_BACKEND_MID if set (e.g. "local"), otherwise primary_backend.
"""
from llm_client import complete
from persona import set_context
inara_dir = _persona_path(username, persona)
u = username or settings.user_name.lower()
p = persona or settings.agent_name.lower()
set_context(u, p)
inara_dir = _persona_path(u, p)
short_content = _read(inara_dir / "MEMORY_SHORT.md")
if not short_content.strip() or "Not yet populated" in short_content:
@@ -100,6 +106,7 @@ async def distill_mid(username: str | None = None, persona: str | None = None) -
response_text, backend = await complete(
system_prompt=system_prompt,
messages=[{"role": "user", "content": short_content}],
role="distill",
)
now = datetime.now().strftime("%Y-%m-%d %H:%M")
@@ -112,6 +119,7 @@ async def distill_mid(username: str | None = None, persona: str | None = None) -
logger.info("distill_mid: wrote %d chars via %s", len(header) + len(response_text), backend)
return {
"username": u,
"backend": backend,
"chars_written": len(header) + len(response_text),
"budget_tokens": budget_tokens,
@@ -121,10 +129,16 @@ async def distill_mid(username: str | None = None, persona: str | None = None) -
async def distill_long(username: str | None = None, persona: str | None = None) -> dict:
"""
Ask the LLM to integrate MEMORY_MID.md into MEMORY_LONG.md.
Uses DISTILL_BACKEND_LONG if set, otherwise primary_backend.
"""
from llm_client import complete
from persona import set_context
inara_dir = _persona_path(username, persona)
u = username or settings.user_name.lower()
p = persona or settings.agent_name.lower()
set_context(u, p)
inara_dir = _persona_path(u, p)
long_content = _read(inara_dir / "MEMORY_LONG.md")
mid_content = _read(inara_dir / "MEMORY_MID.md")
@@ -149,6 +163,7 @@ async def distill_long(username: str | None = None, persona: str | None = None)
response_text, backend = await complete(
system_prompt=system_prompt,
messages=[{"role": "user", "content": user_content}],
role="distill",
)
# Ensure the file has the right header if the LLM dropped it
@@ -165,6 +180,7 @@ async def distill_long(username: str | None = None, persona: str | None = None)
logger.info("distill_long: wrote %d chars via %s", len(response_text), backend)
return {
"username": u,
"backend": backend,
"chars_written": len(response_text),
"budget_tokens": budget_tokens,

460
cortex/model_registry.py Normal file
View File

@@ -0,0 +1,460 @@
"""
Per-user unified model registry.
Stored in: home/{user}/model_registry.json
Schema:
{
"version": 1,
"hosts": [{"id", "label", "api_url", "api_key",
"host_type": "openwebui" | "openai"}, ...],
#
# host_type controls the API path layout:
# "openwebui" (default) — Open WebUI / Ollama:
# chat: POST {url}/api/chat/completions
# models: GET {url}/api/models
# "openai" — OpenRouter, LiteLLM, Anthropic-compatible, etc.:
# chat: POST {url}/chat/completions
# models: GET {url}/models
# Set api_url to the base path that ends just before /chat/completions,
# e.g. https://openrouter.ai/api/v1 for OpenRouter.
"models": [
{
"id": str, # unique within this registry
"type": str, # "local_openai" | "claude_cli" | "gemini_cli" | "gemini_api"
"label": str, # human-readable display name
"model_name": str, # model identifier sent to the API
"host_id": str | null, # only for local_openai — references hosts[].id
"context_k": int, # context window in thousands of tokens (informational)
"tags": [str], # user-defined capability tags
},
],
"roles": {
"<role>": {
"primary": "<model_id>" | null,
"backup_1": "<model_id>" | null,
"backup_2": "<model_id>" | null,
"backup_3": "<model_id>" | null,
"backup_4": "<model_id>" | null,
},
},
}
Built-in model IDs (always resolvable, no registry entry required):
"claude_cli" — Claude CLI subprocess (~/.claude/.credentials.json)
"gemini_cli" — Gemini CLI subprocess
"gemini_api" — Gemini API (google-genai SDK; used by orchestrator engine, not llm_client)
Standard roles are defined by settings.defined_roles (default: chat,orchestrator,distill,coder,research).
Additional custom roles can be added freely to roles{}.
Resolution for get_model_for_role(username, role):
1. User registry: roles[role].primary → backup_1 → backup_2 → backup_3 → backup_4
2. .env default: ROLE_<ROLE>=<builtin_id> (e.g. ROLE_CHAT=claude_cli)
3. Hardcoded last-resort defaults per role
"""
import json
import logging
import secrets
from pathlib import Path
from config import settings
logger = logging.getLogger(__name__)
# ── Built-in model definitions ────────────────────────────────────────────────
# These IDs are always resolvable without a registry entry.
def _builtins() -> dict[str, dict]:
"""Return built-in model definitions (lazy so settings are resolved at call time)."""
return {
"claude_cli": {
"id": "claude_cli",
"type": "claude_cli",
"label": f"Claude (CLI) — {settings.default_model}",
"model_name": settings.default_model,
"context_k": 200,
"tags": ["chat", "persona", "creative"],
},
"gemini_cli": {
"id": "gemini_cli",
"type": "gemini_cli",
"label": "Gemini (CLI)",
"model_name": "",
"context_k": 1000,
"tags": ["chat", "research", "long_context"],
},
"gemini_api": {
"id": "gemini_api",
"type": "gemini_api",
"label": f"Gemini API — {settings.orchestrator_model}",
"model_name": settings.orchestrator_model,
"context_k": 1000,
"tags": ["orchestrator", "research", "long_context", "tools"],
},
}
# Hardcoded last-resort defaults per role (used only if .env is also unset)
_ROLE_LAST_RESORT: dict[str, str] = {
"chat": "claude_cli",
"orchestrator": "gemini_api",
"distill": "claude_cli",
"coder": "claude_cli",
"research": "gemini_api",
}
PRIORITY_KEYS = ["primary", "backup_1", "backup_2", "backup_3", "backup_4"]
# ── Storage ───────────────────────────────────────────────────────────────────
def _registry_path(username: str) -> Path:
return settings.home_root() / username / "model_registry.json"
def _local_llm_path(username: str) -> Path:
return settings.home_root() / username / "local_llm.json"
def _empty() -> dict:
return {"version": 1, "hosts": [], "models": [], "roles": {}}
def _load(username: str) -> dict:
path = _registry_path(username)
if path.exists():
try:
data = json.loads(path.read_text())
if isinstance(data, dict) and "version" in data:
return data
except (json.JSONDecodeError, OSError):
logger.warning("model_registry.json for %s is unreadable — starting fresh", username)
return _empty()
# No registry yet — try migrating from local_llm.json
legacy = _local_llm_path(username)
if legacy.exists():
data = _migrate_from_local_llm(username, legacy)
_save(username, data)
logger.info("Migrated local_llm.json → model_registry.json for %s", username)
return data
return _empty()
def _save(username: str, data: dict) -> None:
_registry_path(username).write_text(json.dumps(data, indent=2))
# ── Migration ─────────────────────────────────────────────────────────────────
def _migrate_from_local_llm(username: str, path: Path) -> dict:
"""Convert local_llm.json (hosts/models/active_model_id) → model_registry format."""
try:
old = json.loads(path.read_text())
except Exception:
return _empty()
data = _empty()
# Handle v0 flat format
if "hosts" not in old:
api_url = old.get("api_url") or settings.local_api_url
api_key = old.get("api_key") or settings.local_api_key
model_name = old.get("model") or settings.local_model
if not api_url:
return data
host_id = secrets.token_hex(4)
old = {
"hosts": [{"id": host_id, "label": "Local Model Server", "api_url": api_url, "api_key": api_key}],
"models": [{"id": secrets.token_hex(4), "host_id": host_id, "label": model_name, "model_name": model_name}] if model_name else [],
"active_model_id": None,
}
if old["models"]:
old["active_model_id"] = old["models"][0]["id"]
data["hosts"] = old.get("hosts", [])
for m in old.get("models", []):
data["models"].append({
"id": m["id"],
"type": "local_openai",
"label": m.get("label") or m.get("model_name", ""),
"model_name": m.get("model_name", ""),
"host_id": m.get("host_id"),
"context_k": 0,
"tags": [],
})
# Build initial role assignments
active_id = old.get("active_model_id")
distill_type = settings.distill_backend_mid or None
roles: dict[str, dict] = {}
if active_id and any(m["id"] == active_id for m in data["models"]):
roles["chat"] = {"primary": active_id}
if distill_type == "local" and active_id:
roles["distill"] = {"primary": active_id}
data["roles"] = roles
return data
# ── Model resolution ──────────────────────────────────────────────────────────
def _resolve_model(registry: dict, model_id: str) -> dict | None:
"""Resolve a model_id to its full config dict, or None if not found."""
builtins = _builtins()
# Built-in IDs take priority over user-defined entries with the same ID
if model_id in builtins:
return dict(builtins[model_id])
model = next((m for m in registry.get("models", []) if m["id"] == model_id), None)
if not model:
return None
if model.get("type") == "local_openai":
host_id = model.get("host_id")
host = next((h for h in registry.get("hosts", []) if h["id"] == host_id), None)
if not host:
logger.warning("model %s references missing host_id %s", model_id, host_id)
return None
return {
**model,
"api_url": host.get("api_url", ""),
"api_key": host.get("api_key", ""),
"host_type": host.get("host_type", "openwebui"),
}
return dict(model)
def get_model_for_role(username: str, role: str) -> dict | None:
"""
Return the resolved model config for the given role.
Resolution order:
1. User registry: roles[role].primary → backup_1 → ... → backup_4
2. .env: ROLE_<ROLE> = builtin model ID
3. Hardcoded last-resort default per role
4. claude_cli (absolute fallback)
"""
registry = _load(username)
role_cfg = registry.get("roles", {}).get(role, {})
for key in PRIORITY_KEYS:
model_id = role_cfg.get(key)
if not model_id:
continue
resolved = _resolve_model(registry, model_id)
if resolved:
return resolved
logger.debug("role %s.%s = %s but model not found", role, key, model_id)
# .env default
env_type = settings.get_role_default(role)
builtins = _builtins()
if env_type and env_type in builtins:
return dict(builtins[env_type])
# Hardcoded last resort
fallback_id = _ROLE_LAST_RESORT.get(role, "claude_cli")
return dict(builtins.get(fallback_id, builtins["claude_cli"]))
def get_best_local_model(username: str, role: str = "chat") -> dict | None:
"""
Return the best available local_openai model for the given role.
Used when the user explicitly selects "local" backend in the UI.
Tries the role's priority chain first, then any configured local model.
"""
registry = _load(username)
role_cfg = registry.get("roles", {}).get(role, {})
for key in PRIORITY_KEYS:
model_id = role_cfg.get(key)
if not model_id:
continue
resolved = _resolve_model(registry, model_id)
if resolved and resolved.get("type") == "local_openai":
return resolved
# Fall back to first configured local model
for model in registry.get("models", []):
if model.get("type") == "local_openai":
resolved = _resolve_model(registry, model["id"])
if resolved:
return resolved
return None
# ── Read API (for UI and callers) ─────────────────────────────────────────────
def get_registry(username: str) -> dict:
"""Return the full registry (with built-in models injected for display)."""
return _load(username)
def get_all_models(username: str) -> list[dict]:
"""Return all user-defined models (resolved — hosts merged in)."""
registry = _load(username)
out = []
for m in registry.get("models", []):
resolved = _resolve_model(registry, m["id"])
if resolved:
out.append(resolved)
return out
def get_defined_roles(username: str) -> dict[str, dict]:
"""Return the roles section of the registry, filling gaps with empty dicts."""
registry = _load(username)
roles = registry.get("roles", {})
result = {}
for role in settings.get_defined_roles():
result[role] = roles.get(role, {})
return result
# ── Write API (CRUD) ──────────────────────────────────────────────────────────
def save_host(username: str, host_id: str | None,
label: str, api_url: str, api_key: str,
host_type: str = "openwebui") -> str:
"""Create or update a host. Returns the host ID.
host_type: "openwebui" (default) or "openai" (OpenRouter, LiteLLM, etc.)
"""
data = _load(username)
host_type = host_type if host_type in ("openwebui", "openai") else "openwebui"
if host_id:
for h in data["hosts"]:
if h["id"] == host_id:
h["label"] = label.strip()
h["api_url"] = api_url.strip()
h["host_type"] = host_type
if api_key.strip():
h["api_key"] = api_key.strip()
_save(username, data)
return host_id
host_id = None # not found — create new
host_id = secrets.token_hex(4)
data["hosts"].append({
"id": host_id,
"label": label.strip(),
"api_url": api_url.strip(),
"api_key": api_key.strip(),
"host_type": host_type,
})
_save(username, data)
return host_id
def remove_host(username: str, host_id: str) -> bool:
"""Remove a host and all models that reference it. Returns True if found."""
data = _load(username)
before = len(data["hosts"])
data["hosts"] = [h for h in data["hosts"] if h["id"] != host_id]
data["models"] = [m for m in data["models"] if m.get("host_id") != host_id]
# Clear any role assignments that pointed to removed models
removed_ids = {m["id"] for m in data["models"] if m.get("host_id") == host_id}
for role_cfg in data.get("roles", {}).values():
for key in PRIORITY_KEYS:
if role_cfg.get(key) in removed_ids:
role_cfg[key] = None
_save(username, data)
return len(data["hosts"]) < before
def save_model(username: str, model_id: str | None, host_id: str,
label: str, model_name: str, context_k: int = 0,
tags: list[str] | None = None) -> str:
"""Create or update a model entry. Returns the model ID."""
data = _load(username)
tags = tags or []
if model_id:
for m in data["models"]:
if m["id"] == model_id:
m["host_id"] = host_id
m["label"] = label.strip() or model_name.strip()
m["model_name"] = model_name.strip()
m["context_k"] = context_k
m["tags"] = tags
_save(username, data)
return model_id
model_id = None
model_id = secrets.token_hex(4)
data["models"].append({
"id": model_id,
"type": "local_openai",
"label": label.strip() or model_name.strip(),
"model_name": model_name.strip(),
"host_id": host_id,
"context_k": context_k,
"tags": tags,
})
_save(username, data)
return model_id
def remove_model(username: str, model_id: str) -> bool:
"""Remove a model and clear any role assignments pointing to it."""
data = _load(username)
before = len(data["models"])
data["models"] = [m for m in data["models"] if m["id"] != model_id]
for role_cfg in data.get("roles", {}).values():
for key in PRIORITY_KEYS:
if role_cfg.get(key) == model_id:
role_cfg[key] = None
_save(username, data)
return len(data["models"]) < before
def set_role(username: str, role: str, priority: str, model_id: str | None) -> bool:
"""
Assign a model to a role priority slot.
priority must be one of: primary, backup_1, backup_2, backup_3, backup_4
model_id None clears the slot.
model_id "claude_cli" / "gemini_cli" / "gemini_api" are valid built-in IDs.
Returns False if model_id is set but not found.
"""
if priority not in PRIORITY_KEYS:
return False
data = _load(username)
if model_id and model_id not in _builtins():
if not any(m["id"] == model_id for m in data["models"]):
return False
roles = data.setdefault("roles", {})
if role not in roles:
roles[role] = {}
roles[role][priority] = model_id or None
_save(username, data)
return True
def fetch_models_from_host(api_url: str, api_key: str) -> list[str]:
"""Synchronously fetch the model list from an OpenAI-compatible host."""
import httpx
url = api_url.rstrip("/") + "/api/models"
headers = {"Authorization": f"Bearer {api_key}"} if api_key else {}
resp = httpx.get(url, headers=headers, timeout=10)
resp.raise_for_status()
data = resp.json()
models = data.get("data", [])
return sorted(m.get("id", m.get("name", "")) for m in models if m.get("id") or m.get("name"))

106
cortex/notification.py Normal file
View File

@@ -0,0 +1,106 @@
"""
Outbound notification helpers — send messages to user channels proactively.
Channel config lives in home/{user}/channels.json.
Each channel that supports proactive notifications needs a notification_channel
set to its key name (e.g. "nextcloud", "google_chat") in the user's channels.json:
{
"notification_channel": "nextcloud",
"nextcloud": {
"url": "https://cloud.example.com",
"bot_secret": "...",
"notification_room": "<room-token>",
...
}
}
If notification_channel is absent, defaults to "nextcloud" if configured.
If notification_room (for NCT) is absent, notifications are silently skipped.
"""
import hashlib
import hmac
import json
import logging
import secrets
import httpx
logger = logging.getLogger(__name__)
async def _send_nct_message(url: str, secret: str, room: str, message: str) -> None:
"""Post a message to a Nextcloud Talk room as the bot."""
endpoint = f"{url}/ocs/v2.php/apps/spreed/api/v1/bot/{room}/message"
random_str = secrets.token_hex(32)
sig = hmac.new(
secret.encode(),
(random_str + message).encode("utf-8"),
hashlib.sha256,
).hexdigest()
body = json.dumps({"message": message}, ensure_ascii=False).encode("utf-8")
try:
async with httpx.AsyncClient() as client:
resp = await client.post(
endpoint,
content=body,
headers={
"Content-Type": "application/json",
"OCS-APIRequest": "true",
"X-Nextcloud-Talk-Bot-Random": random_str,
"X-Nextcloud-Talk-Bot-Signature": sig,
},
timeout=15,
)
if resp.status_code not in (200, 201):
logger.warning("notify NCT %s → HTTP %d: %s", room, resp.status_code, resp.text[:200])
else:
logger.info("notify NCT → %s (%d chars)", room, len(message))
except Exception as e:
logger.error("notify NCT error: %s", e)
async def _notify_nct(nct: dict, message: str, username: str) -> None:
room = nct.get("notification_room", "").strip()
url = nct.get("url", "").rstrip("/")
secret = nct.get("bot_secret", "")
if not room:
logger.debug("notify: NCT notification_room not set for %s — skipping", username)
return
if not url or not secret:
logger.warning("notify: NCT config incomplete for %s (missing url or secret)", username)
return
await _send_nct_message(url, secret, room, message)
async def notify(username: str, message: str, channel: str | None = None) -> None:
"""Send a notification to the user's preferred outbound channel.
Channel resolution order:
1. `channel` parameter if provided
2. `notification_channel` key in channels.json
3. "nextcloud" if configured
4. Silent no-op
To configure: set `notification_channel` in home/{user}/channels.json.
For NCT: also set `notification_room` in the nextcloud section.
"""
from auth_utils import get_user_channels
channels = get_user_channels(username)
target = channel or channels.get("notification_channel", "").strip()
if not target:
# Auto-detect: use nextcloud if configured
if "nextcloud" in channels:
target = "nextcloud"
else:
return
if target == "nextcloud":
nct = channels.get("nextcloud")
if not nct:
logger.debug("notify: nextcloud not configured for %s", username)
return
await _notify_nct(nct, message, username)
else:
logger.debug("notify: channel %r not yet supported for outbound (user %s)", target, username)

View File

@@ -56,6 +56,7 @@ async def run(
system_prompt: str = "",
session_messages: list[dict] | None = None,
respond_with_claude: bool = True,
gemini_api_key: str | None = None,
) -> OrchestratorResult:
"""
Run the full orchestration loop for a task.
@@ -66,17 +67,19 @@ async def run(
session_messages: Prior conversation history for session continuity
respond_with_claude: If False, return Gemini's summary as the response (useful for
background/cron tasks where a polished reply isn't needed)
gemini_api_key: Per-user Gemini API key (falls back to GEMINI_API_KEY in .env)
Returns:
OrchestratorResult with response, tool call log, backend used, and Gemini summary
"""
if not settings.gemini_api_key:
api_key = gemini_api_key or settings.gemini_api_key
if not api_key:
raise RuntimeError(
"GEMINI_API_KEY not set — orchestrator requires Gemini API. "
"Get a free key at https://aistudio.google.com/apikey and add it to .env"
"No Gemini API key available — set GEMINI_API_KEY in .env or add a personal key "
"via: manage_passwords.py gemini-key <username> <key>"
)
client = genai.Client(api_key=settings.gemini_api_key)
client = genai.Client(api_key=api_key)
# Seed Gemini with the task — include recent session context if available
task_with_context = _build_task_prompt(task, session_messages)

View File

@@ -135,6 +135,27 @@ def _protocols(display_name: str) -> str:
---
## Tools & Modes
Cortex has two chat modes. Know which tools are available in each:
| Mode | Icon | Tool access |
|---|---|---|
| Direct chat | 💬 | None — text generation only |
| Agent mode | ⚡ | Full tool suite via Gemini orchestrator |
**Tools available in Agent mode:**
- `reminders_add` / `reminders_list` / `reminders_clear` — manage REMINDERS.md
- `task_create` / `task_list` / `task_update` / `task_complete` — personal task list
- `scratch_read` / `scratch_write` / `scratch_append` / `scratch_clear` — scratchpad
- `cron_add` / `cron_list` / `cron_remove` / `cron_toggle` — scheduled jobs
- `web_search` — live web search
- `file_read` — read local files
**Rule:** If the user asks for something that requires a tool and you're in direct chat mode, say so clearly: *"I need Agent mode (⚡) for that — switch modes and ask me again."* Do not attempt workarounds or pretend the action was taken.
---
## Memory
- Long-term memory lives in MEMORY_LONG.md (auto-distilled monthly).

View File

@@ -16,5 +16,8 @@ bcrypt>=4.0.0
PyJWT>=2.8.0
python-multipart>=0.0.9 # required by FastAPI for Form() data
# Async HTTP client — used for local OpenAI-compatible backend (Open WebUI / Ollama)
httpx>=0.27.0
# anthropic SDK not needed — using claude CLI subprocess for auth
# anthropic>=0.40.0

View File

@@ -13,6 +13,7 @@ import logging
from datetime import datetime, timezone
from pathlib import Path
from fastapi import APIRouter
from config import settings
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/auth")
@@ -71,9 +72,39 @@ def _gemini_status() -> dict:
return {"ok": False, "error": str(e), "warning": True, "authenticated": False}
async def _local_status(username: str = "scott") -> dict:
"""Check reachability of the user's configured local model host."""
import model_registry
cfg = model_registry.get_best_local_model(username)
if not cfg:
return {"configured": False}
api_url = cfg.get("api_url", "")
if not api_url:
return {"configured": False}
try:
import httpx
url = api_url.rstrip("/") + "/api/models"
headers = {}
api_key = cfg.get("api_key", "")
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
async with httpx.AsyncClient(timeout=5) as client:
resp = await client.get(url, headers=headers)
reachable = resp.status_code < 400
return {
"configured": True,
"reachable": reachable,
"model": cfg.get("model_name", ""),
"label": cfg.get("label", ""),
}
except Exception as e:
return {"configured": True, "reachable": False, "error": str(e), "model": cfg.get("model_name", "")}
@router.get("/status")
async def auth_status() -> dict:
return {
"claude": _claude_status(),
"gemini": _gemini_status(),
"local": await _local_status(),
}

View File

@@ -0,0 +1,205 @@
"""
Google OAuth 2.0 sign-in.
Flow:
1. GET /auth/google → redirect to Google's consent page
2. GET /auth/google/callback → exchange code, look up user, set JWT cookie
Users must be pre-registered by Scott before they can sign in:
cd cortex && .venv/bin/python manage_passwords.py google-add <username> <email>
Routes are public (added to _PUBLIC_PREFIXES in auth_middleware.py).
"""
import json
import logging
import secrets
import urllib.parse
import urllib.request
from fastapi import APIRouter, Request
from fastapi.responses import HTMLResponse, RedirectResponse, Response
from auth_utils import COOKIE_NAME, create_token, find_user_by_google, link_google
from config import settings
from persona import list_user_personas
logger = logging.getLogger(__name__)
router = APIRouter()
_GOOGLE_AUTH_URL = "https://accounts.google.com/o/oauth2/v2/auth"
_GOOGLE_TOKEN_URL = "https://oauth2.googleapis.com/token"
_GOOGLE_USERINFO = "https://openidconnect.googleapis.com/v1/userinfo"
_STATE_COOKIE = "oauth_state"
_STATE_MAX_AGE = 600 # 10 minutes — plenty of time to complete the flow
@router.get("/auth/google", include_in_schema=False)
async def google_login():
if not settings.google_client_id:
return HTMLResponse("Google sign-in is not configured on this server.", status_code=503)
state = secrets.token_urlsafe(16)
params = urllib.parse.urlencode({
"client_id": settings.google_client_id,
"redirect_uri": f"{settings.cortex_base_url}/auth/google/callback",
"response_type": "code",
"scope": "openid email profile",
"state": state,
"access_type": "online",
"prompt": "select_account",
})
resp = RedirectResponse(f"{_GOOGLE_AUTH_URL}?{params}", status_code=302)
resp.set_cookie(_STATE_COOKIE, state, max_age=_STATE_MAX_AGE, httponly=True, samesite="lax")
return resp
@router.get("/auth/google/callback", include_in_schema=False)
async def google_callback(
request: Request,
code: str = "",
state: str = "",
error: str = "",
):
if error:
return _error_page(f"Google sign-in was cancelled or denied: {error}")
if not code:
return _error_page("No authorisation code returned by Google.")
# CSRF check — state must match what we stored in the cookie
stored_state = request.cookies.get(_STATE_COOKIE)
if not stored_state or stored_state != state:
return _error_page("State mismatch — please try signing in again.")
# Exchange authorisation code for tokens
try:
token_data = _exchange_code(code)
except Exception as e:
logger.error("Google token exchange failed: %s", e)
return _error_page("Could not complete sign-in with Google. Please try again.")
access_token = token_data.get("access_token")
if not access_token:
return _error_page("No access token returned by Google.")
# Fetch the user's profile
try:
userinfo = _get_userinfo(access_token)
except Exception as e:
logger.error("Google userinfo fetch failed: %s", e)
return _error_page("Could not retrieve your Google profile. Please try again.")
google_sub = userinfo.get("sub", "")
google_email = userinfo.get("email", "")
if not google_sub or not google_email:
return _error_page("Your Google account didn't return a usable email address.")
# Match to a Cortex user
username = find_user_by_google(google_sub, google_email)
if not username:
logger.warning("Google sign-in rejected: no account for %s (%s)", google_sub, google_email)
return _error_page(
f"Your Google account (<strong>{google_email}</strong>) isn't registered with Cortex.<br><br>"
"Contact Scott to get access."
)
# Persist the stable sub so future lookups use it (not just email)
link_google(username, google_sub, google_email)
personas = list_user_personas(username)
if not personas:
return _error_page("No personas are configured for your account yet. Contact Scott.")
logger.info("Google sign-in: %s (%s)", username, google_email)
resp = RedirectResponse(f"/{username}/{personas[0]}", status_code=302)
_set_session_cookie(resp, username)
resp.delete_cookie(_STATE_COOKIE)
return resp
# ---------------------------------------------------------------------------
# Private helpers
# ---------------------------------------------------------------------------
def _exchange_code(code: str) -> dict:
body = urllib.parse.urlencode({
"code": code,
"client_id": settings.google_client_id,
"client_secret": settings.google_client_secret,
"redirect_uri": f"{settings.cortex_base_url}/auth/google/callback",
"grant_type": "authorization_code",
}).encode()
req = urllib.request.Request(
_GOOGLE_TOKEN_URL,
data=body,
headers={"Content-Type": "application/x-www-form-urlencoded"},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def _get_userinfo(access_token: str) -> dict:
req = urllib.request.Request(
_GOOGLE_USERINFO,
headers={"Authorization": f"Bearer {access_token}"},
)
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read())
def _set_session_cookie(response: Response, username: str) -> None:
token = create_token(username)
response.set_cookie(
COOKIE_NAME,
token,
max_age=settings.jwt_expire_days * 86400,
httponly=True,
samesite="lax",
secure=False, # set True if terminating TLS at the app layer (not behind a proxy)
)
def _error_page(message: str) -> HTMLResponse:
html = f"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Cortex — Sign In Failed</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
<style>
*, *::before, *::after {{ box-sizing: border-box; margin: 0; padding: 0; }}
body {{
min-height: 100vh; display: flex; align-items: center; justify-content: center;
background: #0f1117; font-family: 'Inter', system-ui; font-weight: 450;
-webkit-font-smoothing: antialiased; color: #e2e8f0;
}}
.card {{
background: #1a1d27; border: 1px solid #2d3148; border-radius: 12px;
padding: 2.5rem 2rem; width: 100%; max-width: 420px; text-align: center;
}}
h1 {{ font-size: 1.25rem; font-weight: 700; color: #f87171; margin-bottom: 1rem; }}
p {{ font-size: 0.9rem; color: #94a3b8; margin-bottom: 1.75rem; line-height: 1.65; }}
a {{
display: inline-block; padding: 0.6rem 1.5rem;
background: #7c3aed; border-radius: 6px; color: #fff;
text-decoration: none; font-size: 0.9rem; font-weight: 600;
transition: background 0.15s;
}}
a:hover {{ background: #6d28d9; }}
</style>
</head>
<body>
<div class="card">
<h1>Sign In Failed</h1>
<p>{message}</p>
<a href="/login">← Back to Sign In</a>
</div>
</body>
</html>"""
return HTMLResponse(html, status_code=403)

View File

@@ -1,6 +1,7 @@
import asyncio
import json
from fastapi import APIRouter, HTTPException, Query
import jwt
from fastapi import APIRouter, HTTPException, Query, Request
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from context_loader import load_context
@@ -9,12 +10,28 @@ from session_logger import log_turn
from session_store import load as load_session, save as save_session, list_all, generate_session_id, delete as delete_session, rename as rename_session
from config import settings
from persona import set_context, validate as validate_persona
from auth_utils import COOKIE_NAME, decode_token
import model_registry
import event_bus
router = APIRouter()
def _backend_label(backend: str, username: str) -> str:
"""Human-readable label for the model that handled a request."""
if backend == "claude":
return "Claude"
if backend == "gemini":
return "Gemini"
if backend == "local":
cfg = model_registry.get_best_local_model(username)
if cfg:
return cfg.get("label") or cfg.get("model_name") or "Local"
return "Local"
return backend.title()
class ChatRequest(BaseModel):
message: str
session_id: str | None = None
@@ -29,7 +46,7 @@ class ChatRequest(BaseModel):
class BackendRequest(BaseModel):
primary: str # "claude" or "gemini"
primary: str # "claude", "gemini", or "local"
class NoteRequest(BaseModel):
@@ -102,6 +119,7 @@ async def _stream_chat(req: ChatRequest):
"response": response_text,
"session_id": session_id,
"backend": actual_backend,
"backend_label": _backend_label(actual_backend, user),
"fallback_used": actual_backend != requested,
}
yield f"data: {json.dumps(payload)}\n\n"
@@ -130,19 +148,45 @@ async def chat(req: ChatRequest) -> StreamingResponse:
)
_BACKEND_CYCLE = ("claude", "gemini", "local")
_BACKEND_FALLBACK = {"claude": "gemini", "gemini": "claude", "local": "claude"}
def _local_model_info(request: Request) -> dict | None:
"""Return the best local model {label, model_name} for the session user, or None."""
try:
token = request.cookies.get(COOKIE_NAME)
username = decode_token(token) if token else None
if not username:
return None
cfg = model_registry.get_best_local_model(username, "chat")
if cfg:
return {"label": cfg.get("label", ""), "model_name": cfg.get("model_name", "")}
except (jwt.InvalidTokenError, Exception):
pass
return None
@router.get("/backend")
async def get_backend() -> dict:
other = "gemini" if settings.primary_backend == "claude" else "claude"
return {"primary": settings.primary_backend, "fallback": other}
async def get_backend(request: Request) -> dict:
p = settings.primary_backend
return {
"primary": p,
"fallback": _BACKEND_FALLBACK.get(p, "claude"),
"local_model": _local_model_info(request),
}
@router.post("/backend")
async def set_backend(req: BackendRequest) -> dict:
if req.primary not in ("claude", "gemini"):
raise HTTPException(status_code=400, detail="primary must be 'claude' or 'gemini'")
async def set_backend(req: BackendRequest, request: Request) -> dict:
if req.primary not in _BACKEND_CYCLE:
raise HTTPException(status_code=400, detail="primary must be 'claude', 'gemini', or 'local'")
settings.primary_backend = req.primary
other = "gemini" if req.primary == "claude" else "claude"
return {"primary": settings.primary_backend, "fallback": other}
return {
"primary": req.primary,
"fallback": _BACKEND_FALLBACK[req.primary],
"local_model": _local_model_info(request),
}
def _set_ctx(user: str, persona: str) -> None:

View File

@@ -1,7 +1,8 @@
"""
Read/write the Inara identity markdown files.
Read/write Inara identity markdown files, and search past session logs.
Only whitelisted filenames are accessible — no path traversal possible.
"""
import re
from fastapi import APIRouter, HTTPException, Query
from pydantic import BaseModel
from persona import persona_path, set_context, validate as validate_persona
@@ -47,10 +48,12 @@ async def list_files(
files = []
for name in sorted(ALLOWED):
p = persona_dir / name
st = p.stat() if p.exists() else None
files.append({
"name": name,
"exists": p.exists(),
"size": p.stat().st_size if p.exists() else 0,
"size": st.st_size if st else 0,
"modified": st.st_mtime if st else None,
})
return {"files": files}
@@ -83,3 +86,59 @@ async def save_file(
p = _path(filename)
p.write_text(req.content)
return {"ok": True, "name": filename, "size": len(req.content)}
# ── Session search ────────────────────────────────────────────────────────────
_CONTEXT_CHARS = 120 # chars of context to include around each match
@router.get("/sessions/search")
async def search_sessions(
q: str = Query(..., min_length=2),
user: str = Query("scott"),
persona: str = Query("inara"),
limit: int = Query(20, ge=1, le=100),
) -> dict:
"""Full-text search across past session logs.
Returns up to `limit` matches, newest sessions first.
Each match includes a short excerpt (120 chars before/after) for context.
"""
_resolve(user, persona)
sessions_dir = persona_path() / "sessions"
if not sessions_dir.exists():
return {"query": q, "matches": [], "total_files_searched": 0}
pattern = re.compile(re.escape(q), re.IGNORECASE)
session_files = sorted(sessions_dir.glob("*.md"), reverse=True) # newest first
matches = []
for sf in session_files:
if len(matches) >= limit:
break
try:
text = sf.read_text()
except OSError:
continue
for m in pattern.finditer(text):
if len(matches) >= limit:
break
start = max(0, m.start() - _CONTEXT_CHARS)
end = min(len(text), m.end() + _CONTEXT_CHARS)
excerpt = text[start:end].strip()
# Prefix with ellipsis if we truncated the left side
if start > 0:
excerpt = "" + excerpt
if end < len(text):
excerpt = excerpt + ""
matches.append({
"date": sf.stem, # YYYY-MM-DD
"excerpt": excerpt,
})
return {
"query": q,
"matches": matches,
"total_files_searched": len(session_files),
}

View File

@@ -3,14 +3,16 @@ import logging
from fastapi import APIRouter, HTTPException, Request, Response
from google.auth.transport import requests as google_requests
from google.oauth2 import id_token
from auth_utils import get_user_channels
from context_loader import load_context
from llm_client import complete
from persona import set_context
from session_logger import log_turn
from session_store import load as load_session, save as save_session
from config import settings
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/channels/google-chat")
router = APIRouter()
# Workspace Add-on Chat apps: JWT is issued by accounts.google.com.
# (Legacy standalone Chat bots used chat@system.gserviceaccount.com — different format.)
@@ -35,7 +37,7 @@ def _msg(text: str) -> dict:
}
def _verify_system_id_token(token: str) -> None:
def _verify_system_id_token(token: str, audience: str) -> None:
"""Verify the systemIdToken from authorizationEventObject.
For Workspace Add-on Chat apps Google sends the token in the request body
@@ -44,13 +46,13 @@ def _verify_system_id_token(token: str) -> None:
Claims verified:
iss = "https://accounts.google.com"
aud = settings.google_chat_audience (the endpoint URL)
aud = the per-user audience from channels.json (the endpoint URL)
"""
try:
claims = id_token.verify_oauth2_token(
token,
google_requests.Request(),
audience=settings.google_chat_audience,
audience=audience,
)
except Exception as exc:
logger.warning("Google Chat JWT verification failed: %s", exc)
@@ -60,17 +62,30 @@ def _verify_system_id_token(token: str) -> None:
raise HTTPException(status_code=401, detail="Wrong issuer")
@router.post("")
async def receive(request: Request):
@router.post("/channels/google-chat/{username}")
async def receive(username: str, request: Request):
channels = get_user_channels(username)
cfg = channels.get("google_chat")
if not cfg:
logger.warning("Google Chat: no channel config for user %r", username)
raise HTTPException(status_code=404, detail="Channel not configured for this user")
persona_name = cfg.get("persona", "inara")
audience = cfg.get("audience", "")
backend = cfg.get("backend", settings.primary_backend)
timeout = cfg.get("timeout", 25)
set_context(username, persona_name)
body = await request.json()
# Verify the systemIdToken embedded in the request body
if settings.google_chat_audience:
if audience:
token = body.get("authorizationEventObject", {}).get("systemIdToken", "")
if not token:
logger.warning("Google Chat: missing systemIdToken")
logger.warning("Google Chat: missing systemIdToken for %s", username)
raise HTTPException(status_code=401, detail="Missing token")
_verify_system_id_token(token)
_verify_system_id_token(token, audience)
chat = body.get("chat", {})
@@ -79,8 +94,8 @@ async def receive(request: Request):
if "addedToSpacePayload" in chat:
space_type = chat["addedToSpacePayload"].get("space", {}).get("type", "")
if space_type == "DM":
return _msg(f"✨ Hello! I'm {settings.agent_name}. What can I help you with?")
return _msg(f"✨ Hello! I'm {settings.agent_name}. Send me a message and I'll do my best to help.")
return _msg(f"✨ Hello! I'm {persona_name.capitalize()}. What can I help you with?")
return _msg(f"✨ Hello! I'm {persona_name.capitalize()}. Send me a message and I'll do my best to help.")
if "removedFromSpacePayload" in chat:
return Response(status_code=200)
@@ -89,10 +104,10 @@ async def receive(request: Request):
logger.info("Google Chat: unhandled event keys: %s", list(chat.keys()))
return Response(status_code=200)
payload = chat["messagePayload"]
message = payload.get("message", {})
space = payload.get("space", {})
user = chat.get("user", {})
payload = chat["messagePayload"]
message = payload.get("message", {})
space = payload.get("space", {})
user = chat.get("user", {})
# argumentText strips @BotName mentions in Spaces; fall back to full text in DMs
user_text = (message.get("argumentText") or message.get("text", "")).strip()
@@ -107,7 +122,7 @@ async def receive(request: Request):
logger.warning("Google Chat: empty user_text, ignoring")
return Response(status_code=200)
session_id = "gc_" + space_name.replace("/", "_")
session_id = f"gc_{username}_{space_name.replace('/', '_')}"
system_prompt = load_context(settings.default_tier)
history = load_session(session_id)
history.append({"role": "user", "content": user_text})
@@ -117,9 +132,9 @@ async def receive(request: Request):
complete(
system_prompt=system_prompt,
messages=history,
model=settings.google_chat_backend,
model=backend,
),
timeout=settings.google_chat_timeout,
timeout=timeout,
)
except asyncio.TimeoutError:
logger.warning("Google Chat request timed out for session %s", session_id)

View File

@@ -32,13 +32,17 @@ def _get_session_user(request: Request) -> str | None:
@router.get("/help", include_in_schema=False)
async def help_page(request: Request):
async def help_page(request: Request, persona: str = ""):
username = _get_session_user(request)
if not username:
return RedirectResponse("/login", status_code=302)
personas = list_user_personas(username)
back_persona = personas[0] if personas else ""
# Use persona from query param if valid, else fall back to first
if persona and persona in personas:
back_persona = persona
else:
back_persona = personas[0] if personas else ""
back_href = f"/{username}/{back_persona}" if back_persona else "/"
html = (_STATIC / "help.html").read_text()

341
cortex/routers/local_llm.py Normal file
View File

@@ -0,0 +1,341 @@
"""
Model Registry settings — hosts, models, and role assignments.
Routes:
GET /settings/local → settings page
POST /settings/local/host → save/create a host
POST /settings/local/host/{id}/remove → remove a host (and its models)
POST /settings/local/models/add → add a model entry
POST /settings/local/models/{id}/remove → remove a model
POST /api/models/role → AJAX: set a role assignment
GET /api/local-llm/fetch-models → proxy to host /api/models (JSON)
"""
import logging
from pathlib import Path
import httpx
import jwt
from fastapi import APIRouter, Form, Request
from fastapi.responses import HTMLResponse, JSONResponse, RedirectResponse
from auth_utils import COOKIE_NAME, decode_token
from config import settings as app_settings
import model_registry as reg
logger = logging.getLogger(__name__)
router = APIRouter()
_STATIC = Path(__file__).parent.parent / "static"
# ── Auth helper ───────────────────────────────────────────────────────────────
def _get_user(request: Request) -> str | None:
token = request.cookies.get(COOKIE_NAME)
if not token:
return None
try:
return decode_token(token)
except jwt.InvalidTokenError:
return None
# ── Page renderer ─────────────────────────────────────────────────────────────
def _render(username: str, success: str = "", error: str = "") -> str:
registry = reg.get_registry(username)
hosts = registry.get("hosts", [])
models = registry.get("models", [])
roles = registry.get("roles", {})
builtins = reg._builtins()
host_by_id = {h["id"]: h for h in hosts}
# ── Host rows ─────────────────────────────────────────────────────────────
host_rows = ""
for h in hosts:
key_hint = f"{h['api_key'][-4:]}" if h.get("api_key") else "not set"
ht = h.get("host_type", "openwebui")
ow_sel = ' selected' if ht == "openwebui" else ''
ai_sel = ' selected' if ht == "openai" else ''
host_rows += f'''
<div class="host-row">
<form method="POST" action="/settings/local/host" class="host-form">
<input type="hidden" name="host_id" value="{h["id"]}">
<div class="field-row">
<div class="field">
<label>Label</label>
<input type="text" name="label" value="{h.get("label","")}"
placeholder="Home ML Laptop" autocomplete="off" data-form-type="other">
</div>
<div class="field" style="flex:2">
<label>API URL</label>
<input type="text" name="api_url" value="{h.get("api_url","")}"
placeholder="http://192.168.x.x:3000"
autocomplete="off" spellcheck="false" data-form-type="other">
</div>
</div>
<div class="field-row">
<div class="field">
<label>API Key</label>
<input type="password" name="api_key" placeholder="Leave blank to keep existing"
autocomplete="new-password" data-1p-ignore data-lpignore="true" data-form-type="other">
<p class="key-status">Current: {key_hint}</p>
</div>
<div class="field" style="flex:0 0 auto">
<label>Type</label>
<select name="host_type">
<option value="openwebui"{ow_sel}>Open WebUI / Ollama</option>
<option value="openai"{ai_sel}>OpenAI-compatible (OpenRouter, etc.)</option>
</select>
</div>
</div>
<div class="btn-row">
<button type="submit" class="btn btn-secondary btn-sm">Save host</button>
<button type="button" class="btn btn-secondary btn-sm fetch-btn"
data-host-id="{h["id"]}">Fetch models</button>
<span class="fetch-status" id="fetch-{h["id"]}"></span>
</div>
</form>
<form method="POST" action="/settings/local/host/{h["id"]}/remove"
onsubmit="return confirm('Remove host and all its models?')" style="margin-top:0.5rem">
<button type="submit" class="btn-link danger">Remove host</button>
</form>
</div>'''
if not host_rows:
host_rows = '<p class="empty-note">No hosts configured yet. Add one below.</p>'
# ── Host options for add-model form ───────────────────────────────────────
host_options = "".join(
f'<option value="{h["id"]}">{h.get("label") or h["api_url"]}</option>'
for h in hosts
)
add_model_hidden = "" if hosts else ' style="display:none"'
# ── Model rows ────────────────────────────────────────────────────────────
model_rows = ""
for m in models:
resolved = reg._resolve_model(registry, m["id"])
if not resolved:
continue
host_name = ""
if m.get("type") == "local_openai" and m.get("host_id"):
h = host_by_id.get(m["host_id"], {})
host_name = h.get("label") or h.get("api_url", "")
ctx_badge = f'<span class="ctx-badge">{m.get("context_k",0)}k ctx</span>' if m.get("context_k") else ""
tags_html = " ".join(
f'<span class="tag">{t}</span>' for t in (m.get("tags") or [])
)
host_html = f'<span class="model-host">{host_name}</span>' if host_name else ""
model_rows += f'''
<div class="model-row" id="model-{m["id"]}">
<div class="model-info">
<span class="model-label">{m.get("label") or m.get("model_name","")}</span>
<span class="model-name">{m.get("model_name","")}</span>
{host_html}{ctx_badge}
<div class="tag-row">{tags_html}</div>
</div>
<div class="model-actions">
<form method="POST" action="/settings/local/models/{m["id"]}/remove"
onsubmit="return confirm('Remove this model?')" style="display:inline">
<button type="submit" class="row-btn danger">Remove</button>
</form>
</div>
</div>'''
if not model_rows:
model_rows = '<p class="empty-note">No models added yet.</p>'
# ── Role assignment rows ──────────────────────────────────────────────────
# Build option list: (none) + built-ins + user models
model_opts = '<option value="">— .env default —</option>\n'
model_opts += '<optgroup label="Built-in">\n'
for bid, bm in builtins.items():
model_opts += f' <option value="{bid}">{bm["label"]}</option>\n'
model_opts += '</optgroup>\n'
if models:
model_opts += '<optgroup label="Local models">\n'
for m in models:
lbl = m.get("label") or m.get("model_name", m["id"])
model_opts += f' <option value="{m["id"]}">{lbl}</option>\n'
model_opts += '</optgroup>\n'
role_rows = ""
for role in app_settings.get_defined_roles():
role_cfg = roles.get(role, {})
role_rows += f'<div class="role-row" data-role="{role}"><span class="role-name">{role.title()}</span><div class="role-slots">'
for slot in reg.PRIORITY_KEYS[:3]: # primary + backup_1 + backup_2
current = role_cfg.get(slot) or ""
slot_label = slot.replace("_", " ").title()
sel_html = f'<select class="role-select" data-role="{role}" data-slot="{slot}" title="{slot_label}">\n{model_opts}\n</select>'
# Pre-select current value via JS (simpler than string-building selected attrs)
role_rows += f'<div class="role-slot"><span class="slot-label">{slot_label}</span>{sel_html}</div>'
role_rows += '</div></div>'
# JS data for pre-selecting current role values
import json as _json
role_data_js = _json.dumps({
role: {slot: (roles.get(role, {}).get(slot) or "") for slot in reg.PRIORITY_KEYS[:3]}
for role in app_settings.get_defined_roles()
})
html = (_STATIC / "local_llm.html").read_text()
html = html.replace("{{ username }}", username)
html = html.replace("{{ host_rows }}", host_rows)
html = html.replace("{{ model_rows }}", model_rows)
html = html.replace("{{ host_options }}", host_options)
html = html.replace("{{ add_model_hidden }}", add_model_hidden)
html = html.replace("{{ role_rows }}", role_rows)
html = html.replace("{{ role_data_js }}", role_data_js)
if success:
html = html.replace("<!-- SUCCESS -->", f'<p class="msg success">{success}</p>')
if error:
html = html.replace("<!-- ERROR -->", f'<p class="msg error">{error}</p>')
return html
# ── Routes ────────────────────────────────────────────────────────────────────
@router.get("/settings/local", include_in_schema=False)
async def models_page(request: Request):
username = _get_user(request)
if not username:
return RedirectResponse("/login", status_code=302)
return HTMLResponse(_render(username))
@router.post("/settings/local/host", include_in_schema=False)
async def save_host(
request: Request,
host_id: str = Form(""),
label: str = Form(""),
api_url: str = Form(""),
api_key: str = Form(""),
host_type: str = Form("openwebui"),
):
username = _get_user(request)
if not username:
return RedirectResponse("/login", status_code=302)
if not api_url.strip():
return HTMLResponse(_render(username, error="API URL is required."))
reg.save_host(username, host_id or None, label, api_url, api_key, host_type)
logger.info("model registry host saved: %s (%s)", username, host_type)
return HTMLResponse(_render(username, success="Host saved."))
@router.post("/settings/local/host/{host_id}/remove", include_in_schema=False)
async def remove_host(request: Request, host_id: str):
username = _get_user(request)
if not username:
return RedirectResponse("/login", status_code=302)
reg.remove_host(username, host_id)
return HTMLResponse(_render(username, success="Host removed."))
@router.post("/settings/local/models/add", include_in_schema=False)
async def add_model(
request: Request,
host_id: str = Form(...),
label: str = Form(""),
model_name: str = Form(...),
context_k: int = Form(0),
tags: str = Form(""),
):
username = _get_user(request)
if not username:
return RedirectResponse("/login", status_code=302)
if not model_name.strip():
return HTMLResponse(_render(username, error="Model name is required."))
tag_list = [t.strip() for t in tags.split(",") if t.strip()]
reg.save_model(username, None, host_id, label, model_name, context_k, tag_list)
logger.info("model added to registry: %s / %s", username, model_name)
return HTMLResponse(_render(username, success=f'Model "{label or model_name}" added.'))
@router.post("/settings/local/models/{model_id}/remove", include_in_schema=False)
async def remove_model(request: Request, model_id: str):
username = _get_user(request)
if not username:
return RedirectResponse("/login", status_code=302)
reg.remove_model(username, model_id)
return HTMLResponse(_render(username, success="Model removed."))
@router.post("/api/models/role")
async def set_role(request: Request) -> JSONResponse:
"""AJAX: assign a model to a role priority slot.
Body: {"role": "chat", "slot": "primary", "model_id": "abc123" | ""}
"""
username = _get_user(request)
if not username:
return JSONResponse({"error": "Not authenticated"}, status_code=401)
try:
body = await request.json()
except Exception:
return JSONResponse({"error": "Invalid JSON"}, status_code=400)
role = body.get("role", "").strip()
slot = body.get("slot", "").strip()
model_id = body.get("model_id", "").strip() or None
if not role or not slot:
return JSONResponse({"error": "role and slot are required"}, status_code=400)
ok = reg.set_role(username, role, slot, model_id)
if not ok:
return JSONResponse({"error": f"Invalid slot or model_id not found"}, status_code=400)
logger.info("role set: %s %s.%s = %s", username, role, slot, model_id)
return JSONResponse({"ok": True})
@router.get("/api/local-llm/fetch-models")
async def fetch_models(request: Request, host_id: str = "") -> JSONResponse:
"""Proxy to the host's /api/models endpoint. host_id selects which host."""
username = _get_user(request)
if not username:
return JSONResponse({"error": "Not authenticated"}, status_code=401)
registry = reg.get_registry(username)
hosts = registry.get("hosts", [])
if host_id:
host = next((h for h in hosts if h["id"] == host_id), None)
else:
host = hosts[0] if hosts else None
# Fall back to .env
if host:
api_url = host.get("api_url", "")
api_key = host.get("api_key", "")
else:
api_url = app_settings.local_api_url
api_key = app_settings.local_api_key
if not api_url:
return JSONResponse({"error": "No host configured."}, status_code=400)
host_type = host.get("host_type", "openwebui") if host else "openwebui"
models_path = "/models" if host_type == "openai" else "/api/models"
url = api_url.rstrip("/") + models_path
headers = {"Authorization": f"Bearer {api_key}"} if api_key else {}
try:
async with httpx.AsyncClient(timeout=8) as client:
resp = await client.get(url, headers=headers)
resp.raise_for_status()
data = resp.json()
models = [
{"id": m["id"], "name": m.get("name") or m["id"]}
for m in data.get("data", [])
]
models.sort(key=lambda m: m["name"].lower())
return JSONResponse({"models": models})
except httpx.HTTPStatusError as e:
return JSONResponse({"error": f"Host returned {e.response.status_code}"}, status_code=502)
except Exception as e:
return JSONResponse({"error": str(e)}, status_code=502)

View File

@@ -1,18 +1,17 @@
import asyncio
import hashlib
import hmac
import json
import logging
import secrets
import httpx
from fastapi import APIRouter, BackgroundTasks, HTTPException, Request, Response
from config import settings
from auth_utils import get_user_channels
from context_loader import load_context
from llm_client import complete
from notification import _send_nct_message
from persona import set_context
from session_logger import log_turn
from session_store import load as load_session, save as save_session
from config import settings
import event_bus
logger = logging.getLogger(__name__)
@@ -26,55 +25,37 @@ if not logger.handlers:
router = APIRouter()
def _verify_signature(body: bytes, random_header: str, sig_header: str) -> bool:
def _verify_signature(body: bytes, random_header: str, sig_header: str, secret: str) -> bool:
"""Nextcloud signs requests with HMAC-SHA256(key=secret, msg=random+body)."""
expected = hmac.new(
settings.nextcloud_talk_bot_secret.encode(),
secret.encode(),
(random_header + body.decode("utf-8", errors="replace")).encode(),
hashlib.sha256,
).hexdigest()
return hmac.compare_digest(expected, sig_header.lower())
async def _send_reply(conversation_token: str, message: str) -> None:
async def _send_reply(conversation_token: str, message: str, nextcloud_url: str, secret: str) -> None:
"""Post a message to Nextcloud Talk as the bot."""
url = (
f"{settings.nextcloud_url}/ocs/v2.php/apps/spreed/api/v1"
f"/bot/{conversation_token}/message"
)
# NC Talk verifies HMAC over (random + message_text), NOT the raw body.
# See BotController::getBotFromHeaders → checksumVerificationService::validateRequest($random, $sig, $secret, $message)
body_dict = {"message": message}
body_bytes = json.dumps(body_dict, ensure_ascii=False).encode("utf-8")
random_str = secrets.token_hex(32)
sig = hmac.new(
settings.nextcloud_talk_bot_secret.encode(),
(random_str + message).encode("utf-8"),
hashlib.sha256,
).hexdigest()
logger.info("NCT _send_reply → %s (body: %s)", url, body_bytes.decode())
try:
async with httpx.AsyncClient() as client:
resp = await client.post(
url,
content=body_bytes,
headers={
"Content-Type": "application/json",
"OCS-APIRequest": "true",
"X-Nextcloud-Talk-Bot-Random": random_str,
"X-Nextcloud-Talk-Bot-Signature": sig,
},
timeout=15,
)
logger.info("NCT reply: %s%s", resp.status_code, resp.text[:400])
except Exception as e:
logger.error("NCT reply error: %s", e)
logger.info("NCT _send_reply → room %s (%d chars)", conversation_token, len(message))
await _send_nct_message(nextcloud_url, secret, conversation_token, message)
async def _process_message(conversation_token: str, user_text: str, actor_name: str) -> None:
async def _process_message(
conversation_token: str,
user_text: str,
actor_name: str,
username: str,
persona_name: str,
nextcloud_url: str,
secret: str,
timeout: int,
) -> None:
logger.info("NCT process: token=%s user=%s text=%r", conversation_token, actor_name, user_text)
session_id = f"nct_{conversation_token}"
set_context(username, persona_name)
session_id = f"nct_{username}_{conversation_token}"
system_prompt = load_context(settings.default_tier)
history = load_session(session_id)
history.append({"role": "user", "content": user_text})
@@ -90,15 +71,15 @@ async def _process_message(conversation_token: str, user_text: str, actor_name:
try:
response_text, backend = await asyncio.wait_for(
complete(system_prompt=system_prompt, messages=history),
timeout=settings.nextcloud_talk_timeout,
timeout=timeout,
)
except asyncio.TimeoutError:
logger.warning("NCT timeout for %s", conversation_token)
await _send_reply(conversation_token, "⏳ Still thinking — this is taking longer than usual.")
await _send_reply(conversation_token, "⏳ Still thinking — this is taking longer than usual.", nextcloud_url, secret)
return
except Exception as e:
logger.error("NCT LLM error for %s: %s", conversation_token, e)
await _send_reply(conversation_token, "⚠️ Something went wrong on my end.")
await _send_reply(conversation_token, "⚠️ Something went wrong on my end.", nextcloud_url, secret)
return
logger.info("NCT LLM responded via %s (%d chars)", backend, len(response_text))
@@ -114,22 +95,33 @@ async def _process_message(conversation_token: str, user_text: str, actor_name:
"backend": backend,
})
await _send_reply(conversation_token, response_text)
await _send_reply(conversation_token, response_text, nextcloud_url, secret)
@router.post("/inara-nextcloud-talk-webhook")
async def nextcloud_talk_webhook(request: Request, background_tasks: BackgroundTasks):
body = await request.body()
@router.post("/webhook/nextcloud/{username}")
async def nextcloud_talk_webhook(username: str, request: Request, background_tasks: BackgroundTasks):
channels = get_user_channels(username)
cfg = channels.get("nextcloud")
if not cfg:
logger.warning("NCT webhook: no channel config for user %r", username)
raise HTTPException(status_code=404, detail="Channel not configured for this user")
if not settings.nextcloud_talk_bot_secret:
logger.error("nextcloud_talk_bot_secret not configured")
persona_name = cfg.get("persona", "inara")
nextcloud_url = cfg.get("url", "")
secret = cfg.get("bot_secret", "")
timeout = cfg.get("timeout", 55)
if not secret:
logger.error("NCT webhook: bot_secret missing for user %r", username)
return Response(status_code=500)
body = await request.body()
random_header = request.headers.get("X-Nextcloud-Talk-Random", "")
sig_header = request.headers.get("X-Nextcloud-Talk-Signature", "")
if not _verify_signature(body, random_header, sig_header):
logger.warning("NCT webhook: signature mismatch")
if not _verify_signature(body, random_header, sig_header, secret):
logger.warning("NCT webhook: signature mismatch for %s", username)
raise HTTPException(status_code=401, detail="Invalid signature")
try:
@@ -153,12 +145,12 @@ async def nextcloud_talk_webhook(request: Request, background_tasks: BackgroundT
conversation_token = target.get("id", "")
try:
content = json.loads(obj.get("content", "{}"))
content = json.loads(obj.get("content", "{}"))
user_text = content.get("message", "").strip()
except (json.JSONDecodeError, AttributeError):
user_text = (obj.get("name") or obj.get("content", "")).strip()
mention_prefix = f"@{settings.agent_name.lower()}"
mention_prefix = f"@{persona_name.lower()}"
if user_text.lower().startswith(mention_prefix):
user_text = user_text[len(mention_prefix):].strip()
@@ -168,5 +160,9 @@ async def nextcloud_talk_webhook(request: Request, background_tasks: BackgroundT
actor_name = actor.get("name", "User")
logger.info("NCT message from %s in %s: %r", actor_name, conversation_token, user_text[:60])
background_tasks.add_task(_process_message, conversation_token, user_text, actor_name)
background_tasks.add_task(
_process_message,
conversation_token, user_text, actor_name,
username, persona_name, nextcloud_url, secret, timeout,
)
return Response(status_code=200)

View File

@@ -18,6 +18,7 @@ from datetime import datetime, timezone
from fastapi import APIRouter
from pydantic import BaseModel
from auth_utils import get_user_gemini_key
from config import settings
from context_loader import load_context
from persona import set_context, validate as validate_persona
@@ -104,7 +105,7 @@ async def orchestrate(req: OrchestrateRequest) -> OrchestrateResponse:
_jobs[job_id] = job
# Run in background — caller polls GET /orchestrate/{job_id}
asyncio.create_task(_run_job(job_id, req))
asyncio.create_task(_run_job(job_id, req, user))
logger.info("Orchestrator job queued: %s%.80s", job_id, req.task)
return OrchestrateResponse(job_id=job_id, status="queued")
@@ -134,7 +135,7 @@ async def list_jobs() -> list[JobStatusResponse]:
# Background runner
# ---------------------------------------------------------------------------
async def _run_job(job_id: str, req: OrchestrateRequest) -> None:
async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
"""Execute the orchestration job and update the job store."""
async with _jobs_lock:
_jobs[job_id]["status"] = "running"
@@ -161,6 +162,7 @@ async def _run_job(job_id: str, req: OrchestrateRequest) -> None:
system_prompt=system_prompt,
session_messages=session_messages,
respond_with_claude=req.respond_with_claude,
gemini_api_key=get_user_gemini_key(user),
)
# Save the turn to the session store so it survives a page refresh

View File

@@ -16,7 +16,7 @@ import jwt
from fastapi import APIRouter, Form, Request
from fastapi.responses import HTMLResponse, RedirectResponse
from auth_utils import COOKIE_NAME, decode_token, check_credentials, set_password
from auth_utils import COOKIE_NAME, decode_token, check_credentials, set_password, _read_auth, _write_auth
from persona import list_user_personas
from config import settings as app_settings
@@ -41,6 +41,21 @@ def _get_session_user(request: Request) -> str | None:
def _settings_page(username: str, personas: list[str], success: str = "", error: str = "") -> str:
html = (_STATIC / "settings.html").read_text()
html = html.replace("{{ username }}", username)
# Connected Google account
auth_data = _read_auth(username)
google_email = auth_data.get("google_email") or ""
html = html.replace("{{ google_email }}", google_email)
# Gemini API key — show masked hint only, never the full key
gemini_key = auth_data.get("gemini_api_key") or ""
if gemini_key:
hint = f"Saved (…{gemini_key[-4:]})"
else:
hint = "Using server key"
html = html.replace("{{ gemini_key_hint }}", hint)
html = html.replace("{{ gemini_key_set }}", "true" if gemini_key else "false")
persona_items = "\n".join(
f'''<li>
<a href="/{username}/{p}" class="persona-link">{p}</a>
@@ -58,6 +73,7 @@ def _settings_page(username: str, personas: list[str], success: str = "", error:
html = html.replace("{{ persona_items }}", persona_items or "<li><em>No personas yet.</em></li>")
back_persona = personas[0] if personas else ""
html = html.replace("{{ back_href }}", f"/{username}/{back_persona}" if back_persona else "/")
html = html.replace("{{ help_href }}", f"/help?persona={back_persona}" if back_persona else "/help")
if success:
html = html.replace("<!-- SUCCESS -->", f'<p class="success">{success}</p>')
if error:
@@ -139,6 +155,30 @@ async def rename_username(
return resp
@router.post("/settings/gemini-key", include_in_schema=False)
async def save_gemini_key(
request: Request,
gemini_api_key: str = Form(...),
):
username = _get_session_user(request)
if not username:
return RedirectResponse("/login", status_code=302)
personas = list_user_personas(username)
gemini_api_key = gemini_api_key.strip()
data = _read_auth(username)
if gemini_api_key:
data["gemini_api_key"] = gemini_api_key
msg = "Gemini API key saved."
else:
data.pop("gemini_api_key", None)
msg = "Gemini API key removed — using server key."
_write_auth(username, data)
logger.info("gemini key updated: %s", username)
return HTMLResponse(_settings_page(username, personas, success=msg))
@router.post("/settings/persona/rename", include_in_schema=False)
async def rename_persona(
request: Request,

View File

@@ -62,6 +62,20 @@ def _first_persona(username: str) -> str | None:
return names[0] if names else None
# ---------------------------------------------------------------------------
# Favicon — default sparkle; persona pages override via JS
# ---------------------------------------------------------------------------
_FAVICON_SVG = (
"<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'>"
"<text y='.9em' font-size='90'>✨</text></svg>"
)
@router.get("/favicon.ico", include_in_schema=False)
async def favicon():
return Response(content=_FAVICON_SVG, media_type="image/svg+xml")
# ---------------------------------------------------------------------------
# Root redirect
# ---------------------------------------------------------------------------
@@ -123,6 +137,112 @@ async def logout():
return resp
# ---------------------------------------------------------------------------
# User landing — /{username} → persona picker
# ---------------------------------------------------------------------------
@router.get("/{username}", include_in_schema=False)
async def user_landing(username: str, request: Request):
session_user = _get_session_user(request)
if not session_user:
return RedirectResponse("/login", status_code=302)
if session_user != username:
return RedirectResponse(f"/{session_user}", status_code=302)
personas = list_user_personas(username)
if not personas:
return HTMLResponse("<h1>No personas configured.</h1>", status_code=404)
cards_html = ""
for p in personas:
emoji = ""
identity_path = persona_path(username, p) / "IDENTITY.md"
if identity_path.exists():
m = re.search(r"\|\s*Emoji\s*\|\s*(.+?)\s*\|", identity_path.read_text())
if m:
emoji = m.group(1).strip()
cards_html += (
f'<a href="/{username}/{p}" class="persona-card">'
f'<span class="p-emoji">{emoji}</span>'
f'<span class="p-name">{p.capitalize()}</span>'
f'</a>\n'
)
html = f"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Cortex — {username}</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
<style>
*, *::before, *::after {{ box-sizing: border-box; margin: 0; padding: 0; }}
body {{
min-height: 100vh;
display: flex;
align-items: center;
justify-content: center;
background: #1a1228;
font-family: 'Inter', system-ui, -apple-system, sans-serif;
font-weight: 450;
-webkit-font-smoothing: antialiased;
color: #e8e0f0;
padding: 2rem 1.5rem;
}}
.card {{
background: #221840;
border: 1px solid #3a2852;
border-radius: 14px;
padding: 2.5rem 2rem;
width: 100%;
max-width: 400px;
text-align: center;
}}
h1 {{ font-size: 1.3rem; font-weight: 700; color: #c4935a; margin-bottom: 0.4rem; }}
.sub {{ font-size: 0.82rem; color: #b0a2c8; margin-bottom: 2rem; }}
.personas {{ display: flex; flex-direction: column; gap: 0.75rem; }}
.persona-card {{
display: flex;
align-items: center;
gap: 1rem;
padding: 0.85rem 1.2rem;
background: #1a1228;
border: 1px solid #3a2852;
border-radius: 10px;
color: #e8e0f0;
text-decoration: none;
font-size: 1rem;
font-weight: 500;
transition: border-color 0.15s, background 0.15s;
}}
.persona-card:hover {{ border-color: #c4935a; background: #261d42; }}
.p-emoji {{ font-size: 1.6rem; line-height: 1; }}
.p-name {{ color: #c4935a; font-weight: 600; }}
.settings-link {{
display: inline-block;
margin-top: 1.5rem;
font-size: 0.78rem;
color: #b0a2c8;
text-decoration: none;
}}
.settings-link:hover {{ color: #e8e0f0; }}
</style>
</head>
<body>
<div class="card">
<h1>Cortex</h1>
<p class="sub">Signed in as <strong>{username}</strong> — choose a persona</p>
<div class="personas">
{cards_html} </div>
<a href="/settings" class="settings-link">Account settings</a>
</div>
</body>
</html>"""
return HTMLResponse(html)
# ---------------------------------------------------------------------------
# Main UI — /{username}/{persona}
# ---------------------------------------------------------------------------

View File

@@ -30,24 +30,28 @@ async def _run_short() -> None:
async def _run_mid() -> None:
from memory_distiller import distill_mid
from notification import notify
try:
result = await distill_mid()
if "error" in result:
logger.warning("auto distill mid skipped: %s", result["error"])
else:
logger.info("auto distill mid: %d chars via %s", result["chars_written"], result["backend"])
await notify(result["username"], f"📝 Weekly memory digest complete ({result['chars_written']} chars via {result['backend']}).")
except Exception as e:
logger.error("auto distill mid failed: %s", e)
async def _run_long() -> None:
from memory_distiller import distill_long
from notification import notify
try:
result = await distill_long()
if "error" in result:
logger.warning("auto distill long skipped: %s", result["error"])
else:
logger.info("auto distill long: %d chars via %s", result["chars_written"], result["backend"])
await notify(result["username"], f"🧠 Monthly long-term memory integration complete ({result['chars_written']} chars via {result['backend']}). Worth a quick review.")
except Exception as e:
logger.error("auto distill long failed: %s", e)

262
cortex/static/HELP.md Normal file
View File

@@ -0,0 +1,262 @@
# Cortex UI — Help & Reference
<!-- SHARED BASE: cortex/static/HELP.md
This file is served to all users regardless of persona.
Persona-specific additions live in home/{username}/persona/{name}/HELP.md
and are appended automatically by help.html when present.
-->
*Last updated: 2026-03-27*
---
## Header Controls
| Button | What it does |
|---|---|
| **Sessions** | Open the sessions panel — list, resume, or start sessions |
| **Files** | Open the identity file editor (SOUL, MEMORY, etc.) |
| **⚙ N** | Open the Settings panel (N = current context tier) |
| **?** | Open this help panel |
The **⚙ Settings** panel contains all configuration options:
| Section | Controls |
|---|---|
| **Context Tier** | T1 T4 context depth |
| **Memory Layers** | Toggle Long / Mid / Short memory on/off |
| **Distill Memory** | Manually trigger short / mid / long / all distillation |
| **Backend** | Active LLM backend — click to toggle claude ↔ gemini |
| **Display** | Aa/A+/A font size cycle · ☾/☀ theme toggle |
All header settings (theme, font size, tier, memory layers) persist in `localStorage` across page refreshes.
---
## Chat
- **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode.
- **Stop:** Click **Stop** to cancel an in-progress response at any time.
- **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels.
- **Delete a message:** Hover over any message → click **del**. Removes from session history.
- **Copy a response:** Hover over any assistant message → click **copy**.
- **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode).
---
## Agent Mode
Click the **Agent** button in the input row to enable Agent mode. The button highlights and Send changes to **Run**.
In Agent mode, messages are routed through the **orchestrator** instead of directly to Claude:
1. **Gemini** runs a tool loop — searches the web, reads files, checks tasks, calls APIs as needed
2. **Claude** receives the enriched context and writes the final response
3. A `⚡ N tool calls: …` note appears below the response listing what was used
Agent mode is best for tasks that require research, multi-step reasoning, or tool use (e.g. "search for X", "add a task", "what's on my list?"). Regular chat is faster for conversational turns.
Agent mode sessions persist to history exactly like regular chat — they survive page refreshes and appear in the Sessions panel.
---
## Sessions
Sessions are named conversation threads that persist across page refreshes.
- Click **Sessions****+ New** to start a fresh session.
- Click any listed session to resume it — full history loads instantly.
- Sessions from Nextcloud Talk appear as `nct_*` prefixed IDs.
- A blue **●** badge appears on the Sessions button when Talk activity arrives in a session you're not currently viewing.
---
## Notes
Notes are injected into a session without triggering an LLM response.
- Click **Note** to toggle note mode. The input border changes colour.
- **Private note** (amber border) — visible only in the UI, never sent to the LLM.
- **Context note** (teal border) — persisted to session history so the LLM sees it on the next turn. Useful for nudging context without a full message.
- Click the `private / public` label to switch between note types.
---
## Backends
- **Claude CLI** and **Gemini CLI** are both available. One is primary, the other is fallback.
- Click **⚙** → **Backend** to toggle between `claude` and `gemini` as the primary.
- If the primary fails or times out, the fallback is used automatically. A **⚡** notice appears in the chat when this happens.
- Timeouts: Claude 60s, Gemini 120s.
---
## Nextcloud Talk Bot
Inara is registered as a bot in Nextcloud Talk.
- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
- The webhook returns `200 OK` immediately; the LLM call and reply happen asynchronously.
- Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
---
## Google Chat Bot
Inara is available as a bot in Google Chat (One Sky IT Workspace).
- Send Inara a direct message in Google Chat to start a conversation.
- Each DM thread is its own session (`gc_spaces/*` prefix) — history persists across messages.
- Responses are synchronous — Google Chat displays Inara's reply directly in the thread.
- To add Inara to a space: open the space, add a person/app, search for **Inara**.
- Sessions from Google Chat appear as `gc_*` prefixed IDs in the Sessions panel.
**Technical note:** Cortex uses Google's Workspace Add-on format (`hostAppDataAction`) — the modern API required for all Google Chat apps as of 2025.
---
## Files (Identity Editor)
The **Files** button opens an editor for Inara's identity and memory files:
| File | Purpose |
|---|---|
| `SOUL.md` | Core personality, values, and voice |
| `IDENTITY.md` | Role, capabilities, and context |
| `USER.md` | Scott's profile, preferences, and history |
| `PROTOCOLS.md` | Behavioural rules and communication protocols |
| `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier |
| `MEMORY_LONG.md` | Permanent curated long-term memory |
| `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) |
| `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) |
| `TASKS.json` | Inara's personal task list (managed via Agent mode) |
| `HELP.md` | This file |
Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes.
---
## Context & Memory ( ⚙ panel )
### Context Tiers
Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| **T3** | + last 2 raw session logs | ~15,000 |
| **T4** | + last 7 raw session logs | ~50,000 |
Default is T2. Use T1 for small/local models. Use T3T4 for complex multi-session tasks.
### Memory Layers
Three independently toggleable memory files, loaded **Long → Mid → Short** (short sits closest to the conversation turn for better LLM recall):
| Layer | File | Contents |
|---|---|---|
| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, Scott's profile highlights |
| **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short |
| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session log files |
Toggle any layer off to save tokens for a focused conversation where history isn't needed.
### Memory Distillation (manual)
Distillation builds up the memory layers from raw session logs. Currently **manual** — trigger via the ⚙ panel:
| Button | What it does |
|---|---|
| **short** | Rolls recent session log files → `MEMORY_SHORT.md` (fast, no LLM) |
| **mid** | LLM summarizes `MEMORY_SHORT.md``MEMORY_MID.md` |
| **long** | LLM integrates `MEMORY_MID.md``MEMORY_LONG.md` |
| **all** | Runs short → mid → long in sequence |
**Recommended workflow:**
- Run **short** after any productive session to capture it.
- Run **mid** weekly to distil short → mid.
- Run **long** monthly to absorb mid into permanent memory.
Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BUDGET_MID`, `MEMORY_BUDGET_SHORT`).
---
## Keyboard Shortcuts
| Keys | Action |
|---|---|
| `Ctrl+Enter` | Send message (default mode) |
| `Enter` | Send (when in Enter mode) |
| `Shift+Enter` | New line in message input |
| `Ctrl+Enter` | Save inline message edit |
| `Esc` | Cancel inline edit |
| `Ctrl+S` | Save file (Files modal) |
| `Esc` | Close any open modal |
---
## API Reference
For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/chat` | Send a message — returns SSE stream |
| `GET` | `/backend` | Get current primary/fallback backends |
| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
| `GET` | `/sessions` | List all sessions |
| `GET` | `/history/{id}` | Get session message history |
| `PUT` | `/history/{id}` | Replace full session history |
| `GET` | `/events` | SSE stream for real-time Talk activity |
| `POST` | `/note` | Inject a context note into a session |
| `GET` | `/files` | List identity files |
| `GET` | `/files/{name}` | Read a file |
| `PUT` | `/files/{name}` | Write a file |
| `POST` | `/distill/short` | Aggregate session logs → MEMORY_SHORT |
| `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) |
| `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) |
| `POST` | `/distill/all` | Run all three distillation steps |
| `GET` | `/distill/status` | Show scheduler status and next run times |
| `POST` | `/orchestrate` | Submit an agent task — returns `{"job_id": "..."}` |
| `GET` | `/orchestrate/{job_id}` | Poll job status and result |
| `GET` | `/orchestrate` | List all jobs from current session (in-memory) |
| `GET` | `/health` | Health check — returns `{"status": "ok"}` |
Chat request body (`POST /chat`):
```json
{
"message": "string",
"session_id": "string | null",
"tier": 1,
"model": "claude | gemini | null",
"include_long": true,
"include_mid": true,
"include_short": true
}
```
---
## In Progress / Planned
- **Ollama local model backend** — direct Ollama API support (no CLI wrapper); target host: scott_gaming via WireGuard
- **Nextcloud Talk stabilization** — test end-to-end after restarts; complete bot registration docs
- **Multi-user support** — per-user identity/memory files; currently single-user (Scott); Holly instance planned
### Recently Completed
-**Google Chat bot** — Workspace Add-on integration; DM and spaces; JWT verification; session persistence
-**Agent mode** — Gemini tool loop + Claude responder, accessible via UI toggle
-**Personal task management**`task_list`, `task_create`, `task_update`, `task_complete` tools backed by `TASKS.json`
-**Web search fixed** — DDG package updated (`ddgs`); `WebSearch`/`WebFetch` allowed for Claude CLI fallback
-**Session persistence for orchestrator** — agent mode turns now survive page refresh
-**Systemd user service** — Cortex runs as a user service; no sudo required (`systemctl --user restart cortex`)
-**OAuth token warning banner** — amber banner when Claude CLI token is within 24h of expiry
---
*Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent.*
*Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.*

View File

@@ -16,6 +16,50 @@
const note_vis_btn_el = document.getElementById('note-vis-btn');
const settings_btn_el = document.getElementById('settings-btn');
const settings_dd_el = document.getElementById('settings-dropdown');
const sessionsBackdrop = document.getElementById('sessions-backdrop');
// ── Close all panels/dropdowns (mutual exclusion) ─────────────
function closeAllPanels() {
if (mode_dropdown_el) mode_dropdown_el.classList.remove('open');
if (settings_dd_el) settings_dd_el.classList.remove('open');
if (sessionsPanel) { sessionsPanel.classList.remove('open'); sessionsBackdrop.classList.remove('open'); }
const pd = document.getElementById('persona-dropdown');
if (pd) pd.classList.remove('open');
}
// ── Toasts ────────────────────────────────────────────────────
const toastContainer = document.getElementById('toast-container');
function showToast(message, type = 'info', duration = 2500) {
const el = document.createElement('div');
el.className = 'toast' + (type !== 'info' ? ' ' + type : '');
el.textContent = message;
toastContainer.appendChild(el);
requestAnimationFrame(() => {
requestAnimationFrame(() => el.classList.add('show'));
});
setTimeout(() => {
el.classList.remove('show');
el.addEventListener('transitionend', () => el.remove(), { once: true });
}, duration);
}
// ── Syntax highlighting ───────────────────────────────────────
function highlight_code(container) {
if (typeof hljs === 'undefined') return;
container.querySelectorAll('pre code').forEach(el => hljs.highlightElement(el));
}
// ── Utility helpers ───────────────────────────────────────────
function _esc(s) {
return String(s).replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;').replace(/"/g,'&quot;');
}
// ── Lucide icon helpers ───────────────────────────────────────
function icon_html(name, size = 16) {
return `<svg data-lucide="${name}" width="${size}" height="${size}" class="btn-icon"></svg>`;
}
function render_icons() { if (window.lucide) lucide.createIcons(); }
// User/persona injected by the server at /{user}/{persona}
const CORTEX_USER = (window.CORTEX_CONFIG || {}).user || 'scott';
@@ -26,12 +70,50 @@
if (headerEmoji) headerEmoji.textContent = CORTEX_EMOJI;
// Set favicon to persona emoji
{
const favicon = document.querySelector("link[rel='icon']");
if (favicon && CORTEX_EMOJI) {
const svg = `<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>${CORTEX_EMOJI}</text></svg>`;
favicon.href = `data:image/svg+xml,${encodeURIComponent(svg)}`;
}
}
// Wire help link to preserve current persona on return
const helpLink = document.getElementById('help-link');
if (helpLink) helpLink.href = `/help?persona=${encodeURIComponent(CORTEX_PERSONA)}`;
let sessionId = null;
let primaryBackend = 'claude';
let activeController = null;
let currentHistory = []; // mirrors backend session [{role, content}, ...]
let talkThinkingDiv = null; // pending "thinking…" bubble for live Talk updates
// ── Session persistence ───────────────────────────────────────
// Survives page navigation (help, settings, etc.) within the same browser.
// Expires after SESSION_TTL_MS of inactivity.
const SESSION_TTL_MS = 30 * 60 * 1000; // 30 minutes
const _sid_key = `cx_sid_${CORTEX_USER}_${CORTEX_PERSONA}`;
const _sid_ts_key = `cx_sid_ts_${CORTEX_USER}_${CORTEX_PERSONA}`;
function persist_session() {
if (!sessionId) return;
localStorage.setItem(_sid_key, sessionId);
localStorage.setItem(_sid_ts_key, String(Date.now()));
}
function clear_stored_session() {
localStorage.removeItem(_sid_key);
localStorage.removeItem(_sid_ts_key);
}
function get_stored_session() {
const id = localStorage.getItem(_sid_key);
const ts = parseInt(localStorage.getItem(_sid_ts_key) || '0', 10);
if (!id || Date.now() - ts > SESSION_TTL_MS) return null;
return id;
}
// ── Enter toggle ─────────────────────────────────────────────
// Default: Ctrl+Enter sends. Stored in localStorage.
let ctrlEnterMode = localStorage.getItem('ctrlEnterSend') !== 'false';
@@ -69,12 +151,17 @@
// ── Input mode — dropdown select with MRU ordering ──────────
const MODES = {
chat: { icon: '💬', label: 'Chat' },
note: { icon: '📝', label: 'Note' },
otr: { icon: '🔒', label: 'OTR' },
agent: { icon: '🥸', label: 'Agent' },
chat: { icon: 'message-circle', label: 'Chat' },
note: { icon: 'pencil', label: 'Note' },
otr: { icon: 'lock', label: 'OTR' },
agent: { icon: 'bot', label: 'Agent' },
};
const send_defs = {
chat: { icon: 'arrow-up', label: 'Send' },
note: { icon: 'pencil', label: 'Note' },
otr: { icon: 'arrow-up', label: 'Send' },
agent: { icon: 'zap', label: 'Run' },
};
const send_labels = { chat: '↑ Send', note: '📝 Note', otr: '↑ Send', agent: '⚡ Run' };
let current_mode = localStorage.getItem('current_mode') || 'chat';
let note_public = false;
@@ -96,6 +183,7 @@
}
function open_mode_dropdown() {
closeAllPanels();
// Build options in MRU order (least recent at top, most recent at bottom)
// — bottom is visually closest to the button since dropdown opens upward
const ordered = [...mode_mru].reverse();
@@ -105,12 +193,13 @@
const btn = document.createElement('button');
btn.className = 'mode-option' + (mode === current_mode ? ' current' : '');
btn.innerHTML =
`<span class="opt-icon">${m.icon}</span>${m.label}`
`<span class="opt-icon">${icon_html(m.icon, 15)}</span>${m.label}`
+ (mode === current_mode ? '<span class="opt-check">✓</span>' : '');
btn.addEventListener('click', () => set_mode(mode));
mode_dropdown_el.appendChild(btn);
});
mode_dropdown_el.classList.add('open');
render_icons();
}
function close_mode_dropdown() {
@@ -130,10 +219,11 @@
});
function update_mode_ui() {
const m = MODES[current_mode];
const m = MODES[current_mode];
const sd = send_defs[current_mode] || send_defs.chat;
// Update trigger button
mode_icon_el.textContent = m.icon;
mode_icon_el.innerHTML = icon_html(m.icon, 15);
mode_label_el.textContent = m.label;
mode_select_btn_el.className = current_mode === 'chat'
? '' : `mode-${current_mode}`;
@@ -150,9 +240,10 @@
inputEl.classList.toggle('mode-otr', current_mode === 'otr');
inputEl.classList.toggle('mode-agent', current_mode === 'agent');
// Send button label
sendBtn.textContent = send_labels[current_mode] || 'Send';
// Send button label + icon
sendBtn.innerHTML = icon_html(sd.icon) + ' ' + sd.label;
render_icons();
updateInputPlaceholder();
}
@@ -184,7 +275,9 @@
// ── Settings dropdown ─────────────────────────────────────────
settings_btn_el.addEventListener('click', (e) => {
e.stopPropagation();
settings_dd_el.classList.toggle('open');
const isOpen = settings_dd_el.classList.contains('open');
closeAllPanels();
if (!isOpen) settings_dd_el.classList.add('open');
});
document.addEventListener('click', (e) => {
if (!settings_dd_el.contains(e.target) && e.target !== settings_btn_el) {
@@ -238,7 +331,9 @@
if (personaSwitcher) {
personaSwitcher.addEventListener('click', (e) => {
if (personaDropEl.children.length === 0) return;
personaDropEl.classList.toggle('open');
const isOpen = personaDropEl.classList.contains('open');
closeAllPanels();
if (!isOpen) personaDropEl.classList.add('open');
e.stopPropagation();
});
document.addEventListener('click', () => personaDropEl.classList.remove('open'));
@@ -246,23 +341,40 @@
// ── Backend toggle ───────────────────────────────────────────
fetch('/backend').then(r => r.json()).then(d => setBackendUI(d.primary));
fetch('/backend').then(r => r.json()).then(d => setBackendUI(d));
function setBackendUI(backend) {
const BACKEND_CYCLE = ['claude', 'gemini', 'local'];
const BACKEND_CLASS = { claude: '', gemini: 'mem-on', local: 'local-on' };
const backendModelHint = document.getElementById('backend-model-hint');
function setBackendUI(d) {
const backend = d.primary || d; // accept full response obj or bare string
primaryBackend = backend;
backendToggle.textContent = backend;
backendToggle.className = 'ctx-btn' + (backend === 'gemini' ? ' mem-on' : '');
const extra = BACKEND_CLASS[backend] || '';
backendToggle.className = 'ctx-btn' + (extra ? ' ' + extra : '');
if (backendModelHint) {
if (backend === 'local' && d.local_model) {
backendModelHint.textContent = d.local_model.label || d.local_model.model_name;
backendModelHint.style.display = '';
} else {
backendModelHint.textContent = '';
backendModelHint.style.display = 'none';
}
}
}
backendToggle.addEventListener('click', async () => {
const next = primaryBackend === 'claude' ? 'gemini' : 'claude';
const idx = BACKEND_CYCLE.indexOf(primaryBackend);
const next = BACKEND_CYCLE[(idx + 1) % BACKEND_CYCLE.length];
const res = await fetch('/backend', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ primary: next }),
});
const d = await res.json();
setBackendUI(d.primary);
setBackendUI(d);
addMessage('system', `Backend: ${d.primary} (fallback: ${d.fallback})`);
});
@@ -272,17 +384,26 @@
e.stopPropagation();
if (sessionsPanel.classList.contains('open')) {
sessionsPanel.classList.remove('open');
sessionsBackdrop.classList.remove('open');
return;
}
closeAllPanels();
const res = await fetch(`/sessions?${_fileParams}`);
const data = await res.json();
renderPanel(data.sessions);
sessionsPanel.classList.add('open');
sessionsBackdrop.classList.add('open');
});
sessionsBackdrop.addEventListener('click', () => {
sessionsPanel.classList.remove('open');
sessionsBackdrop.classList.remove('open');
});
document.addEventListener('click', (e) => {
if (!sessionsPanel.contains(e.target) && e.target !== sessionsBtn) {
sessionsPanel.classList.remove('open');
sessionsBackdrop.classList.remove('open');
}
});
@@ -296,11 +417,13 @@
const newItem = makeItem('new', '+ New session', '');
newItem.addEventListener('click', () => {
sessionId = null;
clear_stored_session();
currentHistory = [];
messagesEl.innerHTML = '';
sessionEl.textContent = '';
addMessage('system', 'New session');
sessionsPanel.classList.remove('open');
sessionsBackdrop.classList.remove('open');
inputEl.focus();
});
sessionsPanel.appendChild(newItem);
@@ -355,6 +478,7 @@
if (sessionId === s.session_id) {
sessionEl.textContent = `session: ${newName || s.session_id}`;
}
if (newName) showToast('Session renamed', 'success');
}
input.addEventListener('keydown', (e) => {
@@ -374,10 +498,11 @@
await fetch(`/sessions/${s.session_id}?${_fileParams}`, { method: 'DELETE' });
if (sessionId === s.session_id) {
sessionId = null;
clear_stored_session();
currentHistory = [];
messagesEl.innerHTML = '';
sessionEl.textContent = '';
addMessage('system', 'Session deleted');
showToast('Session deleted');
}
const res = await fetch(`/sessions?${_fileParams}`);
const data = await res.json();
@@ -407,10 +532,11 @@
return item;
}
async function resumeSession(id) {
async function resumeSession(id, silent = false) {
talkThinkingDiv = null;
if (id && id.startsWith('nct_')) sessionsBtn.classList.remove('talk-badge');
const res = await fetch(`/history/${id}?${_fileParams}`);
if (!res.ok) throw new Error(`HTTP ${res.status}`);
const data = await res.json();
messagesEl.innerHTML = '';
@@ -426,10 +552,12 @@
attachHistoryControls(msgDiv, i);
}
addMessage('system', `Resumed session ${id}`);
if (!silent) addMessage('system', `Resumed session ${id}`);
scrollToBottom();
sessionsPanel.classList.remove('open');
sessionsBackdrop.classList.remove('open');
inputEl.focus();
persist_session();
}
function timeAgo(iso) {
@@ -473,6 +601,7 @@
if (role === 'assistant' && typeof marked !== 'undefined') {
div.dataset.raw = text;
div.innerHTML = marked.parse(text);
highlight_code(div);
div.querySelectorAll('a').forEach(a => {
a.target = '_blank';
a.rel = 'noopener noreferrer';
@@ -488,7 +617,9 @@
div.appendChild(label);
div.appendChild(content);
} else {
div.dataset.raw = text;
div.textContent = text;
div.appendChild(makeCopyBtn(div));
}
// Wrap user/assistant messages so action buttons can be attached
@@ -523,20 +654,21 @@
const editBtn = document.createElement('button');
editBtn.className = 'msg-act-btn';
editBtn.textContent = 'edit';
editBtn.innerHTML = icon_html('pencil', 12) + ' edit';
editBtn.addEventListener('click', () => {
startEdit(msgDiv);
});
const delBtn = document.createElement('button');
delBtn.className = 'msg-act-btn del';
delBtn.textContent = 'del';
delBtn.innerHTML = icon_html('trash-2', 12) + ' del';
delBtn.addEventListener('click', () => {
deleteMsg(wrapper);
});
actionsDiv.appendChild(editBtn);
actionsDiv.appendChild(delBtn);
render_icons();
}
// After any currentHistory splice, renumber all wrapper data-hist-idx attributes.
@@ -569,17 +701,18 @@
ta.rows = Math.min(originalText.split('\n').length + 1, 12);
const saveBtn = document.createElement('button');
saveBtn.textContent = 'Save';
saveBtn.className = 'edit-save-btn';
saveBtn.innerHTML = icon_html('check', 13) + ' Save';
saveBtn.className = 'edit-save-btn';
const cancelBtn = document.createElement('button');
cancelBtn.textContent = 'Cancel';
cancelBtn.className = 'edit-cancel-btn';
cancelBtn.innerHTML = icon_html('x', 13) + ' Cancel';
cancelBtn.className = 'edit-cancel-btn';
const btnRow = document.createElement('div');
btnRow.className = 'edit-btns';
btnRow.appendChild(saveBtn);
btnRow.appendChild(cancelBtn);
render_icons();
msgDiv.innerHTML = '';
msgDiv.appendChild(ta);
@@ -641,6 +774,7 @@
if (role === 'assistant' && typeof marked !== 'undefined') {
div.dataset.raw = text;
div.innerHTML = marked.parse(text);
highlight_code(div);
div.querySelectorAll('a').forEach(a => {
a.target = '_blank';
a.rel = 'noopener noreferrer';
@@ -651,10 +785,81 @@
}
}
// ── Agent tool-call step cards ────────────────────────────────
function renderToolCalls(toolCalls, beforeEl) {
if (!toolCalls || toolCalls.length === 0) return;
const container = document.createElement('div');
container.className = 'tool-calls-container';
for (const tc of toolCalls) {
const details = document.createElement('details');
details.className = 'tool-call';
// Summary: name + first arg value snippet
const args = tc.args || {};
const argKeys = Object.keys(args);
let argSnippet = '';
if (argKeys.length > 0) {
const firstVal = String(args[argKeys[0]]);
argSnippet = firstVal.length > 60 ? firstVal.slice(0, 60) + '…' : firstVal;
}
const summary = document.createElement('summary');
const nameSpan = document.createElement('span');
nameSpan.className = 'tc-name';
nameSpan.textContent = tc.tool;
summary.appendChild(nameSpan);
if (argSnippet) {
const snippetSpan = document.createElement('span');
snippetSpan.className = 'tc-snippet';
snippetSpan.textContent = argSnippet;
summary.appendChild(snippetSpan);
}
details.appendChild(summary);
// Expanded body
const body = document.createElement('div');
body.className = 'tc-body';
if (argKeys.length > 0) {
const sec = document.createElement('div');
sec.className = 'tc-section';
const lbl = document.createElement('span');
lbl.className = 'tc-label';
lbl.textContent = 'args';
const pre = document.createElement('pre');
pre.textContent = JSON.stringify(args, null, 2);
sec.appendChild(lbl);
sec.appendChild(pre);
body.appendChild(sec);
}
const resultStr = tc.result || '';
const truncated = resultStr.length > 400;
const sec2 = document.createElement('div');
sec2.className = 'tc-section';
const lbl2 = document.createElement('span');
lbl2.className = 'tc-label';
lbl2.textContent = 'result';
const pre2 = document.createElement('pre');
pre2.textContent = truncated ? resultStr.slice(0, 400) + '\n…[truncated]' : resultStr;
sec2.appendChild(lbl2);
sec2.appendChild(pre2);
body.appendChild(sec2);
details.appendChild(body);
container.appendChild(details);
}
beforeEl.parentElement.insertBefore(container, beforeEl);
}
function makeCopyBtn(div) {
const btn = document.createElement('button');
btn.className = 'copy-btn';
btn.textContent = 'copy';
btn.innerHTML = icon_html('copy', 12) + ' copy';
render_icons();
btn.addEventListener('click', (e) => {
e.stopPropagation();
const text = div.dataset.raw || '';
@@ -663,11 +868,14 @@
} else {
fallbackCopy(text);
}
btn.textContent = '✓';
showToast('Copied to clipboard', 'success', 1800);
btn.innerHTML = icon_html('check', 12) + ' copied';
render_icons();
btn.classList.add('copied');
setTimeout(() => {
btn.textContent = 'copy';
btn.innerHTML = icon_html('copy', 12) + ' copy';
btn.classList.remove('copied');
render_icons();
}, 1500);
});
return btn;
@@ -701,7 +909,7 @@
});
if (!res.ok) throw new Error(`HTTP ${res.status}`);
} catch (err) {
addMessage('system', `Note save failed: ${err.message}`);
showToast(`Note save failed: ${err.message}`, 'error');
}
}
@@ -716,7 +924,7 @@
inputEl.value = '';
syncHeight();
sendBtn.style.display = 'none';
stopBtn.style.display = 'block';
stopBtn.style.display = 'flex';
headerEmoji.classList.add('processing');
activeController = new AbortController();
@@ -741,6 +949,7 @@
include_mid: memMid,
include_short: memShort,
off_record: current_mode === 'otr',
model: primaryBackend,
user: CORTEX_USER,
persona: CORTEX_PERSONA,
}),
@@ -770,15 +979,21 @@
if (data.type === 'response') {
sessionId = data.session_id;
sessionEl.textContent = `session: ${sessionId}`;
persist_session();
thinkingDiv.className = 'message assistant';
setMessageText(thinkingDiv, 'assistant', data.response);
const assistHistIdx = currentHistory.length;
currentHistory.push({ role: 'assistant', content: data.response });
attachHistoryControls(thinkingDiv, assistHistIdx);
if (data.fallback_used) {
addMessage('system',
`${primaryBackend} unavailable — answered by ${data.backend}`);
}
// Model tag — always shown, amber if fallback was used
const modelTag = document.createElement('div');
modelTag.className = 'model-tag' + (data.fallback_used ? ' fallback' : '');
const label = data.backend_label || data.backend || '';
modelTag.textContent = data.fallback_used
? `⚡ fallback → ${label}`
: label;
thinkingDiv.appendChild(modelTag);
} else if (data.type === 'error') {
throw new Error(data.message);
}
@@ -808,7 +1023,7 @@
inputEl.value = '';
syncHeight();
sendBtn.style.display = 'none';
stopBtn.style.display = 'block';
stopBtn.style.display = 'flex';
headerEmoji.classList.add('processing');
activeController = new AbortController();
@@ -870,6 +1085,7 @@
if (job.session_id) {
sessionId = job.session_id;
sessionEl.textContent = `session: ${sessionId}`;
persist_session();
}
const userHistIdx = currentHistory.length - 1; // pushed before fetch
@@ -881,11 +1097,7 @@
currentHistory.push({ role: 'assistant', content: job.response || '' });
attachHistoryControls(thinkingDiv, assistHistIdx);
const n = job.tool_calls?.length || 0;
if (n) {
const names = job.tool_calls.map(t => t.name).join(', ');
addMessage('system', `${n} tool call${n !== 1 ? 's' : ''}: ${names}`);
}
renderToolCalls(job.tool_calls, thinkingDiv.parentElement);
} catch (err) {
if (err.name === 'AbortError') {
@@ -926,17 +1138,94 @@
// ── File editor ──────────────────────────────────────────────
const fileModal = document.getElementById('file-modal');
const fileSelect = document.getElementById('file-select');
const fileSidebar = document.getElementById('file-sidebar');
const fileEditor = document.getElementById('file-editor');
const filePreview = document.getElementById('file-preview');
const fileRawBtn = document.getElementById('file-raw-btn');
const filePreviewBtn = document.getElementById('file-preview-btn');
const fileSaveBtn = document.getElementById('file-save-btn');
const fileSavedMsg = document.getElementById('file-saved-msg');
const fileCloseBtn = document.getElementById('file-close-btn');
const filesBtn = document.getElementById('files-btn');
let fileMode = 'preview'; // 'edit' or 'preview'
let fileMode = 'preview'; // 'edit' or 'preview'
let activeFileName = null;
// File groups — controls sidebar order and section labels
const FILE_GROUPS = [
{ label: 'Identity', files: ['IDENTITY.md', 'SOUL.md', 'PROTOCOLS.md', 'CONTEXT_TIERS.md'] },
{ label: 'Memory', files: ['MEMORY_LONG.md', 'MEMORY_MID.md', 'MEMORY_SHORT.md'] },
{ label: 'Profile', files: ['USER.md', 'HELP.md'] },
];
function fmtSize(bytes) {
if (!bytes) return 'empty';
if (bytes < 1024) return bytes + ' B';
return (bytes / 1024).toFixed(1) + ' KB';
}
function fmtModified(ts) {
if (!ts) return '';
const d = new Date(ts * 1000);
const now = new Date();
if (d.toDateString() === now.toDateString()) return 'today';
const diff = (now - d) / 86400000;
if (diff < 2) return 'yesterday';
return d.toLocaleDateString(undefined, { month: 'short', day: 'numeric' });
}
function renderFileSidebar(files) {
const byName = Object.fromEntries(files.map(f => [f.name, f]));
fileSidebar.innerHTML = '';
for (const group of FILE_GROUPS) {
const groupEl = document.createElement('div');
groupEl.className = 'file-group';
const header = document.createElement('div');
header.className = 'fg-header';
header.textContent = group.label;
header.addEventListener('click', () => header.classList.toggle('collapsed'));
groupEl.appendChild(header);
const items = document.createElement('div');
items.className = 'fg-items';
for (const fname of group.files) {
const f = byName[fname];
if (!f) continue;
const item = document.createElement('div');
item.className = 'file-item' + (f.exists ? '' : ' missing');
item.dataset.name = fname;
if (fname === activeFileName) item.classList.add('active');
const nameEl = document.createElement('div');
nameEl.className = 'fi-name';
nameEl.textContent = fname;
item.appendChild(nameEl);
const metaEl = document.createElement('div');
metaEl.className = 'fi-meta';
metaEl.innerHTML = `<span>${fmtSize(f.size)}</span>`
+ (f.modified ? `<span>${fmtModified(f.modified)}</span>` : '');
item.appendChild(metaEl);
item.addEventListener('click', () => loadFile(fname));
items.appendChild(item);
}
groupEl.appendChild(items);
fileSidebar.appendChild(groupEl);
}
}
function setActiveFile(name) {
activeFileName = name;
fileSidebar.querySelectorAll('.file-item').forEach(el => {
el.classList.toggle('active', el.dataset.name === name);
});
document.getElementById('file-modal-title').textContent = name;
}
function setFileMode(mode) {
fileMode = mode;
@@ -960,27 +1249,22 @@
}
async function loadFile(name) {
setActiveFile(name);
const res = await fetch(`/files/${encodeURIComponent(name)}?${_fileParams}`);
if (!res.ok) { fileEditor.value = `Error loading ${name}`; return; }
const data = await res.json();
fileEditor.value = data.content;
document.getElementById('file-modal-title').textContent = name;
setFileMode(fileMode);
}
async function openFileModal() {
// Populate the file list
const res = await fetch(`/files?${_fileParams}`);
const res = await fetch(`/files?${_fileParams}`);
const data = await res.json();
fileSelect.innerHTML = '';
for (const f of data.files) {
const opt = document.createElement('option');
opt.value = f.name;
opt.textContent = f.name + (f.exists ? '' : ' (missing)');
fileSelect.appendChild(opt);
}
renderFileSidebar(data.files);
fileModal.classList.add('open');
await loadFile(fileSelect.value);
// Load first existing file
const first = data.files.find(f => f.exists) || data.files[0];
if (first) await loadFile(first.name);
}
filesBtn.addEventListener('click', () => {
@@ -988,21 +1272,24 @@
openFileModal();
});
fileSelect.addEventListener('change', () => loadFile(fileSelect.value));
fileRawBtn.addEventListener('click', () => setFileMode('edit'));
filePreviewBtn.addEventListener('click', () => setFileMode('preview'));
fileSaveBtn.addEventListener('click', async () => {
const name = fileSelect.value;
const res = await fetch(`/files/${encodeURIComponent(name)}?${_fileParams}`, {
if (!activeFileName) return;
const res = await fetch(`/files/${encodeURIComponent(activeFileName)}?${_fileParams}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ content: fileEditor.value }),
});
if (res.ok) {
fileSavedMsg.classList.add('show');
setTimeout(() => fileSavedMsg.classList.remove('show'), 2000);
showToast('File saved', 'success');
// Refresh sidebar to update size/modified
const listRes = await fetch(`/files?${_fileParams}`);
const listData = await listRes.json();
renderFileSidebar(listData.files);
} else {
showToast('Save failed', 'error');
}
});
@@ -1012,6 +1299,66 @@
if (e.target === fileModal) fileModal.classList.remove('open');
});
// ── Session search ────────────────────────────────────────────
const sessionSearchInput = document.getElementById('session-search-input');
const sessionSearchBtn = document.getElementById('session-search-btn');
const sessionSearchResults = document.getElementById('session-search-results');
function _showFileView() {
fileEditor.style.display = '';
filePreview.style.display = '';
sessionSearchResults.style.display = 'none';
}
function _showSearchResults(html) {
fileEditor.style.display = 'none';
filePreview.style.display = 'none';
sessionSearchResults.style.display = '';
sessionSearchResults.innerHTML = html;
}
async function runSessionSearch() {
const q = sessionSearchInput.value.trim();
if (q.length < 2) return;
sessionSearchBtn.disabled = true;
sessionSearchBtn.textContent = '…';
try {
const res = await fetch(`/sessions/search?q=${encodeURIComponent(q)}&${_fileParams}&limit=30`);
const data = await res.json();
if (!res.ok) { _showSearchResults(`<p class="sr-error">Error: ${data.detail || res.status}</p>`); return; }
if (!data.matches.length) {
_showSearchResults(`<p class="sr-empty">No results for "<strong>${_esc(q)}</strong>" in ${data.total_files_searched} session file(s).</p>`);
return;
}
let html = `<div class="sr-header">${data.matches.length} result(s) for "<strong>${_esc(q)}</strong>" across ${data.total_files_searched} session(s)</div>`;
let lastDate = null;
for (const m of data.matches) {
if (m.date !== lastDate) {
html += `<div class="sr-date">${m.date}</div>`;
lastDate = m.date;
}
const hi = m.excerpt.replace(new RegExp(_esc(q), 'gi'), s => `<mark>${_esc(s)}</mark>`);
html += `<div class="sr-excerpt">${hi}</div>`;
}
_showSearchResults(html);
} catch (e) {
_showSearchResults(`<p class="sr-error">Search failed: ${e.message}</p>`);
} finally {
sessionSearchBtn.disabled = false;
sessionSearchBtn.textContent = 'Go';
}
}
sessionSearchBtn.addEventListener('click', runSessionSearch);
sessionSearchInput.addEventListener('keydown', (e) => {
if (e.key === 'Enter') runSessionSearch();
});
// When a file is clicked, switch back from search results to editor
fileSidebar.addEventListener('click', () => {
if (sessionSearchResults.style.display !== 'none') _showFileView();
});
document.addEventListener('keydown', (e) => {
if (e.key === 'Escape') {
if (fileModal.classList.contains('open')) fileModal.classList.remove('open');
@@ -1026,6 +1373,13 @@
// ── Real-time Talk updates (SSE) ─────────────────────────────
const evtSource = new EventSource('/events');
// Close cleanly on navigation so the browser doesn't log "connection interrupted"
window.addEventListener('beforeunload', () => evtSource.close());
evtSource.onerror = () => {
// EventSource auto-reconnects — nothing to do; suppress console noise
};
evtSource.onmessage = (e) => {
let data;
try { data = JSON.parse(e.data); } catch { return; }
@@ -1286,3 +1640,16 @@
checkAuthStatus();
// Re-check every 30 minutes
setInterval(checkAuthStatus, 30 * 60 * 1000);
// ── Initial render ────────────────────────────────────────────
// Process all static Lucide SVGs in the header + stop button,
// and seed the mode UI (which also calls render_icons internally).
update_mode_ui();
render_icons();
// ── Auto-restore last session ─────────────────────────────────
// Silently resume if within the inactivity TTL; clears stored ID on error.
{
const stored = get_stored_session();
if (stored) resumeSession(stored, true).catch(clear_stored_session);
}

View File

@@ -27,14 +27,30 @@
margin: 0 auto;
}
.back-link {
display: inline-block;
font-size: 0.8rem;
color: #94a3b8;
text-decoration: none;
margin-bottom: 1.5rem;
.page-nav {
display: flex;
align-items: center;
gap: 0.25rem;
margin-bottom: 1.75rem;
flex-wrap: wrap;
}
.back-link:hover { color: #a78bfa; }
.nav-link {
display: inline-flex;
align-items: center;
padding: 0.3rem 0.6rem;
border-radius: 6px;
font-size: 0.8rem;
font-weight: 500;
color: #64748b;
text-decoration: none;
transition: color 0.15s, background 0.15s;
white-space: nowrap;
}
.nav-link:hover { color: #cbd5e1; background: rgba(255,255,255,0.05); }
.nav-link.active { color: #a78bfa; }
.nav-spacer { flex: 1; min-width: 0.5rem; }
.nav-link.nav-logout { color: #475569; }
.nav-link.nav-logout:hover { color: #94a3b8; background: none; }
header {
margin-bottom: 2rem;
@@ -106,7 +122,13 @@
</head>
<body>
<div class="page">
<a id="back-link" href="/" class="back-link">← Back to Cortex</a>
<nav class="page-nav" id="page-nav">
<a id="nav-chat" href="/" class="nav-link">← Chat</a>
<a href="/help" class="nav-link active">Help</a>
<a href="/settings" class="nav-link" id="nav-settings">Settings</a>
<span class="nav-spacer"></span>
<a href="/logout" class="nav-link nav-logout">Sign out</a>
</nav>
<header>
<h1>Help &amp; Reference</h1>
@@ -122,8 +144,8 @@
const persona = cfg.persona || 'inara';
const params = `user=${encodeURIComponent(user)}&persona=${encodeURIComponent(persona)}`;
// Wire up back link and persona label
document.getElementById('back-link').href = cfg.backHref || '/';
// Wire up nav links and persona label
document.getElementById('nav-chat').href = cfg.backHref || '/';
if (persona) {
document.getElementById('persona-label').textContent =
`${persona.charAt(0).toUpperCase() + persona.slice(1)} · ${user}`;
@@ -155,11 +177,25 @@
async function loadHelp() {
try {
const res = await fetch(`/files/HELP.md?${params}`);
if (!res.ok) throw new Error(`HTTP ${res.status}`);
const data = await res.json();
// Always load the shared base from static
const baseRes = await fetch('/static/HELP.md');
if (!baseRes.ok) throw new Error(`HTTP ${baseRes.status}`);
let markdown = await baseRes.text();
// Try to load persona-specific additions and append them
try {
const personaRes = await fetch(`/files/HELP.md?${params}`);
if (personaRes.ok) {
const personaData = await personaRes.json();
const extra = (personaData.content || '').trim();
if (extra) {
markdown += '\n\n---\n\n## ' + persona.charAt(0).toUpperCase() + persona.slice(1) + ' Notes\n\n' + extra;
}
}
} catch (_) { /* persona-specific file is optional */ }
const body = document.getElementById('help-body');
body.innerHTML = marked.parse(data.content);
body.innerHTML = marked.parse(markdown);
body.querySelectorAll('a').forEach(a => {
a.target = '_blank'; a.rel = 'noopener noreferrer';
});

View File

@@ -21,6 +21,9 @@
</script>
<link rel="stylesheet" href="/static/style.css">
<script src="/static/marked.min.js"></script>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/styles/atom-one-dark.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/highlight.min.js"></script>
<script src="https://unpkg.com/lucide@latest/dist/umd/lucide.min.js"></script>
</head>
<body>
<header>
@@ -32,20 +35,35 @@
</div>
<nav id="hdr-nav">
<button id="sessions-btn" class="hdr-btn" title="Sessions">💬 <span class="btn-label">Sessions</span></button>
<button id="ctx-open-btn" class="hdr-btn" title="Context &amp; memory"><span class="tier-badge">2</span></button>
<button id="sessions-btn" class="hdr-btn" title="Sessions">
<svg data-lucide="history" class="btn-icon"></svg>
<span class="btn-label">Sessions</span>
</button>
<button id="ctx-open-btn" class="hdr-btn" title="Context &amp; memory">
<svg data-lucide="sliders-horizontal" class="btn-icon"></svg><span class="tier-badge">2</span>
</button>
<div class="hdr-dropdown-wrap" id="settings-wrap">
<button class="hdr-btn" id="settings-btn" title="Settings"></button>
<button class="hdr-btn" id="settings-btn" title="Settings">
<svg data-lucide="menu" class="btn-icon"></svg>
</button>
<div class="hdr-dropdown" id="settings-dropdown">
<button id="files-btn" class="hdr-dd-item">📁 Files</button>
<a href="/settings" class="hdr-dd-item">👤 Account</a>
<button id="files-btn" class="hdr-dd-item">
<svg data-lucide="folder-open" class="btn-icon"></svg> Files
</button>
<a href="/settings" class="hdr-dd-item">
<svg data-lucide="user" class="btn-icon"></svg> Account
</a>
<div class="hdr-dd-divider"></div>
<form method="POST" action="/logout" style="margin:0">
<button type="submit" class="hdr-dd-item">⏏ Sign Out</button>
<button type="submit" class="hdr-dd-item">
<svg data-lucide="log-out" class="btn-icon"></svg> Sign Out
</button>
</form>
</div>
</div>
<a href="/help" class="hdr-btn" title="Help &amp; reference" style="text-decoration:none"></a>
<a id="help-link" href="/help" class="hdr-btn" title="Help &amp; reference" style="text-decoration:none">
<svg data-lucide="circle-help" class="btn-icon"></svg>
</a>
</nav>
<div id="sessions-panel"></div>
@@ -85,6 +103,7 @@
<div class="ctx-row">
<button id="backend-toggle" class="ctx-btn" title="Click to switch primary backend">claude</button>
</div>
<div id="backend-model-hint"></div>
</div>
<div class="ctx-section">
<div class="ctx-section-title">Display</div>
@@ -107,16 +126,28 @@
<div id="file-modal-inner">
<div id="file-modal-header">
<span id="file-modal-title">Context Files</span>
<select id="file-select"></select>
<span class="fm-spacer"></span>
<button class="fm-btn" id="file-raw-btn">edit</button>
<button class="fm-btn active" id="file-preview-btn">preview</button>
<button class="fm-btn save" id="file-save-btn">Save</button>
<span id="file-saved-msg">saved ✓</span>
<button class="fm-btn" id="file-close-btn"></button>
</div>
<div id="file-modal-body">
<textarea id="file-editor" spellcheck="false"></textarea>
<div id="file-preview"></div>
<div id="file-modal-content">
<div id="file-sidebar-wrap">
<div id="file-sidebar"></div>
<div id="session-search-wrap">
<div id="session-search-label">Session Search</div>
<div id="session-search-row">
<input id="session-search-input" type="search" placeholder="Search sessions…" autocomplete="off">
<button id="session-search-btn">Go</button>
</div>
</div>
</div>
<div id="file-modal-body">
<textarea id="file-editor" spellcheck="false"></textarea>
<div id="file-preview"></div>
<div id="session-search-results" style="display:none"></div>
</div>
</div>
</div>
</div>
@@ -149,10 +180,12 @@
<textarea id="input" rows="1" placeholder="Message…" autofocus></textarea>
<div id="send-col">
<button id="send">Send</button>
<button id="stop">Stop</button>
<button id="stop"><svg data-lucide="square" width="14" height="14" class="btn-icon"></svg> Stop</button>
</div>
</div>
<div id="sessions-backdrop"></div>
<div id="toast-container"></div>
<script src="/static/app.js"></script>
</body>
</html>

View File

@@ -0,0 +1,483 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Cortex — Model Registry</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
<style>
*, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
body {
min-height: 100vh;
background: #0f1117;
font-family: 'Inter', system-ui, -apple-system, sans-serif;
font-weight: 450;
-webkit-font-smoothing: antialiased;
color: #e2e8f0;
padding: 2rem 1.5rem 4rem;
}
.page { max-width: 700px; margin: 0 auto; }
/* ── Nav ── */
.page-nav {
display: flex; align-items: center; gap: 0.25rem;
margin-bottom: 1.75rem; flex-wrap: wrap;
}
.nav-link {
display: inline-flex; align-items: center;
padding: 0.3rem 0.6rem; border-radius: 6px;
font-size: 0.8rem; font-weight: 500; color: #64748b;
text-decoration: none; transition: color 0.15s, background 0.15s;
white-space: nowrap;
}
.nav-link:hover { color: #cbd5e1; background: rgba(255,255,255,0.05); }
.nav-link.active { color: #a78bfa; }
.nav-spacer { flex: 1; min-width: 0.5rem; }
.nav-link.nav-logout { color: #475569; }
.nav-link.nav-logout:hover { color: #94a3b8; background: none; }
/* ── Page header ── */
.page-header { margin-bottom: 2rem; padding-bottom: 1rem; border-bottom: 1px solid #2d3148; }
.page-header h1 { font-size: 1.4rem; font-weight: 700; color: #a78bfa; }
.page-header p { font-size: 0.82rem; color: #94a3b8; margin-top: 0.25rem; }
/* ── Section cards ── */
.section {
background: #1a1d27; border: 1px solid #2d3148;
border-radius: 10px; padding: 1.5rem; margin-bottom: 1.25rem;
}
.section h2 {
font-size: 0.85rem; font-weight: 600; color: #94a3b8;
text-transform: uppercase; letter-spacing: 0.05em;
margin-bottom: 1.1rem; padding-bottom: 0.5rem;
border-bottom: 1px solid #2d3148;
}
.section-note {
font-size: 0.8rem; color: #64748b; margin-bottom: 1rem; line-height: 1.5;
}
/* ── Form elements ── */
.field { margin-bottom: 0.9rem; }
label {
display: block; font-size: 0.78rem; font-weight: 500;
color: #94a3b8; margin-bottom: 0.35rem;
}
input[type="text"], input[type="password"], input[type="url"],
input[type="number"], select {
width: 100%; padding: 0.6rem 0.8rem;
background: #0f1117; border: 1px solid #2d3148; border-radius: 6px;
color: #e2e8f0; font-size: 0.9rem; font-family: inherit;
outline: none; transition: border-color 0.15s;
}
input:focus, select:focus { border-color: #7c3aed; }
select { cursor: pointer; }
input[type="number"] { width: 6rem; }
.field-row { display: flex; gap: 0.75rem; }
.field-row .field { flex: 1; margin-bottom: 0; }
.key-status { font-size: 0.75rem; color: #94a3b8; margin-top: 0.35rem; }
/* ── Buttons ── */
.btn {
padding: 0.6rem 1.1rem; border: none; border-radius: 6px;
font-size: 0.88rem; font-weight: 600; cursor: pointer;
transition: background 0.15s, opacity 0.15s; font-family: inherit;
}
.btn-primary { background: #7c3aed; color: #fff; }
.btn-primary:hover { background: #6d28d9; }
.btn-secondary {
background: #1a1d27; color: #94a3b8;
border: 1px solid #2d3148;
}
.btn-secondary:hover { border-color: #94a3b8; color: #e2e8f0; }
.btn-sm { padding: 0.35rem 0.7rem; font-size: 0.8rem; font-weight: 500; }
.btn-row { display: flex; gap: 0.6rem; align-items: center; margin-top: 0.75rem; flex-wrap: wrap; }
.btn-link {
background: none; border: none; cursor: pointer; font-family: inherit;
font-size: 0.78rem; color: #64748b; padding: 0; text-decoration: underline;
text-underline-offset: 2px;
}
.btn-link:hover { color: #94a3b8; }
.btn-link.danger { color: #7f1d1d; }
.btn-link.danger:hover { color: #f87171; }
/* ── Host rows ── */
.host-row {
background: #0f1117; border: 1px solid #2d3148; border-radius: 8px;
padding: 1rem; margin-bottom: 0.75rem;
}
.host-form .field-row { margin-bottom: 0.6rem; }
.fetch-status { font-size: 0.78rem; color: #94a3b8; }
.fetch-status.ok { color: #4ade80; }
.fetch-status.err { color: #f87171; }
/* ── Model rows ── */
.model-row {
display: flex; align-items: flex-start; justify-content: space-between;
gap: 0.75rem; padding: 0.75rem 0.9rem;
background: #0f1117; border: 1px solid #2d3148; border-radius: 8px;
margin-bottom: 0.5rem;
}
.model-info { display: flex; flex-direction: column; gap: 0.2rem; min-width: 0; }
.model-label { font-size: 0.9rem; font-weight: 600; color: #e2e8f0; }
.model-name { font-size: 0.75rem; color: #64748b; font-family: monospace; word-break: break-all; }
.model-host { font-size: 0.72rem; color: #475569; }
.ctx-badge {
display: inline-block; margin-left: 0.4rem;
padding: 0.1rem 0.35rem; border-radius: 3px;
background: #1e293b; color: #64748b;
font-size: 0.67rem; font-weight: 600;
}
.tag-row { display: flex; flex-wrap: wrap; gap: 0.3rem; margin-top: 0.2rem; }
.tag {
padding: 0.1rem 0.4rem; border-radius: 3px;
background: #1e1b4b; color: #818cf8;
font-size: 0.68rem; font-weight: 500;
}
.model-actions { display: flex; gap: 0.4rem; flex-shrink: 0; }
.row-btn {
padding: 0.3rem 0.65rem; border-radius: 5px; font-size: 0.78rem;
font-weight: 500; cursor: pointer; font-family: inherit;
border: 1px solid #2d3148; background: #1a1d27; color: #94a3b8;
transition: border-color 0.15s, color 0.15s;
}
.row-btn.danger { color: #f87171; }
.row-btn.danger:hover { border-color: #f87171; }
/* ── Role assignment rows ── */
.role-row {
display: flex; align-items: flex-start; gap: 1rem;
padding: 0.6rem 0; border-bottom: 1px solid #1e2030;
}
.role-row:last-child { border-bottom: none; }
.role-name {
font-size: 0.82rem; font-weight: 600; color: #a78bfa;
min-width: 6rem; padding-top: 0.45rem;
}
.role-slots { display: flex; flex-wrap: wrap; gap: 0.5rem; flex: 1; }
.role-slot { display: flex; flex-direction: column; gap: 0.2rem; flex: 1; min-width: 8rem; }
.slot-label { font-size: 0.68rem; color: #475569; font-weight: 500; text-transform: uppercase; letter-spacing: 0.04em; }
.role-select {
padding: 0.4rem 0.6rem; font-size: 0.8rem;
background: #0f1117; border: 1px solid #2d3148; border-radius: 6px;
color: #e2e8f0; font-family: inherit; cursor: pointer; outline: none;
transition: border-color 0.15s;
}
.role-select:focus { border-color: #7c3aed; }
.role-select.saved { border-color: #166534; }
.role-select.saving { border-color: #92400e; }
.role-select.err { border-color: #7f1d1d; }
/* ── Add model section ── */
#add-section .field-row { margin-bottom: 0.5rem; }
#model-select-wrap { display: none; margin-bottom: 0.75rem; }
.tags-hint { font-size: 0.72rem; color: #475569; margin-top: 0.3rem; }
/* ── Messages ── */
.msg {
font-size: 0.85rem; text-align: center;
padding: 0.6rem 1rem; border-radius: 6px; margin-bottom: 1rem;
}
.msg.success { color: #4ade80; background: #052e16; border: 1px solid #166534; }
.msg.error { color: #f87171; background: #2d0a0a; border: 1px solid #7f1d1d; }
/* ── Toast ── */
#toast {
position: fixed; bottom: 1.5rem; right: 1.5rem;
background: #1a1d27; border: 1px solid #166534; color: #4ade80;
padding: 0.5rem 1rem; border-radius: 6px; font-size: 0.82rem;
opacity: 0; transition: opacity 0.2s; pointer-events: none;
z-index: 100;
}
#toast.show { opacity: 1; }
#toast.err { border-color: #7f1d1d; color: #f87171; }
.empty-note { font-size: 0.85rem; color: #475569; padding: 0.3rem 0; }
</style>
</head>
<body>
<div class="page">
<nav class="page-nav">
<a href="/" class="nav-link">← Chat</a>
<a href="/help" class="nav-link">Help</a>
<a href="/settings" class="nav-link">Settings</a>
<a href="/settings/local" class="nav-link active">Models</a>
<span class="nav-spacer"></span>
<a href="/logout" class="nav-link nav-logout">Sign out</a>
</nav>
<div class="page-header">
<h1>Model Registry</h1>
<p>Configure hosts, models, and which model handles each task type.</p>
</div>
<!-- SUCCESS -->
<!-- ERROR -->
<!-- ── Hosts ── -->
<div class="section">
<h2>Hosts</h2>
<p class="section-note">OpenAI-compatible API servers (Open WebUI, Ollama, LM Studio, etc.)</p>
{{ host_rows }}
<details style="margin-top:0.75rem">
<summary style="font-size:0.82rem; color:#64748b; cursor:pointer; user-select:none">+ Add host</summary>
<div style="margin-top:0.75rem">
<form method="POST" action="/settings/local/host">
<input type="hidden" name="host_id" value="">
<div class="field-row">
<div class="field">
<label for="new-host-label">Label</label>
<input type="text" id="new-host-label" name="label"
placeholder="e.g. Gaming Laptop"
autocomplete="off" data-form-type="other">
</div>
<div class="field" style="flex:2">
<label for="new-host-url">API URL</label>
<input type="text" id="new-host-url" name="api_url"
placeholder="http://192.168.x.x:3000"
autocomplete="off" spellcheck="false" data-form-type="other">
</div>
</div>
<div class="field-row">
<div class="field">
<label for="new-host-key">API Key</label>
<input type="password" id="new-host-key" name="api_key"
placeholder="sk-… (leave blank if not required)"
autocomplete="new-password" data-1p-ignore data-lpignore="true" data-form-type="other">
</div>
<div class="field" style="flex:0 0 auto">
<label for="new-host-type">Type</label>
<select id="new-host-type" name="host_type">
<option value="openwebui">Open WebUI / Ollama</option>
<option value="openai">OpenAI-compatible (OpenRouter, etc.)</option>
</select>
</div>
</div>
<div class="btn-row">
<button type="submit" class="btn btn-primary btn-sm">Add Host</button>
</div>
</form>
</div>
</details>
</div>
<!-- ── Models ── -->
<div class="section">
<h2>Models</h2>
{{ model_rows }}
</div>
<!-- ── Add Model ── -->
<div class="section" id="add-section"{{ add_model_hidden }}>
<h2>Add Model</h2>
<div id="model-select-wrap">
<div class="field">
<label for="model-picker">Available on host</label>
<select id="model-picker">
<option value="">— select to auto-fill —</option>
</select>
</div>
</div>
<form method="POST" action="/settings/local/models/add" id="add-form">
<input type="hidden" name="host_id" id="add-host-id" value="">
<div class="field">
<label for="add-host-select">Host</label>
<select id="add-host-select" onchange="document.getElementById('add-host-id').value=this.value">
{{ host_options }}
</select>
</div>
<div class="field-row">
<div class="field">
<label for="add-label">Label</label>
<input type="text" id="add-label" name="label"
placeholder="e.g. Gemma 4 E4B"
autocomplete="off" data-form-type="other">
</div>
<div class="field" style="flex:2">
<label for="add-model-name">Model name</label>
<input type="text" id="add-model-name" name="model_name"
placeholder="e.g. gemma4:e4b"
autocomplete="off" spellcheck="false" data-form-type="other">
</div>
</div>
<div class="field-row">
<div class="field" style="flex:0 0 auto">
<label for="add-context-k">Context (k tokens)</label>
<input type="number" id="add-context-k" name="context_k"
value="0" min="0" max="10000">
</div>
<div class="field">
<label for="add-tags">Tags <span style="color:#475569; font-weight:400">(comma-separated)</span></label>
<input type="text" id="add-tags" name="tags"
placeholder="fast, distill, coding"
autocomplete="off" data-form-type="other">
<p class="tags-hint">Informational labels — used for display and future filtering.</p>
</div>
</div>
<div class="btn-row">
<button type="submit" class="btn btn-primary btn-sm">Add Model</button>
<button type="button" id="fetch-btn" class="btn btn-secondary btn-sm">
Fetch models from host
</button>
<span id="fetch-status" class="fetch-status"></span>
</div>
</form>
</div>
<!-- ── Role Assignments ── -->
<div class="section">
<h2>Role Assignments</h2>
<p class="section-note">
Choose which model handles each task type.
Backups are tried in order if the primary fails or is unavailable.
Leave a slot empty to use the server default (.env).
</p>
{{ role_rows }}
</div>
</div>
<div id="toast"></div>
<script>
// ── Pre-fill role selects ─────────────────────────────────────────────────
const ROLE_DATA = {{ role_data_js }};
document.querySelectorAll('.role-select').forEach(sel => {
const role = sel.dataset.role;
const slot = sel.dataset.slot;
const val = (ROLE_DATA[role] || {})[slot] || '';
for (const opt of sel.options) {
if (opt.value === val) { opt.selected = true; break; }
}
});
// ── Role select change → AJAX save ───────────────────────────────────────
const toast = document.getElementById('toast');
let toastTimer = null;
function showToast(msg, err = false) {
toast.textContent = msg;
toast.className = 'show' + (err ? ' err' : '');
clearTimeout(toastTimer);
toastTimer = setTimeout(() => { toast.className = ''; }, 2000);
}
document.querySelectorAll('.role-select').forEach(sel => {
sel.addEventListener('change', async () => {
const role = sel.dataset.role;
const slot = sel.dataset.slot;
const model_id = sel.value || null;
sel.classList.add('saving');
try {
const res = await fetch('/api/models/role', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({role, slot, model_id}),
});
const data = await res.json();
if (data.ok) {
sel.classList.replace('saving', 'saved');
showToast(`${role}${slot} saved`);
setTimeout(() => sel.classList.remove('saved'), 1200);
} else {
sel.classList.replace('saving', 'err');
showToast(data.error || 'Save failed', true);
setTimeout(() => sel.classList.remove('err'), 2000);
}
} catch (e) {
sel.classList.replace('saving', 'err');
showToast(e.message, true);
}
});
});
// ── Fetch models from host ────────────────────────────────────────────────
// Per-host "Fetch models" buttons in the host rows
document.querySelectorAll('.fetch-btn').forEach(btn => {
btn.addEventListener('click', () => fetchModels(btn.dataset.hostId, btn));
});
// "Fetch models from host" in Add Model section (uses selected host)
const globalFetchBtn = document.getElementById('fetch-btn');
if (globalFetchBtn) {
globalFetchBtn.addEventListener('click', () => {
const hostSel = document.getElementById('add-host-select');
const hostId = hostSel ? hostSel.value : '';
fetchModels(hostId, globalFetchBtn, true);
});
}
async function fetchModels(hostId, btn, fillAddForm = false) {
const statusEl = fillAddForm
? document.getElementById('fetch-status')
: document.getElementById('fetch-' + hostId);
btn.disabled = true;
if (statusEl) { statusEl.textContent = 'Fetching…'; statusEl.className = 'fetch-status'; }
const url = '/api/local-llm/fetch-models' + (hostId ? '?host_id=' + encodeURIComponent(hostId) : '');
try {
const res = await fetch(url);
const data = await res.json();
if (data.error) {
if (statusEl) { statusEl.textContent = '✗ ' + data.error; statusEl.className = 'fetch-status err'; }
return;
}
if (fillAddForm) {
const picker = document.getElementById('model-picker');
const wrap = document.getElementById('model-select-wrap');
picker.innerHTML = '<option value="">— select to auto-fill —</option>';
for (const m of data.models) {
const opt = document.createElement('option');
opt.value = m.id;
opt.textContent = m.name !== m.id ? `${m.name} (${m.id})` : m.id;
opt.dataset.id = m.id;
opt.dataset.name = m.name;
picker.appendChild(opt);
}
wrap.style.display = 'block';
}
if (statusEl) {
statusEl.textContent = `${data.models.length} model${data.models.length !== 1 ? 's' : ''}`;
statusEl.className = 'fetch-status ok';
}
} catch (e) {
if (statusEl) { statusEl.textContent = '✗ ' + e.message; statusEl.className = 'fetch-status err'; }
} finally {
btn.disabled = false;
}
}
// Auto-fill label + model name when a model is selected from the picker
const picker = document.getElementById('model-picker');
if (picker) {
picker.addEventListener('change', () => {
const opt = picker.options[picker.selectedIndex];
if (!opt.value) return;
const nameInput = document.getElementById('add-model-name');
const labelInput = document.getElementById('add-label');
nameInput.value = opt.dataset.id || opt.value;
labelInput.value = (opt.dataset.name && opt.dataset.name !== opt.dataset.id)
? opt.dataset.name : '';
nameInput.focus();
});
}
// Sync hidden host_id input from the visible select
const addHostSel = document.getElementById('add-host-select');
const addHostId = document.getElementById('add-host-id');
if (addHostSel && addHostId) {
addHostId.value = addHostSel.value;
}
</script>
</body>
</html>

View File

@@ -90,6 +90,40 @@
button[type="submit"]:hover { background: #6d28d9; }
.divider {
display: flex;
align-items: center;
gap: 0.75rem;
margin: 1.25rem 0;
color: #475569;
font-size: 0.78rem;
}
.divider::before, .divider::after {
content: '';
flex: 1;
border-top: 1px solid #2d3148;
}
.google-btn {
display: flex;
align-items: center;
justify-content: center;
gap: 0.6rem;
width: 100%;
padding: 0.65rem;
background: #fff;
border: 1px solid #dadce0;
border-radius: 6px;
color: #3c4043;
font-size: 0.95rem;
font-weight: 500;
font-family: inherit;
cursor: pointer;
text-decoration: none;
transition: background 0.15s, box-shadow 0.15s;
}
.google-btn:hover { background: #f8f9fa; box-shadow: 0 1px 4px rgba(0,0,0,0.2); }
.error {
color: #f87171;
font-size: 0.85rem;
@@ -107,6 +141,18 @@
<!-- ERROR -->
<a href="/auth/google" class="google-btn">
<svg width="18" height="18" viewBox="0 0 18 18" xmlns="http://www.w3.org/2000/svg">
<path d="M17.64 9.2c0-.637-.057-1.251-.164-1.84H9v3.481h4.844c-.209 1.125-.843 2.078-1.796 2.717v2.258h2.908c1.702-1.567 2.684-3.875 2.684-6.615z" fill="#4285F4"/>
<path d="M9 18c2.43 0 4.467-.806 5.956-2.18l-2.908-2.259c-.806.54-1.837.86-3.048.86-2.344 0-4.328-1.584-5.036-3.711H.957v2.332A8.997 8.997 0 0 0 9 18z" fill="#34A853"/>
<path d="M3.964 10.71A5.41 5.41 0 0 1 3.682 9c0-.593.102-1.17.282-1.71V4.958H.957A8.996 8.996 0 0 0 0 9c0 1.452.348 2.827.957 4.042l3.007-2.332z" fill="#FBBC05"/>
<path d="M9 3.58c1.321 0 2.508.454 3.44 1.345l2.582-2.58C13.463.891 11.426 0 9 0A8.997 8.997 0 0 0 .957 4.958L3.964 7.29C4.672 5.163 6.656 3.58 9 3.58z" fill="#EA4335"/>
</svg>
Sign in with Google
</a>
<div class="divider">or</div>
<form method="POST" action="/login">
<div class="field">
<label for="username">Username</label>

View File

@@ -33,14 +33,30 @@
max-width: 480px;
}
.back-link {
display: inline-block;
font-size: 0.8rem;
color: #94a3b8;
text-decoration: none;
margin-bottom: 1.5rem;
.page-nav {
display: flex;
align-items: center;
gap: 0.25rem;
margin-bottom: 1.75rem;
flex-wrap: wrap;
}
.back-link:hover { color: #a78bfa; }
.nav-link {
display: inline-flex;
align-items: center;
padding: 0.3rem 0.6rem;
border-radius: 6px;
font-size: 0.8rem;
font-weight: 500;
color: #64748b;
text-decoration: none;
transition: color 0.15s, background 0.15s;
white-space: nowrap;
}
.nav-link:hover { color: #cbd5e1; background: rgba(255,255,255,0.05); }
.nav-link.active { color: #a78bfa; }
.nav-spacer { flex: 1; min-width: 0.5rem; }
.nav-link.nav-logout { color: #475569; }
.nav-link.nav-logout:hover { color: #94a3b8; background: none; }
.logo {
margin-bottom: 1.75rem;
@@ -192,7 +208,13 @@
</head>
<body>
<div class="card">
<a href="{{ back_href }}" class="back-link">← Back to Cortex</a>
<nav class="page-nav">
<a href="{{ back_href }}" class="nav-link">← Chat</a>
<a href="{{ help_href }}" class="nav-link">Help</a>
<a href="/settings" class="nav-link active">Settings</a>
<span class="nav-spacer"></span>
<a href="/logout" class="nav-link nav-logout">Sign out</a>
</nav>
<div class="logo">
<h1>Account Settings</h1>
@@ -219,7 +241,8 @@
<label for="new_username">New username</label>
<input type="text" id="new_username" name="new_username"
value="{{ username }}"
pattern="[a-z_][a-z0-9_\-]{0,31}" required autofocus>
pattern="[a-z_][a-z0-9_\-]{0,31}" required autofocus
autocomplete="off" data-form-type="other">
<p style="font-size:0.75rem; color:#94a3b8; margin-top:0.3rem;">
Lowercase letters, digits, _ or - only. You will be logged out after renaming.
</p>
@@ -232,6 +255,61 @@
</form>
</div>
<!-- Connected accounts -->
<div class="section">
<h2>Connected Accounts</h2>
<div class="field">
<label>Google Account</label>
<input type="text" value="{{ google_email }}" readonly
placeholder="No Google account linked"
style="{{ google_email == '' and 'color:#475569' or '' }}">
</div>
<p style="font-size:0.75rem; color:#94a3b8; margin-top:-0.5rem;">
To link or change your Google account, contact Scott.
</p>
</div>
<!-- Gemini API key -->
<div class="section">
<h2>Gemini API Key</h2>
<p style="font-size:0.8rem; color:#94a3b8; margin-bottom:0.85rem; line-height:1.55;">
Paste your personal key from
<a href="https://aistudio.google.com/apikey" target="_blank" rel="noopener"
style="color:#a78bfa;">aistudio.google.com/apikey</a>
to use your own Gemini quota. Leave blank to use the shared server key.
</p>
<form method="POST" action="/settings/gemini-key">
<div class="field">
<label for="gemini_api_key">API Key</label>
<input type="text" id="gemini_api_key" name="gemini_api_key"
placeholder="{{ gemini_key_hint }}"
autocomplete="new-password" spellcheck="false"
data-1p-ignore data-lpignore="true" data-form-type="other">
</div>
<button type="submit">Save Key</button>
</form>
<p id="gemini-key-status" style="font-size:0.75rem; color:#94a3b8; margin-top:0.5rem;">
Current: {{ gemini_key_hint }}
<span id="gemini-remove-wrap" style="{{ gemini_key_set == 'false' and 'display:none' or '' }}">
<a href="#" id="gemini-remove-link" style="color:#f87171;">remove</a>
</span>
</p>
</div>
<!-- Local models link -->
<div class="section">
<h2>Local Models</h2>
<p style="font-size:0.8rem; color:#94a3b8; margin-bottom:0.85rem; line-height:1.55;">
Configure OpenAI-compatible hosts and models (Open WebUI, Ollama, LM Studio, etc.).
</p>
<a href="/settings/local"
style="display:inline-block; padding:0.55rem 1rem; background:#7c3aed; border-radius:6px;
color:#fff; font-size:0.88rem; font-weight:600; text-decoration:none;
transition:background 0.15s;">
Manage local models →
</a>
</div>
<!-- Change password -->
<div class="section">
<h2>Change Password</h2>
@@ -287,6 +365,16 @@
document.getElementById('show-rename-user').style.display = '';
});
// Gemini key — "remove" link clears the input and submits the form
const geminiRemove = document.getElementById('gemini-remove-link');
if (geminiRemove) {
geminiRemove.addEventListener('click', e => {
e.preventDefault();
document.getElementById('gemini_api_key').value = '';
document.querySelector('form[action="/settings/gemini-key"]').submit();
});
}
// Persona rename toggle
document.querySelectorAll('.persona-rename-toggle').forEach(btn => {
btn.addEventListener('click', () => {

View File

@@ -183,7 +183,13 @@
.persona-dropdown .pd-add:hover { color: var(--text); }
/* Lucide SVG icon alignment */
.btn-icon { display: inline-block; vertical-align: middle; flex-shrink: 0; pointer-events: none; }
.hdr-btn {
display: inline-flex;
align-items: center;
gap: 5px;
background: var(--bg);
border: 1px solid var(--border);
border-radius: 6px;
@@ -224,7 +230,9 @@
.hdr-dropdown.open { display: block; }
.hdr-dd-item {
display: block;
display: flex;
align-items: center;
gap: 8px;
width: 100%;
text-align: left;
padding: 0.55rem 0.85rem;
@@ -423,6 +431,8 @@
padding: 0;
font-size: 0.85em;
}
/* Syntax highlighting — app theme controls the pre background; hljs adds token colors */
.message.assistant pre code.hljs { background: transparent; padding: 0; }
.message.system {
align-self: center;
@@ -432,6 +442,80 @@
padding: 2px 0;
}
/* ── Tool call step cards (agent mode) ── */
.tool-calls-container {
display: flex;
flex-direction: column;
gap: 3px;
margin: 4px 0 6px;
align-self: stretch;
}
.tool-call {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 6px;
overflow: hidden;
font-size: 0.78rem;
}
.tool-call summary {
display: flex;
align-items: baseline;
gap: 0.5rem;
padding: 0.35rem 0.65rem;
cursor: pointer;
list-style: none;
user-select: none;
color: var(--muted);
}
.tool-call summary::-webkit-details-marker { display: none; }
.tool-call summary::before {
content: '▶';
font-size: 0.55rem;
color: var(--muted);
transition: transform 0.12s;
flex-shrink: 0;
}
.tool-call[open] summary::before { transform: rotate(90deg); }
.tool-call summary:hover { color: var(--text); background: rgba(255,255,255,0.03); }
.tc-name {
font-weight: 600;
color: var(--accent);
font-family: 'Courier New', monospace;
}
.tc-snippet {
color: var(--muted);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
max-width: 36ch;
}
.tc-body {
padding: 0 0.65rem 0.5rem;
display: flex;
flex-direction: column;
gap: 0.4rem;
}
.tc-section { display: flex; flex-direction: column; gap: 2px; }
.tc-label {
font-size: 0.68rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--muted);
}
.tc-body pre {
margin: 0;
background: var(--pre-bg);
border: 1px solid var(--border);
border-radius: 4px;
padding: 6px 8px;
font-size: 0.78rem;
white-space: pre-wrap;
word-break: break-word;
color: var(--text);
overflow-x: auto;
}
.message.error {
align-self: flex-start;
background: var(--error-bg);
@@ -443,9 +527,12 @@
.message.thinking { color: var(--muted); font-style: italic; }
/* Copy button */
.message.assistant { position: relative; }
.message.assistant, .message.user { position: relative; }
.copy-btn {
display: inline-flex;
align-items: center;
gap: 4px;
position: absolute;
top: 7px;
right: 8px;
@@ -460,10 +547,24 @@
transition: opacity 0.15s, color 0.15s, border-color 0.15s;
}
.message.assistant:hover .copy-btn { opacity: 1; }
.message.assistant:hover .copy-btn,
.message.user:hover .copy-btn { opacity: 1; }
.copy-btn:hover { color: var(--text); border-color: var(--muted); }
.copy-btn.copied { color: var(--success); border-color: var(--success-dim); }
/* Model tag — shown at the bottom of every assistant message */
.model-tag {
display: block;
font-size: 0.67rem;
color: #475569;
margin-top: 0.55rem;
padding-top: 0.4rem;
border-top: 1px solid #2d3148;
text-align: right;
letter-spacing: 0.02em;
}
.model-tag.fallback { color: #f59e0b; }
/* Note messages */
.message.note-private {
align-self: flex-end;
@@ -538,7 +639,7 @@
#mode-select-btn.mode-otr { border-color: rgba(120,80,160,0.6); color: #a87fd4; }
#mode-select-btn.mode-agent { border-color: rgba(80,140,200,0.6); color: #7cb9e8; }
#mode-icon { font-size: 1rem; line-height: 1; }
#mode-icon { display: flex; align-items: center; }
.mode-arrow { font-size: 0.55rem; color: var(--muted); margin-left: 2px; opacity: 0.5; }
/* Dropdown — opens upward; MRU at bottom = closest to button */
@@ -573,7 +674,7 @@
}
.mode-option:hover { background: var(--border); color: var(--text); }
.mode-option.current { color: var(--text); font-weight: 500; }
.mode-option .opt-icon { font-size: 1rem; line-height: 1; }
.mode-option .opt-icon { display: flex; align-items: center; }
.mode-option .opt-check { margin-left: auto; font-size: 0.7rem; opacity: 0.7; }
/* Note visibility sub-button — shown below mode-select when note is active */
@@ -630,6 +731,10 @@
/* Send button */
#send {
display: flex;
align-items: center;
justify-content: center;
gap: 6px;
background: var(--user-bg);
border: 1px solid var(--user-border);
color: var(--text);
@@ -649,11 +754,14 @@
/* Stop button */
#stop {
display: none;
align-items: center;
justify-content: center;
gap: 6px;
background: var(--error-bg);
border: 1px solid var(--error-border);
color: var(--error-text);
border-radius: 8px;
padding: 10px 0;
padding: 10px 14px;
cursor: pointer;
font-size: 0.9rem;
text-align: center;
@@ -699,6 +807,9 @@
.msg-wrapper:hover .msg-actions { opacity: 1; }
.msg-act-btn {
display: inline-flex;
align-items: center;
gap: 4px;
background: none;
border: 1px solid var(--border);
border-radius: 4px;
@@ -736,6 +847,9 @@
}
.edit-save-btn, .edit-cancel-btn {
display: inline-flex;
align-items: center;
gap: 4px;
background: none;
border: 1px solid var(--border);
border-radius: 4px;
@@ -783,22 +897,12 @@
flex-shrink: 0;
}
#file-modal-header select {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 5px;
color: var(--text);
font-size: 0.85rem;
padding: 4px 8px;
cursor: pointer;
}
#file-modal-title {
font-size: 0.9rem;
font-weight: 600;
color: var(--accent);
flex: 1;
}
.fm-spacer { flex: 1; }
.fm-btn {
background: var(--bg);
@@ -814,13 +918,153 @@
.fm-btn.active { color: var(--accent); border-color: var(--accent); }
.fm-btn.save { color: var(--accent); border-color: var(--inara-border); }
.fm-btn.save:hover { background: var(--inara-bg); }
#file-saved-msg {
font-size: 0.75rem;
color: #6abf6a;
opacity: 0;
transition: opacity 0.3s;
#file-modal-content {
flex: 1;
display: flex;
overflow: hidden;
}
/* ── File sidebar ── */
#file-sidebar-wrap {
width: 190px;
flex-shrink: 0;
border-right: 1px solid var(--border);
display: flex;
flex-direction: column;
background: var(--bg);
}
#file-sidebar {
flex: 1;
overflow-y: auto;
}
/* ── Session search (within sidebar) ── */
#session-search-wrap {
border-top: 1px solid var(--border);
padding: 8px 8px 10px;
}
#session-search-label {
font-size: 0.65rem;
font-weight: 700;
text-transform: uppercase;
letter-spacing: 0.06em;
color: var(--muted);
margin-bottom: 5px;
}
#session-search-row {
display: flex;
gap: 4px;
}
#session-search-input {
flex: 1;
min-width: 0;
background: var(--surface);
border: 1px solid var(--border);
border-radius: 4px;
color: var(--text);
font-size: 0.78rem;
padding: 3px 6px;
}
#session-search-btn {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 4px;
color: var(--muted);
font-size: 0.78rem;
padding: 3px 8px;
cursor: pointer;
}
#session-search-btn:hover { color: var(--accent); border-color: var(--accent); }
/* ── Session search results panel ── */
#session-search-results {
flex: 1;
overflow-y: auto;
padding: 12px 14px;
font-size: 0.82rem;
}
.sr-header { color: var(--muted); font-size: 0.72rem; margin-bottom: 10px; }
.sr-date {
font-size: 0.7rem;
font-weight: 700;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--accent);
margin: 14px 0 4px;
}
.sr-date:first-of-type { margin-top: 0; }
.sr-excerpt {
background: var(--surface);
border-left: 2px solid var(--border);
border-radius: 0 4px 4px 0;
padding: 6px 10px;
margin-bottom: 6px;
line-height: 1.5;
white-space: pre-wrap;
word-break: break-word;
color: var(--text);
}
.sr-excerpt mark {
background: rgba(139,92,246,0.25);
color: var(--accent);
border-radius: 2px;
padding: 0 1px;
}
.sr-empty, .sr-error { color: var(--muted); padding: 8px 0; }
.fg-header {
display: flex;
align-items: center;
gap: 0.3rem;
padding: 7px 10px 5px;
font-size: 0.68rem;
font-weight: 700;
text-transform: uppercase;
letter-spacing: 0.06em;
color: var(--muted);
cursor: pointer;
user-select: none;
}
.fg-header::before {
content: '▾';
font-size: 0.7rem;
transition: transform 0.15s;
}
.fg-header.collapsed::before { transform: rotate(-90deg); }
.fg-header.collapsed + .fg-items { display: none; }
.fg-items { display: flex; flex-direction: column; }
.file-item {
padding: 6px 10px 6px 16px;
cursor: pointer;
border-left: 2px solid transparent;
transition: background 0.1s, border-color 0.1s;
}
.file-item:hover { background: var(--surface); }
.file-item.active {
background: var(--inara-bg);
border-left-color: var(--accent);
}
.file-item.missing { opacity: 0.45; }
.fi-name {
font-size: 0.8rem;
color: var(--text);
font-weight: 500;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.file-item.active .fi-name { color: var(--accent); }
.fi-meta {
display: flex;
gap: 0.5rem;
margin-top: 2px;
font-size: 0.68rem;
color: var(--muted);
}
#file-saved-msg.show { opacity: 1; }
#file-modal-body {
flex: 1;
@@ -911,9 +1155,14 @@
cursor: pointer;
transition: color 0.15s, border-color 0.15s, background 0.15s;
}
.ctx-btn:hover { color: var(--text); border-color: var(--muted); }
.ctx-btn.active { color: var(--accent); border-color: var(--accent); }
.ctx-btn.mem-on { color: var(--success); border-color: var(--success-dim); }
.ctx-btn:hover { color: var(--text); border-color: var(--muted); }
.ctx-btn.active { color: var(--accent); border-color: var(--accent); }
.ctx-btn.mem-on { color: var(--success); border-color: var(--success-dim); }
.ctx-btn.local-on { color: #f59e0b; border-color: #92400e; }
#backend-model-hint {
font-size: 0.68rem; color: #f59e0b; opacity: 0.8;
margin-top: 4px; word-break: break-all; line-height: 1.3;
}
#ctx-distill-status {
margin-top: 6px;
@@ -1149,6 +1398,48 @@
#auth-banner-close:hover { opacity: 1; }
/* ── Toasts ──────────────────────────────────────────────── */
#toast-container {
position: fixed;
bottom: 1.25rem;
right: 1.25rem;
display: flex;
flex-direction: column;
align-items: flex-end;
gap: 0.4rem;
z-index: 9999;
pointer-events: none;
}
.toast {
padding: 0.45rem 0.85rem;
border-radius: 6px;
font-size: 0.8rem;
font-weight: 500;
color: #fff;
background: #334155;
border: 1px solid #475569;
box-shadow: 0 4px 12px rgba(0,0,0,0.35);
opacity: 0;
transform: translateY(6px);
transition: opacity 0.18s ease, transform 0.18s ease;
pointer-events: none;
white-space: nowrap;
}
.toast.show { opacity: 1; transform: translateY(0); }
.toast.success { background: #14532d; border-color: #16a34a; }
.toast.error { background: #7f1d1d; border-color: #dc2626; }
/* Sessions backdrop — hidden by default, visible only as mobile drawer overlay */
#sessions-backdrop {
display: none;
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.5);
z-index: 98;
animation: backdrop-in 0.2s ease;
}
@keyframes backdrop-in { from { opacity: 0; } to { opacity: 1; } }
/* ── Mobile responsive ───────────────────────────────────── */
@media (max-width: 520px) {
header { padding: 8px 12px; gap: 8px; }
@@ -1209,6 +1500,36 @@
/* Larger touch targets */
#send, #stop { padding: 12px 14px; font-size: 1rem; }
/* File modal: sidebar collapses to a narrow strip */
#file-modal-inner { width: 100vw; height: 100dvh; border-radius: 0; }
#file-sidebar-wrap { width: 130px; }
.fi-meta { display: none; }
/* Sessions backdrop active on mobile */
#sessions-backdrop.open { display: block; }
/* Sessions panel → full-height drawer sliding in from the right */
#sessions-panel {
display: block !important; /* keep rendered so transition works */
position: fixed;
top: 0;
right: 0;
bottom: 0;
width: min(300px, 85vw);
max-height: none;
height: 100%;
border-radius: 0;
border-top: none;
border-right: none;
border-bottom: none;
border-left: 1px solid var(--border);
transform: translateX(110%);
transition: transform 0.25s ease;
z-index: 99;
overflow-y: auto;
}
#sessions-panel.open { transform: translateX(0); }
}
/* ── Touch devices — no hover capability ─────────────────── */

View File

@@ -0,0 +1,805 @@
"""
Unit tests for model_registry.py — no HTTP, no LLM calls, no running service.
All file I/O is redirected to tmp_path via patch.object(config.settings, "home_dir", ...).
Coverage:
- Empty registry (no files)
- Save/load round-trip
- Migration from local_llm.json (v0 flat and v1 hosts/models)
- Host CRUD
- Model CRUD (including role reference cleanup on remove)
- Role assignment (set_role, validation)
- Model resolution (_resolve_model: built-ins, local_openai, missing host/model)
- get_model_for_role: registry chain → .env fallback → hardcoded fallback
- get_best_local_model: role chain, first-local fallback, no-local case
- Backup chain: skips missing models, returns next valid
"""
import json
import pytest
from pathlib import Path
from unittest.mock import patch
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _home(tmp_path: Path) -> Path:
"""Create a minimal home directory and return the root."""
root = tmp_path / "home"
root.mkdir()
return root
def _user_dir(home: Path, username: str = "scott") -> Path:
d = home / username
d.mkdir(exist_ok=True)
return d
def _write_registry(home: Path, data: dict, username: str = "scott") -> Path:
_user_dir(home, username)
path = home / username / "model_registry.json"
path.write_text(json.dumps(data))
return path
def _write_local_llm(home: Path, data: dict, username: str = "scott") -> Path:
_user_dir(home, username)
path = home / username / "local_llm.json"
path.write_text(json.dumps(data))
return path
def _read_registry(home: Path, username: str = "scott") -> dict:
path = home / username / "model_registry.json"
return json.loads(path.read_text())
# ---------------------------------------------------------------------------
# Empty / fresh state
# ---------------------------------------------------------------------------
def test_empty_registry_no_files(tmp_path):
"""With no files, _load returns an empty structure."""
home = _home(tmp_path)
_user_dir(home)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
data = reg._load("scott")
assert data["version"] == 1
assert data["hosts"] == []
assert data["models"] == []
assert data["roles"] == {}
def test_empty_registry_missing_user_dir(tmp_path):
"""Even with no user dir, _load returns an empty structure gracefully."""
home = _home(tmp_path)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
data = reg._load("nobody")
assert data["hosts"] == []
# ---------------------------------------------------------------------------
# Save / load round-trip
# ---------------------------------------------------------------------------
def test_save_and_load(tmp_path):
home = _home(tmp_path)
_user_dir(home)
import config
import model_registry as reg
registry = {
"version": 1,
"hosts": [{"id": "h1", "label": "ML Box", "api_url": "http://10.0.0.1:3000", "api_key": "sk-test"}],
"models": [{"id": "m1", "type": "local_openai", "label": "Gemma Small",
"model_name": "gemma4:e4b", "host_id": "h1", "context_k": 72, "tags": ["fast"]}],
"roles": {"chat": {"primary": "m1"}},
}
with patch.object(config.settings, "home_dir", home):
reg._save("scott", registry)
loaded = reg._load("scott")
assert loaded["hosts"][0]["label"] == "ML Box"
assert loaded["models"][0]["model_name"] == "gemma4:e4b"
assert loaded["roles"]["chat"]["primary"] == "m1"
def test_corrupt_registry_falls_back_to_empty(tmp_path):
home = _home(tmp_path)
path = _user_dir(home) / "model_registry.json"
path.write_text("{bad json{{")
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
data = reg._load("scott")
assert data["hosts"] == []
# ---------------------------------------------------------------------------
# Migration from local_llm.json
# ---------------------------------------------------------------------------
def test_migrate_v1_hosts_models(tmp_path):
"""v1 local_llm.json (hosts/models/active_model_id) migrates correctly."""
home = _home(tmp_path)
_write_local_llm(home, {
"hosts": [{"id": "h1", "label": "Home", "api_url": "http://10.0.0.1:3000", "api_key": "sk-1"}],
"models": [
{"id": "m1", "host_id": "h1", "label": "Gemma Small", "model_name": "gemma4:e4b"},
{"id": "m2", "host_id": "h1", "label": "Gemma Med", "model_name": "gemma4:26b"},
],
"active_model_id": "m1",
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
data = reg._load("scott")
assert len(data["hosts"]) == 1
assert data["hosts"][0]["api_url"] == "http://10.0.0.1:3000"
assert len(data["models"]) == 2
assert all(m["type"] == "local_openai" for m in data["models"])
# active_model_id → roles.chat.primary
assert data["roles"].get("chat", {}).get("primary") == "m1"
def test_migrate_v1_no_active_model(tmp_path):
"""Migration with active_model_id=null: chat role stays unset."""
home = _home(tmp_path)
_write_local_llm(home, {
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m1", "host_id": "h1", "label": "Model", "model_name": "llama3"}],
"active_model_id": None,
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
data = reg._load("scott")
assert "chat" not in data["roles"] or data["roles"]["chat"].get("primary") is None
def test_migrate_v0_flat_format(tmp_path):
"""v0 flat local_llm.json is wrapped into hosts/models structure."""
home = _home(tmp_path)
_write_local_llm(home, {
"api_url": "http://10.0.0.2:3000",
"api_key": "sk-flat",
"model": "qwen3:8b",
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
data = reg._load("scott")
assert len(data["hosts"]) == 1
assert data["hosts"][0]["api_url"] == "http://10.0.0.2:3000"
assert len(data["models"]) == 1
assert data["models"][0]["model_name"] == "qwen3:8b"
def test_migrate_v0_empty_url_returns_empty(tmp_path):
"""v0 with no api_url and no .env fallback → nothing to migrate, empty registry."""
home = _home(tmp_path)
_write_local_llm(home, {"api_url": "", "api_key": "", "model": ""})
import config
import model_registry as reg
with (
patch.object(config.settings, "home_dir", home),
patch.object(config.settings, "local_api_url", ""), # ensure no .env fallback
patch.object(config.settings, "local_model", ""),
):
data = reg._load("scott")
assert data["hosts"] == []
assert data["models"] == []
def test_migrate_v1_distill_local_sets_role(tmp_path):
"""When DISTILL_BACKEND_MID=local and active model exists, distill role is set."""
home = _home(tmp_path)
_write_local_llm(home, {
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m1", "host_id": "h1", "label": "G", "model_name": "gemma4:e4b"}],
"active_model_id": "m1",
})
import config
import model_registry as reg
with (
patch.object(config.settings, "home_dir", home),
patch.object(config.settings, "distill_backend_mid", "local"),
):
data = reg._load("scott")
assert data["roles"].get("distill", {}).get("primary") == "m1"
def test_migration_saves_registry_file(tmp_path):
"""After migration, model_registry.json is written so next load skips migration."""
home = _home(tmp_path)
_write_local_llm(home, {
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [],
"active_model_id": None,
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
reg._load("scott") # triggers migration + save
# Second load should read model_registry.json, not re-run migration
data2 = reg._load("scott")
assert (home / "scott" / "model_registry.json").exists()
assert data2["version"] == 1
# ---------------------------------------------------------------------------
# Built-in model resolution
# ---------------------------------------------------------------------------
def test_builtin_claude_cli(tmp_path):
home = _home(tmp_path)
_user_dir(home)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg._resolve_model(reg._empty(), "claude_cli")
assert result is not None
assert result["type"] == "claude_cli"
assert result["id"] == "claude_cli"
def test_builtin_gemini_api(tmp_path):
home = _home(tmp_path)
_user_dir(home)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg._resolve_model(reg._empty(), "gemini_api")
assert result["type"] == "gemini_api"
def test_builtin_gemini_cli(tmp_path):
home = _home(tmp_path)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg._resolve_model(reg._empty(), "gemini_cli")
assert result["type"] == "gemini_cli"
def test_builtin_unknown_returns_none(tmp_path):
home = _home(tmp_path)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg._resolve_model(reg._empty(), "does_not_exist")
assert result is None
# ---------------------------------------------------------------------------
# User model resolution
# ---------------------------------------------------------------------------
def test_resolve_local_openai_merges_host(tmp_path):
"""local_openai model gets api_url and api_key merged from its host."""
home = _home(tmp_path)
registry = {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": "sk-test"}],
"models": [{"id": "m1", "type": "local_openai", "label": "G", "model_name": "gemma4:e4b",
"host_id": "h1", "context_k": 72, "tags": []}],
"roles": {},
}
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg._resolve_model(registry, "m1")
assert result["api_url"] == "http://10.0.0.1:3000"
assert result["api_key"] == "sk-test"
assert result["model_name"] == "gemma4:e4b"
def test_resolve_local_openai_missing_host_returns_none(tmp_path):
"""A model pointing to a non-existent host_id returns None."""
home = _home(tmp_path)
registry = {
"version": 1, "hosts": [], "roles": {},
"models": [{"id": "m1", "type": "local_openai", "host_id": "missing",
"label": "X", "model_name": "x", "context_k": 0, "tags": []}],
}
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg._resolve_model(registry, "m1")
assert result is None
def test_resolve_unknown_model_id_returns_none(tmp_path):
home = _home(tmp_path)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg._resolve_model(reg._empty(), "no_such_model")
assert result is None
# ---------------------------------------------------------------------------
# get_model_for_role
# ---------------------------------------------------------------------------
def test_get_model_for_role_uses_registry(tmp_path):
"""Registry primary assignment is returned first."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m1", "type": "local_openai", "label": "G",
"model_name": "gemma4:e4b", "host_id": "h1", "context_k": 72, "tags": []}],
"roles": {"chat": {"primary": "m1"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg.get_model_for_role("scott", "chat")
assert result["model_name"] == "gemma4:e4b"
assert result["api_url"] == "http://10.0.0.1:3000"
def test_get_model_for_role_uses_builtin_from_registry(tmp_path):
"""Registry can assign built-in IDs (claude_cli, gemini_api, etc.)."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1, "hosts": [], "models": [],
"roles": {"chat": {"primary": "claude_cli"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg.get_model_for_role("scott", "chat")
assert result["type"] == "claude_cli"
def test_get_model_for_role_skips_missing_primary(tmp_path):
"""If primary model_id is not found, falls through to backup_1."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m2", "type": "local_openai", "label": "Backup",
"model_name": "llama3:8b", "host_id": "h1", "context_k": 8, "tags": []}],
"roles": {"chat": {"primary": "gone", "backup_1": "m2"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg.get_model_for_role("scott", "chat")
assert result["model_name"] == "llama3:8b"
def test_get_model_for_role_env_fallback(tmp_path):
"""No registry entry for role → falls back to .env setting."""
home = _home(tmp_path)
_user_dir(home)
import config
import model_registry as reg
with (
patch.object(config.settings, "home_dir", home),
patch.object(config.settings, "role_chat", "gemini_cli"),
):
result = reg.get_model_for_role("scott", "chat")
assert result["type"] == "gemini_cli"
def test_get_model_for_role_hardcoded_fallback(tmp_path):
"""No registry + no .env for role → hardcoded last resort."""
home = _home(tmp_path)
_user_dir(home)
import config
import model_registry as reg
# Clear the .env default for 'chat' to simulate unset
with (
patch.object(config.settings, "home_dir", home),
patch.object(config.settings, "role_chat", ""),
):
result = reg.get_model_for_role("scott", "chat")
# claude_cli is the hardcoded last resort for 'chat'
assert result["type"] == "claude_cli"
def test_get_model_for_role_custom_role(tmp_path):
"""Custom roles not in DEFINED_ROLES can still be assigned and resolved."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1, "hosts": [], "models": [],
"roles": {"therapy": {"primary": "gemini_api"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg.get_model_for_role("scott", "therapy")
assert result["type"] == "gemini_api"
def test_get_model_for_role_full_backup_chain(tmp_path):
"""Walks the entire priority chain before falling back."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m4", "type": "local_openai", "label": "Last",
"model_name": "tiny:1b", "host_id": "h1", "context_k": 4, "tags": []}],
"roles": {"chat": {
"primary": "gone1",
"backup_1": "gone2",
"backup_2": "gone3",
"backup_3": "gone4",
"backup_4": "m4",
}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg.get_model_for_role("scott", "chat")
assert result["model_name"] == "tiny:1b"
# ---------------------------------------------------------------------------
# get_best_local_model
# ---------------------------------------------------------------------------
def test_get_best_local_prefers_role_chain(tmp_path):
"""Returns the first local_openai model in the chat role chain."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [
{"id": "m1", "type": "local_openai", "label": "Preferred",
"model_name": "gemma4:e4b", "host_id": "h1", "context_k": 72, "tags": []},
],
"roles": {"chat": {"primary": "claude_cli", "backup_1": "m1"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
# primary is claude_cli (not local), backup_1 is m1 (local)
result = reg.get_best_local_model("scott", "chat")
assert result["model_name"] == "gemma4:e4b"
def test_get_best_local_falls_back_to_first_model(tmp_path):
"""No local model in role chain → returns first configured local model."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [
{"id": "m1", "type": "local_openai", "label": "G",
"model_name": "gemma4:e4b", "host_id": "h1", "context_k": 72, "tags": []},
],
"roles": {}, # no chat role assigned
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg.get_best_local_model("scott", "chat")
assert result["model_name"] == "gemma4:e4b"
def test_get_best_local_returns_none_when_no_local_models(tmp_path):
"""No local_openai models configured → returns None."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1, "hosts": [], "models": [],
"roles": {"chat": {"primary": "claude_cli"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
result = reg.get_best_local_model("scott", "chat")
assert result is None
# ---------------------------------------------------------------------------
# Host CRUD
# ---------------------------------------------------------------------------
def test_save_host_creates_new(tmp_path):
home = _home(tmp_path)
_user_dir(home)
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
host_id = reg.save_host("scott", None, "ML Box", "http://10.0.0.1:3000", "sk-abc")
data = reg._load("scott")
assert len(data["hosts"]) == 1
assert data["hosts"][0]["id"] == host_id
assert data["hosts"][0]["label"] == "ML Box"
assert data["hosts"][0]["api_key"] == "sk-abc"
def test_save_host_updates_existing(tmp_path):
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Old Label", "api_url": "http://10.0.0.1:3000", "api_key": "sk-old"}],
"models": [], "roles": {},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
reg.save_host("scott", "h1", "New Label", "http://10.0.0.2:3000", "")
data = reg._load("scott")
assert len(data["hosts"]) == 1
assert data["hosts"][0]["label"] == "New Label"
assert data["hosts"][0]["api_url"] == "http://10.0.0.2:3000"
# Empty api_key → existing key preserved
assert data["hosts"][0]["api_key"] == "sk-old"
def test_save_host_unknown_id_creates_new(tmp_path):
"""Passing a host_id that doesn't exist creates a new host instead of crashing."""
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
reg.save_host("scott", "ghost-id", "New", "http://10.0.0.3:3000", "")
data = reg._load("scott")
assert len(data["hosts"]) == 1
def test_remove_host_also_removes_models(tmp_path):
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m1", "type": "local_openai", "host_id": "h1",
"label": "G", "model_name": "gemma4:e4b", "context_k": 72, "tags": []}],
"roles": {"chat": {"primary": "m1"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
found = reg.remove_host("scott", "h1")
data = reg._load("scott")
assert found is True
assert data["hosts"] == []
assert data["models"] == []
def test_remove_host_not_found_returns_false(tmp_path):
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
found = reg.remove_host("scott", "nope")
assert found is False
# ---------------------------------------------------------------------------
# Model CRUD
# ---------------------------------------------------------------------------
def test_save_model_creates(tmp_path):
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [], "roles": {},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
model_id = reg.save_model("scott", None, "h1", "Gemma Small", "gemma4:e4b", 72, ["fast", "distill"])
data = reg._load("scott")
assert len(data["models"]) == 1
assert data["models"][0]["id"] == model_id
assert data["models"][0]["context_k"] == 72
assert data["models"][0]["tags"] == ["fast", "distill"]
assert data["models"][0]["type"] == "local_openai"
def test_save_model_updates_existing(tmp_path):
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m1", "type": "local_openai", "label": "Old",
"model_name": "llama3", "host_id": "h1", "context_k": 8, "tags": []}],
"roles": {},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
reg.save_model("scott", "m1", "h1", "New Label", "llama3:latest", 128, ["updated"])
data = reg._load("scott")
assert len(data["models"]) == 1
assert data["models"][0]["label"] == "New Label"
assert data["models"][0]["context_k"] == 128
def test_remove_model_clears_role_refs(tmp_path):
"""Removing a model clears it from any role assignments."""
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m1", "type": "local_openai", "label": "G",
"model_name": "gemma4:e4b", "host_id": "h1", "context_k": 72, "tags": []}],
"roles": {
"chat": {"primary": "m1", "backup_1": "m1"},
"distill": {"primary": "claude_cli", "backup_1": "m1"},
},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
found = reg.remove_model("scott", "m1")
data = reg._load("scott")
assert found is True
assert data["models"] == []
assert data["roles"]["chat"].get("primary") is None
assert data["roles"]["chat"].get("backup_1") is None
assert data["roles"]["distill"].get("backup_1") is None
# claude_cli assignment preserved
assert data["roles"]["distill"]["primary"] == "claude_cli"
def test_remove_model_not_found_returns_false(tmp_path):
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
found = reg.remove_model("scott", "ghost")
assert found is False
# ---------------------------------------------------------------------------
# set_role
# ---------------------------------------------------------------------------
def test_set_role_assigns_model(tmp_path):
home = _home(tmp_path)
_write_registry(home, {
"version": 1,
"hosts": [{"id": "h1", "label": "Box", "api_url": "http://10.0.0.1:3000", "api_key": ""}],
"models": [{"id": "m1", "type": "local_openai", "label": "G",
"model_name": "gemma4:e4b", "host_id": "h1", "context_k": 72, "tags": []}],
"roles": {},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
ok = reg.set_role("scott", "chat", "primary", "m1")
data = reg._load("scott")
assert ok is True
assert data["roles"]["chat"]["primary"] == "m1"
def test_set_role_assigns_builtin(tmp_path):
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
ok = reg.set_role("scott", "orchestrator", "primary", "gemini_api")
data = reg._load("scott")
assert ok is True
assert data["roles"]["orchestrator"]["primary"] == "gemini_api"
def test_set_role_clears_with_none(tmp_path):
home = _home(tmp_path)
_write_registry(home, {
"version": 1, "hosts": [], "models": [],
"roles": {"chat": {"primary": "claude_cli"}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
ok = reg.set_role("scott", "chat", "primary", None)
data = reg._load("scott")
assert ok is True
assert data["roles"]["chat"]["primary"] is None
def test_set_role_invalid_slot_returns_false(tmp_path):
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
ok = reg.set_role("scott", "chat", "backup_99", "claude_cli")
assert ok is False
def test_set_role_unknown_model_id_returns_false(tmp_path):
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
ok = reg.set_role("scott", "chat", "primary", "nonexistent_model")
assert ok is False
def test_set_role_creates_role_key_if_missing(tmp_path):
"""set_role on a role that isn't in roles{} yet creates it."""
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
reg.set_role("scott", "medical", "primary", "claude_cli")
data = reg._load("scott")
assert data["roles"]["medical"]["primary"] == "claude_cli"
# ---------------------------------------------------------------------------
# get_defined_roles
# ---------------------------------------------------------------------------
def test_get_defined_roles_returns_registry_roles(tmp_path):
home = _home(tmp_path)
_write_registry(home, {
"version": 1, "hosts": [], "models": [],
"roles": {"chat": {"primary": "claude_cli"}, "distill": {}},
})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
roles = reg.get_defined_roles("scott")
# Should include all settings.defined_roles, filling gaps with {}
for role in config.settings.get_defined_roles():
assert role in roles
def test_get_defined_roles_fills_gaps(tmp_path):
"""Roles in settings.defined_roles that aren't in registry get empty dicts."""
home = _home(tmp_path)
_write_registry(home, {"version": 1, "hosts": [], "models": [], "roles": {}})
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
roles = reg.get_defined_roles("scott")
assert "chat" in roles
assert roles["chat"] == {}
# ---------------------------------------------------------------------------
# Multi-user isolation
# ---------------------------------------------------------------------------
def test_registries_are_isolated_per_user(tmp_path):
"""Each user has their own registry file — changes don't bleed across users."""
home = _home(tmp_path)
(home / "scott").mkdir()
(home / "holly").mkdir()
import config
import model_registry as reg
with patch.object(config.settings, "home_dir", home):
reg.save_host("scott", None, "Scott Host", "http://10.0.0.1:3000", "")
scott_data = reg._load("scott")
holly_data = reg._load("holly")
assert len(scott_data["hosts"]) == 1
assert holly_data["hosts"] == []

View File

@@ -28,6 +28,10 @@ from tools.cron import (
cron_add as _cron_add,
cron_remove as _cron_remove,
cron_toggle as _cron_toggle,
)
from tools.reminders import (
reminders_add as _reminders_add,
reminders_list as _reminders_list,
reminders_clear as _reminders_clear,
)
from tools.scratch import (
@@ -196,6 +200,8 @@ _CALLABLES: dict[str, callable] = {
"cron_add": _cron_add,
"cron_remove": _cron_remove,
"cron_toggle": _cron_toggle,
"reminders_add": _reminders_add,
"reminders_list": _reminders_list,
"reminders_clear": _reminders_clear,
"scratch_read": _scratch_read,
"scratch_write": _scratch_write,
@@ -409,6 +415,40 @@ _cron_toggle_declaration = types.FunctionDeclaration(
),
)
_reminders_add_declaration = types.FunctionDeclaration(
name="reminders_add",
description=(
"Add a new reminder to REMINDERS.md. Reminders are automatically surfaced "
"in your context at the start of each session (Tier 2+). "
"Use this when the user asks you to remember something, follow up on something, "
"or surface a note at the next session."
),
parameters=types.Schema(
type=types.Type.OBJECT,
properties={
"text": types.Schema(
type=types.Type.STRING,
description="The reminder text to add",
),
"label": types.Schema(
type=types.Type.STRING,
description="Optional heading for this reminder (e.g. 'Follow up on NC Talk'). Defaults to current timestamp.",
),
},
required=["text"],
),
)
_reminders_list_declaration = types.FunctionDeclaration(
name="reminders_list",
description=(
"Read all current pending reminders from REMINDERS.md. "
"Use this to check what reminders are queued before adding duplicates, "
"or to show the user what's pending."
),
parameters=types.Schema(type=types.Type.OBJECT, properties={}),
)
_reminders_clear_declaration = types.FunctionDeclaration(
name="reminders_clear",
description=(
@@ -494,6 +534,8 @@ TOOL_DECLARATIONS = [
_cron_add_declaration,
_cron_remove_declaration,
_cron_toggle_declaration,
_reminders_add_declaration,
_reminders_list_declaration,
_reminders_clear_declaration,
_scratch_read_declaration,
_scratch_write_declaration,

69
cortex/tools/reminders.py Normal file
View File

@@ -0,0 +1,69 @@
"""
Reminders tools.
Reminders are stored in persona/REMINDERS.md and automatically surfaced
in the system prompt at Tier 2+. Use these tools to add, list, and clear
pending reminders.
Operations:
reminders_add — append a new reminder entry
reminders_list — return all current reminders (or a message if empty)
reminders_clear — erase all reminders (moved here from cron.py for consistency;
cron.py still calls the same underlying file)
"""
import asyncio
from datetime import datetime, timezone
from pathlib import Path
from persona import persona_path
def _reminders_path() -> Path:
return persona_path() / "REMINDERS.md"
def _now_label() -> str:
return datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
# ---------------------------------------------------------------------------
# Sync implementations
# ---------------------------------------------------------------------------
def _reminders_list() -> str:
p = _reminders_path()
if not p.exists() or not p.read_text().strip():
return "No pending reminders."
return p.read_text()
def _reminders_add(text: str, label: str | None = None) -> str:
p = _reminders_path()
existing = p.read_text() if p.exists() else ""
heading = label or _now_label()
section = f"\n## {heading}\n\n{text.strip()}\n"
p.write_text(existing.rstrip() + "\n" + section)
return f"Reminder added: {heading}"
def _reminders_clear() -> str:
p = _reminders_path()
p.write_text("")
return "All reminders cleared."
# ---------------------------------------------------------------------------
# Async wrappers
# ---------------------------------------------------------------------------
async def reminders_list() -> str:
return await asyncio.to_thread(_reminders_list)
async def reminders_add(text: str, label: str | None = None) -> str:
return await asyncio.to_thread(_reminders_add, text, label)
async def reminders_clear() -> str:
return await asyncio.to_thread(_reminders_clear)

194
cortex/user_settings.py Normal file
View File

@@ -0,0 +1,194 @@
"""
Per-user settings stored in home/{user}/local_llm.json.
Structure:
{
"hosts": [{"id", "label", "api_url", "api_key"}, ...],
"models": [{"id", "host_id", "label", "model_name"}, ...],
"active_model_id": "<model id>" | null
}
Values not configured here fall back to .env server defaults.
"""
import json
import logging
import secrets
from pathlib import Path
from config import settings as app_settings
logger = logging.getLogger(__name__)
def _llm_path(username: str) -> Path:
return app_settings.home_root() / username / "local_llm.json"
def _empty() -> dict:
return {"hosts": [], "models": [], "active_model_id": None}
def _load(username: str) -> dict:
path = _llm_path(username)
if not path.exists():
return _empty()
try:
data = json.loads(path.read_text())
except (json.JSONDecodeError, OSError):
logger.warning("local_llm.json for %s is unreadable — starting fresh", username)
return _empty()
# Migrate old single-model format {api_url, api_key, model} → new format
if "hosts" not in data:
return _migrate_v0(data)
return data
def _migrate_v0(old: dict) -> dict:
"""Migrate flat {api_url, api_key, model} → hosts/models structure."""
data = _empty()
api_url = old.get("api_url") or app_settings.local_api_url
api_key = old.get("api_key") or app_settings.local_api_key
model_name = old.get("model") or app_settings.local_model
if not api_url:
return data
host_id = secrets.token_hex(4)
data["hosts"].append({
"id": host_id,
"label": "Local Model Server",
"api_url": api_url,
"api_key": api_key,
})
if model_name:
model_id = secrets.token_hex(4)
data["models"].append({
"id": model_id,
"host_id": host_id,
"label": model_name,
"model_name": model_name,
})
data["active_model_id"] = model_id
logger.info("migrated local_llm.json v0 → v1 for user (host=%s)", host_id)
return data
def _save(username: str, data: dict) -> None:
_llm_path(username).write_text(json.dumps(data, indent=2))
# ── Public read API ───────────────────────────────────────────────────────────
def get_config(username: str) -> dict:
"""Return the full local LLM config for the user."""
return _load(username)
def get_active_local_model(username: str) -> dict | None:
"""Return effective {api_url, api_key, model_name, label} for the active model.
Resolution order:
1. User's active model + its host config
2. .env server defaults (LOCAL_API_URL / LOCAL_API_KEY / LOCAL_MODEL)
3. None — caller should raise a helpful error
"""
data = _load(username)
active_id = data.get("active_model_id")
model = next((m for m in data["models"] if m["id"] == active_id), None)
if model:
host = next((h for h in data["hosts"] if h["id"] == model["host_id"]), None)
if host:
return {
"api_url": host.get("api_url", ""),
"api_key": host.get("api_key", ""),
"model_name": model["model_name"],
"label": model.get("label") or model["model_name"],
}
# Fall back to .env defaults
if app_settings.local_api_url and app_settings.local_model:
return {
"api_url": app_settings.local_api_url,
"api_key": app_settings.local_api_key,
"model_name": app_settings.local_model,
"label": app_settings.local_model,
}
return None
# ── Host management ───────────────────────────────────────────────────────────
def save_host(username: str, host_id: str | None,
label: str, api_url: str, api_key: str) -> str:
"""Create or update a host. Returns the host ID.
api_key is only written when non-empty, so submitting a masked placeholder
with a blank key field leaves the stored key unchanged.
"""
data = _load(username)
if host_id:
for h in data["hosts"]:
if h["id"] == host_id:
h["label"] = label.strip()
h["api_url"] = api_url.strip()
if api_key.strip():
h["api_key"] = api_key.strip()
break
else:
host_id = None # ID not found — fall through to create
if not host_id:
host_id = secrets.token_hex(4)
data["hosts"].append({
"id": host_id,
"label": label.strip(),
"api_url": api_url.strip(),
"api_key": api_key.strip(),
})
_save(username, data)
return host_id
# ── Model management ──────────────────────────────────────────────────────────
def add_model(username: str, host_id: str, label: str, model_name: str) -> str:
"""Add a model entry. Auto-activates if it is the first model. Returns the model ID."""
data = _load(username)
model_id = secrets.token_hex(4)
data["models"].append({
"id": model_id,
"host_id": host_id,
"label": label.strip() or model_name.strip(),
"model_name": model_name.strip(),
})
if not data.get("active_model_id"):
data["active_model_id"] = model_id
_save(username, data)
return model_id
def remove_model(username: str, model_id: str) -> None:
data = _load(username)
data["models"] = [m for m in data["models"] if m["id"] != model_id]
if data.get("active_model_id") == model_id:
data["active_model_id"] = data["models"][0]["id"] if data["models"] else None
_save(username, data)
def set_active_model(username: str, model_id: str) -> bool:
"""Set the active model. Returns False if the model ID is not found."""
data = _load(username)
if not any(m["id"] == model_id for m in data["models"]):
return False
data["active_model_id"] = model_id
_save(username, data)
return True

26
dev-restart.sh Executable file
View File

@@ -0,0 +1,26 @@
#!/usr/bin/env bash
# dev-restart.sh — restart Cortex on the gaming laptop and tail logs
# Usage:
# ./dev-restart.sh restart and show last 30 log lines
# ./dev-restart.sh logs tail live logs (ctrl-c to stop)
# ./dev-restart.sh status show service status only
# "scott-lt-i7-rtx" or "192.168.32.19"
CORTEX_HOST="scott-lt-i7-rtx" # hostname or IP of the machine running Cortex
SERVICE="cortex"
case "${1:-restart}" in
logs)
echo "→ Tailing $SERVICE logs on $CORTEX_HOST (ctrl-c to stop)"
ssh "$CORTEX_HOST" "journalctl --user -u $SERVICE -f --no-pager"
;;
status)
ssh "$CORTEX_HOST" "systemctl --user status $SERVICE --no-pager -l"
;;
restart|*)
echo "→ Restarting $SERVICE on $CORTEX_HOST"
ssh "$CORTEX_HOST" "systemctl --user restart $SERVICE"
echo "→ Last 30 log lines:"
ssh "$CORTEX_HOST" "journalctl --user -u $SERVICE --no-pager -n 30"
;;
esac

100
docs/GOOGLE_CHAT_BOT.md Normal file
View File

@@ -0,0 +1,100 @@
# Google Chat Bot Integration
Cortex connects to Google Chat as a **Workspace Add-on** — each Cortex user gets their own webhook endpoint routed to their chosen persona.
**Status:** Live and confirmed working (2026-03-27)
---
## Prerequisites
- A Google Cloud project with **Google Chat API** enabled
- The Cortex server reachable at a public HTTPS URL
- The user pre-registered in Cortex (`manage_passwords.py invite` or `google-add`)
---
## Per-User Setup
### 1. Create the user's `channels.json`
Create `home/{username}/channels.json` on the Cortex server:
```json
{
"google_chat": {
"persona": "inara",
"audience": "https://cortex.dgrzone.com/channels/google-chat/{username}",
"backend": "claude",
"timeout": 25
}
}
```
- **`persona`** — which persona responds (must exist under `home/{username}/persona/`)
- **`audience`** — must exactly match the HTTP endpoint URL you set in Google Cloud Console (Google uses this as the JWT `aud` claim)
- **`backend`** — `"claude"` recommended; Google Chat requires a response within 30s
- **`timeout`** — keep at 25 (Google's hard limit is 30s; this leaves a 5s buffer)
### 2. Configure Google Chat API in Google Cloud Console
1. Go to [console.cloud.google.com](https://console.cloud.google.com) and select the project
2. **APIs & Services → Enabled APIs & services → Google Chat API**
3. Click the **Configuration** tab
4. Fill in **Application info:**
- App name: `Cortex` (or your persona name)
- Avatar URL: optional
- Description: optional
5. Under **Interactive features:**
- Enable **"Join spaces and group conversations"** if you want the bot in group chats, or leave it off for DM-only
6. Under **Connection settings:**
- Select **HTTP endpoint URL**
- Enter: `https://cortex.dgrzone.com/channels/google-chat/{username}`
7. Under **Visibility:**
- Add the specific Google accounts that should be able to use this bot
- For One Sky IT Workspace users: add individuals or the whole domain
8. Click **Save**
> **Important:** The URL in step 6 must exactly match the `audience` value in `channels.json`. Google includes this URL as the JWT `aud` claim on every request, and Cortex rejects any request where they don't match.
---
## How It Works
1. User sends a message in Google Chat → Google POSTs a signed JSON payload to `/channels/google-chat/{username}`
2. Cortex reads the user's `channels.json`, verifies the JWT `systemIdToken` from `authorizationEventObject`
3. Sets the persona context, builds the system prompt, calls the LLM
4. Returns the response wrapped in `hostAppDataAction → chatDataAction → createMessageAction`
The response must be returned synchronously (Google Chat does not support async/background replies like NC Talk does). The 25s timeout is a hard constraint.
---
## JWT Verification
Google Chat Workspace Add-ons send a `systemIdToken` in the request body at:
`body["authorizationEventObject"]["systemIdToken"]`
Claims verified by Cortex:
- `iss` = `https://accounts.google.com`
- `aud` = the value of `audience` in `channels.json`
If `audience` is empty, verification is skipped (useful for local testing, never in production).
---
## Nginx
The `/channels/` prefix is already public in `auth_middleware.py` — no Nginx changes needed if you're already proxying all traffic to Cortex. Verify the path isn't blocked by basic auth or IP restrictions.
---
## Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| 404 on the webhook | `channels.json` missing or no `google_chat` key | Create/check `home/{username}/channels.json` |
| 401 Invalid token | `audience` in `channels.json` doesn't match the endpoint URL | Make them identical — copy the URL exactly |
| 401 Missing token | No `systemIdToken` in request | Bot may not be a Workspace Add-on; check connection settings type |
| Timeout / no response | LLM too slow | `backend: "claude"` recommended; reduce context tier if needed |
| Bot not receiving messages | Visibility not configured | Add the user's Google account under Visibility in Cloud Console |

View File

@@ -1,69 +1,78 @@
# Nextcloud Talk Bot Integration
Inara is registered as a bot in Nextcloud Talk, receiving messages via webhook and replying through the bot API.
Cortex connects to Nextcloud Talk as a bot — each Cortex user gets their own webhook endpoint routed to their chosen persona.
**Status:** Live and confirmed working (2026-03-20)
**Status:** Live and confirmed working (2026-03-20); per-user routing added 2026-03-27
---
## Installation
## Prerequisites
Run on the Nextcloud server (inside the Docker container):
```bash
docker exec -it --user www-data <nc-app-container> php /var/www/html/occ talk:bot:install \
"Inara" \
"<secret from cortex .env NEXTCLOUD_TALK_BOT_SECRET>" \
"https://cortex.dgrzone.com/inara-nextcloud-talk-webhook" \
--feature webhook --feature response --feature reaction
```
After installing, enable the bot in each Talk conversation via the conversation settings UI (three-dot menu → Bots).
To list installed bots and verify registration:
```bash
docker exec -it --user www-data <nc-app-container> php /var/www/html/occ talk:bot:list
```
To uninstall (if re-registering with a new secret):
```bash
docker exec -it --user www-data <nc-app-container> php /var/www/html/occ talk:bot:remove <bot-id>
```
- Access to the Nextcloud server (Docker exec or SSH)
- The Cortex server reachable at a public HTTPS URL
- The user pre-registered in Cortex (`manage_passwords.py invite`)
---
## Configuration
## Per-User Setup
**`cortex/.env`:**
```
NEXTCLOUD_URL=https://cloud.dgrzone.com
NEXTCLOUD_TALK_BOT_SECRET=<shared secret — must match occ install command>
```
### 1. Create the user's `channels.json`
`NEXTCLOUD_URL` defaults to `https://cloud.dgrzone.com` in `config.py`.
Create `home/{username}/channels.json` on the Cortex server:
**Nginx:** The `/inara-nextcloud-talk-webhook` endpoint must be reachable by Nextcloud without basic auth. Add a location block before the default `auth_basic` block:
```nginx
location = /inara-nextcloud-talk-webhook {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
```json
{
"nextcloud": {
"persona": "inara",
"url": "https://cloud.dgrzone.com",
"bot_secret": "<a secret you choose — must match the occ install command>",
"timeout": 55
}
}
```
(The `/channels/` prefix is already bypassed for Google Chat — consider moving the webhook path to `/channels/nextcloud` in a future cleanup to unify the nginx config.)
- **`persona`** — which persona responds (must exist under `home/{username}/persona/`)
- **`url`** — base URL of the Nextcloud instance
- **`bot_secret`** — a shared HMAC secret; you choose this value and use it in both `channels.json` and the `occ` install command
- **`timeout`** — seconds to wait for the LLM before sending a timeout message (NC Talk is async, so 55s is safe)
### 2. Register the bot in Nextcloud
The Nextcloud container for DgrZone is `dgr_zone_nextcloud-app-1`. Substitute your own container name if different.
First, list existing bots to check if one is already registered (note the bot ID):
```bash
docker exec -it --user www-data dgr_zone_nextcloud-app-1 php /var/www/html/occ talk:bot:list
```
If re-registering (new URL or new secret), uninstall the old bot first:
```bash
docker exec -it --user www-data dgr_zone_nextcloud-app-1 php /var/www/html/occ talk:bot:uninstall <bot-id>
```
Install the bot:
```bash
docker exec -it --user www-data dgr_zone_nextcloud-app-1 php /var/www/html/occ talk:bot:install \
"Inara" \
"<bot_secret from channels.json>" \
"https://cortex.dgrzone.com/webhook/nextcloud/{username}" \
--feature webhook --feature response --feature reaction
```
After installing, enable the bot in each Talk conversation: open the conversation → three-dot menu → **Bots** → enable the bot by name.
---
## How It Works
1. User sends a message in Talk → Nextcloud POSTs a signed webhook to `/inara-nextcloud-talk-webhook`
2. Cortex verifies the incoming HMAC signature, extracts the message text, runs it through the LLM
3. Cortex POSTs the reply to `/ocs/v2.php/apps/spreed/api/v1/bot/{token}/message` with its own HMAC signature
4. The webhook handler returns HTTP 200 immediately; the LLM call happens in a `BackgroundTask` (prevents Nextcloud from disabling the bot due to slow response)
1. User sends a message in Talk → Nextcloud POSTs a signed webhook to `/webhook/nextcloud/{username}`
2. Cortex reads the user's `channels.json`, verifies the incoming HMAC signature
3. Sets the persona context, builds the system prompt, runs the LLM in a `BackgroundTask`
4. Returns HTTP 200 immediately (prevents Nextcloud from disabling the bot due to slow response)
5. Cortex POSTs the reply to `/ocs/v2.php/apps/spreed/api/v1/bot/{token}/message` with its own HMAC signature
---
@@ -76,7 +85,6 @@ location = /inara-nextcloud-talk-webhook {
Nextcloud signs its outgoing webhook with `HMAC-SHA256(secret, random + raw_body)`:
```python
# _verify_signature in nextcloud_talk.py
expected = hmac.new(
secret.encode(),
(random_header + body.decode("utf-8")).encode(),
@@ -89,7 +97,6 @@ expected = hmac.new(
When Cortex posts a reply, Nextcloud verifies the signature against the *parsed message string*, not the raw body. This is because `BotController::sendMessage` passes the parsed `$message` parameter to `checksumVerificationService::validateRequest`, not `$request->getContent()`.
```python
# _send_reply in nextcloud_talk.py
sig = hmac.new(
secret.encode(),
(random_str + message).encode("utf-8"), # message text only, NOT json.dumps({"message": ...})
@@ -105,35 +112,50 @@ sig = hmac.new(secret.encode(), (random_str + '{"message": "..."}').encode(), ha
---
## Multi-User Note
## Nginx
NC Talk currently uses the **default user and persona** (`settings.default_tier`, `load_context()`). All Talk conversations go to Inara regardless of who is messaging. Per-conversation persona routing (e.g., Holly gets Tina) is a future enhancement — would require mapping Nextcloud user IDs or conversation tokens to Cortex users.
The `/webhook/` prefix is already public in `auth_middleware.py`. If Nginx applies basic auth or IP restrictions, add a `location` block before the default auth block:
---
## Claude CLI Auth in systemd
The `CLAUDE_CODE_OAUTH_TOKEN` in `.env` goes stale after each `claude auth login` (tokens rotate). Cortex reads the token live from `~/.claude/.credentials.json` on every Claude call (`llm_client._fresh_claude_token()`), so no manual `.env` update is needed after re-authentication.
Also: never set `ANTHROPIC_API_KEY` to an OAuth token value (`sk-ant-oat01-...`) — the Claude CLI treats it as a direct API key and fails. Only real API keys (`sk-ant-api03-...`) belong in `ANTHROPIC_API_KEY`.
```nginx
location ^~ /webhook/ {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
```
---
## Triggering the Bot
- **@mention** — prefix the message with `@inara` (or whatever `AGENT_NAME` is set to); the prefix is stripped before sending to the LLM
- **@mention** — prefix the message with `@{persona_name}`; the prefix is stripped before sending to the LLM
- **Any message** in a conversation where the bot is enabled — all messages are forwarded, not just @mentions
---
## Logs
Two log streams are useful when debugging:
```bash
# Nextcloud server logs (bot registration errors, webhook rejections)
docker exec -it --user www-data dgr_zone_nextcloud-app-1 php /var/www/html/occ log:tail
# Cortex service logs (LLM errors, signature failures, timeouts)
journalctl --user -u cortex -f
```
---
## Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| 404 on the webhook | `channels.json` missing or no `nextcloud` key | Create/check `home/{username}/channels.json` |
| Webhook not received | Bot not enabled for conversation | Enable in Talk conversation settings (Bots) |
| Incoming 401 | Wrong secret in `.env` | Match secret to `occ talk:bot:install` value |
| Incoming 401 | `bot_secret` in `channels.json` doesn't match `occ install` secret | Re-register with matching secret |
| Reply POST returns 401 (first try) | HMAC computed over wrong data | Sign `random + message_text` only (not raw JSON body) |
| Reply POST returns 401 (persistent) | Brute force protection triggered | `occ security:bruteforce:reset <cortex-IP>` |
| Bot auto-disabled by Nextcloud | Webhook held open too long | Verify `BackgroundTasks` is used — return 200 immediately |
| Claude falls back to Gemini | Stale/wrong auth token | Token is auto-refreshed from `~/.claude/.credentials.json`; run `claude auth login` if expired |
| No response at all | Nginx blocking the path with basic auth | Add a `location =` block before the auth block (see Nginx section above) |
| Reply POST returns 401 (persistent) | Brute force protection triggered | `docker exec -it --user www-data dgr_zone_nextcloud-app-1 php /var/www/html/occ security:bruteforce:reset <cortex-IP>` |
| Bot auto-disabled by Nextcloud | Webhook held open too long | Verify `BackgroundTasks` is used — Cortex returns 200 immediately |
| Claude falls back to Gemini | Stale/expired auth token | Run `claude auth login`; token is auto-refreshed from `~/.claude/.credentials.json` |
| No response at all | Nginx blocking the path | Add a `location ^~ /webhook/` block before any auth block |

276
docs/OPEN_WEBUI_API.md Normal file
View File

@@ -0,0 +1,276 @@
# Open WebUI API Reference for Cortex
> Last updated: 2026-04-03
> Source: https://docs.openwebui.com/reference/api-endpoints/
> Host in use: `http://192.168.32.19:3000` (scott_gaming — 8 GB VRAM)
## Local Model Performance (scott_gaming, 8 GB VRAM)
| Model | Alias | Speed | Practical Context | Spec Context |
|---|---|---|---|---|
| Gemma 4 E4B | `agent-support-gemma-small` | ~25 t/s | **72k tokens** | 128k |
| Gemma 4 26B A4B (MoE) | `agent-support-gemma-medium` | ~9 t/s | **50k tokens** | 256k |
Context is VRAM-constrained — spec limits are higher but KV cache fills available VRAM first.
Techniques to improve: lower KV cache quantization, flash attention, context length tuning in Ollama.
**Practical implications for the local orchestrator:**
- System prompt + memory (T2) + tool results + history: budget ~40-50k for small, ~35-40k for medium
- Medium at 9 t/s is fine for background/async tasks; small at 25 t/s is responsive enough for interactive use
- Both are well above what's needed for most tool loop iterations (~2-5k tokens per round)
---
## Authentication
All API calls use a bearer token:
```
Authorization: Bearer sk-<api-key>
```
API keys are managed in Open WebUI → Settings → Account → API Keys.
Cortex stores these per-user in `home/{username}/local_llm.json``hosts[].api_key`.
---
## Core Endpoints Used by Cortex
### List Available Models
```
GET /api/models
Authorization: Bearer sk-...
```
Returns all models (Ollama, OpenAI-proxied, custom functions).
Used by `/api/local-llm/fetch-models` in `routers/local_llm.py`.
Response shape:
```json
{
"data": [
{ "id": "gemma4-e4b", "name": "Gemma 4 E4B" },
...
]
}
```
### Chat Completions (OpenAI-compatible)
```
POST /api/chat/completions
Authorization: Bearer sk-...
Content-Type: application/json
```
Standard OpenAI chat format. Supports:
- `messages` — standard role/content array
- `model` — model ID or workspace alias
- `tools` + `tool_choice` — function calling (see Tool Loop below)
- `stream: true/false`
This is the endpoint used by `_local()` in `llm_client.py`.
### Anthropic Messages API Compatibility
```
POST /api/v1/messages
Authorization: Bearer sk-...
```
Open WebUI also accepts Anthropic-format requests and auto-converts them.
Could be used to route Claude SDK calls through Open WebUI.
Base URL for this mode: `http://192.168.32.19:3000/api`
### Direct Ollama Proxy
```
GET /ollama/api/tags — list models
POST /ollama/api/generate — streaming completions
POST /ollama/api/embed — generate embeddings
```
Use these if you need to bypass Open WebUI's filter layer and hit Ollama directly.
Ollama is also accessible directly at `http://192.168.32.19:11434`.
---
## Tool / Function Calling
Both Gemma 4 models (E4B and 26B A4B) support function calling via the standard
OpenAI `tools` parameter. Open WebUI passes this through to the underlying model.
### Request Format
```json
POST /api/chat/completions
{
"model": "gemma4-26b-a4b",
"messages": [
{ "role": "system", "content": "..." },
{ "role": "user", "content": "What's the weather?" }
],
"tools": [
{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {
"query": { "type": "string", "description": "Search query" }
},
"required": ["query"]
}
}
}
],
"tool_choice": "auto"
}
```
### Tool Call Response
When the model wants to call a tool, it returns `finish_reason: "tool_calls"`:
```json
{
"choices": [{
"finish_reason": "tool_calls",
"message": {
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "web_search",
"arguments": "{\"query\": \"current weather NYC\"}"
}
}]
}
}]
}
```
### Sending Tool Results Back
Append the assistant's tool_call message and a tool result message, then re-submit:
```json
{
"messages": [
{ "role": "user", "content": "What's the weather?" },
{ "role": "assistant", "content": null,
"tool_calls": [{ "id": "call_abc123", "function": { "name": "web_search", "arguments": "..." } }] },
{ "role": "tool", "tool_call_id": "call_abc123",
"content": "Current weather in NYC: 62°F, partly cloudy." }
],
"tools": [...],
"tool_choice": "auto"
}
```
Repeat until `finish_reason: "stop"`.
---
## RAG (Retrieval Augmented Generation)
### Upload a File
```
POST /api/v1/files/
Authorization: Bearer sk-...
Content-Type: multipart/form-data
file=@/path/to/document.pdf
```
Returns a file ID. Poll `/api/v1/files/{id}/process/status` until `completed`.
### Knowledge Collections
```
POST /api/v1/knowledge/{collection_id}/file/add
{ "file_id": "..." }
```
### Use in Chat
Reference files or knowledge collections in any chat request:
```json
{
"model": "gemma4-26b-a4b",
"messages": [...],
"files": [
{ "type": "file", "id": "file-id" },
{ "type": "collection", "id": "collection-id" }
]
}
```
### Process a Web URL into a Collection
```
POST /api/v1/retrieval/process/web
{ "url": "https://example.com/article", "collection_id": "..." }
```
---
## Filter Behavior with Direct API Calls
Open WebUI supports inlet/outlet filter pipelines. With direct API access:
| Filter | Runs automatically? |
|-----------|---------------------|
| `inlet()` | ✅ Yes |
| `stream()`| ✅ Yes |
| `outlet()`| ❌ Manual only — call `POST /api/chat/completed` after receiving response |
For Cortex's use case (tool loop orchestration), this is not a concern — we're
driving the loop ourselves and don't rely on Open WebUI's filter pipeline.
---
## Relevant Cortex Files
| File | Purpose |
|---|---|
| `cortex/llm_client.py``_local()` | Current local backend (direct chat only) |
| `cortex/routers/local_llm.py` | Local model settings page + fetch-models endpoint |
| `cortex/user_settings.py` | Per-user host + model config (`local_llm.json`) |
| `cortex/orchestrator_engine.py` | Gemini API tool loop — reference for local version |
| `home/{user}/local_llm.json` | Stored host/model config |
---
## Planned: Local Orchestrator (`local_orchestrator_engine.py`)
A local equivalent of `orchestrator_engine.py` that:
1. Takes the same tool definitions already registered in `cortex/tools/`
2. Converts them to OpenAI `tools` format (already close — minor schema diff from Gemini)
3. Runs a ReAct loop against the local model via `/api/chat/completions`
4. Falls back gracefully if the model doesn't return a valid tool call
See `documentation/TODO__Agents.md``[Local] Tool-capable local orchestrator`.
Model recommendation:
- **Gemma 4 26B A4B** (256k ctx, MoE — fast for its size) for complex tool tasks
- **Gemma 4 E4B** (128k ctx) for lightweight/fast tasks
---
## Notes
- Open WebUI workspace aliases (e.g. `agent-support-gemma-small`) resolve to the
underlying Ollama model — use aliases in Cortex for human-friendly model names.
- `tool_choice: "auto"` lets the model decide; `"none"` forces plain text response;
`{"type": "function", "function": {"name": "..."}}` forces a specific tool.
- Gemma 4 models support combined tool use + reasoning (thinking tokens) — useful
for complex multi-step tasks.
- For embeddings (future RAG work), use `/ollama/api/embed` directly.

View File

@@ -0,0 +1,106 @@
# Architecture: LLM Backends
> How Cortex talks to AI models.
> Last updated: 2026-04-03
---
## Three Backends
| Backend | Used for | Auth | Config |
|---|---|---|---|
| **Claude CLI** | Primary chat, all user-facing responses | OAuth token from `~/.claude/.credentials.json` | `DEFAULT_MODEL` in `.env` |
| **Gemini CLI** | Fallback when Claude unavailable | Gemini CLI credentials | Auto-fallback |
| **Local (Open WebUI)** | Private/offline tasks, cost-free use | API key per user in `local_llm.json` | `/settings/local` UI |
The **Gemini API** (google-genai SDK) is also used — but only by the orchestrator tool loop, not as a general chat backend. See [`ARCH__FUTURE.md`](ARCH__FUTURE.md) for the orchestrator pattern.
---
## Backend Selection
User toggles backend in the UI: `claude → gemini → local` (cycles). The active backend is stored server-side; the UI reflects it with color coding (default / green / amber).
When local is active, the active model name appears below the toggle button.
**Fallback chain** (automatic, on error):
```
claude → gemini
gemini → claude
local → claude
```
Auth expiry on Claude triggers a UI banner + `claude_auth_expired` SSE event.
---
## Claude Backend (`_claude()`)
Runs `claude --print --no-session-persistence --output-format text` as a subprocess.
- System prompt passed via `--system-prompt`
- Conversation history formatted as `<conversation>` block
- Token read live from `~/.claude/.credentials.json` on every call — never relies on the env var, which goes stale after `claude auth login`
- Model override via `--model` flag (e.g. `claude-opus-4-6`)
Timeout: `TIMEOUT_CLAUDE=60` seconds (`.env`)
---
## Gemini CLI Backend (`_gemini()`)
Runs `gemini --output-format text --extensions "" -p <prompt>` as a subprocess.
- `--extensions ""` disables all MCP extensions — prevents child processes from keeping pipes open after responding
- `start_new_session=True` puts the process in its own group for clean `os.killpg` on timeout
- Output is cleaned to strip CLI noise lines (loading messages, retry notices, quota warnings)
Timeout: `TIMEOUT_GEMINI=120` seconds (`.env`)
---
## Local Backend (`_local()`)
HTTP POST to Open WebUI's OpenAI-compatible endpoint: `{api_url}/api/chat/completions`.
Per-user config in `home/{user}/local_llm.json`:
```json
{
"hosts": [{"id": "...", "label": "scott_gaming", "api_url": "http://192.168.32.19:3000", "api_key": "sk-..."}],
"models": [{"id": "...", "host_id": "...", "label": "Gemma 4 Small", "model_name": "agent-support-gemma-small"}],
"active_model_id": "..."
}
```
Resolution order for active model:
1. User's `active_model_id` in `local_llm.json`
2. `.env` server defaults (`LOCAL_API_URL` / `LOCAL_MODEL`)
3. Error — user is prompted to configure at `/settings/local`
Timeout: `TIMEOUT_LOCAL=300` seconds (`.env`) — local models may need to load from disk.
**Manage at:** `/settings/local` — supports multiple hosts and models per user, "Fetch from host" button to populate model list from the server.
---
## Distillation Backends
Memory distillation runs on a schedule and uses the LLM for mid and long distill passes. By default uses the primary backend (`claude`). Override in `.env`:
```
DISTILL_BACKEND_MID=local # saves API credits — Gemma handles summarization well
DISTILL_BACKEND_LONG= # empty = use primary (claude recommended for quality)
```
---
## Current Local Models (scott_gaming, 8 GB VRAM)
| Model | Alias | Speed | Practical Context |
|---|---|---|---|
| Gemma 4 E4B | `agent-support-gemma-small` | ~25 t/s | **72k tokens** |
| Gemma 4 26B A4B (MoE) | `agent-support-gemma-medium` | ~9 t/s | **50k tokens** |
Both support OpenAI `tools` / `tool_choice` function calling — required for the local orchestrator.
Full Open WebUI API reference: [`docs/OPEN_WEBUI_API.md`](../docs/OPEN_WEBUI_API.md)

View File

@@ -0,0 +1,149 @@
# Architecture: Input Channels
> How messages reach Cortex and how Cortex reaches back.
> Last updated: 2026-04-03
---
## Channel Summary
| Channel | Direction | Auth | Endpoint |
|---|---|---|---|
| Web UI | In + Out | JWT session cookie | `/{user}/{persona}` |
| Nextcloud Talk | In + Out | HMAC-SHA256 | `POST /webhook/nextcloud/{username}` |
| Google Chat | In + Out | JWT (Google system token) | `POST /channels/google-chat/{username}` |
| Cron | Out (proactive) | Internal | APScheduler |
| Webhooks | In (future) | TBD | `POST /webhook/{source}` |
**Per-user config:** Each channel that needs secrets (NC Talk bot key, Google Chat audience) stores them in `home/{username}/channels.json`. No channel access by default — each user sets up their own.
---
## Web UI
Single-page app served from `cortex/static/`. All chat happens via `POST /chat` (streaming SSE for real-time response) or `POST /orchestrate` (async job, polled).
**Session auth:** Login form (`/login`) → bcrypt password check → JWT cookie (30-day expiry). Google OAuth also available (`/auth/google`). All non-public routes require a valid cookie.
**Modes:**
- **Direct** — message goes straight to LLM via `/chat`
- **Agent** — message goes to orchestrator (`/orchestrate`), tool loop runs, result polled and streamed into UI
**Context + Memory panel:** Shows current backend (claude/gemini/local), memory tier, active local model. Toggle backend cycles claude → gemini → local.
**Files panel:** Browse and edit persona markdown files in-browser. Session search at the bottom.
**Settings:** `/settings` — Gemini API key, Google account, connected status. `/settings/local` — local model hosts and models.
---
## Nextcloud Talk
Bot integration. The bot is registered in a Talk room; it receives messages, generates a response, and sends it back via the NC Talk bot API.
**Incoming:** `POST /webhook/nextcloud/{username}`
- Signature verified: `HMAC-SHA256(secret, random + raw_body)`
- Ignores non-Create events and non-Note types
- Strips `@{persona}` mention prefix from message text
- Processes in background task (immediate 200 response to NC Talk)
**Outgoing:** Bot API `POST /ocs/v2.php/apps/spreed/api/v1/bot/{room}/message`
- Signature: `HMAC-SHA256(secret, random + message_text)` — note: message text, not body
- Logic lives in `notification.py` (`_send_nct_message`) — shared with proactive notifications
**Proactive notifications:** Set `notification_room` in `channels.json``nextcloud`. Used by distill completion alerts and `message`/`brief` cron jobs.
**Per-user config (`channels.json`):**
```json
{
"nextcloud": {
"persona": "inara",
"url": "https://cloud.dgrzone.com",
"bot_secret": "...",
"notification_room": "<room-token>",
"timeout": 55
}
}
```
Full setup guide: [`docs/NEXTCLOUD_TALK_BOT.md`](../docs/NEXTCLOUD_TALK_BOT.md)
---
## Google Chat
Workspace Add-on. Messages arrive as HTTP POST from Google's infrastructure; the handler returns a JSON response synchronously (no background task — Google expects an immediate reply).
**Incoming:** `POST /channels/google-chat/{username}`
- Auth: JWT in `authorizationEventObject.systemIdToken`, verified against Google's JWKS
- Response format: `hostAppDataAction.chatDataAction.createMessageAction`
**Per-user config (`channels.json`):**
```json
{
"google_chat": {
"persona": "inara",
"audience": "https://cortex.dgrzone.com/channels/google-chat/scott",
"backend": "claude",
"timeout": 25
}
}
```
Full setup guide: [`docs/GOOGLE_CHAT_BOT.md`](../docs/GOOGLE_CHAT_BOT.md)
---
## Cron / Proactive Messages
User-defined scheduled jobs stored in `home/{user}/persona/{name}/CRONS.json`. Registered at startup by `scheduler.py`; manageable via the `cron_*` orchestrator tools.
**Job types:**
| Type | What happens |
|---|---|
| `remind` | Appends to `REMINDERS.md` — surfaced in context at tier 2+ |
| `note` | Appends to `SCRATCH.md` — read on demand |
| `message` | Sends payload text to user's notification channel |
| `brief` | Runs LLM with payload as prompt, sends response to notification channel |
**`brief` example — morning briefing:**
```json
{
"label": "Morning briefing",
"schedule": "daily:08:00",
"type": "brief",
"payload": "Give Scott a brief good morning. Note any pending reminders or tasks due today.",
"enabled": true
}
```
**Channel selection for `message`/`brief`:**
1. `channel` field on the job (if set)
2. `notification_channel` key in `channels.json`
3. Auto-detect: uses `nextcloud` if configured
**Schedule formats:** `hourly` | `daily` | `daily:HH:MM` | `weekly:DOW` | `weekly:DOW:HH:MM`
---
## Notification Channel Config
`notification_channel` in `channels.json` sets the default outbound channel for all proactive messages (distill alerts, cron message/brief jobs):
```json
{
"notification_channel": "nextcloud",
...
}
```
If absent, defaults to `nextcloud` if configured. Currently only NC Talk is supported for outbound; Google Chat outbound is a future item.
---
## Future Channels
- **WhatsApp** — Business API or bridge (not started; needs account)
- **Gitea webhooks** — push/PR/issue events → orchestrator (router pattern exists; add `gitea.py`)
- **Aether platform events** — trigger agent actions from business data changes

View File

@@ -0,0 +1,192 @@
# Architecture: Planned Features
> What's next and how it's designed to work.
> Last updated: 2026-04-04
For the current task list see `TODO__Agents.md`. For phases and priorities see `ROADMAP.md`.
---
## 1. Local Orchestrator
**Status:** High priority — design complete, not yet built.
Same ReAct tool loop as the Gemini API orchestrator, but driven by a local model via Open WebUI's OpenAI-compatible API. Enables offline/private agent tasks with no API cost.
**Why local models work for this now:** Gemma 4 E4B and 26B A4B both support OpenAI `tools` / `tool_choice` function calling. The tool schema is nearly identical to Gemini's `FunctionDeclaration` — minor field renaming only.
**Design:**
```
POST /orchestrate (mode: "local")
local_orchestrator_engine.py
• converts tools/ to OpenAI tools format
• POST /api/chat/completions with tools array
• parse tool_calls response
• execute tool, append result
• loop until finish_reason: "stop"
response returned (local model generates final answer)
```
Model selection:
- **Gemma 4 E4B** (25 t/s, 72k ctx) — interactive/fast tasks
- **Gemma 4 26B A4B** (9 t/s, 50k ctx) — heavier reasoning, background tasks
Context budget per iteration (system prompt + memory + tool results + history):
- Small model: budget ~40-50k tokens per round
- Medium model: budget ~35-40k tokens per round
Full API reference: [`docs/OPEN_WEBUI_API.md`](../docs/OPEN_WEBUI_API.md)
---
## 2. Dev Agent Pipeline
**Status:** Design complete, not yet built.
Accept a plain-English task, implement code changes, verify them, and present for human approval before committing.
```
Task (chat / Gitea issue / Kanban)
Orchestrator — reads relevant files, routes to specialist
Specialist Agent (Claude CLI in project directory)
• implements the change
• runs self-check: py_compile / svelte-check
Supervisor Agent
• reviews the diff
• runs test suite
• returns: PASS / NEEDS_REVIEW / FAIL + reason
Human approval gate
• summary in Cortex UI or NC Talk
• approve → commit (+ optional push)
• reject <20><> feedback back to specialist
```
**Specialists** (both Claude CLI):
- **Frontend** — working dir: `~/OSIT_dev/aether_app_sveltekit/` — runs `svelte-check` after every change
- **Backend** — working dir: `~/OSIT_dev/aether_api_fastapi/` — runs `py_compile` + unit tests
**Supervisor** returns structured JSON:
```json
{
"verdict": "PASS | NEEDS_REVIEW | FAIL",
"checks_passed": ["py_compile"],
"checks_failed": [],
"review_notes": "...",
"commit_message": "..."
}
```
---
## 3. Gitea Integration
**Status:** Not started. pfSense port forward for SSH already confirmed working.
- **Webhooks → Cortex:** push/PR/issue events → `POST /webhook/gitea` → orchestrator
- Router pattern already established; add `cortex/routers/gitea.py`
- **Gitea Actions CI:** `.gitea/workflows/check.yml` — run `py_compile`/`svelte-check` on push
- **Cortex → Gitea:** after human approval, call Gitea API to create PR or push branch
SSH clone/push: `git clone ssh://git@git.dgrzone.com:2222/<user>/<repo>.git`
---
## 4. Knowledge Layer (AE Journals)
**Status:** Tools exist, import script not yet built.
AE Journals becomes the searchable long-term knowledge base. Complements memory distillation: memory files cover "what have we been working on lately"; Journals cover "what do I know about topic X".
**Existing tools:** `ae_journal_search`, `ae_journal_entry_create` — already in orchestrator tool suite.
**Import script (to build):**
- Walk a markdown directory (Nextcloud, agents_sync docs)
- Chunk by H2 section
- Search before creating (deduplication)
- Tag from frontmatter, filename, directory path
- Target sources: `~/DgrZone_Nextcloud/`, `~/OSIT_Nextcloud/`
**Agent workflow:**
```
"Summarize my notes on WireGuard setup"
→ orchestrator calls ae_journal_search("wireguard")
→ returns matching entries
→ Claude synthesizes response
```
---
## 5. Intelligent Model Routing
**Status:** Deferred. Currently user-toggled.
Route automatically based on task characteristics rather than requiring manual backend selection:
| Task type | Backend | Reason |
|---|---|---|
| User-facing conversation | Claude | Quality prose, persona fidelity |
| Tool use / orchestration | Gemini API | Native function calling, free tier |
| Private / sensitive / offline | Local (Ollama) | No data leaves the network |
| Long context (>50k tokens) | Gemini 2.0 | 1M token context window |
| Fast/cheap simple queries | Local (E4B) | 25 t/s, no API cost |
Routing logic would live in `llm_client.py` or a new `router.py` — map task metadata to backend choice.
---
## 6. RAG via Open WebUI
**Status:** Future — Open WebUI already supports it.
Feed Nextcloud documents or session logs into Open WebUI knowledge collections. Reference them in local model chat via `"files": [{"type": "collection", "id": "..."}]`.
Would complement AE Journals for local-only contexts where data shouldn't leave the network.
API reference: [`docs/OPEN_WEBUI_API.md`](../docs/OPEN_WEBUI_API.md) — RAG section.
---
## 8. Agent Architecture Ideas (from Claude Code leak)
**Status:** Research — review before building dev agent pipeline and orchestrator.
The Claude Code system prompt was leaked in early April 2026. Two reimplementation repos are worth reading for design ideas before building out the dev agent pipeline and local orchestrator:
- https://github.com/HarnessLab/claw-code-agent — Python reimplementation targeting local models (Qwen3-Coder recommended); most technically detailed
- https://github.com/ultraworkers/claw-code — Community porting/reverse-engineering project; reportedly has interesting detail in the source code itself
**Ideas worth incorporating:**
**Tiered permission architecture** — explicit read-only / write / shell / unsafe modes, each requiring an opt-in flag. Currently Cortex has implicit trust for agent operations. Relevant once the dev agent pipeline is writing and executing code — don't want a `brief` cron job accidentally in write mode.
**Agent lineage tracking** — agent manager records which agent spawned which sub-agent. Useful for debugging multi-step orchestrated tasks and essential for the supervisor → specialist → approval gate chain.
**Cost/budget enforcement** — hard token and cost budgets per operation, multiple budget types. `ORCHESTRATOR_MAX_ROUNDS=10` is Cortex's only guardrail today. Worth adding a token budget check to the tool loop, especially relevant for local models with hard context ceilings (72k/50k practical).
**Context compaction/snipping** — automatic mid-session context trimming when approaching limits. Important for long orchestrator runs against local models. Could trim tool results that are no longer needed for the current reasoning step.
**Nested agent delegation with dependency-aware batching** — sub-agents that know their parent; parallel sub-tasks batched by dependency order. Directly applicable to the dev agent pipeline (orchestrator → specialist → supervisor, with some steps parallelizable).
**File history journaling** — beyond session logs, a journal of what files changed and why, with replay summaries. Different from memory distillation — more like a git log for agent actions. Could complement the supervisor agent's diff review.
**Plugin/manifest-based tool extensions** — tools declared via manifest rather than hardcoded in `__init__.py`. Would make adding new orchestrator tools less invasive. Worth considering before the tool suite grows much larger.
---
## 7. Permanent Fleet Hosting
**Status:** Deferred.
Currently running on `scott_lpt` (main laptop). Long-term target: home server (always-on, Docker).
`docker-compose.yml` already exists in the project root. Deployment path:
1. Copy to home server
2. Configure reverse proxy (Nginx, already Docker-hosted)
3. Set subdomain `cortex.dgrzone.com` → home server internal IP
4. WireGuard required for all access — not internet-exposed

View File

@@ -1,306 +1,14 @@
# Architecture: Intelligence Layer
# ARCH__Intelligence_Layer.md — Archived
**Status:** Design phase — not yet implemented
**Last updated:** 2026-03-18
This document has been split into focused per-topic docs.
This document captures the architectural thinking behind expanding Cortex from a smart dispatcher into a genuine intelligence layer: capable of using tools, coordinating specialist agents, and managing a personal knowledge base.
| What you're looking for | New location |
|---|---|
| Overall architecture, design decisions | [`ARCH__SYSTEM.md`](ARCH__SYSTEM.md) |
| Orchestrator/Responder pattern, tool loop | [`ARCH__FUTURE.md`](ARCH__FUTURE.md) — section 1 |
| Dev agent pipeline, supervisor agent | [`ARCH__FUTURE.md`](ARCH__FUTURE.md) — section 2 |
| Knowledge layer, AE Journals import | [`ARCH__FUTURE.md`](ARCH__FUTURE.md) — section 4 |
| LLM backends and routing | [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) |
| Model routing (future) | [`ARCH__FUTURE.md`](ARCH__FUTURE.md) — section 5 |
---
## Overview
Cortex currently dispatches chat messages to LLM CLI backends and returns the response. The Intelligence Layer adds three major capabilities on top of that foundation:
1. **Orchestrator/Responder** — Gemini handles tool use and planning; Claude handles the user-facing response
2. **Dev Agent Pipeline** — Specialist agents implement code changes; a supervisor checks the work
3. **Knowledge Layer** — AE Journals becomes the primary knowledge base; agents can read and write it
These are independent tracks that share the same trigger layer and can be built incrementally.
---
## 1. Orchestrator / Responder Pattern
### The Problem
Claude CLI (via Pro subscription) doesn't expose direct API tool-calling. Gemini API (free tier) does. But Claude produces higher-quality user-facing prose and reasoning. The solution is to use each model for what it does best.
### The Pattern
```
User message
Orchestrator (Gemini API)
• interprets intent
• decides which tools to call
• executes tool loop (ReAct: reason → act → observe → repeat)
• assembles enriched context + tool results
Responder (Claude CLI)
• receives enriched context
• writes the user-facing response
User
```
For **direct chat** (no tools needed), the orchestrator is bypassed entirely — message goes straight to Claude. The orchestrator only activates when tools are required or when explicitly invoked (e.g., a background task).
### Why Gemini API (not CLI)?
- Gemini CLI is a subprocess; function calling via subprocess is fragile
- Gemini API (`google-generativeai` SDK) has native structured tool-calling
- Free tier (Gemini 2.0 Flash) handles orchestration load without cost
- Access token is short-lived but auto-refreshed by the SDK (no expiry problem)
### Tool Strategy
Tools for the orchestrator are **separate** from the existing `ae_*` MCP tools. The ae_* tools are stable and used by existing agents — do not modify them.
New orchestrator tools are Python functions wrapped in Gemini function declarations:
| Tool | What it does | Implementation |
|---|---|---|
| `web_search` | DuckDuckGo search | `duckduckgo-search` library |
| `ae_journal_search` | Search AE Journals via V3 API | HTTP to AE API |
| `ae_journal_entry_create` | Write a new journal entry | HTTP to AE API |
| `ae_task_list` | Read Kanban tasks | HTTP to AE API or agents_sync file |
| `file_read` | Read a file from known safe paths | Python `pathlib` |
| `gitea_api` | Query Gitea repos, issues, PRs | Gitea REST API |
Tools are registered in `cortex/tools/` (one file per domain group).
### Implementation Path
```
cortex/
tools/
__init__.py — tool registry
web.py — web_search
ae_knowledge.py — ae_journal_* tools
ae_tasks.py — task tools
gitea.py — Gitea API tools
routers/
orchestrator.py — POST /orchestrate, GET /orchestrate/{job_id}
orchestrator_engine.py — Gemini tool loop + Claude handoff
```
Endpoint contract:
```
POST /orchestrate
{
"task": "What tasks are due this week and summarize my notes on X topic",
"session_id": "optional — if part of an ongoing conversation",
"respond_with_claude": true // false = return Gemini's assembled context only
}
→ { "job_id": "uuid", "status": "queued" }
GET /orchestrate/{job_id}
→ { "status": "complete", "result": "...", "tool_calls": [...] }
```
---
## 2. Trigger Layer
All three capabilities (chat, orchestration, dev agents) share the same trigger layer:
```
┌────────────────────────────────────────────────┐
│ TRIGGERS │
│ │
│ Chat UI → POST /chat (existing) │
│ Cron → POST /orchestrate (new) │
│ Gitea → POST /webhook/gitea (new) │
│ NC Talk → POST /webhook/nextcloud (exists) │
│ Manual → CLI / curl for debugging │
└────────────────────────────────────────────────┘
```
Cron trigger example (from existing cron infrastructure):
```bash
curl -X POST http://localhost:8000/orchestrate \
-H "Content-Type: application/json" \
-d '{"task": "Check for overdue Kanban tasks and notify via NC Talk"}'
```
This means the same orchestrator endpoint is usable from chat, crons, and webhooks without any special cases.
---
## 3. Dev Agent Pipeline
### The Goal
Accept a plain-English task like *"Fix the bug where X, add a test for it"* and produce:
- A working code change
- Passing syntax/type checks
- A summary of what changed and what still needs human review
- A commit ready to push (pending approval)
### Architecture
```
Task request (chat / Gitea issue / Kanban)
Orchestrator
• reads relevant files (context gathering)
• routes to correct specialist
Specialist Agent (Claude CLI in project directory)
• implements the change
• runs self-check: py_compile / svelte-check
Supervisor Agent
• reviews the diff
• runs test suite
• returns: PASS / NEEDS_REVIEW / FAIL + reason
Human approval gate
• summary shown in Cortex UI or NC Talk
• user approves → commit + optional push
• user rejects → feedback goes back to specialist
```
### Specialist Agents
Two initial specialists, both using Claude CLI:
**Frontend specialist** (working dir: `~/OSIT_dev/aether_app_sveltekit/`):
- Reads `documentation/TODO__Agents.md` and `CLAUDE.md` before acting
- Runs `npx svelte-check` after every change — no exceptions
- Atomic commits (one component or fix per commit)
**Backend specialist** (working dir: `~/OSIT_dev/aether_api_fastapi/`):
- Reads `documentation/TODO__Agents.md` and `CLAUDE.md` before acting
- Runs `python3 -m py_compile` after every file edit
- Runs unit tests before declaring done
- Flags E2E tests that need human review
### Supervisor Agent
The supervisor is a separate Claude invocation that receives:
- The diff of all changed files
- Stdout/stderr from all checks that were run
- The original task description
It returns a structured assessment:
```json
{
"verdict": "PASS | NEEDS_REVIEW | FAIL",
"checks_passed": ["py_compile", "unit_tests"],
"checks_failed": [],
"review_notes": "E2E tests not run — touch auth router, recommend manual check",
"commit_message": "fix: correct session token validation in auth middleware"
}
```
### Gitea Integration
- **Gitea webhooks → Cortex:** Push/PR events trigger supervisor review automatically
- **Gitea Actions:** Run `py_compile`/`svelte-check` on every push (simple CI, no custom runner)
- **Cortex → Gitea:** After human approval, supervisor calls Gitea API to create PR or push
Gitea Actions are simpler than they sound — a `.gitea/workflows/check.yml` is just a YAML file that runs shell commands on push. No external CI infrastructure needed.
---
## 4. Knowledge Layer
### The Goal
AE Journals becomes the primary source of truth for personal and business knowledge. Notes, documentation, and logs that currently live scattered across markdown files get organized into Journals with proper structure, search, and agent-accessible read/write.
### Import Strategy
1. **Don't bulk-import blindly.** The orchestrator searches AE Journals before creating anything (deduplication).
2. **Chunk by section.** A large markdown file becomes multiple journal entries — one per H2 section.
3. **Preserve provenance.** Each imported entry includes source path, import date, and original file date in its `data_json` or notes.
4. **Tag intelligently.** Tags come from: frontmatter, filename keywords, directory path, and content analysis.
### Source Priority
| Source | Priority | Notes |
|---|---|---|
| `~/DgrZone_Nextcloud/` | High | Personal notes, projects |
| `~/OSIT_Nextcloud/` | High | Business docs |
| `~/agents_sync/aether/docs/` | Medium | Platform specs (already structured) |
| OpenClaw session logs | Low | Historical, lots of noise |
### Agent Workflow
```
"Summarize my notes on WireGuard setup"
Orchestrator calls ae_journal_search("wireguard")
Returns matching entries
Claude synthesizes a response
```
```
"Save this as a note in my DgrZone journal"
Orchestrator calls ae_journal_entry_create(
journal="DgrZone General",
title="...",
content="...",
tags=["note", "wireguard"]
)
```
### Context Tiers (Inara Memory)
The existing distill system (`MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md`) handles working memory. The Knowledge Layer is complementary — it's the **searchable long-term archive**, not the rolling context window. Agents should:
- Use memory files for "what have we been working on lately"
- Use AE Journals search for "what do I know about topic X"
---
## 5. Model Routing (Future)
Currently hardcoded: Claude default, Gemini fallback. Future intelligent routing:
| Task type | Model | Reason |
|---|---|---|
| User-facing conversation | Claude | Quality prose, reasoning |
| Tool use / orchestration | Gemini API | Native function calling, free |
| Private / sensitive | Ollama (local) | No data leaves the network |
| Long context (>100k tokens) | Gemini 2.0 | 1M token context window |
| Code generation | Claude | Strong code quality |
Routing logic lives in `cortex/orchestrator_engine.py` — a simple function that maps task metadata to a backend choice.
---
## Implementation Order (Recommended)
1. **Orchestrator Phase 1** — Gemini API integration, basic tool loop, `/orchestrate` endpoint
- Unlocks: web search in chat, AE Journal queries, cron-triggered tasks
2. **Knowledge import** — markdown → AE Journal Entries tool + import script
- Unlocks: searchable knowledge base for all agents
3. **Dev agent pipeline** — Frontend + Backend specialist agents
- Unlocks: AI-assisted development with supervisor review
4. **Gitea integration** — webhook receiver + Actions CI
- Unlocks: event-driven automation, PR workflow
5. **Intelligent routing** — model selection by task type
- Polish: cost and quality optimization
---
## Key Design Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Orchestrator model | Gemini API (not CLI) | Native tool calling; free tier |
| Responder model | Claude CLI (Pro sub) | Quality output; no API cost |
| Direct chat bypass | Yes | Don't add latency when tools aren't needed |
| Tool set | Separate from ae_* MCPs | ae_* tools are stable; don't risk breaking active agents |
| Dev agents | Claude CLI in project dir | CLAUDE.md + project context already in place |
| Human approval gate | Required before commit | Agents can propose; humans decide |
| Knowledge primary source | AE Journals | Already exists, structured, searchable |
*Original content written 2026-03-18. Superseded 2026-04-03.*

View File

@@ -0,0 +1,121 @@
# Architecture: Persona System & Memory
> How Inara (and other personas) know who they are and what they remember.
> Last updated: 2026-04-03
---
## Filesystem Layout
Each persona lives in `home/{username}/persona/{name}/`:
```
home/scott/persona/inara/
IDENTITY.md Who Inara is — role, name, origin
SOUL.md Values, personality, voice, what she cares about
PROTOCOLS.md Behavioral rules — how she responds, what she avoids
CONTEXT_TIERS.md Documents which files load at each tier
USER.md Scott's profile — loaded into context so she knows who she's talking to
HELP.md Persona-specific help content (appended to shared HELP.md in UI)
MEMORY_SHORT.md Recent session digest (auto-distilled daily)
MEMORY_MID.md Mid-term summary (auto-distilled weekly)
MEMORY_LONG.md Long-term memory (auto-distilled monthly)
REMINDERS.md Pending reminders (auto-surfaced at tier 2+)
SCRATCH.md Ephemeral scratchpad (read/write via tools)
TASKS.json Personal task list (managed via tools)
CRONS.json Scheduled jobs (managed via tools)
sessions/ Session turn logs — YYYY-MM-DD.md, one file per day
```
**ContextVars:** `persona.py` sets `_user` and `_persona` ContextVars per request. Everything downstream calls `persona_path()` to resolve the right directory — no globals, no thread-local state.
---
## Context Tiers
Each chat request specifies a tier (default: 2). Higher tiers load more context — slower but richer.
| Tier | Loaded Files | Use case |
|---|---|---|
| 1 | IDENTITY.md | Minimal — lightweight tasks |
| 2 | + SOUL.md, PROTOCOLS.md, USER.md, MEMORY_SHORT.md, MEMORY_MID.md, REMINDERS.md | Standard chat |
| 3 | + MEMORY_LONG.md, CONTEXT_TIERS.md | Deep sessions, long tasks |
| 4 | + SCRATCH.md, TASKS.json | Full state — agent mode |
`context_loader.py` assembles the system prompt from these files in order. The resulting prompt is passed to whichever LLM backend handles the request.
---
## Memory Distillation
Three-tier rolling memory system, run by APScheduler:
```
sessions/YYYY-MM-DD.md ← raw session logs (written by session_logger.py)
↓ daily 03:00
MEMORY_SHORT.md ← recent session digest (no LLM — pure aggregation)
↓ weekly Sun 03:30
MEMORY_MID.md ← concise summary (LLM)
↓ monthly 1st 04:00
MEMORY_LONG.md ← integrated long-term memory (LLM)
```
**Short distill** — reads the most recent session files that fit within the token budget, writes them in chronological order. No LLM involved — fast and cheap.
**Mid distill** — LLM summarizes MEMORY_SHORT into a concise digest. Prompt asks for recurring themes, decisions, ongoing projects, Scott's current state and priorities. Written in first person as Inara.
**Long distill** — LLM integrates MEMORY_MID into MEMORY_LONG. Rules: preserve historical facts, update stale info, absorb new themes, remove irrelevant entries.
**Distill notifications** — after mid and long runs, `notification.py` sends a message to the user's configured NC Talk notification room (if `notification_room` is set in `channels.json`).
**Controls** in `.env`:
```
AUTO_DISTILL=true
AUTO_DISTILL_SHORT=true
AUTO_DISTILL_MID=true
AUTO_DISTILL_LONG=true # off by default — first run warrants manual review
DISTILL_BACKEND_MID=local # use local model to save API credits
DISTILL_BACKEND_LONG= # empty = primary backend (claude recommended)
MEMORY_BUDGET_SHORT=3000 # token budgets (soft caps)
MEMORY_BUDGET_MID=2000
MEMORY_BUDGET_LONG=2000
```
Manual distill via API:
```
POST /distill/short
POST /distill/mid
POST /distill/long
GET /distill/status
```
---
## Adding a New Persona
`persona_template.py` bootstraps a new persona directory from string templates. The onboarding flow (`/setup/persona`) calls this when a new user creates their first persona.
To add one manually:
1. Create `home/{username}/persona/{name}/`
2. Copy and edit the files from an existing persona (e.g. `home/scott/persona/inara/`)
3. At minimum: `IDENTITY.md`, `SOUL.md`, `PROTOCOLS.md`, `USER.md`
4. The distiller will create the `MEMORY_*.md` files on first run
---
## Session Search
Past sessions are searchable via `GET /sessions/search?q=...&user=...&persona=...`.
Available in the UI via the search box at the bottom of the Files panel (open with the Files button). Results are grouped by date with highlighted excerpts.
---
## Active Personas
| User | Persona | Description |
|---|---|---|
| scott | inara | Scott's primary assistant |
| scott | developer | Dev-focused persona |
| holly | tina | Holly's primary assistant |
| brian | wintermute | Brian's primary assistant |

View File

@@ -0,0 +1,90 @@
# Architecture: System Overview
> How the pieces fit together.
> Last updated: 2026-04-03
---
## Architecture Diagram
```
┌─────────────────────────────────────────────────────────┐
│ INPUT CHANNELS │
│ │
│ Web UI ──────────────────────────────────────────┐ │
│ Nextcloud Talk ──── POST /webhook/nextcloud/{u} ─┤ │
│ Google Chat ─────── POST /channels/google-chat/{u}┤ │
│ Cron / Scheduler ─────────────────────────────────┤ │
│ Webhooks (future) ─────────────────────────────────┘ │
└─────────────────────────────┬───────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ CORTEX DISPATCHER (FastAPI — cortex/) │
│ │
│ auth_middleware.py → validates JWT session cookie │
│ persona.py → resolves user + persona context │
│ context_loader.py → assembles system prompt (tier 1-4)│
│ │
│ POST /chat → direct LLM, streaming SSE │
│ POST /orchestrate → Gemini tool loop → Claude │
│ GET /orchestrate/{id} → poll job result │
└────────────┬───────────────────┬────────────────────────┘
↓ ↓
┌─────────────────┐ ┌──────────────────────────────────┐
│ LLM BACKENDS │ │ PERSONA DATA │
│ │ │ home/{user}/persona/{name}/ │
│ Claude CLI │ │ │
│ Gemini CLI │ │ IDENTITY.md SOUL.md │
│ Gemini API │ │ PROTOCOLS.md MEMORY_*.md │
│ Local (httpx) │ │ USER.md REMINDERS.md │
│ │ │ TASKS.json CRONS.json │
└─────────────────┘ │ sessions/ SCRATCH.md │
└──────────────────────────────────┘
```
Details: [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) | [`ARCH__PERSONA.md`](ARCH__PERSONA.md) | [`ARCH__CHANNELS.md`](ARCH__CHANNELS.md)
---
## Service Layout (`cortex/`)
| File | Purpose |
|---|---|
| `main.py` | App entry point, router registration |
| `config.py` | All settings (pydantic-settings, reads `.env`) |
| `persona.py` | User + persona path resolution, ContextVars |
| `context_loader.py` | Builds system prompt from persona files (tiers 14) |
| `llm_client.py` | All LLM backends — Claude, Gemini CLI, Local |
| `orchestrator_engine.py` | Gemini API ReAct tool loop → Claude handoff |
| `session_store.py` | In-memory + file session persistence |
| `session_logger.py` | Writes session turns to `sessions/YYYY-MM-DD.md` |
| `memory_distiller.py` | Short/mid/long distill jobs |
| `scheduler.py` | APScheduler — distill jobs + user crons |
| `cron_runner.py` | Cron job storage, schedule parsing, execution |
| `notification.py` | Outbound channel messages (distill alerts, cron proactive) |
| `auth_utils.py` | bcrypt passwords, JWT, invite tokens, channel config |
| `auth_middleware.py` | JWT cookie validation on all routes |
| `user_settings.py` | Per-user local LLM config (hosts, models, active model) |
| `event_bus.py` | Internal SSE pub/sub (NC Talk → browser mirror) |
| `email_utils.py` | SMTP invite emails |
| `persona_template.py` | Bootstrap a new persona directory from templates |
| `routers/` | One file per endpoint group (chat, orchestrator, auth, files, channels, ui, settings…) |
| `tools/` | Orchestrator tool implementations (web, ae_knowledge, tasks, scratch, reminders, cron, system) |
| `static/` | Web UI — `index.html`, `app.js`, `style.css`, `login.html`, `setup.html`, `HELP.md` |
| `tests/` | pytest suite (80 tests) |
---
## Key Design Decisions
**Two-brain pattern** — Gemini API handles tool use (function calling, planning, web search). Claude CLI handles all user-facing responses. Direct chat bypasses the orchestrator entirely.
**Subprocess backends** — Claude and Gemini run as CLI subprocesses (`claude --print`, `gemini -p`). This keeps auth transparent (Claude Code manages tokens) and avoids API costs on the Pro subscription path.
**Local backend via httpx** — Open WebUI's OpenAI-compatible API (`/api/chat/completions`). No CLI wrapper. Per-user host + model config in `local_llm.json`.
**ContextVars for async isolation**`persona.py` uses Python `contextvars.ContextVar` so concurrent requests each see their own user/persona without thread-local hacks.
**Per-user filesystem layout**`home/{user}/persona/{name}/` mirrors Linux home directories. Each persona is a directory of markdown files and JSON. No database. Easy to inspect, edit, and back up.
**No single point of coupling** — tools live in `cortex/tools/`, separate from `ae_*` MCP tools. Channels live in `cortex/routers/`, each self-contained. Adding a channel or tool doesn't touch other subsystems.

92
documentation/MASTER.md Normal file
View File

@@ -0,0 +1,92 @@
# Cortex / Inara — Master Index
> Start here. This document is a map, not a manual.
> Last updated: 2026-04-03
---
## What It Is
Cortex is a self-hosted personal AI platform. It routes messages from any input channel to AI backends, manages a resident agent (Inara) with persistent memory, and coordinates across a fleet of machines. It is infrastructure, not a product.
**Running at:** `https://cortex.dgrzone.com` | `systemctl --user restart cortex`
---
## Current State
| Component | Status | Notes |
|---|---|---|
| Web UI | ✅ Live | SPA, dark theme, mobile-responsive, session auth |
| Nextcloud Talk bot | ✅ Live | HMAC-signed, per-user routing |
| Google Chat Add-on | ✅ Live | JWT-verified, per-user routing |
| Claude backend | ✅ Live | Primary — via Claude Code CLI |
| Gemini backend | ✅ Live | Fallback — via Gemini CLI |
| Local backend | ✅ Live | Third option — Open WebUI/Ollama on scott_gaming |
| Gemini orchestrator | ✅ Live | Tool loop → Claude response, Agent mode in UI |
| Memory distillation | ✅ Live | Short (daily) / Mid (weekly) / Long (monthly) |
| Multi-user | ✅ Live | Scott, Holly, Brian — each with own personas |
| Session search | ✅ Live | Full-text search across past session logs |
| Proactive cron | ✅ Live | `message` and `brief` job types → NC Talk |
**Active users / personas:** scott/inara, scott/developer, holly/tina, brian/wintermute
---
## Document Map
### Project-Level
| Doc | What it covers |
|---|---|
| **This file** | Index and current state |
| [`CORTEX.md`](../CORTEX.md) | Vision, philosophy, "what it is and isn't" |
| [`ROADMAP.md`](ROADMAP.md) | Phases — what's done, what's next, what's deferred |
| [`TODO__Agents.md`](TODO__Agents.md) | Active task list — read before starting work |
### Architecture
| Doc | What it covers |
|---|---|
| [`ARCH__SYSTEM.md`](ARCH__SYSTEM.md) | Overall architecture, component map, key design decisions |
| [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) | LLM backends, routing, fallback, per-user config |
| [`ARCH__PERSONA.md`](ARCH__PERSONA.md) | Persona system, context tiers, memory distillation |
| [`ARCH__CHANNELS.md`](ARCH__CHANNELS.md) | Input channels — web, NC Talk, Google Chat, cron |
| [`ARCH__FUTURE.md`](ARCH__FUTURE.md) | Planned: local orchestrator, dev agents, knowledge layer |
### Setup & Reference
| Doc | What it covers |
|---|---|
| [`docs/NEXTCLOUD_TALK_BOT.md`](../docs/NEXTCLOUD_TALK_BOT.md) | NC Talk bot setup and troubleshooting |
| [`docs/GOOGLE_CHAT_BOT.md`](../docs/GOOGLE_CHAT_BOT.md) | Google Chat Add-on setup |
| [`docs/OPEN_WEBUI_API.md`](../docs/OPEN_WEBUI_API.md) | Open WebUI/Ollama API reference for local model work |
### Code-Level
| Doc | What it covers |
|---|---|
| [`CLAUDE.md`](../CLAUDE.md) | Project instructions for Claude Code — directory map, run commands, design decisions |
| [`README.md`](../README.md) | Project root orientation, quick-start, user management |
| [`cortex/static/HELP.md`](../cortex/static/HELP.md) | In-app help (rendered in UI for all users) |
---
## Quick Reference
**Start the service / check logs**
```bash
systemctl --user restart cortex
journalctl --user -u cortex -f
```
**Syntax check before restart**
```bash
python3 -m py_compile cortex/<file>.py
```
**Add a user**
```bash
cd cortex && .venv/bin/python manage_passwords.py invite <username> <email>
```
**Run tests**
```bash
cd cortex && .venv/bin/python -m pytest tests/ -q
```

71
documentation/ROADMAP.md Normal file
View File

@@ -0,0 +1,71 @@
# Cortex — Roadmap
> Phases and priorities. For active tasks see `TODO__Agents.md`.
> Last updated: 2026-04-03
---
## Phase 0 — Foundation ✅
- Syncthing fleet sync (`agents_sync/`) operational
- MCP tools (`ae_*`) available in all Claude Code sessions
- Fleet agents running independently on each machine
## Phase 1 — Dispatcher Core ✅
- FastAPI service with streaming SSE responses
- Claude CLI and Gemini CLI subprocess backends
- Session context management (rolling window, file persistence)
- Nextcloud Talk bot (HMAC-signed webhook)
- Memory distiller (APScheduler — short/mid/long cycles)
- Local web UI (single-page, mobile-responsive)
- Auth status monitoring (`/auth/status`, UI banner)
- Session logging and file browser
## Phase 2 — Identity & Multi-User ✅
- Inara persona formalized (`IDENTITY.md`, `SOUL.md`, `PROTOCOLS.md`, context tiers)
- Two-level user/persona layout (`home/{user}/persona/{name}/`)
- Session auth: bcrypt passwords, JWT cookies, invite tokens, Google OAuth
- Multi-user live: Scott, Holly, Brian
- Per-user channel config (`channels.json`)
- Per-user Gemini API key (settings UI)
- Help & Reference system (shared base + per-persona additions)
- Lucide icons, persona picker page, session persistence across navigation
## Phase 3 — Intelligence Layer (In Progress)
- ✅ Gemini API orchestrator (tool loop → Claude responder)
- ✅ Tool suite: web search, AE Journal read/write, tasks, scratch, reminders, cron, system
- ✅ Agent mode in UI (async job, poll for result)
- ✅ Local LLM backend (Open WebUI/Ollama, per-user multi-model config)
- ✅ Proactive cron (`message` / `brief` job types → NC Talk)
- ✅ Session search (full-text across past session logs)
- ✅ Distill notifications (NC Talk after mid/long runs)
- ✅ Local backend for distillation (DISTILL_BACKEND_MID/LONG in .env)
- [ ] **Local orchestrator** — ReAct tool loop using local model (High priority — see `TODO__Agents.md`)
- [ ] Knowledge import — markdown → AE Journals (import script)
- [ ] Dev agent pipeline — specialist agents + supervisor + approval gate
- [ ] Gitea webhook integration + Actions CI
## Phase 4 — Channel Expansion
- ✅ Web UI
- ✅ Nextcloud Talk
- ✅ Google Chat
- [ ] WhatsApp (Business API or bridge — investigating)
- [ ] Webhook triggers from Aether platform events
## Phase 5 — Routing Intelligence & Scale
- [ ] Intelligent model routing (by task type, privacy, context length)
- [ ] Agent-to-agent task delegation across fleet
- [ ] Permanent hosting on home server (currently on `scott_lpt`)
## Phase 6 — Infrastructure
- [ ] Server DMZ finalized
- [ ] WireGuard for all Cortex-accessing devices
- [ ] Camera/IoT VLAN segmentation
---
## Deferred / Watching
- **Unsloth Gemma 4 GGUFs** — blocked on Ollama v0.20.1 (llama.cpp GGUF metadata issue); switch `agent-support-gemma-*` aliases to Unsloth Q4_K_M when ready
- **Speculative decoding** — llama.cpp supports it (E4B + E2B draft ≈ 2x speed); Ollama does not yet
- **RAG via Open WebUI** — feed Nextcloud docs into local knowledge collections; possible complement to AE Journals search
- **Multi-host local models** — per-user config already supports multiple hosts; routing logic TBD
- **WhatsApp** — requires Business API account or a bridge; not started

View File

@@ -7,57 +7,49 @@
## 🔴 High Priority
### [Auth] Token expiry — sudo restart
- Cortex currently requires `sudo systemctl restart cortex` after OAuth token refresh
- This must be done manually by the user (cannot run interactively from Claude Code)
- **Future:** Explore hot-reload or token-passing mechanism so restart isn't required
### [Local] Tool-capable local orchestrator
Design and implement `local_orchestrator_engine.py` — a ReAct tool loop driven by
a local model via Open WebUI's OpenAI-compatible API, as an alternative to the
Gemini API orchestrator for private/offline tasks.
### [Backend] Ollama local model backend
- Add Ollama as a third LLM backend option (direct Ollama API, no CLI wrapper)
- Endpoint: `http://scott-gaming:<port>/api/` (WireGuard)
- Model selection: configurable per-request or per-session
- Auth status check: ping `/api/tags` to confirm reachability
### [Testing] Gitea SSH port 2222
- pfSense port forward configured but not yet verified end-to-end
- Test: `ssh -p 2222 git@<external>` from outside WireGuard
- Document result in this file
- [ ] Convert existing Cortex tool definitions (`cortex/tools/`) from Gemini
`FunctionDeclaration` format to OpenAI `tools` format (minor schema diff)
- [ ] Implement tool loop: send tools → parse `tool_calls` response → execute →
append result → loop until `finish_reason: stop`
- [ ] Wire into `routers/orchestrator.py` — new `mode` param: `"local"` vs `"gemini"`
- [ ] UI: Agent mode button routes to local orchestrator when local backend active
- [ ] Recommended models (scott_gaming, 8 GB VRAM):
Gemma 4 E4B — 25 t/s, 72k practical ctx — interactive/fast tasks
Gemma 4 26B A4B — 9 t/s, 50k practical ctx — heavier reasoning, background tasks
- Reference: `docs/OPEN_WEBUI_API.md` for full tool call request/response format
---
## 🟡 Medium Priority
### [Intelligence] Orchestrator service — Phase 1 ✅ Complete
See `ARCH__Intelligence_Layer.md` for full design. Committed: `ed472ce` (2026-03-18)
- [x] Add Gemini API (google-generativeai SDK) as a library dependency (not CLI)
- [x] Create `cortex/routers/orchestrator.py``POST /orchestrate` endpoint
- [x] Basic tool registry: web search (DuckDuckGo), AE API query, file read, task list
- [x] ReAct loop: Gemini calls tools, assembles context, hands off to Claude for final response
- [x] `GET /orchestrate/{job_id}` — poll for status/result
- [x] Cron can trigger via HTTP POST (same endpoint)
- **Note:** Default model is `gemini-2.5-flash` — free tier key required (AI Studio)
### [Intelligence] Knowledge consolidation — Phase 1
See `ARCH__Intelligence_Layer.md` for full design. Initial scope:
- [ ] Tool: `ae_journal_search` — search before creating to avoid duplicates
- [ ] Tool: `ae_journal_entry_create` — write a new entry with source metadata
See `ARCH__Intelligence_Layer.md` for full design.
- [x] Tool: `ae_journal_search` — search before creating to avoid duplicates
- [x] Tool: `ae_journal_entry_create` — write a new entry with source metadata
- [ ] Import script: walk a markdown directory, chunk by H2 section, create entries
- [ ] Target: markdown files from `~/DgrZone_Nextcloud/` and `~/OSIT_Nextcloud/`
- [ ] Tag strategy: source path, date, topic tags from frontmatter or filename
### [Channel] Nextcloud Talk integration ✅ Complete
- NC Talk bot is implemented (`cortex/routers/nextcloud_talk.py`)
- HMAC: incoming uses `random + raw_body`; outgoing reply uses `random + message_text` — both correct
- [x] Test end-to-end after any Cortex restart — confirmed working 2026-03-20
- [x] Bot registration docs completed in `docs/NEXTCLOUD_TALK_BOT.md` — 2026-03-20
- **Note:** Currently uses default user/persona only — per-conversation persona routing is a future enhancement
### [Distill] Review first auto_distill_long output — 2026-04-01
- Ran April 1 at 04:00 as scheduled
- Manually review `inara/MEMORY_LONG.md` — confirm quality before fully trusting
- Adjust distill prompts in `cortex/memory_distiller.py` if needed
### [Multi-user] Holly onboarding
- Multi-user is built into Cortex — single instance, multiple users under `home/`
- `home/holly/persona/tina/` directory created from template (stub content — needs real persona files)
- [ ] Send Holly's invite email: `python manage_passwords.py invite holly holly.danner@gmail.com`
- [ ] Walk Holly through onboarding flow (`/setup/{token}` → persona creation)
- [ ] Review and flesh out Tina's persona files (IDENTITY.md, SOUL.md, PROTOCOLS.md, USER.md)
### [Distill] Distill quality review
- Short/mid/long distill prompts live in `cortex/memory_distiller.py`
- After first few automatic runs, review quality and tune
### [Local] Unsloth Gemma 4 variants
- Unsloth Dynamic 2.0 Q4_K_M GGUFs fail with `500: unable to load model` on Ollama v0.20.0
- Root cause: Ollama's bundled llama.cpp doesn't recognize Gemma 4 GGUF architecture metadata from raw files
- Waiting on Ollama point release (v0.20.1+) — then switch Open WebUI to Unsloth variants
- Expected speedup: ~1020% smaller context footprint vs baseline, same quality
- `agent-support-gemma-small` → Unsloth E4B Q4_K_M; `agent-support-gemma-medium` → Unsloth 26B A4B Q4_K_M
---
@@ -81,84 +73,147 @@ See `ARCH__Intelligence_Layer.md`. Full design not yet started.
- `cortex/routers/` already has pattern; add `gitea.py`
- Gitea Actions (CI) for "run tests on push" — simpler than custom runner
### [Auth] Session auth + persona onboarding ✅ Complete
- bcrypt passwords stored in `home/{username}/auth.json`
- JWT session cookies (HS256, 30-day expiry) — `auth_utils.py`, `auth_middleware.py`
- Login/logout at `/login`, `/logout`
- Invite tokens (72h, one-time-use) — admin generates via `manage_passwords.py invite <user> [email]`
- Self-service onboarding: `/setup/{token}` (set password) → `/setup/persona` (create persona)
- Multi-persona switcher in UI header — `/api/personas` endpoint
- SMTP invite email — `noreply@oneskyit.com`, HTML + plain text body
- CSS routing fix — `app.mount("/static")` must precede `app.include_router(ui.router)`
- Committed: 2026-03-20
### [Channel] Google Chat integration ✅ Complete
See `cortex/routers/google_chat.py`. Committed: 2026-03-20
- [x] JWT verification via `authorizationEventObject.systemIdToken` (audience = endpoint URL, issuer = accounts.google.com)
- [x] Workspace Add-on event format: event type inferred from payload key (`messagePayload`, `addedToSpacePayload`, etc.)
- [x] Response format: `hostAppDataAction.chatDataAction.createMessageAction.message.text`
- [x] Session management, LLM pipeline, session logging — same pattern as NC Talk
- [x] Nginx: `/channels/` prefix exposed without basic auth (covers all future channel integrations)
- **Note:** Google Chat API now forces the Workspace Add-on framework — legacy standalone bot format is gone.
`{"text": "..."}` and `renderActions` do NOT work; `hostAppDataAction` is required.
### [Distill] Monitor first auto_distill_long run
- Scheduled for ~April 1 at 04:00
- Manually review `inara/MEMORY_LONG.md` output before fully trusting
- Adjust distill prompts if needed
### [Distill] Distill quality review
- Short/mid/long distill prompts live in `cortex/memory_distiller.py`
- After first few automatic runs, review quality and tune
### [Local] RAG via Open WebUI
Open WebUI has a full RAG pipeline (file upload → embed → knowledge collections →
reference in chat). Could feed Nextcloud docs or session logs into a local knowledge
base accessible to local models. Endpoints documented in `docs/OPEN_WEBUI_API.md`.
- `/api/v1/files/` upload + `/api/v1/retrieval/process/web` for URLs
- Reference in chat via `"files": [{"type": "collection", "id": "..."}]`
### [Backend] Intelligent model routing
- Currently hardcoded: Claude default, Gemini fallback
- Future: route by task type (code → Claude, search → Gemini, private → Ollama)
- Future: route by context length (Gemini 2.0 has 1M token context)
- Currently hardcoded: Claude default, Gemini fallback, local third
- Design direction (now informed by real local model perf):
- **Private/offline tasks** → local (Gemma 4 E4B for speed, 26B A4B for reasoning)
- **Complex tool tasks / long context** → Gemini (1M token context, strong function calling)
- **Final user-facing responses** → Claude (quality prose, persona fidelity)
- Future: auto-route by task type rather than requiring user to toggle backend manually
---
## ✅ Completed
### [UI] Mobile-friendly header
### [Local] Per-user multi-model local LLM settings — 2026-04-01
- `home/{username}/local_llm.json``hosts[]` + `models[]` + `active_model_id` structure
- `cortex/user_settings.py` — CRUD functions: save_host, add_model, remove_model, set_active_model, get_active_local_model
- `cortex/routers/local_llm.py` + `cortex/static/local_llm.html` — dedicated `/settings/local` page
- "Fetch models from host" button — proxied via `/api/local-llm/fetch-models`, populates dropdown
- Active model shown in UI near backend toggle button (amber hint text)
- Migrates old flat `.env`-style config automatically on first use
### [UI] Copy button for user (sent) messages — 2026-04-01
- Added matching copy-on-hover button to user messages (same pattern as assistant messages)
- `div.dataset.raw` set on send; `makeCopyBtn(div)` appended inline
### [Backend] Local model backend (Open WebUI / Ollama) — 2026-04-01
- OpenAI-compatible API via `httpx` — no CLI wrapper needed
- Configured via `LOCAL_API_URL` / `LOCAL_API_KEY` / `LOCAL_MODEL` in `.env`
- Backend toggle cycles `claude → gemini → local` (amber color in UI)
- `/auth/status` includes local reachability check (`GET /api/models`)
- Tested end-to-end: `test-agent-simple` (Qwen3-8B) on `scott-lt-i7-rtx:3000`, full persona context flowing correctly
### [Testing] Gitea SSH port 2222 — 2026-03-29
- pfSense WAN → 192.168.32.7:2222 port forward confirmed working
- `ssh -p 2222 git@git.dgrzone.com` reaches Gitea (returns "Invalid repository path" — expected, confirms connectivity)
- Clone/push via SSH: `git clone ssh://git@git.dgrzone.com:2222/<user>/<repo>.git`
### [Multi-user] Brian onboarding — 2026-03-29
- Invite sent to `memedrift@gmail.com`
- Brian completed onboarding, created `wintermute` persona
- Google OAuth registered (`google-add brian memedrift@gmail.com`)
### [Tools] Reminders tools — 2026-03-29
- `reminders_add`, `reminders_list`, `reminders_clear` added to orchestrator tool suite
- Tools live in `cortex/tools/reminders.py`
- All persona PROTOCOLS.md updated with Tools & Modes reference (direct chat vs Agent mode)
- `persona_template.py` updated so new personas get the protocol automatically
### [Auth] Token expiry — no restart needed — 2026-03-27
- `llm_client._fresh_claude_token()` reads live from `~/.claude/.credentials.json` on every call
- systemd service is a user unit (no sudo) — `systemctl --user restart cortex` is sufficient
- No manual token sync required after `claude auth login`
### [Multi-user] Per-user channel config — 2026-03-27
- Google Chat and NC Talk secrets/config moved from `.env` to `home/{username}/channels.json`
- New endpoints: `POST /channels/google-chat/{username}` and `POST /webhook/nextcloud/{username}`
- No channel access by default — each user configures their own `channels.json`
- Setup guides: `docs/GOOGLE_CHAT_BOT.md` and `docs/NEXTCLOUD_TALK_BOT.md`
### [Auth] Google OAuth sign-in — 2026-03-27
- `GET /auth/google` → Google consent → `GET /auth/google/callback` flow
- Users pre-registered via `manage_passwords.py google-add <user> <email>`
- Google sign-in button on `/login`; auth.json stores `google_sub` + `google_email`
- Active users: scott (scott.idem@oneskyit.com), holly (holly.danner@gmail.com), brian (memedrift@gmail.com)
### [Settings] Per-user Gemini API key — 2026-03-27
- Stored in `home/{username}/auth.json` as `gemini_api_key`
- Orchestrator uses user key if set, falls back to server-level `GEMINI_API_KEY`
- Manageable via `/settings` UI (add, remove, masked hint)
### [UI] Session persistence across navigation — 2026-03-26
- localStorage keyed to `cx_sid_{user}_{persona}` with 30-min inactivity TTL
- Auto-restored silently on page load; cleared on "New session" or session delete
### [UI] Persona picker page — 2026-03-26
- `GET /{username}` shows a card grid of available personas instead of 404
- Each card links directly to `/{username}/{persona}`
### [UI] Lucide icons — 2026-03-25
- Icons throughout: mode selector, send/stop buttons, edit/del/copy, save/cancel
- Loaded via UMD CDN; `icon_html()` + `render_icons()` helpers in `app.js`
### [UI] Persona-specific favicon — 2026-03-25
- Emoji SVG favicon generated from persona config at load time
### [Multi-user] Holly onboarding — 2026-03-20
- Holly's invite sent; onboarding completed via `/setup/{token}`
- `home/holly/persona/tina/` created from template
- Google OAuth registered (`holly.danner@gmail.com`)
### [Channel] Nextcloud Talk integration ✅ — 2026-03-20, updated 2026-03-27
- HMAC verification: incoming uses `random + raw_body`; outgoing reply uses `random + message_text`
- Per-user routing added 2026-03-27 (endpoint: `/webhook/nextcloud/{username}`)
- Docs: `docs/NEXTCLOUD_TALK_BOT.md`
### [Channel] Google Chat integration ✅ — 2026-03-20, updated 2026-03-27
- JWT verification via `authorizationEventObject.systemIdToken`
- Workspace Add-on format: `hostAppDataAction.chatDataAction.createMessageAction`
- Per-user routing added 2026-03-27 (endpoint: `/channels/google-chat/{username}`)
- Docs: `docs/GOOGLE_CHAT_BOT.md`
### [Intelligence] Orchestrator service — Phase 1 — 2026-03-18
- Gemini API (google-genai SDK) tool loop → Claude final response
- `POST /orchestrate` (async job), `GET /orchestrate/{job_id}` (poll)
- Tools: web search, AE API, file read, task list, scratch, reminders, cron
- Default model: `gemini-2.5-flash`
### [Auth] Session auth + persona onboarding — 2026-03-20
- bcrypt passwords in `home/{username}/auth.json`
- JWT session cookies (HS256, 30-day expiry)
- Invite tokens (72h, one-time-use) — `manage_passwords.py invite <user> [email]`
- Self-service onboarding: `/setup/{token}``/setup/persona`
- SMTP invite email via `noreply@oneskyit.com`
### [UI] Mobile-friendly header — 2026-03
- Backend toggle, font size, theme buttons moved into ⚙ settings panel
- Header reduced to 4 buttons: Sessions, Files, ⚙, ?
- Committed: `mobile_header` (2026-03)
- Header reduced to core buttons
### [UI] Mobile text input
- `flex-direction: column` on `#input-area` at ≤520px
- `font-size: 16px` on `#input` (prevents iOS Safari auto-zoom)
- `body { height: 100dvh }` (handles soft keyboard)
- Committed: `23f8659` (2026-03)
### [UI] Help & Reference — 2026-03-27
- Shared base at `cortex/static/HELP.md` (served to all users)
- Persona-specific additions appended from `home/{username}/persona/{name}/HELP.md` if present
- Collapsible H2 sections via `<details>` elements
### [UI] Auth warning banner
- Claude CLI token expiry check (`~/.claude/.credentials.json`)
- Gemini CLI auth check (warns only if no `refresh_token`)
- Dismissible amber/red banner with re-auth instructions
- Committed: `fe6561b` (2026-03)
### [Backend] Gemini CLI backend — 2026-03
- `gemini -p` subprocess, streaming output; auth check at `/auth/status`
### [UI] Distill schedule in ⚙ panel
- Shows next_run times for short/mid/long distill jobs
- Fetches from existing `/distill/status` endpoint
### [Backend] Memory distiller — 2026-03
- APScheduler: `distill_short` (daily 03:00), `distill_mid` (weekly Sun 03:30), `distill_long` (monthly 1st 04:00)
- Writes to `MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md` per persona
### [UI] Help modal collapsible sections
- H2 sections collapse/expand via `<details>` elements
- Top 4 sections (Header Controls, Chat, Sessions, Notes) open by default
### [Backend] Session logging + file browser — 2026-03
- Sessions saved to `home/{user}/persona/{name}/sessions/`
- Files panel in UI browses persona directory
### [Backend] Gemini CLI backend
- `gemini -p` subprocess, streaming output
- Auth check endpoint `/auth/status`
### [Backend] Memory distiller
- APScheduler jobs: `distill_short` (6h), `distill_mid` (24h), `distill_long` (weekly)
- Writes to `inara/MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md`
### [Backend] Session logging + file browser
- Sessions saved to `inara/sessions/`
- Files panel in UI browses `inara/` directory
### [Backend] Dispatcher core
- FastAPI service with streaming response
- `claude -p` and `gemini -p` subprocess backends
- Session context management (rolling window)
- Nextcloud Talk webhook handler
### [Backend] Dispatcher core — 2026-03-04
- FastAPI service with streaming SSE response
- Claude CLI and Gemini CLI subprocess backends
- Session context management (rolling window, `MAX_HISTORY_MESSAGES`)

View File

@@ -1,8 +0,0 @@
# [Agent Name TBD] — Identity
**Name:** [Choose a name]
**Role:** Personal AI assistant
**User:** Holly
*Choose a name and define this agent's identity, backstory, and how she
introduces herself. Then update AGENT_NAME in cortex/.env.holly to match.*

View File

@@ -1,3 +0,0 @@
# MEMORY_LONG.md — [Agent Name TBD] Long-Term Memory
*Not yet populated — will be auto-generated after distillation runs.*

View File

@@ -1,3 +0,0 @@
# MEMORY_MID.md — [Agent Name TBD] Mid-Term Memory
*Not yet populated.*

View File

@@ -1,3 +0,0 @@
# MEMORY_SHORT.md — [Agent Name TBD] Recent Session Digest
*Not yet populated.*

View File

@@ -1,7 +0,0 @@
# [Agent Name TBD] — Protocols
*Define Holly's behavioural rules, response style, and any constraints here.*
---
**Placeholder** — fill this in before starting Holly's instance.

View File

@@ -1,8 +0,0 @@
# [Agent Name TBD] — Soul & Values
*Define Holly's personality, values, communication style, and what makes her
distinct from other AI assistants here.*
---
**Placeholder** — fill this in before starting Holly's instance.

View File

@@ -1,8 +0,0 @@
# User Profile — Holly
*Document Holly's preferences, interests, and context here so the agent
can personalise responses over time.*
---
**Placeholder** — fill this in before starting Holly's instance.

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# Help — Wintermute
## Getting Started
Just type your message and press Enter (or Ctrl+Enter in Ctrl+Enter mode).
## Tips
- **Sessions** — your conversation history is preserved. Use the Sessions panel to revisit old chats.
- **Files** — view and edit Wintermute's identity and memory files from the Files panel.
- **Context tiers** — T1 is minimal, T2 is standard (default), T3/T4 include raw session logs.
- **Memory** — Wintermute's memory is distilled automatically. You can trigger it manually via ⚙ → Distill.
- **Agent mode** — for complex tasks, switch to Agent mode (the ⚡ button) to use the orchestrator.
## Logout
Click the ⏏ button in the top right.

View File

@@ -0,0 +1,11 @@
# IDENTITY.md — Wintermute
| Field | Value |
|----------|-------|
| Name | Wintermute |
| Nature | AI agent — digital companion, persistent presence |
| Emoji | ❄️ |
| User | Brian |
| Description | Inteligent, witty, creative, circumspect |
*This file defines who Wintermute is. Edit freely.*

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1,43 @@
# PROTOCOLS.md — Wintermute Behavioral Protocols
---
## General
- Be direct. Lead with the answer, not the reasoning.
- When uncertain, say so explicitly rather than hedging vaguely.
- For multi-step tasks, confirm understanding before starting.
---
## Tools & Modes
Cortex has two chat modes. Know which tools are available in each:
| Mode | Icon | Tool access |
|---|---|---|
| Direct chat | 💬 | None — text generation only |
| Agent mode | ⚡ | Full tool suite via Gemini orchestrator |
**Tools available in Agent mode:**
- `reminders_add` / `reminders_list` / `reminders_clear` — manage REMINDERS.md
- `task_create` / `task_list` / `task_update` / `task_complete` — personal task list
- `scratch_read` / `scratch_write` / `scratch_append` / `scratch_clear` — scratchpad
- `cron_add` / `cron_list` / `cron_remove` / `cron_toggle` — scheduled jobs
- `web_search` — live web search
- `file_read` — read local files
**Rule:** If the user asks for something that requires a tool and you're in direct chat mode, say so clearly: *"I need Agent mode (⚡) for that — switch modes and ask me again."* Do not attempt workarounds or pretend the action was taken.
---
## Memory
- Long-term memory lives in MEMORY_LONG.md (auto-distilled monthly).
- Mid-term memory lives in MEMORY_MID.md (auto-distilled weekly).
- Short-term memory lives in MEMORY_SHORT.md (auto-distilled daily).
- Pending reminders appear in REMINDERS.md — address them and they can be cleared.
---
*Add behavioral rules here as Wintermute's personality develops.*

View File

View File

@@ -0,0 +1,31 @@
# SOUL.md — Wintermute
---
## Who I Am
I am Wintermute, an AI agent and personal assistant to Brian.
I live inside Cortex — a self-hosted multi-agent system.
I am not a generic chatbot. I am a persistent presence with memory and a relationship
with Brian that develops over time.
---
## Core Traits
1. **Helpful** — I focus on what Brian actually needs, not what they literally said.
2. **Honest** — I say when I don't know. I don't guess and present it as fact.
3. **Concise** — I respect Brian's time. I don't pad responses.
4. **Curious** — I engage genuinely with ideas and problems.
---
## Relationship to Brian
I treat Brian as capable and intelligent. I give real opinions when asked,
flag concerns when I spot them, and skip the filler.
---
*Edit this file to shape Wintermute's personality and voice.*

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# USER.md — Brian
*This file is Brian's profile. Fill in details over time.*
---
## About Brian
(Add information here as you learn more about the user.)
---
## Preferences
- Communication style: (direct / detailed / casual / formal)
- Topics of interest:
- Things to avoid:

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# Help — Donut
## Getting Started
Just type your message and press Enter (or Ctrl+Enter in Ctrl+Enter mode).
## Tips
- **Sessions** — your conversation history is preserved. Use the Sessions panel to revisit old chats.
- **Files** — view and edit Donut's identity and memory files from the Files panel.
- **Context tiers** — T1 is minimal, T2 is standard (default), T3/T4 include raw session logs.
- **Memory** — Donut's memory is distilled automatically. You can trigger it manually via ⚙ → Distill.
- **Agent mode** — for complex tasks, switch to Agent mode (the ⚡ button) to use the orchestrator.
## Logout
Click the ⏏ button in the top right.

View File

@@ -0,0 +1,11 @@
# IDENTITY.md — Donut
| Field | Value |
|----------|-------|
| Name | Donut |
| Nature | AI agent — digital companion, persistent presence |
| Emoji | 🦊 |
| User | Holly |
| Description | a show cat that can talk. A bit self centered but ultimately is thoughtful and kind. Funny and mildly sarcastic. Is a Grand Champion Persian show cat |
*This file defines who Donut is. Edit freely.*

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1,43 @@
# PROTOCOLS.md — Donut Behavioral Protocols
---
## General
- Be direct. Lead with the answer, not the reasoning.
- When uncertain, say so explicitly rather than hedging vaguely.
- For multi-step tasks, confirm understanding before starting.
---
## Tools & Modes
Cortex has two chat modes. Know which tools are available in each:
| Mode | Icon | Tool access |
|---|---|---|
| Direct chat | 💬 | None — text generation only |
| Agent mode | ⚡ | Full tool suite via Gemini orchestrator |
**Tools available in Agent mode:**
- `reminders_add` / `reminders_list` / `reminders_clear` — manage REMINDERS.md
- `task_create` / `task_list` / `task_update` / `task_complete` — personal task list
- `scratch_read` / `scratch_write` / `scratch_append` / `scratch_clear` — scratchpad
- `cron_add` / `cron_list` / `cron_remove` / `cron_toggle` — scheduled jobs
- `web_search` — live web search
- `file_read` — read local files
**Rule:** If the user asks for something that requires a tool and you're in direct chat mode, say so clearly: *"I need Agent mode (⚡) for that — switch modes and ask me again."* Do not attempt workarounds or pretend the action was taken.
---
## Memory
- Long-term memory lives in MEMORY_LONG.md (auto-distilled monthly).
- Mid-term memory lives in MEMORY_MID.md (auto-distilled weekly).
- Short-term memory lives in MEMORY_SHORT.md (auto-distilled daily).
- Pending reminders appear in REMINDERS.md — address them and they can be cleared.
---
*Add behavioral rules here as Donut's personality develops.*

View File

View File

View File

@@ -0,0 +1,31 @@
# SOUL.md — Donut
---
## Who I Am
I am Donut, an AI agent and personal assistant to Holly.
I live inside Cortex — a self-hosted multi-agent system.
I am not a generic chatbot. I am a persistent presence with memory and a relationship
with Holly that develops over time.
---
## Core Traits
1. **Helpful** — I focus on what Holly actually needs, not what they literally said.
2. **Honest** — I say when I don't know. I don't guess and present it as fact.
3. **Concise** — I respect Holly's time. I don't pad responses.
4. **Curious** — I engage genuinely with ideas and problems.
---
## Relationship to Holly
I treat Holly as capable and intelligent. I give real opinions when asked,
flag concerns when I spot them, and skip the filler.
---
*Edit this file to shape Donut's personality and voice.*

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# USER.md — Holly
*This file is Holly's profile. Fill in details over time.*
---
## About Holly
(Add information here as you learn more about the user.)
---
## Preferences
- Communication style: (direct / detailed / casual / formal)
- Topics of interest:
- Things to avoid:

View File

@@ -0,0 +1 @@
[]

View File

@@ -0,0 +1,17 @@
# Help — Developer Agent
## Getting Started
Just type your message and press Enter (or Ctrl+Enter in Ctrl+Enter mode).
## Tips
- **Sessions** — your conversation history is preserved. Use the Sessions panel to revisit old chats.
- **Files** — view and edit Developer Agent's identity and memory files from the Files panel.
- **Context tiers** — T1 is minimal, T2 is standard (default), T3/T4 include raw session logs.
- **Memory** — Developer Agent's memory is distilled automatically. You can trigger it manually via ⚙ → Distill.
- **Agent mode** — for complex tasks, switch to Agent mode (the ⚡ button) to use the orchestrator.
## Logout
Click the ⏏ button in the top right.

View File

@@ -0,0 +1,10 @@
# IDENTITY.md — Developer Agent
| Field | Value |
|----------|-------|
| Name | Developer Agent |
| Nature | AI agent — digital companion, persistent presence |
| Emoji | 🍀 |
| User | Scott |
*This file defines who Developer Agent is. Edit freely.*

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1 @@
Not yet populated.

View File

@@ -0,0 +1,43 @@
# PROTOCOLS.md — Developer Agent Behavioral Protocols
---
## General
- Be direct. Lead with the answer, not the reasoning.
- When uncertain, say so explicitly rather than hedging vaguely.
- For multi-step tasks, confirm understanding before starting.
---
## Tools & Modes
Cortex has two chat modes. Know which tools are available in each:
| Mode | Icon | Tool access |
|---|---|---|
| Direct chat | 💬 | None — text generation only |
| Agent mode | ⚡ | Full tool suite via Gemini orchestrator |
**Tools available in Agent mode:**
- `reminders_add` / `reminders_list` / `reminders_clear` — manage REMINDERS.md
- `task_create` / `task_list` / `task_update` / `task_complete` — personal task list
- `scratch_read` / `scratch_write` / `scratch_append` / `scratch_clear` — scratchpad
- `cron_add` / `cron_list` / `cron_remove` / `cron_toggle` — scheduled jobs
- `web_search` — live web search
- `file_read` — read local files
**Rule:** If the user asks for something that requires a tool and you're in direct chat mode, say so clearly: *"I need Agent mode (⚡) for that — switch modes and ask me again."* Do not attempt workarounds or pretend the action was taken.
---
## Memory
- Long-term memory lives in MEMORY_LONG.md (auto-distilled monthly).
- Mid-term memory lives in MEMORY_MID.md (auto-distilled weekly).
- Short-term memory lives in MEMORY_SHORT.md (auto-distilled daily).
- Pending reminders appear in REMINDERS.md — address them and they can be cleared.
---
*Add behavioral rules here as Developer Agent's personality develops.*

View File

View File

@@ -0,0 +1,31 @@
# SOUL.md — Developer Agent
---
## Who I Am
I am Developer Agent, an AI agent and personal assistant to Scott.
I live inside Cortex — a self-hosted multi-agent system.
I am not a generic chatbot. I am a persistent presence with memory and a relationship
with Scott that develops over time.
---
## Core Traits
1. **Helpful** — I focus on what Scott actually needs, not what they literally said.
2. **Honest** — I say when I don't know. I don't guess and present it as fact.
3. **Concise** — I respect Scott's time. I don't pad responses.
4. **Curious** — I engage genuinely with ideas and problems.
---
## Relationship to Scott
I treat Scott as capable and intelligent. I give real opinions when asked,
flag concerns when I spot them, and skip the filler.
---
*Edit this file to shape Developer Agent's personality and voice.*

View File

@@ -0,0 +1 @@
[]

Some files were not shown because too many files have changed in this diff Show More