Scott Idem 8baab874f1 feat: replace backend/slot toggle with role selector
The backend toggle now cycles through configured roles (chat, coder,
research, distill, etc.) instead of backup model slots within the chat
role. Each role uses its own primary→backup chain from the registry.

- ChatRequest.slot replaced by chat_role (default "chat")
- GET /backend returns available_roles instead of chat_models
- _available_roles_for_toggle() builds list from defined_roles, excluding
  orchestrator (which has its own Agent mode)
- Model label on responses now reflects the actual role's assigned model
- Toggle is inert when only one role is configured (avoids useless cycling)
- Add "Clear browser cache" button to Account Settings (Connected Accounts)
- Add _role_model_label() helper for cleaner response tag labeling

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-28 19:23:18 -04:00

Cortex / Inara — Project Root

Owner: Scott Idem (One Sky IT / Danger Zone) Started: 2026-03-04 Status: Active development

"You can't stop the signal."

Cortex is a self-hosted multi-agent AI platform. It supports multiple users, each with their own named AI persona.


Quick Orientation

Directory What it is
cortex/ FastAPI service — dispatcher, routing, LLM backends, session management
home/ User and persona data (home/{username}/persona/{name}/)
docs/ Integration reference docs (NC Talk bot, Google Chat bot)
documentation/ Architecture decisions, project plans, agent task lists

Multi-User Layout

Persona data lives in a two-level tree modelled on Linux home directories:

home/
  scott/
    persona/
      inara/       ← IDENTITY.md, SOUL.md, MEMORY_*.md, sessions/, TASKS.json, …
  holly/
    persona/
      tina/
  [username]/
    persona/
      [name]/

Each HTTP request includes user and persona fields. The service validates both against the home/ tree before routing. ContextVars ensure per-request isolation in async code.

Naming rules (same as Linux usernames): lowercase letters, digits, _, -; must start with a letter or underscore; max 32 characters. Example: scott, holly, my_ai-v2.


Setup / Install

Run install.py on any machine to set up or update Cortex. It is idempotent — safe to re-run.

python3 install.py           # install / update everything
python3 install.py --check   # status check only, no changes

What it does: creates the Python venv, installs dependencies, writes the systemd user service, enables linger, starts/restarts the service, checks LLM CLI auth, and sets up the daily backup timer.

Config: copy cortex/.env.default to cortex/.env and fill in secrets before first run.

Running Cortex

Cortex runs as a systemd user service (no sudo required).

# Start / stop / restart
systemctl --user start cortex
systemctl --user stop cortex
systemctl --user restart cortex

# Status and logs
systemctl --user status cortex
journalctl --user -u cortex -f

# Web UI
http://localhost:8000   (or cortex.dgrzone.com on WireGuard)

The service starts automatically at boot via loginctl enable-linger. Service file: ~/.config/systemd/user/cortex.service

Config lives in cortex/config.py and cortex/.env (not tracked — see cortex/.env.default).

Development Workflow

The codebase lives in agents_sync/ and syncs to all fleet machines via Syncthing. Edit code on any machine; use dev-restart.sh to apply changes on the host running the service.

./dev-restart.sh          # restart service, show last 30 log lines
./dev-restart.sh logs     # tail live logs (ctrl-c to stop)
./dev-restart.sh status   # show service status only

Backup

Persona data (home/) is excluded from git and backed up with restic. install.py sets up a systemd timer that runs backup.sh daily at 03:00.

./backup.sh    # run a backup manually

# Inspect snapshots (set env vars or export them)
RESTIC_REPOSITORY=~/backups/cortex-home-restic \
RESTIC_PASSWORD_FILE=~/.config/cortex/restic-password \
restic snapshots

The restic password is generated at ~/.config/cortex/restic-password on first install. Back it up separately — it is required to restore from any snapshot.


Key Documentation

Start here for a full picture: documentation/MASTER.md

File Purpose
documentation/MASTER.md Index — current state, all doc links, quick reference
documentation/ROADMAP.md Phases — what's done, what's next
documentation/TODO__Agents.md Active task list
documentation/ARCH__SYSTEM.md System architecture and component map
documentation/ARCH__BACKENDS.md LLM backends, routing, fallback
documentation/ARCH__PERSONA.md Persona system, context tiers, memory distillation
documentation/ARCH__CHANNELS.md Input channels — web, NC Talk, Google Chat, cron
documentation/ARCH__FUTURE.md Planned features — local orchestrator, dev agents, knowledge layer
docs/NEXTCLOUD_TALK_BOT.md NC Talk bot setup and troubleshooting
docs/GOOGLE_CHAT_BOT.md Google Chat Add-on setup
docs/OPEN_WEBUI_API.md Open WebUI/Ollama API reference

Architecture at a Glance

[Web UI / NC Talk / Google Chat / Cron / Webhooks]
        ↓
  Cortex Dispatcher  (FastAPI, cortex/)
    ├─ POST /chat                            — direct to LLM (streaming SSE)
    ├─ POST /orchestrate                     — Gemini tool loop → Claude response
    ├─ POST /webhook/nextcloud/{username}    — Nextcloud Talk bot (per-user)
    └─ POST /channels/google-chat/{username} — Google Chat Add-on (per-user)
        ↓
  LLM Backends
  • Claude CLI   — primary, all user-facing responses
  • Gemini CLI   — fallback
  • Gemini API   — orchestrator tool loop only (not general chat)
  • Local        — Open WebUI/Ollama on scott_gaming (private/offline)
        ↓
  Persona context loaded from home/{user}/persona/{name}/

See documentation/ARCH__SYSTEM.md for the full architecture breakdown.


Personas

Each persona has its own identity, memory, and session history. They are not tied to a specific LLM model — the name is fixed, the backend varies. Context is loaded at request time from home/{user}/persona/{name}/ via cortex/context_loader.py.

User Persona Description
scott inara Scott's primary AI assistant
scott developer Scott's dev-focused persona
holly tina Holly's primary AI assistant
brian wintermute Brian's primary AI assistant

Channels

Webhook endpoints are per-user — each user configures their own secrets in home/{username}/channels.json.

Channel Status Endpoint
Web UI Live https://cortex.dgrzone.com — session auth (login form + JWT cookie)
Nextcloud Talk Live POST /webhook/nextcloud/{username} — HMAC-signed, async reply
Google Chat Live POST /channels/google-chat/{username} — Workspace Add-on, JWT auth

See docs/NEXTCLOUD_TALK_BOT.md and docs/GOOGLE_CHAT_BOT.md for setup instructions.


User Management

cd cortex

# Create a user directory and send an invite email
.venv/bin/python manage_passwords.py invite <username> <email>

# Register a Google account for sign-in (run after user completes onboarding)
.venv/bin/python manage_passwords.py google-add <username> <email>

# List users with password, Google, and email status
.venv/bin/python manage_passwords.py list

# Set/check a password directly
.venv/bin/python manage_passwords.py set <username>
.venv/bin/python manage_passwords.py check <username>

New users receive a link to /setup/{token} where they set their own password and create their first persona. Invite tokens expire in 72 hours and are one-time-use.

To enable a channel for a user, create home/{username}/channels.json — see the relevant doc in docs/.


Testing

cd cortex
.venv/bin/python -m pytest tests/ -q

80 tests covering API endpoints, persona routing, tool functions, and security.


Project Path
Aether Platform API ~/OSIT_dev/aether_api_fastapi/
Aether Frontend ~/OSIT_dev/aether_app_sveltekit/
Fleet coordination ~/agents_sync/
Description
No description provided
Readme 3.9 MiB
Languages
Python 69.1%
HTML 14.1%
JavaScript 10.2%
CSS 6.2%
Shell 0.3%