Scott Idem a4daebdc9b feat: local LLM multi-model, session search, cron proactive types, notifications, docs overhaul
Local LLM:
- user_settings.py: per-user hosts/models config (local_llm.json)
- routers/local_llm.py + static/local_llm.html: dedicated settings page
- llm_client.py: local OpenAI-compatible backend via httpx
- config.py: LOCAL_API_URL/KEY/MODEL + per-backend timeouts
- Active model shown near backend toggle (amber hint text)

Memory distillation:
- memory_distiller.py: DISTILL_BACKEND_MID/LONG .env overrides
- scheduler.py + notification.py: notify NC Talk after mid/long distill
- notification.py: outbound channel abstraction (NC Talk, extensible)

Session search:
- routers/files.py: GET /sessions/search?q= with excerpts grouped by date
- static/index.html + app.js: search UI in file sidebar with highlight
- _esc() helper to prevent XSS in search results

Proactive cron:
- cron_runner.py: new job types — message (send directly) and brief (LLM + send)
- Both support optional per-job channel override

Channels:
- routers/nextcloud_talk.py: consolidated using notification._send_nct_message()
- routers/auth.py: local backend status in /auth/status
- routers/chat.py: /backend returns {primary, fallback, local_model} object

UI / UX:
- Copy button for user messages (matching assistant)
- Autocomplete disabled on sensitive form fields
- settings.html: local model section replaced with link to /settings/local

Docs overhaul:
- MASTER.md hub + ARCH__SYSTEM/BACKENDS/PERSONA/CHANNELS/FUTURE.md
- ARCH__Intelligence_Layer.md replaced with redirect table
- CORTEX.md trimmed to vision only; README updated
- OPEN_WEBUI_API.md added to docs/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 20:53:06 -04:00

Cortex / Inara — Project Root

Owner: Scott Idem (One Sky IT / Danger Zone) Started: 2026-03-04 Status: Active development

"You can't stop the signal."

Cortex is a self-hosted multi-agent AI platform. It supports multiple users, each with their own named AI persona.


Quick Orientation

Directory What it is
cortex/ FastAPI service — dispatcher, routing, LLM backends, session management
home/ User and persona data (home/{username}/persona/{name}/)
docs/ Integration reference docs (NC Talk bot, Google Chat bot)
documentation/ Architecture decisions, project plans, agent task lists

Multi-User Layout

Persona data lives in a two-level tree modelled on Linux home directories:

home/
  scott/
    persona/
      inara/       ← IDENTITY.md, SOUL.md, MEMORY_*.md, sessions/, TASKS.json, …
  holly/
    persona/
      tina/
  [username]/
    persona/
      [name]/

Each HTTP request includes user and persona fields. The service validates both against the home/ tree before routing. ContextVars ensure per-request isolation in async code.

Naming rules (same as Linux usernames): lowercase letters, digits, _, -; must start with a letter or underscore; max 32 characters. Example: scott, holly, my_ai-v2.


Running Cortex

Cortex runs as a systemd user service (no sudo required).

# Start / stop / restart
systemctl --user start cortex
systemctl --user stop cortex
systemctl --user restart cortex

# Status and logs
systemctl --user status cortex
journalctl --user -u cortex -f

# Web UI
http://localhost:8000   (or cortex.dgrzone.com on WireGuard)

The service starts automatically at boot via loginctl enable-linger. Service file: ~/.config/systemd/user/cortex.service

Config lives in cortex/config.py and cortex/.env (not tracked — see cortex/.env.example).


Key Documentation

Start here for a full picture: documentation/MASTER.md

File Purpose
documentation/MASTER.md Index — current state, all doc links, quick reference
documentation/ROADMAP.md Phases — what's done, what's next
documentation/TODO__Agents.md Active task list
documentation/ARCH__SYSTEM.md System architecture and component map
documentation/ARCH__BACKENDS.md LLM backends, routing, fallback
documentation/ARCH__PERSONA.md Persona system, context tiers, memory distillation
documentation/ARCH__CHANNELS.md Input channels — web, NC Talk, Google Chat, cron
documentation/ARCH__FUTURE.md Planned features — local orchestrator, dev agents, knowledge layer
docs/NEXTCLOUD_TALK_BOT.md NC Talk bot setup and troubleshooting
docs/GOOGLE_CHAT_BOT.md Google Chat Add-on setup
docs/OPEN_WEBUI_API.md Open WebUI/Ollama API reference

Architecture at a Glance

[Web UI / NC Talk / Google Chat / Cron / Webhooks]
        ↓
  Cortex Dispatcher  (FastAPI, cortex/)
    ├─ POST /chat                            — direct to LLM (streaming SSE)
    ├─ POST /orchestrate                     — Gemini tool loop → Claude response
    ├─ POST /webhook/nextcloud/{username}    — Nextcloud Talk bot (per-user)
    └─ POST /channels/google-chat/{username} — Google Chat Add-on (per-user)
        ↓
  LLM Backends
  • Claude CLI   — primary, all user-facing responses
  • Gemini CLI   — fallback
  • Gemini API   — orchestrator tool loop only (not general chat)
  • Local        — Open WebUI/Ollama on scott_gaming (private/offline)
        ↓
  Persona context loaded from home/{user}/persona/{name}/

See documentation/ARCH__SYSTEM.md for the full architecture breakdown.


Personas

Each persona has its own identity, memory, and session history. They are not tied to a specific LLM model — the name is fixed, the backend varies. Context is loaded at request time from home/{user}/persona/{name}/ via cortex/context_loader.py.

User Persona Description
scott inara Scott's primary AI assistant
scott developer Scott's dev-focused persona
holly tina Holly's primary AI assistant
brian wintermute Brian's primary AI assistant

Channels

Webhook endpoints are per-user — each user configures their own secrets in home/{username}/channels.json.

Channel Status Endpoint
Web UI Live https://cortex.dgrzone.com — session auth (login form + JWT cookie)
Nextcloud Talk Live POST /webhook/nextcloud/{username} — HMAC-signed, async reply
Google Chat Live POST /channels/google-chat/{username} — Workspace Add-on, JWT auth

See docs/NEXTCLOUD_TALK_BOT.md and docs/GOOGLE_CHAT_BOT.md for setup instructions.


User Management

cd cortex

# Create a user directory and send an invite email
.venv/bin/python manage_passwords.py invite <username> <email>

# Register a Google account for sign-in (run after user completes onboarding)
.venv/bin/python manage_passwords.py google-add <username> <email>

# List users with password, Google, and email status
.venv/bin/python manage_passwords.py list

# Set/check a password directly
.venv/bin/python manage_passwords.py set <username>
.venv/bin/python manage_passwords.py check <username>

New users receive a link to /setup/{token} where they set their own password and create their first persona. Invite tokens expire in 72 hours and are one-time-use.

To enable a channel for a user, create home/{username}/channels.json — see the relevant doc in docs/.


Testing

cd cortex
.venv/bin/python -m pytest tests/ -q

80 tests covering API endpoints, persona routing, tool functions, and security.


Project Path
Aether Platform API ~/OSIT_dev/aether_api_fastapi/
Aether Frontend ~/OSIT_dev/aether_app_sveltekit/
Fleet coordination ~/agents_sync/
Description
No description provided
Readme 3.9 MiB
Languages
Python 69.1%
HTML 14.1%
JavaScript 10.2%
CSS 6.2%
Shell 0.3%