Local LLM:
- user_settings.py: per-user hosts/models config (local_llm.json)
- routers/local_llm.py + static/local_llm.html: dedicated settings page
- llm_client.py: local OpenAI-compatible backend via httpx
- config.py: LOCAL_API_URL/KEY/MODEL + per-backend timeouts
- Active model shown near backend toggle (amber hint text)
Memory distillation:
- memory_distiller.py: DISTILL_BACKEND_MID/LONG .env overrides
- scheduler.py + notification.py: notify NC Talk after mid/long distill
- notification.py: outbound channel abstraction (NC Talk, extensible)
Session search:
- routers/files.py: GET /sessions/search?q= with excerpts grouped by date
- static/index.html + app.js: search UI in file sidebar with highlight
- _esc() helper to prevent XSS in search results
Proactive cron:
- cron_runner.py: new job types — message (send directly) and brief (LLM + send)
- Both support optional per-job channel override
Channels:
- routers/nextcloud_talk.py: consolidated using notification._send_nct_message()
- routers/auth.py: local backend status in /auth/status
- routers/chat.py: /backend returns {primary, fallback, local_model} object
UI / UX:
- Copy button for user messages (matching assistant)
- Autocomplete disabled on sensitive form fields
- settings.html: local model section replaced with link to /settings/local
Docs overhaul:
- MASTER.md hub + ARCH__SYSTEM/BACKENDS/PERSONA/CHANNELS/FUTURE.md
- ARCH__Intelligence_Layer.md replaced with redirect table
- CORTEX.md trimmed to vision only; README updated
- OPEN_WEBUI_API.md added to docs/
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
190 lines
5.9 KiB
Markdown
190 lines
5.9 KiB
Markdown
# Cortex / Inara — Project Root
|
|
|
|
**Owner:** Scott Idem (One Sky IT / Danger Zone)
|
|
**Started:** 2026-03-04
|
|
**Status:** Active development
|
|
|
|
> *"You can't stop the signal."*
|
|
|
|
Cortex is a self-hosted multi-agent AI platform. It supports multiple users, each with their own named AI persona.
|
|
|
|
---
|
|
|
|
## Quick Orientation
|
|
|
|
| Directory | What it is |
|
|
|---|---|
|
|
| `cortex/` | FastAPI service — dispatcher, routing, LLM backends, session management |
|
|
| `home/` | User and persona data (`home/{username}/persona/{name}/`) |
|
|
| `docs/` | Integration reference docs (NC Talk bot, Google Chat bot) |
|
|
| `documentation/` | Architecture decisions, project plans, agent task lists |
|
|
|
|
---
|
|
|
|
## Multi-User Layout
|
|
|
|
Persona data lives in a two-level tree modelled on Linux home directories:
|
|
|
|
```
|
|
home/
|
|
scott/
|
|
persona/
|
|
inara/ ← IDENTITY.md, SOUL.md, MEMORY_*.md, sessions/, TASKS.json, …
|
|
holly/
|
|
persona/
|
|
tina/
|
|
[username]/
|
|
persona/
|
|
[name]/
|
|
```
|
|
|
|
Each HTTP request includes `user` and `persona` fields. The service validates both against
|
|
the `home/` tree before routing. ContextVars ensure per-request isolation in async code.
|
|
|
|
**Naming rules** (same as Linux usernames): lowercase letters, digits, `_`, `-`; must start
|
|
with a letter or underscore; max 32 characters. Example: `scott`, `holly`, `my_ai-v2`.
|
|
|
|
---
|
|
|
|
## Running Cortex
|
|
|
|
Cortex runs as a **systemd user service** (no sudo required).
|
|
|
|
```bash
|
|
# Start / stop / restart
|
|
systemctl --user start cortex
|
|
systemctl --user stop cortex
|
|
systemctl --user restart cortex
|
|
|
|
# Status and logs
|
|
systemctl --user status cortex
|
|
journalctl --user -u cortex -f
|
|
|
|
# Web UI
|
|
http://localhost:8000 (or cortex.dgrzone.com on WireGuard)
|
|
```
|
|
|
|
The service starts automatically at boot via `loginctl enable-linger`.
|
|
Service file: `~/.config/systemd/user/cortex.service`
|
|
|
|
Config lives in `cortex/config.py` and `cortex/.env` (not tracked — see `cortex/.env.example`).
|
|
|
|
---
|
|
|
|
## Key Documentation
|
|
|
|
**Start here for a full picture:** [`documentation/MASTER.md`](documentation/MASTER.md)
|
|
|
|
| File | Purpose |
|
|
|---|---|
|
|
| `documentation/MASTER.md` | Index — current state, all doc links, quick reference |
|
|
| `documentation/ROADMAP.md` | Phases — what's done, what's next |
|
|
| `documentation/TODO__Agents.md` | Active task list |
|
|
| `documentation/ARCH__SYSTEM.md` | System architecture and component map |
|
|
| `documentation/ARCH__BACKENDS.md` | LLM backends, routing, fallback |
|
|
| `documentation/ARCH__PERSONA.md` | Persona system, context tiers, memory distillation |
|
|
| `documentation/ARCH__CHANNELS.md` | Input channels — web, NC Talk, Google Chat, cron |
|
|
| `documentation/ARCH__FUTURE.md` | Planned features — local orchestrator, dev agents, knowledge layer |
|
|
| `docs/NEXTCLOUD_TALK_BOT.md` | NC Talk bot setup and troubleshooting |
|
|
| `docs/GOOGLE_CHAT_BOT.md` | Google Chat Add-on setup |
|
|
| `docs/OPEN_WEBUI_API.md` | Open WebUI/Ollama API reference |
|
|
|
|
---
|
|
|
|
## Architecture at a Glance
|
|
|
|
```
|
|
[Web UI / NC Talk / Google Chat / Cron / Webhooks]
|
|
↓
|
|
Cortex Dispatcher (FastAPI, cortex/)
|
|
├─ POST /chat — direct to LLM (streaming SSE)
|
|
├─ POST /orchestrate — Gemini tool loop → Claude response
|
|
├─ POST /webhook/nextcloud/{username} — Nextcloud Talk bot (per-user)
|
|
└─ POST /channels/google-chat/{username} — Google Chat Add-on (per-user)
|
|
↓
|
|
LLM Backends
|
|
• Claude CLI — primary, all user-facing responses
|
|
• Gemini CLI — fallback
|
|
• Gemini API — orchestrator tool loop only (not general chat)
|
|
• Local — Open WebUI/Ollama on scott_gaming (private/offline)
|
|
↓
|
|
Persona context loaded from home/{user}/persona/{name}/
|
|
```
|
|
|
|
See `documentation/ARCH__SYSTEM.md` for the full architecture breakdown.
|
|
|
|
---
|
|
|
|
## Personas
|
|
|
|
Each persona has its own identity, memory, and session history.
|
|
They are not tied to a specific LLM model — the name is fixed, the backend varies.
|
|
Context is loaded at request time from `home/{user}/persona/{name}/` via `cortex/context_loader.py`.
|
|
|
|
| User | Persona | Description |
|
|
|---|---|---|
|
|
| scott | inara | Scott's primary AI assistant |
|
|
| scott | developer | Scott's dev-focused persona |
|
|
| holly | tina | Holly's primary AI assistant |
|
|
| brian | wintermute | Brian's primary AI assistant |
|
|
|
|
---
|
|
|
|
## Channels
|
|
|
|
Webhook endpoints are per-user — each user configures their own secrets in `home/{username}/channels.json`.
|
|
|
|
| Channel | Status | Endpoint |
|
|
|---|---|---|
|
|
| Web UI | Live | `https://cortex.dgrzone.com` — session auth (login form + JWT cookie) |
|
|
| Nextcloud Talk | Live | `POST /webhook/nextcloud/{username}` — HMAC-signed, async reply |
|
|
| Google Chat | Live | `POST /channels/google-chat/{username}` — Workspace Add-on, JWT auth |
|
|
|
|
See `docs/NEXTCLOUD_TALK_BOT.md` and `docs/GOOGLE_CHAT_BOT.md` for setup instructions.
|
|
|
|
---
|
|
|
|
## User Management
|
|
|
|
```bash
|
|
cd cortex
|
|
|
|
# Create a user directory and send an invite email
|
|
.venv/bin/python manage_passwords.py invite <username> <email>
|
|
|
|
# Register a Google account for sign-in (run after user completes onboarding)
|
|
.venv/bin/python manage_passwords.py google-add <username> <email>
|
|
|
|
# List users with password, Google, and email status
|
|
.venv/bin/python manage_passwords.py list
|
|
|
|
# Set/check a password directly
|
|
.venv/bin/python manage_passwords.py set <username>
|
|
.venv/bin/python manage_passwords.py check <username>
|
|
```
|
|
|
|
New users receive a link to `/setup/{token}` where they set their own password and create their first persona. Invite tokens expire in 72 hours and are one-time-use.
|
|
|
|
To enable a channel for a user, create `home/{username}/channels.json` — see the relevant doc in `docs/`.
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
cd cortex
|
|
.venv/bin/python -m pytest tests/ -q
|
|
```
|
|
|
|
80 tests covering API endpoints, persona routing, tool functions, and security.
|
|
|
|
---
|
|
|
|
## Related Projects
|
|
|
|
| Project | Path |
|
|
|---|---|
|
|
| Aether Platform API | `~/OSIT_dev/aether_api_fastapi/` |
|
|
| Aether Frontend | `~/OSIT_dev/aether_app_sveltekit/` |
|
|
| Fleet coordination | `~/agents_sync/` |
|