- http_post: POST to external URLs with per-user URL prefix allowlist
(home/{user}/http_allowlist.json); admin-only, confirm-required
- nc_talk_history: read recent NC Talk messages via Basic Auth (requires
nc_username + nc_app_password in channels.json under nextcloud)
- openai_orchestrator: _chat_with_retry() wraps both API calls with
exponential backoff (3 attempts, 1s/2s) on connection errors and
transient status codes (429, 500, 502, 503, 504)
- Docs updated: CLAUDE.md, HELP.md, TODO, MASTER, ROADMAP (50 tools)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cortex / Inara — Project Root
Owner: Scott Idem (One Sky IT / Danger Zone) Started: 2026-03-04 Status: Active development
"You can't stop the signal."
Cortex is a self-hosted multi-agent AI platform. It supports multiple users, each with their own named AI persona.
Where Cortex Fits
AI tools aren't one-size-fits-all. Cortex exists in a specific niche — it's not trying to be everything.
Cortex is a self-hosted persona platform. It gives you a persistent AI companion with its own identity, memory, and voice — reachable through your chat apps, not just a browser tab. It remembers who you are across days and weeks. It can proactively message you on a schedule. It runs on your own hardware, behind your own auth.
What Cortex is good at
- Being a consistent AI presence — same persona, same memory, day after day
- Multi-channel access — web, Nextcloud Talk, Google Chat, all routed to the same brain
- Proactive work — scheduled messages, reminders, cron jobs that reach out to you
- Multi-user households — each person gets their own persona (Scott → Inara, Holly → Tina)
- Private, offline-capable — local models via Ollama when you don't want anything leaving the LAN
What Cortex is not
- Not a coding assistant. Cortex lives in chat apps, not in your terminal or IDE. Use Claude Code, DeepSeek TUI, Gemini CLI, or Copilot for code-level work — they specialize in reading and editing project files. Cortex can't open a codebase.
- Not a generic LLM chat UI. Open WebUI and LibreChat are excellent model-switching frontends. Cortex isn't a frontend — it's a platform with its own identity system, orchestrator, and memory pipeline. Two different jobs.
- Not a SaaS product. Nobody else hosts your Cortex instance. Nobody else sees your conversations.
The trade-off is you manage the service yourself —
systemctl --user restart cortex. - Not an agent framework. LangChain, CrewAI, and similar are libraries for building AI pipelines. Cortex is a running service with concrete personas, not an abstraction layer to build on top of.
The stack in practice
- Use Cortex to talk to Inara — daily assistant, memory keeper, scheduled check-ins
- Use Claude Code / DeepSeek TUI to work on Cortex — code edits, architecture, debugging
- Use Open WebUI when you want to test a new model or run a quick prompt without persona context
Same AI, different interfaces for different jobs.
Quick Orientation
| Directory | What it is |
|---|---|
cortex/ |
FastAPI service — dispatcher, routing, LLM backends, session management |
home/ |
User and persona data (home/{username}/persona/{name}/) |
docs/ |
Integration reference docs (NC Talk bot, Google Chat bot) |
documentation/ |
Architecture decisions, project plans, agent task lists |
Multi-User Layout
Persona data lives in a two-level tree modelled on Linux home directories:
home/
scott/
persona/
inara/ ← IDENTITY.md, SOUL.md, MEMORY_*.md, sessions/, TASKS.json, …
holly/
persona/
tina/
[username]/
persona/
[name]/
Each HTTP request includes user and persona fields. The service validates both against
the home/ tree before routing. ContextVars ensure per-request isolation in async code.
Naming rules (same as Linux usernames): lowercase letters, digits, _, -; must start
with a letter or underscore; max 32 characters. Example: scott, holly, my_ai-v2.
Setup / Install
Run install.py on any machine to set up or update Cortex. It is idempotent — safe to re-run.
python3 install.py # install / update everything
python3 install.py --check # status check only, no changes
What it does: creates the Python venv, installs dependencies, writes the systemd user service, enables linger, starts/restarts the service, checks LLM CLI auth, and sets up the daily backup timer.
Config: copy cortex/.env.default to cortex/.env and fill in secrets before first run.
Running Cortex
Cortex runs as a systemd user service (no sudo required).
# Start / stop / restart
systemctl --user start cortex
systemctl --user stop cortex
systemctl --user restart cortex
# Status and logs
systemctl --user status cortex
journalctl --user -u cortex -f
# Web UI
http://localhost:8000 (or cortex.dgrzone.com on WireGuard)
The service starts automatically at boot via loginctl enable-linger.
Service file: ~/.config/systemd/user/cortex.service
Config lives in cortex/config.py and cortex/.env (not tracked — see cortex/.env.default).
Development Workflow
The codebase lives in agents_sync/ and syncs to all fleet machines via Syncthing.
Edit code on any machine; use dev-restart.sh to apply changes on the host running the service.
./dev-restart.sh # restart service, show last 30 log lines
./dev-restart.sh logs # tail live logs (ctrl-c to stop)
./dev-restart.sh status # show service status only
Backup
Persona data (home/) is excluded from git and backed up with restic.
install.py sets up a systemd timer that runs backup.sh daily at 03:00.
./backup.sh # run a backup manually
# Inspect snapshots (set env vars or export them)
RESTIC_REPOSITORY=~/backups/cortex-home-restic \
RESTIC_PASSWORD_FILE=~/.config/cortex/restic-password \
restic snapshots
The restic password is generated at ~/.config/cortex/restic-password on first install.
Back it up separately — it is required to restore from any snapshot.
Key Documentation
Start here for a full picture: documentation/MASTER.md
| File | Purpose |
|---|---|
documentation/MASTER.md |
Index — current state, all doc links, quick reference |
documentation/ROADMAP.md |
Phases — what's done, what's next |
documentation/TODO__Agents.md |
Active task list |
documentation/ARCH__SYSTEM.md |
System architecture and component map |
documentation/ARCH__BACKENDS.md |
LLM backends, routing, fallback |
documentation/ARCH__PERSONA.md |
Persona system, context tiers, memory distillation |
documentation/ARCH__CHANNELS.md |
Input channels — web, NC Talk, Google Chat, cron |
documentation/ARCH__FUTURE.md |
Planned features — local orchestrator, dev agents, knowledge layer |
docs/NEXTCLOUD_TALK_BOT.md |
NC Talk bot setup and troubleshooting |
docs/GOOGLE_CHAT_BOT.md |
Google Chat Add-on setup |
docs/OPEN_WEBUI_API.md |
Open WebUI/Ollama API reference |
Architecture at a Glance
[Web UI / NC Talk / Google Chat / Cron / Webhooks]
↓
Cortex Dispatcher (FastAPI, cortex/)
├─ POST /chat — direct to LLM (streaming SSE)
├─ POST /orchestrate — Gemini tool loop → Claude response
├─ POST /webhook/nextcloud/{username} — Nextcloud Talk bot (per-user)
└─ POST /channels/google-chat/{username} — Google Chat Add-on (per-user)
↓
LLM Backends
• Claude CLI — primary, all user-facing responses
• Gemini CLI — fallback
• Gemini API — orchestrator tool loop (two-brain: Gemini plans, Claude responds)
• Local OpenAI — Open WebUI/Ollama on scott_gaming; also runs local orchestrator loop
↓
Persona context loaded from home/{user}/persona/{name}/
See documentation/ARCH__SYSTEM.md for the full architecture breakdown.
Personas
Each persona has its own identity, memory, and session history.
They are not tied to a specific LLM model — the name is fixed, the backend varies.
Context is loaded at request time from home/{user}/persona/{name}/ via cortex/context_loader.py.
| User | Persona | Description |
|---|---|---|
| scott | inara | Scott's primary AI assistant |
| scott | developer | Scott's dev-focused persona |
| holly | tina | Holly's primary AI assistant |
| brian | wintermute | Brian's primary AI assistant |
Channels
Webhook endpoints are per-user — each user configures their own secrets in home/{username}/channels.json.
| Channel | Status | Endpoint / Notes |
|---|---|---|
| Web UI | Live | https://cortex.dgrzone.com — session auth (login form + JWT cookie) |
| Nextcloud Talk | Live | POST /webhook/nextcloud/{username} — HMAC-signed, async reply |
| Google Chat | Live | POST /channels/google-chat/{username} — Workspace Add-on, JWT auth |
| Browser Push | Live | VAPID push notifications — subscribe via ☰ menu; proactive reminders + distill alerts |
See docs/NEXTCLOUD_TALK_BOT.md and docs/GOOGLE_CHAT_BOT.md for setup instructions.
User Management
cd cortex
# Create a user directory and send an invite email
.venv/bin/python manage_passwords.py invite <username> <email>
# Register a Google account for sign-in (run after user completes onboarding)
.venv/bin/python manage_passwords.py google-add <username> <email>
# List users with password, Google, and email status
.venv/bin/python manage_passwords.py list
# Set/check a password directly
.venv/bin/python manage_passwords.py set <username>
.venv/bin/python manage_passwords.py check <username>
New users receive a link to /setup/{token} where they set their own password and create their first persona. Invite tokens expire in 72 hours and are one-time-use.
To enable a channel for a user, create home/{username}/channels.json — see the relevant doc in docs/.
Testing
cd cortex
.venv/bin/python -m pytest tests/ -q
80 tests covering API endpoints, persona routing, tool functions, and security.
Related Projects
| Project | Path |
|---|---|
| Aether Platform API | ~/OSIT_dev/aether_api_fastapi/ |
| Aether Frontend | ~/OSIT_dev/aether_app_sveltekit/ |
| Fleet coordination | ~/agents_sync/ |