Files
Cortex-Inara/documentation/ARCH__SYSTEM.md
Scott Idem 3672fa1506 docs: comprehensive doc audit — sync all docs to current state
- MASTER.md: tool count 40→47, add proactive notifications + spawn_agent rows, date bump
- ROADMAP.md: mark local orchestrator/web push/proactive notifs/spawn_agent/web_read/session_read as done, date bump
- ARCH__CHANNELS.md: rewrite notification channel config section — all 4 channels, all triggers, on-demand endpoints
- ARCH__SYSTEM.md: update tools/ module list to include files, agents
- README.md: update LLM backends in architecture diagram, add browser push to channels table
- CLAUDE.md: add doc update checklist to Documentation Philosophy section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 13:13:45 -04:00

126 lines
8.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Architecture: System Overview
> How the pieces fit together.
> Last updated: 2026-05-06
---
## Architecture Diagram
```
┌─────────────────────────────────────────────────────────┐
│ INPUT CHANNELS │
│ │
│ Web UI ──────────────────────────────────────────┐ │
│ Nextcloud Talk ──── POST /webhook/nextcloud/{u} ─┤ │
│ Google Chat ─────── POST /channels/google-chat/{u}┤ │
│ Cron / Scheduler ─────────────────────────────────┤ │
│ Webhooks (future) ─────────────────────────────────┘ │
└─────────────────────────────┬───────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ CORTEX DISPATCHER (FastAPI — cortex/) │
│ │
│ auth_middleware.py → validates JWT session cookie │
│ persona.py → resolves user + persona context │
│ context_loader.py → assembles system prompt (tier 1-4)│
│ │
│ POST /chat → direct LLM, streaming SSE │
│ POST /orchestrate → Gemini tool loop → Claude │
│ GET /orchestrate/{id} → poll job result │
└────────────┬───────────────────┬────────────────────────┘
↓ ↓
┌─────────────────┐ ┌──────────────────────────────────┐
│ LLM BACKENDS │ │ PERSONA DATA │
│ │ │ home/{user}/persona/{name}/ │
│ Claude CLI │ │ │
│ Gemini CLI │ │ IDENTITY.md SOUL.md │
│ Gemini API │ │ PROTOCOLS.md MEMORY_*.md │
│ Local (httpx) │ │ USER.md REMINDERS.md │
│ │ │ TASKS.json CRONS.json │
└─────────────────┘ │ sessions/ SCRATCH.md │
└──────────────────────────────────┘
```
Details: [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) | [`ARCH__PERSONA.md`](ARCH__PERSONA.md) | [`ARCH__CHANNELS.md`](ARCH__CHANNELS.md)
---
## Service Layout (`cortex/`)
| File | Purpose |
|---|---|
| `main.py` | App entry point, router registration |
| `config.py` | All settings (pydantic-settings, reads `.env`) |
| `persona.py` | User + persona path resolution, ContextVars |
| `context_loader.py` | Builds system prompt from persona files (tiers 14) |
| `llm_client.py` | All LLM backends — Claude, Gemini CLI, Local |
| `orchestrator_engine.py` | Gemini API ReAct tool loop → Claude handoff |
| `openai_orchestrator.py` | OpenAI-compatible ReAct tool loop (local models via Open WebUI/OpenRouter) |
| `model_registry.py` | Per-user model registry V2 — providers, hosts, models, role assignments |
| `session_store.py` | In-memory + file session persistence (`session_data/{id}.json`) |
| `session_logger.py` | Writes session turns to `sessions/YYYY-MM-DD.md` |
| `memory_distiller.py` | Short/mid/long distill jobs |
| `scheduler.py` | APScheduler — distill jobs + user crons |
| `cron_runner.py` | Cron job storage, schedule parsing, execution |
| `notification.py` | Outbound channel messages (distill alerts, cron proactive) |
| `auth_utils.py` | bcrypt passwords, JWT, invite tokens, channel config |
| `auth_middleware.py` | JWT cookie validation on all routes |
| `tool_audit.py` | JSONL audit log for every orchestrator tool invocation |
| `usage_tracker.py` | Per-user token usage tracking (daily buckets → `usage.json`) |
| `event_bus.py` | Internal SSE pub/sub (NC Talk → browser mirror) |
| `email_utils.py` | SMTP invite emails |
| `persona_template.py` | Bootstrap a new persona directory from templates |
| `routers/` | One file per endpoint group — `chat`, `orchestrator`, `auth`, `files`, `ui`, `settings`, `local_llm`, `distill`, `audit`, `usage`, `push`, `help`, `onboarding`, `auth_google`, `nextcloud_talk`, `google_chat` |
| `tools/` | Orchestrator tool implementations — `web` (search/fetch/web_read), `files` (file_read/write/session_read/search), `tasks`, `scratch`, `reminders`, `cron`, `system`, `notify`, `ae_journals`, `ae_tasks`, `agent_notes`, `agents` (spawn_agent) |
| `static/` | Web UI — `index.html`, `app.js`, `style.css`, `login.html`, `setup.html`, `HELP.md`, `local_llm.html`, `settings.html` |
| `tests/` | pytest suite |
---
## Key Design Decisions
**Two-brain pattern (Gemini orchestrator)** — Gemini API handles tool use (function calling, planning, web search). Claude CLI handles all user-facing responses. Direct chat bypasses the orchestrator entirely.
**Single-model pattern (local orchestrator)** — When the `orchestrator` role resolves to a `local_openai` model, `openai_orchestrator.py` runs the full ReAct loop and produces the final response itself. No Claude handoff — the local model does both reasoning and response.
**Subprocess backends** — Claude and Gemini run as CLI subprocesses (`claude --print`, `gemini -p`). This keeps auth transparent (Claude Code manages tokens) and avoids API costs on the Pro subscription path.
**Local backend via httpx** — Open WebUI's OpenAI-compatible API (`/api/chat/completions`). No CLI wrapper. Per-user host + model config in `local_llm.json`.
**ContextVars for async isolation**`persona.py` uses Python `contextvars.ContextVar` so concurrent requests each see their own user/persona without thread-local hacks.
**Per-user filesystem layout**`home/{user}/persona/{name}/` mirrors Linux home directories. Each persona is a directory of markdown files and JSON. No database. Easy to inspect, edit, and back up.
**No single point of coupling** — tools live in `cortex/tools/`, separate from `ae_*` MCP tools. Channels live in `cortex/routers/`, each self-contained. Adding a channel or tool doesn't touch other subsystems.
**Agent private notes**`AGENT_NOTES.md` per persona, writable only by the orchestrator via `agent_notes_*` tools. Never loaded into user-facing context. Three rolling backups (`bak1``bak3`) are visible read-only in the Files panel. Declared in `tools/agent_notes.py`; usage guidance in `PROTOCOLS.md`.
**No black boxes** — Every component, flow, and design decision is documented. Documentation is updated before implementation of significant changes and verified after. HELP.md is the user-facing contract; ARCH__*.md files are the developer contract; PROTOCOLS.md is the agent contract. If any of these drift from reality, that is a bug.
---
## Onboarding Flow
New users are invited via a one-time token and complete a three-step setup before reaching the chat:
```
1. /setup/{token} → Set password (POST creates session cookie, consumes token)
2. /setup/persona → Create persona (slug, display name, emoji, description)
3. /setup/model → Connect a model — OpenRouter recommended
(skip link goes straight to /{user}/{persona})
```
Step 3 is the planned addition (see `TODO__Agents.md § Guided onboarding`). Before it exists,
users land in the chat with no model configured and must navigate Settings → Model Registry
manually — which is confusing for non-technical users.
**After Step 3:**
- `save_host()` adds OpenRouter (`https://openrouter.ai/api/v1`, type `openai`)
- `save_model()` creates a model entry for the chosen model
- `set_role(chat, primary, model_id)` assigns it as the chat role primary
- Redirect to `/{user}/{persona}`
**Existing users with no model configured** — a dismissable banner is shown in the chat on
load, linking to `/setup/model` (the Step 3 form works standalone, without step labels).