# Cortex — Roadmap > Phases and priorities. For active tasks see `TODO__Agents.md`. > Last updated: 2026-04-03 --- ## Phase 0 — Foundation ✅ - Syncthing fleet sync (`agents_sync/`) operational - MCP tools (`ae_*`) available in all Claude Code sessions - Fleet agents running independently on each machine ## Phase 1 — Dispatcher Core ✅ - FastAPI service with streaming SSE responses - Claude CLI and Gemini CLI subprocess backends - Session context management (rolling window, file persistence) - Nextcloud Talk bot (HMAC-signed webhook) - Memory distiller (APScheduler — short/mid/long cycles) - Local web UI (single-page, mobile-responsive) - Auth status monitoring (`/auth/status`, UI banner) - Session logging and file browser ## Phase 2 — Identity & Multi-User ✅ - Inara persona formalized (`IDENTITY.md`, `SOUL.md`, `PROTOCOLS.md`, context tiers) - Two-level user/persona layout (`home/{user}/persona/{name}/`) - Session auth: bcrypt passwords, JWT cookies, invite tokens, Google OAuth - Multi-user live: Scott, Holly, Brian - Per-user channel config (`channels.json`) - Per-user Gemini API key (settings UI) - Help & Reference system (shared base + per-persona additions) - Lucide icons, persona picker page, session persistence across navigation ## Phase 3 — Intelligence Layer (In Progress) - ✅ Gemini API orchestrator (tool loop → Claude responder) - ✅ Tool suite: web search, AE Journal read/write, tasks, scratch, reminders, cron, system - ✅ Agent mode in UI (async job, poll for result) - ✅ Local LLM backend (Open WebUI/Ollama, per-user multi-model config) - ✅ Proactive cron (`message` / `brief` job types → NC Talk) - ✅ Session search (full-text across past session logs) - ✅ Distill notifications (NC Talk after mid/long runs) - ✅ Local backend for distillation (DISTILL_BACKEND_MID/LONG in .env) - [ ] **Local orchestrator** — ReAct tool loop using local model (High priority — see `TODO__Agents.md`) - [ ] Knowledge import — markdown → AE Journals (import script) - [ ] Dev agent pipeline — specialist agents + supervisor + approval gate - [ ] Gitea webhook integration + Actions CI ## Phase 4 — Channel Expansion - ✅ Web UI - ✅ Nextcloud Talk - ✅ Google Chat - [ ] WhatsApp (Business API or bridge — investigating) - [ ] Webhook triggers from Aether platform events ## Phase 5 — Routing Intelligence & Scale - [ ] Intelligent model routing (by task type, privacy, context length) - [ ] Agent-to-agent task delegation across fleet - [ ] Permanent hosting on home server (currently on `scott_lpt`) ## Phase 6 — Infrastructure - [ ] Server DMZ finalized - [ ] WireGuard for all Cortex-accessing devices - [ ] Camera/IoT VLAN segmentation --- ## Deferred / Watching - **Unsloth Gemma 4 GGUFs** — blocked on Ollama v0.20.1 (llama.cpp GGUF metadata issue); switch `agent-support-gemma-*` aliases to Unsloth Q4_K_M when ready - **Speculative decoding** — llama.cpp supports it (E4B + E2B draft ≈ 2x speed); Ollama does not yet - **RAG via Open WebUI** — feed Nextcloud docs into local knowledge collections; possible complement to AE Journals search - **Multi-host local models** — per-user config already supports multiple hosts; routing logic TBD - **WhatsApp** — requires Business API account or a bridge; not started