docs: update TODO — mark completed items from 2026-06-03 session

- aider_run async/notify: done - L2→L3 boundary enforcement: done (default _agent_level=2) - aider_run multi-provider credentials: done - Added remaining item: pass _agent_level=1 from main orchestrators Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: aider multi-provider credentials + test suite green (182/182)
2026-06-03 23:06:47 -04:00 · 2026-06-03 23:00:45 -04:00 · 2026-06-03 22:40:20 -04:00 · 2026-06-03 22:39:44 -04:00 · 2026-05-15 21:36:24 -04:00 · 2026-05-15 21:19:08 -04:00
97 changed files with 18319 additions and 2779 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -25,5 +25,11 @@ tmp/
 *.tmp
 *.log

+# Aider — history files are personal/ephemeral; .aider.conf.yml is project config and IS tracked
+.aider.chat.history.md
+.aider.input.history
+.aider.llm.history
+
 # System files
 .DS_Store
+.aider*
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -22,7 +22,7 @@ Cortex_and_Inara_dev/
    main.py              ← App entry point, router registration
    config.py            ← All settings (pydantic-settings, reads .env)
    persona.py           ← Two-level identity: user + persona, path resolution, ContextVars
-    llm_client.py        ← Claude CLI + Gemini CLI subprocess backends
+    llm_client.py        ← Claude CLI + Gemini CLI subprocess backends + Anthropic SDK direct
    orchestrator_engine.py ← Gemini API ReAct tool loop → Claude handoff
    context_loader.py    ← Builds system prompt from persona files (tier 1–4)
    session_store.py     ← In-memory + file session persistence
@@ -45,12 +45,15 @@ Cortex_and_Inara_dev/
      google_chat.py     ← POST /webhook/google (Google Chat Add-on)
      ui.py              ← Login/logout, /{user}/{persona} UI route, /api/personas
      onboarding.py      ← /setup/{token} password step + /setup/persona creation
+      settings.py        ← /settings, /settings/notifications, /settings/integrations (admin)
+      tools_settings.py  ← /settings/tools
+      crons.py           ← /settings/crons — Schedules web UI (list/add/edit/toggle/remove)
    tools/
      __init__.py        ← Tool registry (Gemini FunctionDeclarations + dispatcher)
      web.py             ← DuckDuckGo web_search tool
      scratch.py         ← Scratchpad tools (scratch_read/write/append/clear)
      tasks.py           ← Personal task management (task_create/list/update/complete)
-      cron.py            ← Scheduled job tools (cron_list/add/remove/toggle)
+      cron.py            ← Scheduled job tools (cron_list/add/remove/toggle); 5 types; hourly/daily/weekly/monthly/yearly schedules
      system.py          ← Local machine tools (claude_allow_dir)
    tests/               ← pytest test suite (80 tests)
    static/              ← Single-page web UI (index.html, style.css, app.js)
@@ -136,9 +139,10 @@ http://localhost:8000/docs
 - **Orchestrated tasks** go to `POST /orchestrate` — returns a job_id, result is polled

 ### LLM Backends
- `llm_client.py` manages Claude CLI (`claude --print`) and Gemini CLI (`gemini -p`) subprocesses
+- `llm_client.py` manages Claude CLI (`claude --print`), Gemini CLI (`gemini -p`), and Anthropic SDK (`anthropic_api` type) subprocesses/calls
 - `orchestrator_engine.py` uses the Gemini **API** (google-genai SDK) — completely separate from the Gemini CLI
 - Claude OAuth token is read live from `~/.claude/.credentials.json` (never rely on stale env var)
+- `anthropic_api` backend: user-configured API key from `providers.anthropic.credentials` in `model_registry.json` — uses `anthropic.AsyncAnthropic`

 ### Tool Strategy
 - Orchestrator tools live in `cortex/tools/` — separate from the `ae_*` MCP tools
@@ -146,8 +150,8 @@ http://localhost:8000/docs
 - Tools are registered in `cortex/tools/__init__.py` as both Gemini FunctionDeclarations and Python callables

 ### Context / Memory
- `context_loader.py` assembles Inara's system prompt from `inara/` files based on tier (1–3)
- Tier 1 = minimal (identity only); Tier 2 = standard (+ memory + user profile); Tier 3 = full
+- `context_loader.py` assembles Inara's system prompt from `inara/` files based on tier (1–4)
+- Tier 1 = minimal (identity only); Tier 2 = standard (+ memory + user profile); Tier 3 = + last 2 sessions; Tier 4 = + last 7 sessions
 - Memory files are written by the distiller or manually — do not delete them

 ### Security / Safety
@@ -160,6 +164,44 @@ http://localhost:8000/docs
 - Passwords are bcrypt-hashed and stored in `home/{username}/auth.json` — never in `.env` or the DB
 - Invite tokens are one-time-use, 72-hour expiry, stored in `home/{username}/invite.json`

+### Onboarding Flow
+New users follow a three-step setup before reaching the chat:
+1. `GET /setup/{token}` → password form → `POST /setup/{token}` sets password + session cookie
+2. `GET /setup/persona` → persona creation form → `POST /setup/persona` bootstraps persona directory
+3. `GET /setup/model` → OpenRouter quick-connect → `POST /setup/model` saves host + model + role assignment
+
+Step 3 is optional (skip link goes straight to `/{user}/{persona}`). `/setup/model` also works
+standalone (accessible from Settings) for existing users who haven't configured a model.
+
+All in `cortex/routers/onboarding.py`. Model writes use `model_registry.py`: `save_host()`,
+`save_model()`, `set_role(username, "chat", "primary", model_id)`.
+
+### Documentation Philosophy
+Cortex is a no-black-box system. Docs must match reality — at all times.
+
+- **Docs first:** When planning significant changes, update `TODO__Agents.md` and the relevant
+  `ARCH__*.md` to describe the intended design *before* implementing. This creates a spec to
+  implement against.
+- **Verify after:** Once implementation is complete, re-read the pre-written docs and confirm
+  they match what was actually built. Update anything that drifted.
+- **HELP.md is a user contract:** It describes what users can do. Never let it describe
+  features that don't exist or omit features that do.
+- **CLAUDE.md + ARCH__*.md are the developer contract:** Update them as the architecture evolves.
+- **Stale docs are bugs.** If you notice drift, fix it before moving on.
+
+### Doc update checklist (run after any significant change)
+
+| Doc | Update when |
+|---|---|
+| `CLAUDE.md` | New tool, channel, router, major design change, tool count |
+| `cortex/static/HELP.md` | Any user-visible feature — tools, settings, UI, API endpoints |
+| `documentation/TODO__Agents.md` | Mark completed items; add new planned work |
+| `documentation/MASTER.md` | New capability goes live; tool count changes |
+| `documentation/ROADMAP.md` | Phase items completed or added |
+| `documentation/ARCH__CHANNELS.md` | New channel, notification trigger, or scheduler job |
+| `documentation/ARCH__SYSTEM.md` | New module, router, or tools/ file |
+| `README.md` | Architecture diagram, channels table, or setup steps change |
+
 ---

 ## Adding a New Tool
@@ -167,7 +209,13 @@ http://localhost:8000/docs
 1. Implement the tool function in `cortex/tools/<domain>.py`
   - Must be `async def`; use `asyncio.to_thread` for blocking calls
   - Return a plain string result
-2. Add a `FunctionDeclaration` and register it in `cortex/tools/__init__.py`
+2. Add a `FunctionDeclaration` and register it in `cortex/tools/__init__.py`:
+   - Import the callable
+   - Add to `TOOL_CATEGORIES` (pick an existing category or create one)
+   - Add to `_CALLABLES`
+   - Add a `TOOL_RISK` rating (low/medium/high)
+   - Add to `TOOL_ROLES` if admin-only; add to `CONFIRM_REQUIRED` if destructive
+   - Add module to `_ALL_DECLARATIONS`
 3. Syntax check: `python3 -m py_compile cortex/tools/<domain>.py`
 4. Restart Cortex

@@ -212,18 +260,45 @@ clearly asked for a directory to be unblocked.

 ---

-## Current State (2026-04-03)
+## Current State (2026-05-12)

-Cortex is running and stable. All three primary channels are live:
+Cortex is running and stable. All channels are live:

 | Channel | Status | Notes |
 |---|---|---|
-| Web UI | ✅ Live | `https://cortex.dgrzone.com` |
+| Web UI | ✅ Live | `https://cortex.dgrzone.com` — PWA-installable |
 | Nextcloud Talk | ✅ Live | HMAC-signed webhook, async reply |
 | Google Chat | ✅ Live | Workspace Add-on, `hostAppDataAction` response format |
-| Local backend | ✅ Live | Open WebUI/Ollama, per-user multi-model config |
+| Local backend | ✅ Live | Open WebUI/Ollama on scott_gaming, per-user multi-model config |
+| Gemini orchestrator | ✅ Live | Gemini API tool loop → Claude response; ⚡ toggle in UI |
+| Local orchestrator | ✅ Live | OpenAI-compatible ReAct loop; fires when orchestrator role → local model |
+| Tool audit log | ✅ Live | Every tool call logged to `home/{user}/tool_audit/YYYY-MM-DD.jsonl` |
+| Token usage tracking | ✅ Live | Per-user `home/{user}/usage.json`; summary in Settings |
+| Web push | ✅ Live | VAPID push notifications; `web_push` tool; subscribe via ☰ menu |
+| Proactive notifications | ✅ Live | Daily reminder check (09:00); distill/cron completions; `GET /settings/notifications` dedicated page |
+| Proactive cron | ✅ Live | 5 job types: `remind`, `note`, `message`, `brief`, `task` (full orchestrator loop); monthly/yearly schedule formats; HA inbound webhook tools toggle |
+| Schedules web UI | ✅ Live | `/settings/crons` — list, add, edit, pause/resume, delete scheduled jobs |

-Active users: scott (inara, developer), holly (tina), brian (wintermute)
+Active users: scott (inara), holly (tina), brian (wintermute)
+
+**69 orchestrator tools** across 17 domain modules:
+web_search/http_fetch/web_read/http_post,
+project_file_read/list + file_stat/grep/diff/syntax_check (project-scoped),
+file_read/list/write/session_read/session_search (system-scoped, admin),
+git_status/git_log/git_diff (read-only git inspection, project-scoped),
+shell_exec/claude_allow_dir,
+cortex_restart/logs/status/update,
+task_list/create/update/complete, cron_list/add/remove/toggle,
+reminders_add/list/remove/clear, scratch_read/write/append/clear,
+web_push/email_send/nc_talk_send/nc_talk_history,
+ae_journal_list/search/entries_list/entry_read/create/update/disable/append/prepend,
+ae_task_list, ae_db_query/describe/show_view (SELECT-only MariaDB access, admin; disable requires confirm),
+agent_notes_read/write/append/clear, spawn_agent/aider_run (admin; aider_run requires confirm),
+agent_status/agent_list (user-level)/agent_cancel (admin, confirm-required),
+ha_get_state/ha_get_states/ha_call_service.
+
+Each tool has a `TOOL_RISK` rating (low/medium/high). Configure access at `/settings/tools`
+(max_risk threshold + per-tool whitelist/blacklist). Risk policy stored in `home/{user}/tool_policy.json`.

 See `documentation/TODO__Agents.md` for the active task list.
 See `documentation/ROADMAP.md` for phases and what's next.
--- a/README.md
+++ b/README.md
@@ -10,6 +10,43 @@ Cortex is a self-hosted multi-agent AI platform. It supports multiple users, eac

 ---

+## Where Cortex Fits
+
+AI tools aren't one-size-fits-all. Cortex exists in a specific niche — it's not trying to be everything.
+
+**Cortex is a self-hosted persona platform.** It gives you a persistent AI companion with its own
+identity, memory, and voice — reachable through your chat apps, not just a browser tab. It remembers
+who you are across days and weeks. It can proactively message you on a schedule. It runs on your
+own hardware, behind your own auth.
+
+### What Cortex is good at
+- **Being a consistent AI presence** — same persona, same memory, day after day
+- **Multi-channel access** — web, Nextcloud Talk, Google Chat, all routed to the same brain
+- **Proactive work** — scheduled messages, reminders, cron jobs that reach out to you
+- **Multi-user households** — each person gets their own persona (Scott → Inara, Holly → Tina)
+- **Private, offline-capable** — local models via Ollama when you don't want anything leaving the LAN
+
+### What Cortex is not
+- **Not a coding assistant.** Cortex lives in chat apps, not in your terminal or IDE.
+  Use Claude Code, DeepSeek TUI, Gemini CLI, or Copilot for code-level work — they specialize in reading and
+  editing project files. Cortex can't open a codebase.
+- **Not a generic LLM chat UI.** Open WebUI and LibreChat are excellent model-switching frontends.
+  Cortex isn't a frontend — it's a platform with its own identity system, orchestrator, and memory
+  pipeline. Two different jobs.
+- **Not a SaaS product.** Nobody else hosts your Cortex instance. Nobody else sees your conversations.
+  The trade-off is you manage the service yourself — `systemctl --user restart cortex`.
+- **Not an agent framework.** LangChain, CrewAI, and similar are libraries for building AI pipelines.
+  Cortex is a running service with concrete personas, not an abstraction layer to build on top of.
+
+### The stack in practice
+- Use **Cortex** to talk to Inara — daily assistant, memory keeper, scheduled check-ins
+- Use **Claude Code / DeepSeek TUI** to work *on* Cortex — code edits, architecture, debugging
+- Use **Open WebUI** when you want to test a new model or run a quick prompt without persona context
+
+Same AI, different interfaces for different jobs.
+
+---
+
 ## Quick Orientation

 | Directory | What it is |
@@ -145,10 +182,10 @@ Back it up separately — it is required to restore from any snapshot.
    └─ POST /channels/google-chat/{username} — Google Chat Add-on (per-user)
        ↓
  LLM Backends
-  • Claude CLI   — primary, all user-facing responses
-  • Gemini CLI   — fallback
-  • Gemini API   — orchestrator tool loop only (not general chat)
-  • Local        — Open WebUI/Ollama on scott_gaming (private/offline)
+  • Claude CLI      — primary, all user-facing responses
+  • Gemini CLI      — fallback
+  • Gemini API      — orchestrator tool loop (two-brain: Gemini plans, Claude responds)
+  • Local OpenAI    — Open WebUI/Ollama on scott_gaming; also runs local orchestrator loop
        ↓
  Persona context loaded from home/{user}/persona/{name}/
 ```
@@ -176,11 +213,12 @@ Context is loaded at request time from `home/{user}/persona/{name}/` via `cortex

 Webhook endpoints are per-user — each user configures their own secrets in `home/{username}/channels.json`.

-| Channel | Status | Endpoint |
+| Channel | Status | Endpoint / Notes |
 |---|---|---|
 | Web UI | Live | `https://cortex.dgrzone.com` — session auth (login form + JWT cookie) |
 | Nextcloud Talk | Live | `POST /webhook/nextcloud/{username}` — HMAC-signed, async reply |
 | Google Chat | Live | `POST /channels/google-chat/{username}` — Workspace Add-on, JWT auth |
+| Browser Push | Live | VAPID push notifications — subscribe via ☰ menu; proactive reminders + distill alerts |

 See `docs/NEXTCLOUD_TALK_BOT.md` and `docs/GOOGLE_CHAT_BOT.md` for setup instructions.

--- a/cortex/.env.example
+++ b/cortex/.env.example
@@ -93,6 +93,18 @@ AE_API_KEY=
 AE_ACCOUNT_ID=
 AE_API_TIMEOUT=15

+# ── Aether MariaDB (direct — SELECT-only via ae_db_query/describe/show_view tools) ─
+# Configured per-user in home/{username}/channels.json — NOT in .env.
+# Add this block to the user's channels.json to enable the tools:
+#
+#   "aether_db": {
+#       "host":     "192.168.64.5",
+#       "port":     3306,
+#       "name":     "aether_dev",
+#       "user":     "aether_dev",
+#       "password": "..."
+#   }
+
 # ── Distillation schedule ────────────────────────────────────────────────────
 SCHEDULER_TIMEZONE=America/New_York
 AUTO_DISTILL=true
--- a/cortex/agent_manager.py
+++ b/cortex/agent_manager.py
@@ -0,0 +1,158 @@
+"""
+Agent lifecycle manager — registry for background spawn_agent and aider_run tasks.
+
+Tracks running and recently completed agents in-process. On completion, fires
+notification.notify() if notify=True (same channel used by reminders and cron jobs).
+
+Records are kept for 24 hours after completion, then pruned on next registration.
+"""
+
+import asyncio
+import logging
+import uuid
+from dataclasses import dataclass, field
+from datetime import datetime, timedelta
+
+logger = logging.getLogger(__name__)
+
+_PRUNE_AFTER = timedelta(hours=24)
+_RESULT_PREVIEW_CHARS = 500
+_TASK_PREVIEW_CHARS = 200
+
+
+@dataclass
+class AgentRecord:
+    agent_id: str
+    level: int              # 1 = persona, 2 = specialized sub-agent, 3 = support agent
+    role: str               # e.g. "coder", "research", "chat"
+    task: str               # first _TASK_PREVIEW_CHARS of the task
+    status: str             # running / done / failed / cancelled / timeout
+    started: datetime
+    user: str
+    parent_id: str | None = None          # agent_id of the spawner (lineage tracking)
+    finished: datetime | None = None
+    result: str | None = None             # first _RESULT_PREVIEW_CHARS on completion
+    notify: bool = False                  # push notification on completion
+    _task_ref: "asyncio.Task | None" = field(default=None, repr=False)
+
+
+# Module-level registry — in-process only, not persisted across restarts.
+_agents: dict[str, AgentRecord] = {}
+_lock = asyncio.Lock()
+
+
+async def register(
+    user: str,
+    role: str,
+    task: str,
+    level: int = 2,
+    parent_id: str | None = None,
+    notify: bool = False,
+) -> AgentRecord:
+    """Create and register a new running agent. Returns the record (agent_id is set)."""
+    agent_id = str(uuid.uuid4())
+    rec = AgentRecord(
+        agent_id=agent_id,
+        level=level,
+        role=role,
+        task=task[:_TASK_PREVIEW_CHARS],
+        status="running",
+        started=datetime.now(),
+        user=user,
+        parent_id=parent_id,
+        notify=notify,
+    )
+    async with _lock:
+        _prune_locked()
+        _agents[agent_id] = rec
+    logger.info(
+        "agent_manager: registered %s role=%s level=%d user=%s task=%.60s",
+        agent_id[:8], role, level, user, task,
+    )
+    return rec
+
+
+def set_task_ref(agent_id: str, task_ref: "asyncio.Task") -> None:
+    """Store the asyncio.Task reference so it can be cancelled later.
+
+    Call immediately after asyncio.create_task() — before the event loop yields.
+    """
+    rec = _agents.get(agent_id)
+    if rec:
+        rec._task_ref = task_ref
+
+
+async def finish(agent_id: str, result: str, status: str = "done") -> None:
+    """Mark an agent complete, store the result, and notify the user if requested."""
+    async with _lock:
+        rec = _agents.get(agent_id)
+        if not rec:
+            return
+        rec.status = status
+        rec.finished = datetime.now()
+        rec.result = (result or "")[:_RESULT_PREVIEW_CHARS]
+
+    logger.info("agent_manager: finished %s status=%s", agent_id[:8], status)
+
+    if rec.notify and status != "cancelled":
+        try:
+            from notification import notify as _notify
+            elapsed = int((rec.finished - rec.started).total_seconds())
+            emoji = "✅" if status == "done" else "⚠️"
+            preview = (rec.result or "(no output)")[:200]
+            msg = f"{emoji} Agent done [{rec.role}, {elapsed}s]: {preview}"
+            await _notify(rec.user, msg)
+        except Exception as e:
+            logger.warning("agent_manager: notification failed for %s: %s", agent_id[:8], e)
+
+
+async def cancel_agent(agent_id: str, user: str) -> str:
+    """Cancel a running background agent. Returns a human-readable status message."""
+    async with _lock:
+        rec = _agents.get(agent_id)
+        if not rec:
+            return f"No agent found: {agent_id}"
+        if rec.user != user:
+            return "Access denied."
+        if rec.status != "running":
+            return f"Agent {agent_id[:8]}… is already {rec.status}."
+        task_ref = rec._task_ref
+        rec.status = "cancelled"
+        rec.finished = datetime.now()
+
+    if task_ref and not task_ref.done():
+        task_ref.cancel()
+
+    logger.info("agent_manager: cancelled %s by user=%s", agent_id[:8], user)
+    return f"Agent {agent_id[:8]}… cancelled."
+
+
+def get(agent_id: str) -> AgentRecord | None:
+    """Look up an agent record by ID."""
+    return _agents.get(agent_id)
+
+
+def list_agents(user: str, status: str | None = None, limit: int = 10) -> list[AgentRecord]:
+    """Return recent agents for a user, newest first.
+
+    Does not acquire the lock — safe for read-only listing (Python dict iteration is
+    thread-safe for reads; we don't care about racing with a concurrent registration).
+    """
+    records = [r for r in _agents.values() if r.user == user]
+    if status:
+        records = [r for r in records if r.status == status]
+    records.sort(key=lambda r: r.started, reverse=True)
+    return records[:limit]
+
+
+def _prune_locked() -> None:
+    """Remove completed agents older than _PRUNE_AFTER. Must be called inside _lock."""
+    cutoff = datetime.now() - _PRUNE_AFTER
+    stale = [
+        aid for aid, r in _agents.items()
+        if r.status != "running" and r.finished and r.finished < cutoff
+    ]
+    for aid in stale:
+        del _agents[aid]
+    if stale:
+        logger.debug("agent_manager: pruned %d stale records", len(stale))
--- a/cortex/auth_middleware.py
+++ b/cortex/auth_middleware.py
@@ -17,7 +17,8 @@ from starlette.responses import RedirectResponse, JSONResponse
 from auth_utils import COOKIE_NAME, decode_token

 # Paths that don't require a session cookie
-_PUBLIC = {"/login", "/logout", "/health"}
+_PUBLIC = {"/login", "/logout", "/health", "/manifest.json", "/sw.js", "/favicon.ico",
+           "/api/push/vapid-key"}

 # Path prefixes that are always public (setup flow + webhooks + Google OAuth)
 _PUBLIC_PREFIXES = ("/setup/", "/channels/", "/webhook/", "/auth/google")
--- a/cortex/auth_utils.py
+++ b/cortex/auth_utils.py
@@ -115,6 +115,16 @@ def get_user_gemini_key(username: str) -> str | None:
    return _read_auth(username).get("gemini_api_key") or None


+def get_user_role(username: str) -> str:
+    """Return the user's role: 'admin' or 'user' (default).
+
+    Role is stored as auth.json["role"]. Any unrecognised value falls back to 'user'.
+    Set via: manage_passwords.py role <username> admin|user
+    """
+    role = _read_auth(username).get("role", "user")
+    return role if role in ("admin", "user") else "user"
+
+
 # ---------------------------------------------------------------------------
 # JWT helpers
 # ---------------------------------------------------------------------------
@@ -214,3 +224,37 @@ def get_user_channels(username: str) -> dict:
        return json.loads(path.read_text())
    except Exception:
        return {}
+
+
+def get_tool_policy(username: str) -> dict:
+    """Return the parsed tool_policy.json for a user.
+
+    Confirmation-gate keys (existing):
+      allow     — tools in CONFIRM_REQUIRED that this user has pre-approved (skip gate)
+      deny      — tools always blocked for this user regardless of global CONFIRM_REQUIRED
+
+    Risk-policy keys (new):
+      max_risk  — auto-include all tools at/below this level ("low"|"medium"|"high")
+      whitelist — force-include specific tools above max_risk
+      blacklist — force-exclude specific tools regardless of max_risk
+    """
+    path = settings.home_root() / username / "tool_policy.json"
+    try:
+        return json.loads(path.read_text())
+    except Exception:
+        return {}
+
+
+def get_risk_policy(username: str) -> tuple[str | None, list[str], list[str]]:
+    """Return (max_risk, whitelist, blacklist) from the user's tool policy."""
+    policy = get_tool_policy(username)
+    return (
+        policy.get("max_risk") or None,
+        policy.get("whitelist") or [],
+        policy.get("blacklist") or [],
+    )
+
+
+def save_tool_policy(username: str, data: dict) -> None:
+    path = settings.home_root() / username / "tool_policy.json"
+    path.write_text(json.dumps(data, indent=2) + "\n")
--- a/cortex/config.py
+++ b/cortex/config.py
@@ -89,6 +89,12 @@ class Settings(BaseSettings):
    jwt_secret: str = "change-me-in-dotenv"   # override in .env: JWT_SECRET=<random>
    jwt_expire_days: int = 30

+    # Web Push (VAPID) — for browser push notifications
+    # Generate once with py_vapid; see push_utils.py for key format details
+    vapid_public_key: str = ""     # base64url-encoded uncompressed EC point (for browser)
+    vapid_private_key_b64: str = "" # base64-encoded PEM private key (single-line .env storage)
+    vapid_contact: str = "mailto:admin@example.com"
+
    # SMTP — for sending invite emails
    smtp_server: str = ""
    smtp_port: int = 465
--- a/cortex/context_loader.py
+++ b/cortex/context_loader.py
@@ -1,4 +1,10 @@
+from datetime import datetime
+from pathlib import Path
+
 from persona import persona_path
+from tools.reminders import load_due_reminders
+
+_STATIC_DIR = Path(__file__).parent / "static"


 # Core identity files — always loaded regardless of tier
@@ -13,6 +19,10 @@ def load_context(
    include_long: bool = True,
    include_mid: bool = True,
    include_short: bool = True,
+    role_append: str = "",
+    inject_datetime: bool = True,
+    inject_mode: bool = True,
+    mode: str = "chat",
 ) -> str:
    """
    Build the system-prompt context block for a given tier and memory toggles.
@@ -24,10 +34,26 @@ def load_context(
    Tier 2  — + USER full + PROTOCOLS + memory      (~5,000 tokens)
    Tier 3  — + last 2 raw session logs             (~15,000 tokens)
    Tier 4  — + last 7 raw session logs             (~50,000 tokens)
+
+    role_append — optional text injected last (closest to the turn),
+                  sourced from the active role's system_append config.
    """
    inara_dir = persona_path()
    parts = []

+    # ── 0. System block — date/time and session mode (injected first so it's prominent) ──
+    system_lines = []
+    if inject_datetime:
+        now = datetime.now().astimezone()
+        system_lines.append(f"Current date and time: {now.strftime('%A, %Y-%m-%d at %I:%M %p %Z')}")
+    if mode == "otr" and inject_mode:
+        system_lines.append(
+            "Current mode: Off The Record — "
+            "this conversation is private and will not be logged or included in memory distillation"
+        )
+    if system_lines:
+        parts.append("--- System ---\n" + "\n".join(system_lines))
+
    # ── 1. Core identity (always) ──────────────────────────────────
    for filename in _CORE:
        path = inara_dir / filename
@@ -52,17 +78,26 @@ def load_context(
    if proto_path.exists():
        parts.append(f"--- PROTOCOLS.md ---\n{proto_path.read_text()}")

+    ops_path = inara_dir / "OPERATIONS.md"
+    if ops_path.exists():
+        parts.append(f"--- OPERATIONS.md ---\n{ops_path.read_text()}")
+
+    # Global tool reference (same for all personas)
+    tools_path = _STATIC_DIR / "TOOLS.md"
+    if tools_path.exists():
+        parts.append(f"--- TOOLS.md ---\n{tools_path.read_text()}")
+
+    # Persona-specific help additions (optional)
    help_path = inara_dir / "HELP.md"
-    if help_path.exists():
+    if help_path.exists() and help_path.stat().st_size > 10:
        parts.append(f"--- HELP.md ---\n{help_path.read_text()}")

    # ── 4. Pending reminders (tier 2+) ────────────────────────────
-    #    Written by cron jobs; cleared by Inara after acting on them.
-    reminders_path = inara_dir / "REMINDERS.md"
-    if reminders_path.exists() and reminders_path.stat().st_size > 10:
-        content = reminders_path.read_text().strip()
-        if content:
-            parts.append(f"--- REMINDERS.md ---\n{content}")
+    #    Only due and undated reminders are surfaced — future-dated ones
+    #    are stored in REMINDERS.md but suppressed until their date arrives.
+    content = load_due_reminders()
+    if content:
+        parts.append(f"--- REMINDERS.md ---\n{content}")

    # ── 5. Tiered memory — long → mid → short ─────────────────────
    #    Short is last so it sits closest to the conversation turn.
@@ -97,4 +132,8 @@ def load_context(
            for sf in session_files:
                parts.append(f"--- Session: {sf.name} ---\n{sf.read_text()}")

+    # ── 7. Role-specific instructions (always last — closest to the turn) ──
+    if role_append and role_append.strip():
+        parts.append(f"--- Role Context ---\n{role_append.strip()}")
+
    return "\n\n".join(parts)
--- a/cortex/cron_runner.py
+++ b/cortex/cron_runner.py
@@ -10,9 +10,9 @@ Job schema:
    "id":         "c_abc123",
    "label":      "Human-readable name",
    "schedule":   "daily:09:00",   # see parse_schedule() for all formats
-    "type":       "remind" | "note" | "message" | "brief",
+    "type":       "remind" | "note" | "message" | "brief" | "task",
    "payload":    "Text or prompt when the job fires",
-    "channel":    null | "nextcloud" | "google_chat",  # for message/brief types
+    "channel":    null | "nextcloud" | "google_chat",  # for message/brief/task types
    "enabled":    true,
    "created_at": "ISO 8601",
    "last_run":   null | "ISO 8601"
@@ -21,9 +21,14 @@ Job schema:
 Job types:
  remind   → appends to REMINDERS.md  (auto-loaded into context at tier 2+)
  note     → appends to SCRATCH.md    (read on demand via scratch_read)
-  message  → sends payload as-is to NC Talk notification_room
-  brief    → runs LLM with payload as the prompt, sends response to NC Talk
+  message  → sends payload as-is to notification channel
+  brief    → calls LLM (no tools) with payload as prompt, sends response
             (good for morning briefings, summaries, proactive check-ins)
+  task     → runs full orchestrator tool loop with payload as the user request,
+             sends Claude's response to notification channel
+             (good for agentic scheduled work: research, file updates, checks)
+             Tools that require confirmation are skipped — pre-approve them
+             in Settings → Tools to allow them in scheduled tasks.
 """

 import logging
@@ -80,11 +85,16 @@ def parse_schedule(schedule: str) -> dict:
    Convert a human schedule string to APScheduler cron kwargs.

    Formats:
-      "hourly"           → every hour at :00
-      "daily"            → every day at 09:00
-      "daily:HH:MM"      → every day at HH:MM
-      "weekly:DOW"       → every DOW at 09:00
-      "weekly:DOW:HH:MM" → every DOW at HH:MM
+      "hourly"                → every hour at :00
+      "daily"                 → every day at 09:00
+      "daily:HH:MM"           → every day at HH:MM
+      "weekly:DOW"            → every DOW at 09:00
+      "weekly:DOW:HH:MM"      → every DOW at HH:MM
+      "monthly"               → 1st of every month at 09:00
+      "monthly:DD"            → day DD of every month at 09:00
+      "monthly:DD:HH:MM"      → day DD of every month at HH:MM
+      "yearly:MM:DD"          → every year on MM/DD at 09:00  (birthdays, anniversaries)
+      "yearly:MM:DD:HH:MM"    → every year on MM/DD at HH:MM
    """
    s = schedule.strip().lower()

@@ -112,9 +122,37 @@ def parse_schedule(schedule: str) -> dict:
            h, m = _DEFAULT_HOUR, _DEFAULT_MINUTE
        return {"day_of_week": dow, "hour": h, "minute": m}

+    if s.startswith("monthly"):
+        rest = s[7:].lstrip(":")
+        if not rest:
+            return {"day": 1, "hour": _DEFAULT_HOUR, "minute": _DEFAULT_MINUTE}
+        parts = rest.split(":")
+        day = _parse_day(parts[0], schedule)
+        if len(parts) == 3:
+            h, m = _parse_hhmm(f"{parts[1]}:{parts[2]}", schedule)
+        else:
+            h, m = _DEFAULT_HOUR, _DEFAULT_MINUTE
+        return {"day": day, "hour": h, "minute": m}
+
+    if s.startswith("yearly:"):
+        rest = s[7:].split(":")
+        if len(rest) < 2:
+            raise ValueError(
+                f"yearly requires at least MM:DD in {schedule!r}. "
+                f"Example: yearly:03:15  or  yearly:03:15:09:00"
+            )
+        month = _parse_month(rest[0], schedule)
+        day   = _parse_day(rest[1], schedule)
+        if len(rest) == 4:
+            h, m = _parse_hhmm(f"{rest[2]}:{rest[3]}", schedule)
+        else:
+            h, m = _DEFAULT_HOUR, _DEFAULT_MINUTE
+        return {"month": month, "day": day, "hour": h, "minute": m}
+
    raise ValueError(
        f"Unrecognised schedule {schedule!r}. "
-        f"Valid formats: hourly | daily | daily:HH:MM | weekly:DOW | weekly:DOW:HH:MM"
+        f"Valid formats: hourly | daily | daily:HH:MM | weekly:DOW | weekly:DOW:HH:MM | "
+        f"monthly | monthly:DD | monthly:DD:HH:MM | yearly:MM:DD | yearly:MM:DD:HH:MM"
    )


@@ -125,6 +163,26 @@ def _parse_hhmm(s: str, original: str) -> tuple[int, int]:
    return int(parts[0]), int(parts[1])


+def _parse_day(s: str, original: str) -> int:
+    try:
+        d = int(s)
+    except ValueError:
+        raise ValueError(f"Expected day number (1–31) in {original!r}, got {s!r}")
+    if not 1 <= d <= 31:
+        raise ValueError(f"Day must be 1–31 in {original!r}, got {d}")
+    return d
+
+
+def _parse_month(s: str, original: str) -> int:
+    try:
+        m = int(s)
+    except ValueError:
+        raise ValueError(f"Expected month number (1–12) in {original!r}, got {s!r}")
+    if not 1 <= m <= 12:
+        raise ValueError(f"Month must be 1–12 in {original!r}, got {m}")
+    return m
+
+
 # ---------------------------------------------------------------------------
 # Execution
 # ---------------------------------------------------------------------------
@@ -188,6 +246,55 @@ async def run_job(job: dict) -> None:
        except Exception as e:
            logger.error("cron [brief] LLM error for %s: %s", label, e)

+    elif job_type == "task":
+        # Run the full orchestrator tool loop, send Claude's response to the
+        # notification channel. Tools that require confirmation are skipped in
+        # cron context — the user is notified to pre-approve them.
+        from orchestrator_engine import run as _orch_run
+        from context_loader import load_context
+        from notification import notify
+        from persona import set_context
+        from auth_utils import get_user_gemini_key, get_tool_policy, get_risk_policy
+        from config import settings as _s
+
+        username   = job.get("user") or _s.user_name.lower()
+        persona_nm = job.get("persona") or _s.agent_name.lower()
+        channel    = job.get("channel") or None
+        set_context(username, persona_nm)
+
+        system_prompt = load_context(2)
+        policy        = get_tool_policy(username)
+        max_risk, whitelist, blacklist = get_risk_policy(username)
+        gemini_key    = get_user_gemini_key(username)
+
+        try:
+            result = await _orch_run(
+                task=payload,
+                system_prompt=system_prompt,
+                gemini_api_key=gemini_key,
+                respond_with_claude=True,
+                confirm_allow=set(policy.get("allow") or []),
+                confirm_deny=set(policy.get("deny") or []),
+                max_risk=max_risk,
+                risk_whitelist=whitelist,
+                risk_blacklist=blacklist,
+            )
+            if result.checkpoint:
+                tool_name = (result.checkpoint.pending_calls[0].name
+                             if result.checkpoint.pending_calls else "unknown tool")
+                msg = (
+                    f"Scheduled task '{label}' paused — "
+                    f"'{tool_name}' requires confirmation. "
+                    "Pre-approve it in Settings → Tools to allow it in scheduled tasks."
+                )
+                await notify(username, msg, channel=channel)
+                logger.warning("cron [task] %s: confirmation required for %s", label, tool_name)
+            else:
+                await notify(username, result.response, channel=channel)
+                logger.info("cron [task] completed via %s: %s", result.backend, label)
+        except Exception as e:
+            logger.error("cron [task] error for %s: %s", label, e)
+
    else:
        logger.warning("cron: unknown type %r (job %s)", job_type, job.get("id"))
        return
--- a/cortex/llm_client.py
+++ b/cortex/llm_client.py
@@ -33,15 +33,16 @@ async def cleanup() -> None:

 # Map from registry model type → dispatch function key
 _TYPE_TO_BACKEND = {
-    "claude_cli":   "claude",
-    "gemini_cli":   "gemini",
-    "gemini_api":   "gemini",   # gemini_api falls back to CLI in this context
-    "local_openai": "local",
+    "claude_cli":    "claude",
+    "gemini_cli":    "gemini",
+    "gemini_api":    "gemini",        # gemini_api falls back to CLI in this context
+    "local_openai":  "local",
+    "anthropic_api": "anthropic_api",
 }

 # Explicit UI toggle values (kept for backward compat)
 _EXPLICIT_BACKENDS = ("claude", "gemini", "local")
-_FALLBACK = {"claude": "gemini", "gemini": "claude", "local": "claude"}
+_FALLBACK = {"claude": "gemini", "gemini": "claude", "local": "claude", "anthropic_api": "claude"}


 async def complete(
@@ -49,14 +50,18 @@ async def complete(
    messages: list[dict],
    model: str | None = None,
    role: str = "chat",
+    slot: str | None = None,
    max_tokens: int = 2048,
+    attachment: dict | None = None,
 ) -> tuple[str, str]:
    """
    Returns (response_text, actual_backend_used).

-    model: explicit backend override ("claude" | "gemini" | "local") from UI toggle.
+    slot:  Phase 3 — specific role slot ("primary" | "backup_1" | "backup_2").
+           Resolves that exact slot, no fallback chain. Takes priority over model.
+    model: legacy backend override ("claude" | "gemini" | "local") from old toggle.
           None = resolve via model registry for the given role.
-    role:  registry role used when model is None (default: "chat").
+    role:  registry role used for slot/auto routing (default: "chat").
    """
    import model_registry as _reg
    from persona import _user
@@ -64,34 +69,42 @@ async def complete(
    username = _user.get()
    resolved_cfg: dict | None = None

-    if model in _EXPLICIT_BACKENDS:
-        # User explicitly selected a backend in the UI
-        if model == "local":
-            resolved_cfg = _reg.get_best_local_model(username, role)
-            if not resolved_cfg:
-                raise RuntimeError("No local model configured — add one at /settings/models")
-        primary = model
-    else:
-        # Role-based routing via model registry
-        resolved = _reg.get_model_for_role(username, role)
-        if resolved:
-            resolved_cfg = resolved
-            primary = _TYPE_TO_BACKEND.get(resolved["type"], "claude")
+    if slot is not None:
+        # Phase 3: explicit slot selection — no fallback within the role
+        resolved_cfg = _reg.get_model_for_slot(username, role, slot)
+        if resolved_cfg:
+            primary = _TYPE_TO_BACKEND.get(resolved_cfg["type"], "claude")
        else:
-            primary = settings.primary_backend
+            # Slot not configured — fall through to auto routing
+            slot = None
+
+    if slot is None:
+        if model in _EXPLICIT_BACKENDS:
+            # Legacy: explicit backend override from old UI toggle
+            if model == "local":
+                resolved_cfg = _reg.get_best_local_model(username, role)
+                if not resolved_cfg:
+                    raise RuntimeError("No local model configured — add one at /settings/models")
+            primary = model
+        else:
+            # Auto: role-based routing via model registry
+            resolved = _reg.get_model_for_role(username, role)
+            if resolved:
+                resolved_cfg = resolved
+                primary = _TYPE_TO_BACKEND.get(resolved["type"], "claude")
+            else:
+                primary = settings.primary_backend

    fallback = _FALLBACK.get(primary, "claude")

    try:
-        response = await _dispatch(primary, system_prompt, messages, resolved_cfg)
+        response = await _dispatch(primary, system_prompt, messages, resolved_cfg, attachment=attachment)
        return response, primary
    except Exception as e:
        err_str = str(e)
        if primary == "claude" and any(k in err_str for k in ("401", "authenticate", "expired", "OAuth")):
            await event_bus.publish({"type": "claude_auth_expired"})
-        # Only fall back when using a default/auto backend.
-        # If the user has explicitly configured a model via the registry,
-        # surface the error so they know something is wrong.
+        # Surface errors when a model is explicitly configured or a specific slot was pinned.
        if resolved_cfg is not None:
            logger.error("%s failed (no fallback — model explicitly configured): %s", primary, e)
            raise
@@ -105,11 +118,14 @@ async def _dispatch(
    system_prompt: str,
    messages: list[dict],
    model_cfg: dict | None,
+    attachment: dict | None = None,
 ) -> str:
    if backend == "gemini":
        return await _gemini(system_prompt, messages)
    if backend == "local":
-        return await _local(system_prompt, messages, model_cfg)
+        return await _local(system_prompt, messages, model_cfg, attachment=attachment)
+    if backend == "anthropic_api":
+        return await _anthropic_api(system_prompt, messages, model_cfg)
    return await _claude(system_prompt, messages, model_cfg)


@@ -155,11 +171,17 @@ async def _claude(system_prompt: str, messages: list[dict], model_cfg: dict | No
    return await _run(cmd, timeout=settings.timeout_claude, env=env)


-async def _local(system_prompt: str, messages: list[dict], model_cfg: dict | None = None) -> str:
+async def _local(
+    system_prompt: str,
+    messages: list[dict],
+    model_cfg: dict | None = None,
+    attachment: dict | None = None,
+) -> str:
    """OpenAI-compatible backend — Open WebUI / Ollama.

    model_cfg is pre-resolved by complete() via model_registry.
    Falls back to registry lookup if not provided.
+    attachment: optional image dict {filename, mime_type, data} for vision calls.
    """
    import httpx

@@ -177,9 +199,9 @@ async def _local(system_prompt: str, messages: list[dict], model_cfg: dict | Non
    model   = cfg["model_name"]

    if not api_url:
-        raise RuntimeError("local_api_url not configured — set LOCAL_API_URL in .env or add a host at /settings/local")
+        raise RuntimeError("local_api_url not configured — set LOCAL_API_URL in .env or add a host at /settings/models")
    if not model:
-        raise RuntimeError("local_model not configured — add a model at /settings/local")
+        raise RuntimeError("local_model not configured — add a model at /settings/models")

    host_type = cfg.get("host_type", "openwebui")
    # "openwebui" uses Open WebUI/Ollama path layout; "openai" uses standard OpenAI layout
@@ -189,8 +211,20 @@ async def _local(system_prompt: str, messages: list[dict], model_cfg: dict | Non
    msgs: list[dict] = []
    if system_prompt:
        msgs.append({"role": "system", "content": system_prompt})
-    # Strip any non-standard metadata fields before sending to the API
-    msgs.extend({"role": m["role"], "content": m["content"]} for m in messages)
+
+    # Build message list; inject image into the last user message when present.
+    for i, m in enumerate(messages):
+        is_last = (i == len(messages) - 1)
+        if is_last and m["role"] == "user" and attachment:
+            content: list[dict] = [{"type": "text", "text": m["content"]}]
+            content.append({
+                "type": "image_url",
+                "image_url": {"url": attachment["data"]},
+            })
+            msgs.append({"role": "user", "content": content})
+        else:
+            # Strip non-standard metadata fields before sending to the API
+            msgs.append({"role": m["role"], "content": m["content"]})

    url = api_url.rstrip("/") + chat_path
    headers: dict[str, str] = {}
@@ -207,6 +241,64 @@ async def _local(system_prompt: str, messages: list[dict], model_cfg: dict | Non
    text = data["choices"][0]["message"]["content"]
    if not text or not text.strip():
        raise RuntimeError("Local model returned an empty response")
+
+    usage = data.get("usage") or {}
+    if usage.get("prompt_tokens") is not None:
+        import usage_tracker
+        from persona import _user
+        asyncio.create_task(usage_tracker.record(
+            username=_user.get(),
+            backend="local",
+            model_name=model,
+            prompt_tokens=usage.get("prompt_tokens", 0),
+            completion_tokens=usage.get("completion_tokens", 0),
+        ))
+
+    return text.strip()
+
+
+async def _anthropic_api(system_prompt: str, messages: list[dict], model_cfg: dict | None) -> str:
+    """Direct Anthropic API backend using the anthropic SDK."""
+    try:
+        import anthropic
+    except ImportError:
+        raise RuntimeError("anthropic SDK not installed — run: pip install 'anthropic>=0.40.0'")
+
+    cfg        = model_cfg or {}
+    api_key    = cfg.get("api_key", "")
+    model_name = cfg.get("model_name") or settings.default_model
+
+    if not api_key:
+        raise RuntimeError("No Anthropic API key — add one at /settings/models")
+
+    client = anthropic.AsyncAnthropic(api_key=api_key)
+
+    msgs = [{"role": m["role"], "content": m["content"]} for m in messages]
+    kwargs: dict = {
+        "model":      model_name,
+        "max_tokens": 4096,
+        "messages":   msgs,
+    }
+    if system_prompt:
+        kwargs["system"] = system_prompt
+
+    resp = await client.messages.create(**kwargs)
+
+    text = resp.content[0].text if resp.content else ""
+    if not text.strip():
+        raise RuntimeError("Anthropic API returned an empty response")
+
+    if resp.usage:
+        import usage_tracker
+        from persona import _user
+        asyncio.create_task(usage_tracker.record(
+            username=_user.get(),
+            backend="anthropic_api",
+            model_name=model_name,
+            prompt_tokens=resp.usage.input_tokens,
+            completion_tokens=resp.usage.output_tokens,
+        ))
+
    return text.strip()


--- a/cortex/main.py
+++ b/cortex/main.py
@@ -8,8 +8,8 @@ logging.basicConfig(level=logging.INFO, format="%(levelname)s:%(name)s: %(messag

 from config import settings
 from auth_middleware import SessionAuthMiddleware
-from routers import chat, google_chat, nextcloud_talk, files, distill, auth, orchestrator
-from routers import ui, onboarding, settings, help, auth_google, local_llm
+from routers import chat, google_chat, nextcloud_talk, homeassistant, files, distill, auth, orchestrator
+from routers import ui, onboarding, settings, tools_settings, help, auth_google, local_llm, push, audit, usage, crons


@asynccontextmanager
@@ -30,10 +30,14 @@ app.add_middleware(SessionAuthMiddleware)
 app.include_router(chat.router)
 app.include_router(google_chat.router)
 app.include_router(nextcloud_talk.router)
+app.include_router(homeassistant.router)
 app.include_router(files.router)
 app.include_router(distill.router)
 app.include_router(auth.router)
 app.include_router(orchestrator.router)
+app.include_router(push.router)
+app.include_router(audit.router)
+app.include_router(usage.router)

 # Static files — must be mounted BEFORE ui.router so /static/* is matched first.
 # ui.router has a wildcard /{username}/{persona} that would otherwise catch /static/style.css etc.
@@ -47,20 +51,23 @@ app.include_router(onboarding.router)

 # Account settings
 app.include_router(settings.router)
+app.include_router(tools_settings.router)
 app.include_router(local_llm.router)
+app.include_router(crons.router)

 # Help page
 app.include_router(help.router)

-# UI router (login + /{user}/{persona} — must be last to avoid swallowing API paths)
-app.include_router(ui.router)
-
-
+# Health check — must be before ui.router so /{username} catch-all doesn't swallow it.
@app.get("/health")
 async def health() -> dict:
    return {"status": "ok"}


+# UI router (login + /{user}/{persona} — must be last to avoid swallowing API paths)
+app.include_router(ui.router)
+
+
 if __name__ == "__main__":
    uvicorn.run(
        "main:app",
--- a/cortex/manage_passwords.py
+++ b/cortex/manage_passwords.py
@@ -170,6 +170,25 @@ def cmd_google_add(args):
    print(f"They can now sign in at {settings.cortex_base_url}/login using that Google account.")


+def cmd_role(args):
+    if len(args) < 2:
+        print("Usage: manage_passwords.py role <username> admin|user")
+        sys.exit(1)
+    username, role = args[0], args[1].lower().strip()
+    if role not in ("admin", "user"):
+        print("Role must be 'admin' or 'user'.")
+        sys.exit(1)
+    from auth_utils import _read_auth, _write_auth
+    data = _read_auth(username)
+    if not data:
+        print(f"User '{username}' not found — no auth.json.")
+        sys.exit(1)
+    old_role = data.get("role", "user")
+    data["role"] = role
+    _write_auth(username, data)
+    print(f"Role for '{username}': {old_role} → {role}")
+
+
 if __name__ == "__main__":
    if len(sys.argv) < 2:
        print(__doc__)
@@ -190,6 +209,8 @@ if __name__ == "__main__":
        cmd_invite(rest)
    elif command == "google-add":
        cmd_google_add(rest)
+    elif command == "role":
+        cmd_role(rest)
    else:
        print(f"Unknown command: {command}")
        print(__doc__)
--- a/cortex/memory_distiller.py
+++ b/cortex/memory_distiller.py
@@ -1,9 +1,17 @@
 """
-Inara tiered memory distillation.
+Tiered memory distillation.

  distill_short()  — roll recent session logs → MEMORY_SHORT.md  (no LLM)
  distill_mid()    — summarize MEMORY_SHORT   → MEMORY_MID.md    (LLM)
  distill_long()   — integrate MEMORY_MID     → MEMORY_LONG.md   (LLM)
+
+Before any file is overwritten, two rolling backups are kept:
+  MEMORY_*.bak1.md — most recent backup  (created just before last write)
+  MEMORY_*.bak2.md — backup before that
+
+LLM responses are sanity-checked before writing. If the response looks like
+a refusal, is too short, or is obviously not memory content, the distill is
+aborted and the original file is left untouched.
 """
 import logging
 from datetime import datetime
@@ -16,6 +24,25 @@ logger = logging.getLogger(__name__)
 # Rough chars-per-token estimate for budget enforcement
 _CHARS_PER_TOKEN = 4

+# Phrases that indicate the LLM refused or misunderstood the task
+_REFUSAL_PREFIXES = (
+    "i'm sorry",
+    "i am sorry",
+    "i can't",
+    "i cannot",
+    "i'm unable",
+    "i am unable",
+    "as an ai",
+    "as a language model",
+    "i don't have access",
+    "i do not have access",
+    "i'm not able",
+    "i am not able",
+)
+
+# Minimum characters for a valid mid/long distill response
+_MIN_RESPONSE_CHARS = 80
+

 def _budget_chars(tokens: int) -> int:
    return tokens * _CHARS_PER_TOKEN
@@ -25,7 +52,62 @@ def _read(path: Path) -> str:
    return path.read_text() if path.exists() else ""


-def distill_short(username: str | None = None, persona: str | None = None) -> dict:
+def _rotate_backup(path: Path, n: int = 2) -> None:
+    """Rotate up to n rolling backups of path before a write.
+
+    MEMORY_LONG.md → MEMORY_LONG.bak1.md (most recent), MEMORY_LONG.bak2.md (older)
+    """
+    if not path.exists():
+        return
+    # Shift older backups down: bak(n-1) → bak(n), …, bak1 stays as bak1 source
+    for i in range(n, 1, -1):
+        older = path.parent / f"{path.stem}.bak{i}.md"
+        newer = path.parent / f"{path.stem}.bak{i - 1}.md"
+        if newer.exists():
+            older.write_text(newer.read_text())
+    # Current file → bak1
+    bak1 = path.parent / f"{path.stem}.bak1.md"
+    bak1.write_text(path.read_text())
+
+
+def _sanity_check(response_text: str, context: str, existing_content: str = "") -> str | None:
+    """Return an error string if the LLM response looks invalid, else None.
+
+    Checks:
+    - Minimum absolute length
+    - Refusal / AI preamble phrases
+    - Size shrinkage: new content must be at least 40% of the old (catches truncation)
+    - Size explosion: new content must not exceed 250% of the old (catches runaway output)
+      (Both bounds only apply when an existing file is present and reasonably sized.)
+    """
+    stripped = response_text.strip()
+    if len(stripped) < _MIN_RESPONSE_CHARS:
+        return f"{context}: response too short ({len(stripped)} chars) — not writing"
+
+    first_line = stripped.lower().splitlines()[0]
+    if any(first_line.startswith(p) for p in _REFUSAL_PREFIXES):
+        return f"{context}: response looks like a refusal — not writing"
+
+    if existing_content:
+        old_len = len(existing_content.strip())
+        new_len = len(stripped)
+        if old_len >= _MIN_RESPONSE_CHARS * 4:   # only compare when old file has real content
+            ratio = new_len / old_len
+            if ratio < 0.40:
+                return (
+                    f"{context}: new content is only {ratio:.0%} of the old "
+                    f"({new_len} vs {old_len} chars) — looks truncated, not writing"
+                )
+            if ratio > 2.50:
+                return (
+                    f"{context}: new content is {ratio:.0%} of the old "
+                    f"({new_len} vs {old_len} chars) — looks like runaway output, not writing"
+                )
+
+    return None
+
+
+def distill_short(username: str, persona: str) -> dict:
    """
    Roll the most recent session log files into MEMORY_SHORT.md.
    No LLM involved — pure aggregation with budget truncation.
@@ -64,8 +146,9 @@ def distill_short(username: str | None = None, persona: str | None = None) -> di
    )

    out_path = inara_dir / "MEMORY_SHORT.md"
+    _rotate_backup(out_path)
    out_path.write_text(header + body)
-    logger.info("distill_short: wrote %d chars from %d files", len(header) + len(body), len(parts))
+    logger.info("distill_short [%s/%s]: wrote %d chars from %d files", username, persona, len(header) + len(body), len(parts))

    return {
        "files_included": len(parts),
@@ -74,32 +157,34 @@ def distill_short(username: str | None = None, persona: str | None = None) -> di
    }


-async def distill_mid(username: str | None = None, persona: str | None = None) -> dict:
+async def distill_mid(username: str, persona: str) -> dict:
    """
    Ask the LLM to summarize MEMORY_SHORT.md → MEMORY_MID.md.
-    Uses DISTILL_BACKEND_MID if set (e.g. "local"), otherwise primary_backend.
+    Backs up the current MEMORY_MID.md before overwriting.
    """
    from llm_client import complete
    from persona import set_context

-    u = username or settings.user_name.lower()
-    p = persona or settings.agent_name.lower()
+    u, p = username, persona
    set_context(u, p)

    inara_dir = _persona_path(u, p)
    short_content = _read(inara_dir / "MEMORY_SHORT.md")
+    existing_mid = _read(inara_dir / "MEMORY_MID.md")

    if not short_content.strip() or "Not yet populated" in short_content:
        return {"error": "MEMORY_SHORT.md is empty — run distill/short first"}

    budget_tokens = settings.memory_budget_mid
+    persona_name = p.title()
+    user_name = u.title()
    system_prompt = (
-        f"You are {settings.agent_name}'s memory distillation system. "
+        f"You are {persona_name}'s memory distillation system. "
        "Summarize the following recent session logs into a concise mid-term memory digest. "
        f"Target length: under {budget_tokens} tokens. "
        "Focus on: recurring themes, important decisions made, ongoing projects, "
-        f"{settings.user_name}'s current state and priorities, and anything that should persist into future sessions. "
-        f"Write in first person as {settings.agent_name} (e.g. '{settings.user_name} and I worked on...'). "
+        f"{user_name}'s current state and priorities, and anything that should persist into future sessions. "
+        f"Write in first person as {persona_name} (e.g. '{user_name} and I worked on...'). "
        "Use markdown headings. Be specific and concrete — no filler."
    )

@@ -109,14 +194,20 @@ async def distill_mid(username: str | None = None, persona: str | None = None) -
        role="distill",
    )

+    err = _sanity_check(response_text, "distill_mid", existing_mid)
+    if err:
+        logger.warning(err)
+        return {"error": err}
+
    now = datetime.now().strftime("%Y-%m-%d %H:%M")
    header = (
        f"# MEMORY_MID.md — Mid-Term Memory Digest\n\n"
        f"*Auto-distilled: {now} via {backend}.*\n\n---\n\n"
    )
    out_path = inara_dir / "MEMORY_MID.md"
+    _rotate_backup(out_path)
    out_path.write_text(header + response_text)
-    logger.info("distill_mid: wrote %d chars via %s", len(header) + len(response_text), backend)
+    logger.info("distill_mid [%s/%s]: wrote %d chars via %s", u, p, len(header) + len(response_text), backend)

    return {
        "username": u,
@@ -126,16 +217,15 @@ async def distill_mid(username: str | None = None, persona: str | None = None) -
    }


-async def distill_long(username: str | None = None, persona: str | None = None) -> dict:
+async def distill_long(username: str, persona: str) -> dict:
    """
    Ask the LLM to integrate MEMORY_MID.md into MEMORY_LONG.md.
-    Uses DISTILL_BACKEND_LONG if set, otherwise primary_backend.
+    Backs up the current MEMORY_LONG.md before overwriting.
    """
    from llm_client import complete
    from persona import set_context

-    u = username or settings.user_name.lower()
-    p = persona or settings.agent_name.lower()
+    u, p = username, persona
    set_context(u, p)

    inara_dir = _persona_path(u, p)
@@ -146,8 +236,9 @@ async def distill_long(username: str | None = None, persona: str | None = None)
        return {"error": "MEMORY_MID.md is empty — run distill/mid first"}

    budget_tokens = settings.memory_budget_long
+    persona_name = p.title()
    system_prompt = (
-        f"You are {settings.agent_name}'s long-term memory curator. "
+        f"You are {persona_name}'s long-term memory curator. "
        "You will receive the current long-term memory and a recent mid-term digest. "
        f"Integrate the new information into the long-term memory. Target: under {budget_tokens} tokens. "
        "Rules: preserve important historical facts; update or replace stale information; "
@@ -166,18 +257,24 @@ async def distill_long(username: str | None = None, persona: str | None = None)
        role="distill",
    )

+    err = _sanity_check(response_text, "distill_long", long_content)
+    if err:
+        logger.warning(err)
+        return {"error": err}
+
    # Ensure the file has the right header if the LLM dropped it
    now = datetime.now().strftime("%Y-%m-%d %H:%M")
    if not response_text.lstrip().startswith("# MEMORY_LONG"):
        response_text = (
-            f"# MEMORY_LONG.md — {settings.agent_name} Long-Term Memory\n\n"
+            f"# MEMORY_LONG.md — {persona_name} Long-Term Memory\n\n"
            f"*Last distilled: {now} via {backend}.*\n\n---\n\n"
            + response_text
        )

    out_path = inara_dir / "MEMORY_LONG.md"
+    _rotate_backup(out_path)
    out_path.write_text(response_text)
-    logger.info("distill_long: wrote %d chars via %s", len(response_text), backend)
+    logger.info("distill_long [%s/%s]: wrote %d chars via %s", u, p, len(response_text), backend)

    return {
        "username": u,
--- a/cortex/model_registry.py
+++ b/cortex/model_registry.py
@@ -1,57 +1,74 @@
 """
-Per-user unified model registry.
+Per-user unified model registry — V2.

 Stored in: home/{user}/model_registry.json

-Schema:
+V2 Schema:
  {
-    "version": 1,
-    "hosts": [{"id", "label", "api_url", "api_key",
-               "host_type": "openwebui" | "openai"}, ...],
-    #
-    # host_type controls the API path layout:
-    #   "openwebui"  (default) — Open WebUI / Ollama:
-    #                   chat:   POST {url}/api/chat/completions
-    #                   models: GET  {url}/api/models
-    #   "openai"     — OpenRouter, LiteLLM, Anthropic-compatible, etc.:
-    #                   chat:   POST {url}/chat/completions
-    #                   models: GET  {url}/models
-    #   Set api_url to the base path that ends just before /chat/completions,
-    #   e.g. https://openrouter.ai/api/v1  for OpenRouter.
+    "version": 2,
+
+    # Per-provider accounts / credentials (user-configured)
+    "providers": {
+      "anthropic": {
+        "credentials": [
+          {"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}
+        ]
+      },
+      "google": {
+        "accounts": [
+          {"id": "<hex>", "label": "My Google account", "api_key": "AIza..."}
+        ]
+      }
+    },
+
+    # Local OpenAI-compatible hosts (unchanged from V1)
+    "hosts": [{"id", "label", "api_url", "api_key", "host_type"}, ...],
+
+    # User-registered model entries (all providers)
    "models": [
      {
-        "id":         str,             # unique within this registry
-        "type":       str,             # "local_openai" | "claude_cli" | "gemini_cli" | "gemini_api"
-        "label":      str,             # human-readable display name
-        "model_name": str,             # model identifier sent to the API
-        "host_id":    str | null,      # only for local_openai — references hosts[].id
-        "context_k":  int,             # context window in thousands of tokens (informational)
-        "tags":       [str],           # user-defined capability tags
+        "id":           str,            # unique within this registry
+        "type":         str,            # see TYPES below
+        "label":        str,            # human-readable
+        "model_name":   str,            # identifier sent to the API / CLI
+        "provider":     str | null,     # "anthropic" | "google" | "local" | null
+        "host_id":      str | null,     # local_openai only — references hosts[].id
+        "credential_id":str | null,     # claude_cli only — references providers.anthropic.credentials
+        "account_id":   str | null,     # gemini_api only — references providers.google.accounts
+        "context_k":    int,            # context window in k tokens (informational)
+        "max_rounds":   int | null,     # per-model tool-loop cap; null = use orchestrator_max_rounds global
+        "tags":         [str],          # user-defined capability tags
      },
    ],
+
+    # Role assignments — any model (any provider) can fill any role
    "roles": {
      "<role>": {
        "primary":  "<model_id>" | null,
        "backup_1": "<model_id>" | null,
-        "backup_2": "<model_id>" | null,
-        "backup_3": "<model_id>" | null,
+        ...
        "backup_4": "<model_id>" | null,
      },
    },
  }

-Built-in model IDs (always resolvable, no registry entry required):
-  "claude_cli"  — Claude CLI subprocess (~/.claude/.credentials.json)
-  "gemini_cli"  — Gemini CLI subprocess
-  "gemini_api"  — Gemini API (google-genai SDK; used by orchestrator engine, not llm_client)
+Types:
+  "claude_cli"    — Claude CLI subprocess (~/.claude/.credentials.json)
+  "gemini_cli"    — Gemini CLI subprocess
+  "gemini_api"    — Gemini API (google-genai SDK); account_id → api_key from providers.google
+  "local_openai"  — OpenAI-compatible endpoint; host_id → api_url/api_key from hosts[]
+  "anthropic_api" — Anthropic SDK direct; credential_id → api_key from providers.anthropic.credentials

-Standard roles are defined by settings.defined_roles (default: chat,orchestrator,distill,coder,research).
-Additional custom roles can be added freely to roles{}.
+Built-in model IDs (always resolvable without a registry entry):
+  "claude_cli"  — resolves to the default Claude CLI model
+  "gemini_cli"  — resolves to Gemini CLI
+  "gemini_api"  — resolves to Gemini API using GEMINI_API_KEY from .env

-Resolution for get_model_for_role(username, role):
-  1. User registry: roles[role].primary → backup_1 → backup_2 → backup_3 → backup_4
-  2. .env default: ROLE_<ROLE>=<builtin_id>  (e.g. ROLE_CHAT=claude_cli)
+Role resolution for get_model_for_role(username, role):
+  1. User registry: roles[role].primary → backup_1 → ... → backup_4
+  2. .env default: ROLE_<ROLE>=<builtin_id>
  3. Hardcoded last-resort defaults per role
+  4. claude_cli (absolute fallback)
 """

 import json
@@ -63,11 +80,66 @@ from config import settings

 logger = logging.getLogger(__name__)

+
+# ── Role-level tool defaults ───────────────────────────────────────────────────
+# Applied when a user hasn't configured a custom tool list for a role.
+# None = no restriction (all accessible tools); [] = no tools (pure text processing).
+# "chat" is intentionally absent: the /chat endpoint never sends tool schemas anyway,
+# and the orchestrator uses chat_role="chat" as its default — restricting it here
+# would block all tools from every default orchestration request.
+# "orchestrator" is intentionally absent — Phase 2 keyword routing narrows it per message.
+ROLE_DEFAULT_TOOLS: dict[str, list[str] | None] = {
+    "distill":  [],    # pure text processing — no tools needed
+    "research": ["web_search", "web_read", "http_fetch"],
+    "coder": [
+        "project_file_read", "project_file_list", "file_stat", "file_grep",
+        "file_diff", "file_syntax_check", "file_read", "file_list", "file_write",
+        "git_status", "git_log", "git_diff", "shell_exec",
+    ],
+}
+
+
+# ── Provider model catalogs ───────────────────────────────────────────────────
+# Server-side defaults. Update here when providers release new models.
+# Users can add entries via the settings UI (Phase 2).
+
+ANTHROPIC_CATALOG: list[dict] = [
+    # Latest
+    {"id": "claude-opus-4-7",           "label": "Claude Opus 4.7",    "context_k": 1000},
+    {"id": "claude-sonnet-4-6",         "label": "Claude Sonnet 4.6",  "context_k": 1000},
+    {"id": "claude-haiku-4-5-20251001", "label": "Claude Haiku 4.5",   "context_k": 200},
+    # Previous versions (still available, not deprecated)
+    {"id": "claude-opus-4-6",           "label": "Claude Opus 4.6",    "context_k": 1000},
+    {"id": "claude-sonnet-4-5",         "label": "Claude Sonnet 4.5",  "context_k": 200},
+]
+
+GOOGLE_CATALOG: list[dict] = [
+    # Stable / generally available
+    {"id": "gemini-2.5-pro",                  "label": "Gemini 2.5 Pro",                  "context_k": 1000},
+    {"id": "gemini-2.5-flash",                "label": "Gemini 2.5 Flash",                "context_k": 1000},
+    {"id": "gemini-2.5-flash-lite",           "label": "Gemini 2.5 Flash-Lite",           "context_k": 1000},
+    # Preview
+    {"id": "gemini-3.1-pro-preview",          "label": "Gemini 3.1 Pro (preview)",        "context_k": 1000},
+    {"id": "gemini-3-flash-preview",          "label": "Gemini 3 Flash (preview)",        "context_k": 1000},
+    {"id": "gemini-3.1-flash-lite-preview",   "label": "Gemini 3.1 Flash-Lite (preview)", "context_k": 1000},
+]
+
+# Known OpenAI-compatible cloud inference services.
+# All use host_type "openai" (/chat/completions + /models paths).
+CLOUD_API_CATALOG: list[dict] = [
+    {"id": "openrouter",  "label": "OpenRouter",    "api_url": "https://openrouter.ai/api/v1"},
+    {"id": "openai",      "label": "OpenAI",         "api_url": "https://api.openai.com/v1"},
+    {"id": "groq",        "label": "Groq",            "api_url": "https://api.groq.com/openai/v1"},
+    {"id": "xai",         "label": "X.ai / Grok",    "api_url": "https://api.x.ai/v1"},
+    {"id": "together",    "label": "Together.ai",    "api_url": "https://api.together.xyz/v1"},
+    {"id": "fireworks",   "label": "Fireworks.ai",   "api_url": "https://api.fireworks.ai/inference/v1"},
+    {"id": "custom",      "label": "Custom",          "api_url": ""},
+]
+
+
 # ── Built-in model definitions ────────────────────────────────────────────────
-# These IDs are always resolvable without a registry entry.

 def _builtins() -> dict[str, dict]:
-    """Return built-in model definitions (lazy so settings are resolved at call time)."""
    return {
        "claude_cli": {
            "id":         "claude_cli",
@@ -96,7 +168,6 @@ def _builtins() -> dict[str, dict]:
    }


-# Hardcoded last-resort defaults per role (used only if .env is also unset)
 _ROLE_LAST_RESORT: dict[str, str] = {
    "chat":         "claude_cli",
    "orchestrator": "gemini_api",
@@ -107,6 +178,8 @@ _ROLE_LAST_RESORT: dict[str, str] = {

 PRIORITY_KEYS = ["primary", "backup_1", "backup_2", "backup_3", "backup_4"]

+REQUIRED_ROLES: list[str] = ["chat", "orchestrator", "distill"]
+

 # ── Storage ───────────────────────────────────────────────────────────────────

@@ -118,14 +191,41 @@ def _local_llm_path(username: str) -> Path:
    return settings.home_root() / username / "local_llm.json"


+def _auth_path(username: str) -> Path:
+    return settings.home_root() / username / "auth.json"
+
+
 def _empty() -> dict:
-    return {"version": 1, "hosts": [], "models": [], "roles": {}}
+    return {
+        "version":   2,
+        "providers": _default_providers(),
+        "hosts":     [],
+        "models":    [],
+        "roles":     {},
+    }
+
+
+def _default_providers() -> dict:
+    return {
+        "anthropic": {
+            "credentials": [
+                {"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}
+            ]
+        },
+        "google": {
+            "accounts": []
+        },
+    }


 def _normalize(data: dict) -> dict:
-    """Back-fill any missing fields introduced by schema additions."""
+    """Back-fill missing fields introduced by schema additions."""
    for h in data.get("hosts", []):
        h.setdefault("host_type", "openwebui")
+        h.setdefault("max_concurrent", 3)
+    data.setdefault("providers", _default_providers())
+    data["providers"].setdefault("anthropic", {"credentials": [{"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}]})
+    data["providers"].setdefault("google", {"accounts": []})
    return data


@@ -135,12 +235,15 @@ def _load(username: str) -> dict:
        try:
            data = json.loads(path.read_text())
            if isinstance(data, dict) and "version" in data:
+                if data["version"] == 1:
+                    data = _migrate_v1_to_v2(username, data)
+                    _save(username, data)
                return _normalize(data)
        except (json.JSONDecodeError, OSError):
            logger.warning("model_registry.json for %s is unreadable — starting fresh", username)
        return _empty()

-    # No registry yet — try migrating from local_llm.json
+    # No registry — try migrating from local_llm.json
    legacy = _local_llm_path(username)
    if legacy.exists():
        data = _migrate_from_local_llm(username, legacy)
@@ -157,8 +260,45 @@ def _save(username: str, data: dict) -> None:

 # ── Migration ─────────────────────────────────────────────────────────────────

+def _migrate_v1_to_v2(username: str, data: dict) -> dict:
+    """
+    Upgrade a V1 registry to V2.
+
+    Changes:
+    - Adds providers section with default structure
+    - Migrates gemini_api_key from auth.json → first Google account entry
+    - Sets version to 2
+    """
+    logger.info("Migrating model_registry.json V1 → V2 for %s", username)
+
+    data["version"] = 2
+    if "providers" not in data:
+        data["providers"] = _default_providers()
+    else:
+        data["providers"].setdefault("anthropic", {"credentials": [{"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}]})
+        data["providers"].setdefault("google", {"accounts": []})
+
+    # Pull existing Gemini key from auth.json (stored there in V1) → first account entry
+    accounts = data["providers"]["google"]["accounts"]
+    if not accounts:
+        try:
+            auth = json.loads(_auth_path(username).read_text())
+            existing_key = auth.get("gemini_api_key")
+            if existing_key:
+                accounts.append({
+                    "id":      secrets.token_hex(4),
+                    "label":   "Gemini API Key",
+                    "api_key": existing_key,
+                })
+                logger.info("Migrated gemini_api_key from auth.json → providers.google.accounts for %s", username)
+        except (OSError, json.JSONDecodeError):
+            pass
+
+    return data
+
+
 def _migrate_from_local_llm(username: str, path: Path) -> dict:
-    """Convert local_llm.json (hosts/models/active_model_id) → model_registry format."""
+    """Convert local_llm.json → V2 model_registry format."""
    try:
        old = json.loads(path.read_text())
    except Exception:
@@ -190,30 +330,27 @@ def _migrate_from_local_llm(username: str, path: Path) -> dict:
            "type":       "local_openai",
            "label":      m.get("label") or m.get("model_name", ""),
            "model_name": m.get("model_name", ""),
+            "provider":   "local",
            "host_id":    m.get("host_id"),
            "context_k":  0,
            "tags":       [],
        })

-    # Build initial role assignments
    active_id = old.get("active_model_id")
-    distill_type = settings.distill_backend_mid or None
-
-    roles: dict[str, dict] = {}
    if active_id and any(m["id"] == active_id for m in data["models"]):
-        roles["chat"] = {"primary": active_id}
+        data["roles"]["chat"] = {"primary": active_id}
+        if settings.distill_backend_mid == "local":
+            data["roles"]["distill"] = {"primary": active_id}

-    if distill_type == "local" and active_id:
-        roles["distill"] = {"primary": active_id}
-
-    data["roles"] = roles
+    # Migrate Gemini key from auth.json
+    data = _migrate_v1_to_v2(username, {"version": 1, **data})
    return data


 # ── Model resolution ──────────────────────────────────────────────────────────

 def _resolve_model(registry: dict, model_id: str) -> dict | None:
-    """Resolve a model_id to its full config dict, or None if not found."""
+    """Resolve a model_id to its full config dict (credentials merged in), or None."""
    builtins = _builtins()

    # Built-in IDs take priority over user-defined entries with the same ID
@@ -224,7 +361,9 @@ def _resolve_model(registry: dict, model_id: str) -> dict | None:
    if not model:
        return None

-    if model.get("type") == "local_openai":
+    model_type = model.get("type")
+
+    if model_type == "local_openai":
        host_id = model.get("host_id")
        host = next((h for h in registry.get("hosts", []) if h["id"] == host_id), None)
        if not host:
@@ -237,6 +376,29 @@ def _resolve_model(registry: dict, model_id: str) -> dict | None:
            "host_type": host.get("host_type", "openwebui"),
        }

+    if model_type == "gemini_api":
+        account_id = model.get("account_id")
+        if account_id:
+            accounts = registry.get("providers", {}).get("google", {}).get("accounts", [])
+            account = next((a for a in accounts if a["id"] == account_id), None)
+            if account:
+                return {**model, "api_key": account.get("api_key", "")}
+            logger.warning("model %s references missing account_id %s", model_id, account_id)
+        return dict(model)
+
+    if model_type == "anthropic_api":
+        credential_id = model.get("credential_id")
+        if credential_id:
+            creds = registry.get("providers", {}).get("anthropic", {}).get("credentials", [])
+            cred = next((c for c in creds if c["id"] == credential_id), None)
+            if cred and cred.get("api_key"):
+                return {**model, "api_key": cred["api_key"]}
+            logger.warning("model %s references missing/keyless credential_id %s", model_id, credential_id)
+        return dict(model)
+
+    if model_type == "claude_cli":
+        return dict(model)
+
    return dict(model)


@@ -277,7 +439,6 @@ def get_best_local_model(username: str, role: str = "chat") -> dict | None:
    """
    Return the best available local_openai model for the given role.
    Used when the user explicitly selects "local" backend in the UI.
-    Tries the role's priority chain first, then any configured local model.
    """
    registry = _load(username)
    role_cfg = registry.get("roles", {}).get(role, {})
@@ -290,7 +451,6 @@ def get_best_local_model(username: str, role: str = "chat") -> dict | None:
        if resolved and resolved.get("type") == "local_openai":
            return resolved

-    # Fall back to first configured local model
    for model in registry.get("models", []):
        if model.get("type") == "local_openai":
            resolved = _resolve_model(registry, model["id"])
@@ -300,15 +460,110 @@ def get_best_local_model(username: str, role: str = "chat") -> dict | None:
    return None


-# ── Read API (for UI and callers) ─────────────────────────────────────────────
+def set_role_config(
+    username: str,
+    role: str,
+    system_append: str,
+    tools: list[str] | None,
+    inject_datetime: bool = True,
+    inject_mode: bool = True,
+) -> None:
+    """Save system_append, tools allow-list, and per-injection flags for a role.
+
+    tools=None clears the allow-list (role uses all accessible tools).
+    inject_datetime=False suppresses the date/time header for pure processing roles.
+    inject_mode=False suppresses the session mode (OTR) line for pure processing roles.
+    """
+    data = _load(username)
+    roles = data.setdefault("roles", {})
+    if role not in roles:
+        roles[role] = {}
+    roles[role]["system_append"] = system_append.strip()
+    roles[role]["inject_datetime"] = inject_datetime
+    roles[role]["inject_mode"] = inject_mode
+    if tools is None:
+        roles[role].pop("tools", None)
+    else:
+        roles[role]["tools"] = [t for t in tools if t]
+    _save(username, data)
+
+
+def get_role_config(username: str, role: str) -> dict:
+    """
+    Return supplemental config for a role: system_append, tools, and injection flags.
+
+    All keys are optional in the registry — missing means "use defaults":
+      system_append: str   — appended to the system prompt for this role
+      tools: list[str] | None — explicit tool allow-list (None = no restriction)
+      inject_datetime: bool — whether to inject current date/time (default True)
+      inject_mode: bool — whether to inject session mode (OTR) line (default True)
+    """
+    registry = _load(username)
+    role_cfg = registry.get("roles", {}).get(role, {})
+    user_tools = role_cfg.get("tools")
+    if user_tools is None:
+        # No user-configured list — fall back to system defaults for this role
+        effective_tools: list[str] | None = ROLE_DEFAULT_TOOLS.get(role)
+    else:
+        # User has configured tools; preserve their setting (empty list → no restriction)
+        effective_tools = user_tools or None
+    return {
+        "system_append":  role_cfg.get("system_append", ""),
+        "tools":          effective_tools,
+        "inject_datetime": role_cfg.get("inject_datetime", True),
+        "inject_mode":    role_cfg.get("inject_mode", True),
+    }
+
+
+def get_model_for_slot(username: str, role: str, slot: str) -> dict | None:
+    """
+    Resolve a single named priority slot from a role without walking the fallback chain.
+
+    Used by Phase 3 explicit slot selection — the user has pinned a specific model;
+    don't silently redirect to another slot if this one is empty or broken.
+    Returns None if the slot is unset or the model can't be resolved.
+    """
+    if slot not in PRIORITY_KEYS:
+        return None
+    registry = _load(username)
+    model_id = registry.get("roles", {}).get(role, {}).get(slot)
+    if not model_id:
+        return None
+    return _resolve_model(registry, model_id)
+
+
+def get_google_api_key(username: str, account_id: str | None = None) -> str | None:
+    """
+    Return the best available Gemini API key for the user.
+
+    If account_id is specified, returns that account's key (or None if not found).
+    Otherwise returns the first configured account key, falling back to the
+    server-level GEMINI_API_KEY from .env.
+    """
+    registry = _load(username)
+    accounts = registry.get("providers", {}).get("google", {}).get("accounts", [])
+
+    if account_id:
+        account = next((a for a in accounts if a["id"] == account_id), None)
+        return account.get("api_key") if account else None
+
+    # First configured account
+    if accounts:
+        return accounts[0].get("api_key") or None
+
+    # Fall back to .env server key
+    return settings.gemini_api_key or None
+
+
+# ── Read API ──────────────────────────────────────────────────────────────────

 def get_registry(username: str) -> dict:
-    """Return the full registry (with built-in models injected for display)."""
+    """Return the full registry (providers + hosts + models + roles)."""
    return _load(username)


 def get_all_models(username: str) -> list[dict]:
-    """Return all user-defined models (resolved — hosts merged in)."""
+    """Return all user-defined models (resolved — credentials/hosts merged in)."""
    registry = _load(username)
    out = []
    for m in registry.get("models", []):
@@ -319,96 +574,307 @@ def get_all_models(username: str) -> list[dict]:


 def get_defined_roles(username: str) -> dict[str, dict]:
-    """Return the roles section of the registry, filling gaps with empty dicts."""
+    """Return the roles section, filling gaps with empty dicts."""
    registry = _load(username)
    roles = registry.get("roles", {})
-    result = {}
-    for role in settings.get_defined_roles():
-        result[role] = roles.get(role, {})
-    return result
+    return {role: roles.get(role, {}) for role in settings.get_defined_roles()}


-# ── Write API (CRUD) ──────────────────────────────────────────────────────────
+def get_google_accounts(username: str) -> list[dict]:
+    """Return Google account entries (api_key masked for display)."""
+    registry = _load(username)
+    accounts = registry.get("providers", {}).get("google", {}).get("accounts", [])
+    return [
+        {
+            "id":    a["id"],
+            "label": a.get("label", ""),
+            "hint":  (a.get("api_key") or "")[:8] + "…" if a.get("api_key") else "",
+        }
+        for a in accounts
+    ]
+
+
+def get_catalog(provider: str, username: str | None = None) -> list[dict]:
+    """
+    Return the model catalog for a provider.
+
+    For now returns server defaults. Phase 2 will merge in per-user additions.
+    """
+    if provider == "anthropic":
+        return list(ANTHROPIC_CATALOG)
+    if provider == "google":
+        return list(GOOGLE_CATALOG)
+    if provider == "cloud":
+        return list(CLOUD_API_CATALOG)
+    return []
+
+
+# ── Write API — Google accounts ───────────────────────────────────────────────
+
+def save_google_account(username: str, account_id: str | None,
+                        label: str, api_key: str) -> str:
+    """Create or update a Google account entry. Returns the account ID."""
+    data = _load(username)
+    accounts = data["providers"]["google"]["accounts"]
+
+    if account_id:
+        for a in accounts:
+            if a["id"] == account_id:
+                a["label"] = label.strip()
+                if api_key.strip():
+                    a["api_key"] = api_key.strip()
+                _save(username, data)
+                return account_id
+
+    account_id = secrets.token_hex(4)
+    accounts.append({
+        "id":      account_id,
+        "label":   label.strip(),
+        "api_key": api_key.strip(),
+    })
+    _save(username, data)
+    return account_id
+
+
+def remove_google_account(username: str, account_id: str) -> bool:
+    """Remove a Google account. Clears any model entries that reference it."""
+    data = _load(username)
+    accounts = data["providers"]["google"]["accounts"]
+    before = len(accounts)
+    data["providers"]["google"]["accounts"] = [a for a in accounts if a["id"] != account_id]
+
+    # Clear role assignments for models that referenced this account
+    removed_model_ids = {
+        m["id"] for m in data.get("models", [])
+        if m.get("account_id") == account_id
+    }
+    data["models"] = [m for m in data.get("models", []) if m["id"] not in removed_model_ids]
+    for role_cfg in data.get("roles", {}).values():
+        for key in PRIORITY_KEYS:
+            if role_cfg.get(key) in removed_model_ids:
+                role_cfg[key] = None
+
+    _save(username, data)
+    return len(data["providers"]["google"]["accounts"]) < before
+
+
+# ── Write API — Anthropic API keys ───────────────────────────────────────────
+
+def get_anthropic_api_keys(username: str) -> list[dict]:
+    """Return Anthropic API key credentials (type='api_key') with key masked for display."""
+    registry = _load(username)
+    creds = registry.get("providers", {}).get("anthropic", {}).get("credentials", [])
+    return [
+        {
+            "id":    c["id"],
+            "label": c.get("label", ""),
+            "hint":  (c.get("api_key") or "")[:8] + "…" if c.get("api_key") else "no key",
+        }
+        for c in creds
+        if c.get("type") == "api_key"
+    ]
+
+
+def save_anthropic_api_key(username: str, key_id: str | None,
+                           label: str, api_key: str) -> str:
+    """Create or update an Anthropic API key credential. Returns the credential ID."""
+    data = _load(username)
+    creds = data["providers"]["anthropic"]["credentials"]
+
+    if key_id:
+        for c in creds:
+            if c["id"] == key_id and c.get("type") == "api_key":
+                c["label"] = label.strip() or c.get("label", "API Key")
+                if api_key.strip():
+                    c["api_key"] = api_key.strip()
+                _save(username, data)
+                return key_id
+
+    key_id = secrets.token_hex(4)
+    creds.append({
+        "id":      key_id,
+        "label":   label.strip() or "API Key",
+        "type":    "api_key",
+        "api_key": api_key.strip(),
+    })
+    _save(username, data)
+    return key_id
+
+
+def remove_anthropic_api_key(username: str, key_id: str) -> bool:
+    """Remove an Anthropic API key credential. Clears model entries that reference it."""
+    data = _load(username)
+    creds = data["providers"]["anthropic"]["credentials"]
+    before = len(creds)
+    data["providers"]["anthropic"]["credentials"] = [
+        c for c in creds if c["id"] != key_id
+    ]
+
+    removed_model_ids = {
+        m["id"] for m in data.get("models", [])
+        if m.get("credential_id") == key_id
+    }
+    data["models"] = [m for m in data.get("models", []) if m["id"] not in removed_model_ids]
+    for role_cfg in data.get("roles", {}).values():
+        for key in PRIORITY_KEYS:
+            if role_cfg.get(key) in removed_model_ids:
+                role_cfg[key] = None
+
+    _save(username, data)
+    return len(data["providers"]["anthropic"]["credentials"]) < before
+
+
+# ── Write API — Hosts ─────────────────────────────────────────────────────────

 def save_host(username: str, host_id: str | None,
              label: str, api_url: str, api_key: str,
-              host_type: str = "openwebui") -> str:
-    """Create or update a host. Returns the host ID.
-
-    host_type: "openwebui" (default) or "openai" (OpenRouter, LiteLLM, etc.)
-    """
+              host_type: str = "openwebui",
+              max_concurrent: int = 3) -> str:
+    """Create or update a host. Returns the host ID."""
    data = _load(username)
    host_type = host_type if host_type in ("openwebui", "openai") else "openwebui"
+    max_concurrent = max(1, min(int(max_concurrent), 20))

    if host_id:
        for h in data["hosts"]:
            if h["id"] == host_id:
-                h["label"]     = label.strip()
-                h["api_url"]   = api_url.strip()
-                h["host_type"] = host_type
+                h["label"]          = label.strip()
+                h["api_url"]        = api_url.strip()
+                h["host_type"]      = host_type
+                h["max_concurrent"] = max_concurrent
                if api_key.strip():
                    h["api_key"] = api_key.strip()
                _save(username, data)
                return host_id
-        host_id = None  # not found — create new
+        host_id = None

    host_id = secrets.token_hex(4)
    data["hosts"].append({
-        "id":        host_id,
-        "label":     label.strip(),
-        "api_url":   api_url.strip(),
-        "api_key":   api_key.strip(),
-        "host_type": host_type,
+        "id":            host_id,
+        "label":         label.strip(),
+        "api_url":       api_url.strip(),
+        "api_key":       api_key.strip(),
+        "host_type":     host_type,
+        "max_concurrent": max_concurrent,
    })
    _save(username, data)
    return host_id


 def remove_host(username: str, host_id: str) -> bool:
-    """Remove a host and all models that reference it. Returns True if found."""
+    """Remove a host and all models that reference it."""
    data = _load(username)
    before = len(data["hosts"])
-    data["hosts"] = [h for h in data["hosts"] if h["id"] != host_id]
-    data["models"] = [m for m in data["models"] if m.get("host_id") != host_id]
-    # Clear any role assignments that pointed to removed models
-    removed_ids = {m["id"] for m in data["models"] if m.get("host_id") == host_id}
+    removed_model_ids = {m["id"] for m in data["models"] if m.get("host_id") == host_id}
+    data["hosts"]  = [h for h in data["hosts"]  if h["id"] != host_id]
+    data["models"] = [m for m in data["models"]  if m.get("host_id") != host_id]
    for role_cfg in data.get("roles", {}).values():
        for key in PRIORITY_KEYS:
-            if role_cfg.get(key) in removed_ids:
+            if role_cfg.get(key) in removed_model_ids:
                role_cfg[key] = None
    _save(username, data)
    return len(data["hosts"]) < before


+# ── Write API — Models ────────────────────────────────────────────────────────
+
 def save_model(username: str, model_id: str | None, host_id: str,
               label: str, model_name: str, context_k: int = 0,
-               tags: list[str] | None = None) -> str:
-    """Create or update a model entry. Returns the model ID."""
+               tags: list[str] | None = None,
+               max_rounds: int | None = None,
+               tools: bool = True,
+               reasoning_budget_tokens: int | None = None) -> str:
+    """Create or update a local_openai model entry. Returns the model ID."""
    data = _load(username)
    tags = tags or []

    if model_id:
        for m in data["models"]:
            if m["id"] == model_id:
-                m["host_id"]    = host_id
-                m["label"]      = label.strip() or model_name.strip()
-                m["model_name"] = model_name.strip()
-                m["context_k"]  = context_k
-                m["tags"]       = tags
+                m["host_id"]                = host_id
+                m["label"]                  = label.strip() or model_name.strip()
+                m["model_name"]             = model_name.strip()
+                m["context_k"]              = context_k
+                m["max_rounds"]             = max_rounds
+                m["tools"]                  = tools
+                m["tags"]                   = tags
+                m["reasoning_budget_tokens"] = reasoning_budget_tokens
                _save(username, data)
                return model_id
        model_id = None

    model_id = secrets.token_hex(4)
    data["models"].append({
-        "id":         model_id,
-        "type":       "local_openai",
+        "id":                     model_id,
+        "type":                   "local_openai",
+        "label":                  label.strip() or model_name.strip(),
+        "model_name":             model_name.strip(),
+        "provider":               "local",
+        "host_id":                host_id,
+        "context_k":              context_k,
+        "max_rounds":             max_rounds,
+        "tools":                  tools,
+        "tags":                   tags,
+        "reasoning_budget_tokens": reasoning_budget_tokens,
+    })
+    _save(username, data)
+    return model_id
+
+
+def save_cloud_model(username: str, model_id: str | None,
+                     provider: str, model_name: str, label: str,
+                     account_id: str | None = None,
+                     credential_id: str | None = None,
+                     context_k: int = 0,
+                     tags: list[str] | None = None,
+                     max_rounds: int | None = None,
+                     tools: bool = True) -> str:
+    """
+    Create or update an Anthropic or Google model entry. Returns the model ID.
+
+    provider: "anthropic" | "google"
+    account_id:    Google only — references providers.google.accounts[].id
+    credential_id: Anthropic only — "cli" for OAuth CLI, or a hex ID for an API key credential
+    """
+    data = _load(username)
+
+    # Determine model type from credential (anthropic only)
+    if provider == "anthropic":
+        creds = data.get("providers", {}).get("anthropic", {}).get("credentials", [])
+        cred = next((c for c in creds if c["id"] == credential_id), None) if credential_id else None
+        entry_type = "anthropic_api" if (cred and cred.get("type") == "api_key") else "claude_cli"
+    elif provider == "google":
+        entry_type = "gemini_api"
+    else:
+        entry_type = "claude_cli"
+    tags = tags or []
+
+    entry: dict = {
+        "type":       entry_type,
        "label":      label.strip() or model_name.strip(),
        "model_name": model_name.strip(),
-        "host_id":    host_id,
+        "provider":   provider,
        "context_k":  context_k,
+        "max_rounds": max_rounds,
+        "tools":      tools,
        "tags":       tags,
-    })
+    }
+    if account_id:
+        entry["account_id"] = account_id
+    if credential_id:
+        entry["credential_id"] = credential_id
+
+    if model_id:
+        for m in data["models"]:
+            if m["id"] == model_id:
+                m.update(entry)
+                _save(username, data)
+                return model_id
+        model_id = None
+
+    model_id = secrets.token_hex(4)
+    entry["id"] = model_id
+    data["models"].append(entry)
    _save(username, data)
    return model_id

@@ -418,24 +884,67 @@ def remove_model(username: str, model_id: str) -> bool:
    data = _load(username)
    before = len(data["models"])
    data["models"] = [m for m in data["models"] if m["id"] != model_id]
-
    for role_cfg in data.get("roles", {}).values():
        for key in PRIORITY_KEYS:
            if role_cfg.get(key) == model_id:
                role_cfg[key] = None
-
    _save(username, data)
    return len(data["models"]) < before


+def get_custom_roles(username: str) -> list[str]:
+    """
+    Return the user's custom (non-required) roles.
+    Falls back to config-defined roles minus required ones for migration.
+    """
+    registry = _load(username)
+    if "custom_roles" in registry:
+        return [r for r in registry["custom_roles"] if r and r not in REQUIRED_ROLES]
+    from config import settings as _cfg
+    return [r for r in _cfg.get_defined_roles() if r not in REQUIRED_ROLES]
+
+
+def get_all_roles(username: str) -> list[str]:
+    """Return required roles followed by the user's custom roles."""
+    return list(REQUIRED_ROLES) + get_custom_roles(username)
+
+
+def add_custom_role(username: str, role_name: str) -> bool:
+    """Add a custom role. Returns False if the name is invalid or already a required role."""
+    role_name = role_name.strip().lower()
+    if not role_name or role_name in REQUIRED_ROLES:
+        return False
+    data = _load(username)
+    if "custom_roles" not in data:
+        from config import settings as _cfg
+        data["custom_roles"] = [r for r in _cfg.get_defined_roles() if r not in REQUIRED_ROLES]
+    if role_name not in data["custom_roles"]:
+        data["custom_roles"].append(role_name)
+        _save(username, data)
+    return True
+
+
+def remove_custom_role(username: str, role_name: str) -> bool:
+    """Remove a custom role. Required roles cannot be removed."""
+    if role_name in REQUIRED_ROLES:
+        return False
+    data = _load(username)
+    if "custom_roles" not in data:
+        from config import settings as _cfg
+        data["custom_roles"] = [r for r in _cfg.get_defined_roles() if r not in REQUIRED_ROLES]
+    if role_name in data["custom_roles"]:
+        data["custom_roles"].remove(role_name)
+        _save(username, data)
+    return True
+
+
 def set_role(username: str, role: str, priority: str, model_id: str | None) -> bool:
    """
    Assign a model to a role priority slot.

    priority must be one of: primary, backup_1, backup_2, backup_3, backup_4
    model_id None clears the slot.
-    model_id "claude_cli" / "gemini_cli" / "gemini_api" are valid built-in IDs.
-    Returns False if model_id is set but not found.
+    Built-in IDs (claude_cli, gemini_cli, gemini_api) are always valid.
    """
    if priority not in PRIORITY_KEYS:
        return False
@@ -455,10 +964,14 @@ def set_role(username: str, role: str, priority: str, model_id: str | None) -> b
    return True


-def fetch_models_from_host(api_url: str, api_key: str) -> list[str]:
+# ── Utility ───────────────────────────────────────────────────────────────────
+
+def fetch_models_from_host(api_url: str, api_key: str,
+                           host_type: str = "openwebui") -> list[str]:
    """Synchronously fetch the model list from an OpenAI-compatible host."""
    import httpx
-    url = api_url.rstrip("/") + "/api/models"
+    path = "/api/models" if host_type == "openwebui" else "/models"
+    url = api_url.rstrip("/") + path
    headers = {"Authorization": f"Bearer {api_key}"} if api_key else {}
    resp = httpx.get(url, headers=headers, timeout=10)
    resp.raise_for_status()
--- a/cortex/notification.py
+++ b/cortex/notification.py
@@ -1,22 +1,21 @@
 """
 Outbound notification helpers — send messages to user channels proactively.

-Channel config lives in home/{user}/channels.json.
-Each channel that supports proactive notifications needs a notification_channel
-set to its key name (e.g. "nextcloud", "google_chat") in the user's channels.json:
+Channel config lives in home/{user}/channels.json:
  {
-    "notification_channel": "nextcloud",
+    "notification_channel": "email" | "nextcloud" | "google_chat",
+    "notification_email":   "<override address — defaults to login email>",
    "nextcloud": {
-      "url": "https://cloud.example.com",
-      "bot_secret": "...",
-      "notification_room": "<room-token>",
-      ...
+      "url": "...", "bot_secret": "...", "notification_room": "<token>", ...
+    },
+    "google_chat": {
+      "outbound_webhook": "https://chat.googleapis.com/v1/spaces/...", ...
    }
  }

 If notification_channel is absent, defaults to "nextcloud" if configured.
-If notification_room (for NCT) is absent, notifications are silently skipped.
 """
+import asyncio
 import hashlib
 import hmac
 import json
@@ -73,34 +72,105 @@ async def _notify_nct(nct: dict, message: str, username: str) -> None:
    await _send_nct_message(url, secret, room, message)


+async def _notify_email(username: str, message: str, email_override: str | None = None) -> None:
+    """Send notification via email. Address = override → google_email from auth.json."""
+    from auth_utils import _read_auth
+    from email_utils import send_email
+
+    to_addr = email_override or _read_auth(username).get("google_email", "").strip()
+    if not to_addr:
+        logger.warning("notify: no email address for %s — set notification_email in channels.json", username)
+        return
+
+    ok = await asyncio.to_thread(
+        send_email,
+        to_email=to_addr,
+        subject="Cortex",
+        body_text=message,
+        body_html=message.replace("\n", "<br>"),
+    )
+    if ok:
+        logger.info("notify email → %s (%d chars)", to_addr, len(message))
+    else:
+        logger.warning("notify: email send failed for %s", username)
+
+
+async def _notify_google_chat(webhook_url: str, message: str, username: str) -> None:
+    """POST a message to a Google Chat space via incoming webhook."""
+    body = json.dumps({"text": message}, ensure_ascii=False).encode("utf-8")
+    try:
+        async with httpx.AsyncClient() as client:
+            resp = await client.post(
+                webhook_url,
+                content=body,
+                headers={"Content-Type": "application/json"},
+                timeout=15,
+            )
+        if resp.status_code not in (200, 201):
+            logger.warning("notify Google Chat %s → HTTP %d: %s", username, resp.status_code, resp.text[:200])
+        else:
+            logger.info("notify Google Chat → %s (%d chars)", username, len(message))
+    except Exception as e:
+        logger.error("notify Google Chat error for %s: %s", username, e)
+
+
+async def _notify_web_push(username: str, message: str) -> None:
+    """Send a browser push notification."""
+    import push_utils
+    result = await push_utils.send_push(username, "Cortex", message, "")
+    if "error" in result:
+        logger.warning("notify web_push error for %s: %s", username, result["error"])
+    elif result.get("sent", 0) == 0:
+        logger.debug("notify web_push: no subscriptions for %s", username)
+    else:
+        logger.info("notify web_push → %s (%d device(s))", username, result["sent"])
+
+
 async def notify(username: str, message: str, channel: str | None = None) -> None:
    """Send a notification to the user's preferred outbound channel.

    Channel resolution order:
      1. `channel` parameter if provided
      2. `notification_channel` key in channels.json
-      3. "nextcloud" if configured
+      3. "nextcloud" if notification_room is configured
      4. Silent no-op

-    To configure: set `notification_channel` in home/{user}/channels.json.
-    For NCT: also set `notification_room` in the nextcloud section.
+    Supported channels: "web_push", "email", "nextcloud", "google_chat"
+    Configure via home/{user}/channels.json — see module docstring.
    """
    from auth_utils import get_user_channels
    channels = get_user_channels(username)

    target = channel or channels.get("notification_channel", "").strip()
    if not target:
-        # Auto-detect: use nextcloud if configured
-        if "nextcloud" in channels:
+        # Auto-detect: nextcloud if a notification_room is set
+        nct = channels.get("nextcloud", {})
+        if nct.get("notification_room", "").strip():
            target = "nextcloud"
        else:
            return

-    if target == "nextcloud":
+    if target == "web_push":
+        await _notify_web_push(username, message)
+
+    elif target == "email":
+        email_override = channels.get("notification_email", "").strip() or None
+        await _notify_email(username, message, email_override=email_override)
+
+    elif target == "nextcloud":
        nct = channels.get("nextcloud")
        if not nct:
            logger.debug("notify: nextcloud not configured for %s", username)
            return
        await _notify_nct(nct, message, username)
+
+    elif target == "google_chat":
+        gc = channels.get("google_chat", {})
+        webhook = gc.get("outbound_webhook", "").strip()
+        if not webhook:
+            logger.debug("notify: google_chat outbound_webhook not set for %s", username)
+            return
+        await _notify_google_chat(webhook, message, username)
+
    else:
-        logger.debug("notify: channel %r not yet supported for outbound (user %s)", target, username)
+        logger.debug("notify: channel %r not supported for outbound (user %s)", target, username)
--- a/cortex/openai_orchestrator.py
+++ b/cortex/openai_orchestrator.py
@@ -21,11 +21,12 @@ import asyncio
 import json
 import logging

-from openai import AsyncOpenAI
+from openai import AsyncOpenAI, APIConnectionError, APIStatusError

 from config import settings
-from orchestrator_engine import OrchestratorResult
-from tools import OPENAI_TOOL_SCHEMAS, call_tool
+from orchestrator_engine import OrchestrateCheckpoint, OrchestratorResult
+from tools import OPENAI_TOOL_SCHEMAS, call_tool, get_openai_tools_for_role, get_tools_for_role, CONFIRM_REQUIRED, narrow_tools_by_keywords
+import tool_audit

 logger = logging.getLogger(__name__)

@@ -44,6 +45,13 @@ async def run(
    session_messages: list[dict] | None = None,
    model_cfg: dict | None = None,
    respond_with_final: bool = True,
+    user_role: str = "user",
+    tool_list: list[str] | None = None,
+    confirm_allow: set[str] | None = None,
+    confirm_deny: set[str] | None = None,
+    max_risk: str | None = None,
+    risk_whitelist: list[str] | None = None,
+    risk_blacklist: list[str] | None = None,
 ) -> OrchestratorResult:
    """
    Run a tool-enabled task using an OpenAI-compatible API.
@@ -55,36 +63,36 @@ async def run(
        model_cfg:          Resolved model config from model_registry (local_openai type)
        respond_with_final: If False, return just the tool-loop summary without a
                            full persona-voiced response (faster; for cron/background)
+        confirm_allow:      Tools to bypass the confirmation gate for this user
+        confirm_deny:       Tools to always block for this user

    Returns:
-        OrchestratorResult — same shape as the Gemini engine for drop-in compatibility
+        OrchestratorResult — if checkpoint is set, the job is awaiting confirmation
    """
    if not model_cfg:
        raise RuntimeError("model_cfg is required for the OpenAI orchestrator")

-    api_url    = model_cfg.get("api_url", "")
-    api_key    = model_cfg.get("api_key", "") or "none"
-    model_name = model_cfg.get("model_name", "")
-    host_type  = model_cfg.get("host_type", "openwebui")
+    _confirm_allow = frozenset(confirm_allow or ())
+    _confirm_deny = frozenset(confirm_deny or ())
+    effective_confirm = (CONFIRM_REQUIRED - set(_confirm_allow)) | set(_confirm_deny)

-    if not api_url or not model_name:
-        raise RuntimeError(
-            f"model_cfg missing api_url or model_name: {model_cfg.get('label', model_cfg)}"
-        )
+    # Keyword routing: narrow schemas to only what this message needs.
+    # Also scans the last assistant turn so follow-ups like "yes, do that" inherit tool context.
+    # Returns [] when no keywords match (zero tool overhead — model responds as plain chat).
+    effective_tool_list = narrow_tools_by_keywords(task, tool_list, context_messages=session_messages)
+    logger.info(
+        "Keyword routing: %d tools active (role_tools=%s)",
+        len(effective_tool_list),
+        len(tool_list) if tool_list is not None else "all",
+    )

-    # Open WebUI's OpenAI-compatible endpoint lives at /api/chat/completions,
-    # so the SDK base_url needs the /api prefix; standard OpenAI-layout hosts don't.
-    base_url = api_url.rstrip("/")
-    if host_type == "openwebui":
-        base_url = base_url + "/api"
+    client, model_name, active_tools = _build_client(
+        model_cfg, user_role, effective_tool_list,
+        max_risk=max_risk, risk_whitelist=risk_whitelist, risk_blacklist=risk_blacklist,
+    )
+    tool_audit.set_context("openai", model_cfg.get("label") or model_name)

-    client = AsyncOpenAI(base_url=base_url, api_key=api_key)
-
-    # System prompt: persona context + brief tool instruction
    sys_content = (system_prompt or "") + _TOOL_INSTRUCTION
-
-    # Build messages: [system, ...recent_session, current_task]
-    # Strip non-standard metadata fields (backend, host, etc.) before sending.
    messages: list[dict] = [{"role": "system", "content": sys_content}]
    if session_messages:
        messages.extend(
@@ -94,77 +102,320 @@ async def run(
    messages.append({"role": "user", "content": task})

    tool_call_log: list[dict] = []
-    final_response = ""

-    for round_num in range(settings.orchestrator_max_rounds):
-        logger.info("OpenAI orchestrator round %d / %d  model=%s",
-                    round_num + 1, settings.orchestrator_max_rounds, model_name)
+    final_response, checkpoint = await _run_from_messages(
+        client=client,
+        messages=messages,
+        active_tools=active_tools,
+        tool_call_log=tool_call_log,
+        effective_confirm=effective_confirm,
+        model_name=model_name,
+        task=task,
+        model_cfg=model_cfg,
+        respond_with_final=respond_with_final,
+        user_role=user_role,
+        tool_list=effective_tool_list,
+        confirm_allow=_confirm_allow,
+        confirm_deny=_confirm_deny,
+        starting_round=0,
+    )

-        response = await client.chat.completions.create(
-            model=model_name,
-            messages=messages,
-            tools=OPENAI_TOOL_SCHEMAS,
-            tool_choice="auto",
+    if checkpoint:
+        return OrchestratorResult(
+            response=final_response,
+            tool_calls=list(tool_call_log),
+            backend="local",
+            gemini_summary=final_response,
+            checkpoint=checkpoint,
        )

-        choice = response.choices[0]
-        msg    = choice.message
+    model_label = model_cfg.get("label") or model_name
+    logger.info("OpenAI orchestrator complete — model=%s tools=%d", model_label, len(tool_call_log))
+    return OrchestratorResult(
+        response=final_response,
+        tool_calls=tool_call_log,
+        backend="local",
+        backend_label=model_label,
+        gemini_summary=final_response,
+    )
+
+
+async def resume(checkpoint: OrchestrateCheckpoint, confirmed: bool) -> OrchestratorResult:
+    """Continue an OpenAI orchestrator job that was paused at a confirmation gate."""
+    client, model_name, active_tools = _build_client(checkpoint.model_cfg, checkpoint.user_role, checkpoint.tool_list)
+
+    effective_confirm = (CONFIRM_REQUIRED - set(checkpoint.confirm_allow)) | set(checkpoint.confirm_deny)
+
+    messages = list(checkpoint.pre_fn_state)
+    tool_call_log = [t for t in checkpoint.tool_call_log if t["result"] != "[awaiting confirmation]"]
+
+    # Build tool responses for this round
+    for er in checkpoint.executed_results:
+        messages.append({
+            "role": "tool",
+            "tool_call_id": er.get("tool_call_id", er["name"]),
+            "content": er["result"],
+        })
+
+    for pt in checkpoint.pending_tools:
+        if confirmed:
+            result_str = await _execute_tool_dict(pt["name"], pt["args"], checkpoint.user_role, checkpoint.tool_list)
+            logger.info("Confirmed tool %s → %d chars", pt["name"], len(result_str))
+        else:
+            result_str = "Action denied by user."
+            logger.info("Tool %s denied by user", pt["name"])
+        tool_call_log.append({"tool": pt["name"], "args": pt["args"], "result": result_str})
+        messages.append({
+            "role": "tool",
+            "tool_call_id": pt.get("tool_call_id", pt["name"]),
+            "content": result_str,
+        })
+
+    final_response, new_checkpoint = await _run_from_messages(
+        client=client,
+        messages=messages,
+        active_tools=active_tools,
+        tool_call_log=tool_call_log,
+        effective_confirm=effective_confirm,
+        model_name=model_name,
+        task=checkpoint.task,
+        model_cfg=checkpoint.model_cfg,
+        respond_with_final=checkpoint.respond_with_final,
+        user_role=checkpoint.user_role,
+        tool_list=checkpoint.tool_list,
+        confirm_allow=checkpoint.confirm_allow,
+        confirm_deny=checkpoint.confirm_deny,
+        starting_round=checkpoint.rounds_used,
+    )
+
+    if new_checkpoint:
+        return OrchestratorResult(
+            response=final_response,
+            tool_calls=list(tool_call_log),
+            backend="local",
+            gemini_summary=final_response,
+            checkpoint=new_checkpoint,
+        )
+
+    model_label = (checkpoint.model_cfg or {}).get("label") or model_name
+    logger.info("OpenAI orchestrator resumed — model=%s tools=%d", model_label, len(tool_call_log))
+    return OrchestratorResult(
+        response=final_response,
+        tool_calls=tool_call_log,
+        backend="local",
+        gemini_summary=final_response,
+    )
+
+
+_CHARS_PER_TOKEN = 4
+# Fixed token overhead budget per call (tool schemas excluded — cached separately)
+_TOOL_SCHEMA_OVERHEAD = 500
+# Chars to keep per truncated old tool result
+_TRUNC_RESULT_CHARS = 400
+# Always keep the last N tool-result messages uncompacted
+_KEEP_RECENT_TOOL_MSGS = 6  # ~2 rounds of 3 tools each
+
+# Module-level schema cache: key = (user_role, sorted_tools, risk_params)
+# Bounded in practice — keyword routing produces at most ~30 distinct tool sets.
+_tool_schema_cache: dict[str, list[dict]] = {}
+
+
+def _get_cached_tools(
+    user_role: str,
+    tool_list: list[str] | None,
+    max_risk: str | None = None,
+    whitelist: list[str] | None = None,
+    blacklist: list[str] | None = None,
+) -> list[dict]:
+    key = "|".join([
+        user_role,
+        str(sorted(tool_list) if tool_list is not None else "all"),
+        str(max_risk),
+        str(sorted(whitelist) if whitelist else ""),
+        str(sorted(blacklist) if blacklist else ""),
+    ])
+    if key not in _tool_schema_cache:
+        _tool_schema_cache[key] = get_openai_tools_for_role(
+            user_role, tool_list,
+            max_risk=max_risk, whitelist=whitelist, blacklist=blacklist,
+        )
+    return _tool_schema_cache[key]
+
+
+def _estimate_tokens(messages: list[dict]) -> int:
+    total = sum(len(json.dumps(m)) for m in messages)
+    return total // _CHARS_PER_TOKEN + _TOOL_SCHEMA_OVERHEAD
+
+
+def _compact_messages(messages: list[dict], budget_tokens: int) -> list[dict]:
+    """
+    Truncate old tool result content when approaching the context budget.
+
+    Strategy: keep system message, recent assistant/tool rounds, and the
+    original user task intact. Truncate content of old tool results in the
+    middle of the conversation — the model only needs recent results to reason.
+    """
+    if _estimate_tokens(messages) <= budget_tokens:
+        return messages
+
+    tool_indices = [i for i, m in enumerate(messages) if m.get("role") == "tool"]
+    n_to_compact = max(0, len(tool_indices) - _KEEP_RECENT_TOOL_MSGS)
+    if n_to_compact == 0:
+        return messages  # nothing old enough to trim
+
+    compact_set = set(tool_indices[:n_to_compact])
+    result = []
+    chars_saved = 0
+    for i, msg in enumerate(messages):
+        if i in compact_set:
+            content = msg.get("content", "")
+            if len(content) > _TRUNC_RESULT_CHARS:
+                msg = dict(msg)
+                saved = len(content) - _TRUNC_RESULT_CHARS
+                chars_saved += saved
+                msg["content"] = (
+                    content[:_TRUNC_RESULT_CHARS]
+                    + f" …[{len(content) - _TRUNC_RESULT_CHARS} chars omitted]"
+                )
+        result.append(msg)
+
+    new_est = _estimate_tokens(result)
+    logger.info(
+        "context compaction: saved ~%d tokens (%d chars), now ~%d / %d tokens",
+        chars_saved // _CHARS_PER_TOKEN, chars_saved, new_est, budget_tokens,
+    )
+    return result
+
+
+def _context_budget(model_cfg: dict | None) -> int:
+    """Return the usable token budget (75% of context window, min 16k, default 32k)."""
+    context_k = (model_cfg or {}).get("context_k") or 32
+    return max(16_000, int(context_k * 1000 * 0.75))
+
+
+async def _run_from_messages(
+    client,
+    messages: list[dict],
+    active_tools: list,
+    tool_call_log: list[dict],
+    effective_confirm: set[str],
+    model_name: str,
+    task: str,
+    model_cfg: dict | None,
+    respond_with_final: bool,
+    user_role: str,
+    confirm_allow: frozenset,
+    confirm_deny: frozenset,
+    starting_round: int = 0,
+    tool_list: list[str] | None = None,
+) -> tuple[str, OrchestrateCheckpoint | None]:
+    """
+    Run the OpenAI ReAct loop from the current messages state.
+    Returns (final_response, checkpoint) — checkpoint is set if confirmation is needed.
+    """
+    final_response = ""
+    budget = _context_budget(model_cfg)
+
+    per_model_limit = (model_cfg or {}).get("max_rounds") or settings.orchestrator_max_rounds
+    effective_limit = min(per_model_limit, settings.orchestrator_max_rounds)
+
+    for round_num in range(starting_round, effective_limit):
+        messages = _compact_messages(messages, budget)
+        est = _estimate_tokens(messages)
+        logger.info("OpenAI orchestrator round %d / %d  model=%s  ~%d tokens",
+                    round_num + 1, effective_limit, model_name, est)
+
+        call_kwargs: dict = {"model": model_name, "messages": messages}
+        if active_tools:
+            call_kwargs["tools"] = active_tools
+            call_kwargs["tool_choice"] = "auto"
+        reasoning_budget = (model_cfg or {}).get("reasoning_budget_tokens")
+        if reasoning_budget:
+            call_kwargs["extra_body"] = {"reasoning": {"budget_tokens": reasoning_budget}}
+        response = await _chat_with_retry(client, **call_kwargs)
+
+        choice = response.choices[0]
+        msg = choice.message

-        # Append the assistant turn (MUST include tool_calls if present so the
-        # next request is valid — OpenAI requires the full history to be consistent)
        assistant_msg: dict = {"role": "assistant"}
        if msg.content:
            assistant_msg["content"] = msg.content
        if msg.tool_calls:
            assistant_msg["tool_calls"] = [
                {
-                    "id":   tc.id,
+                    "id": tc.id,
                    "type": "function",
-                    "function": {
-                        "name":      tc.function.name,
-                        "arguments": tc.function.arguments,
-                    },
+                    "function": {"name": tc.function.name, "arguments": tc.function.arguments},
                }
                for tc in msg.tool_calls
            ]
        messages.append(assistant_msg)

-        if choice.finish_reason == "tool_calls" and msg.tool_calls:
-            # Execute all tool calls in parallel, then feed results back
-            tool_tasks = [
-                _execute_tool(tc.function.name, tc.function.arguments)
-                for tc in msg.tool_calls
-            ]
-            results = await asyncio.gather(*tool_tasks, return_exceptions=True)
+        # Some models set finish_reason="stop" even when tool_calls are present
+        if msg.tool_calls and (choice.finish_reason in ("tool_calls", "stop", None)):
+            # Snapshot state before tool responses for potential checkpoint
+            pre_fn_state = list(messages)

-            for tc, result in zip(msg.tool_calls, results):
-                result_str = (
-                    str(result)
-                    if not isinstance(result, Exception)
-                    else f"Tool error: {result}"
-                )
-                logger.info("Tool %s → %d chars", tc.function.name, len(result_str))
+            pending_tools: list[dict] = []
+            executed_results: list[dict] = []

+            for tc in msg.tool_calls:
+                name = tc.function.name
+                raw_args = tc.function.arguments or "{}"
                try:
-                    args_parsed = json.loads(tc.function.arguments)
-                except json.JSONDecodeError:
-                    args_parsed = {"raw": tc.function.arguments}
+                    args_parsed = json.loads(raw_args)
+                    if not isinstance(args_parsed, dict):
+                        raise ValueError("args must be a JSON object")
+                except (json.JSONDecodeError, ValueError) as e:
+                    logger.warning("Malformed tool args for %s: %s — args: %.200s", name, e, raw_args)
+                    args_parsed = {}

-                tool_call_log.append({
-                    "tool":   tc.function.name,
-                    "args":   args_parsed,
-                    "result": result_str,
-                })
+                if name in effective_confirm:
+                    pending_tools.append({"name": name, "args": args_parsed, "tool_call_id": tc.id})
+                    logger.info("Tool %s blocked — confirmation required", name)
+                else:
+                    result_str = await _execute_tool(name, tc.function.arguments, user_role, tool_list)
+                    logger.info("Tool %s → %d chars", name, len(result_str))
+                    executed_results.append({"name": name, "args": args_parsed, "result": result_str, "tool_call_id": tc.id})
+                    tool_call_log.append({"tool": name, "args": args_parsed, "result": result_str})
+                    messages.append({"role": "tool", "tool_call_id": tc.id, "content": result_str})

-                # Tool result message — tools array must be re-sent on every request
-                messages.append({
-                    "role":         "tool",
-                    "tool_call_id": tc.id,
-                    "content":      result_str,
-                })
+            if pending_tools:
+                # Add placeholder responses
+                for pt in pending_tools:
+                    placeholder = f"[AWAITING USER CONFIRMATION for {pt['name']}]"
+                    tool_call_log.append({"tool": pt["name"], "args": pt["args"], "result": "[awaiting confirmation]"})
+                    messages.append({"role": "tool", "tool_call_id": pt["tool_call_id"], "content": placeholder})
+
+                messages = _compact_messages(messages, budget)
+                conf_call: dict = {"model": model_name, "messages": messages, "tool_choice": "none"}
+                if active_tools:
+                    conf_call["tools"] = active_tools
+                if reasoning_budget:
+                    conf_call["extra_body"] = {"reasoning": {"budget_tokens": reasoning_budget}}
+                conf_resp = await _chat_with_retry(client, **conf_call)
+                final_response = conf_resp.choices[0].message.content or (
+                    "This action requires your explicit confirmation before it can proceed."
+                )
+
+                checkpoint = OrchestrateCheckpoint(
+                    engine="openai",
+                    pre_fn_state=pre_fn_state,
+                    executed_results=executed_results,
+                    pending_tools=pending_tools,
+                    tool_call_log=list(tool_call_log),
+                    task=task,
+                    model_cfg=model_cfg,
+                    respond_with_final=respond_with_final,
+                    user_role=user_role,
+                    tool_list=tool_list,
+                    confirm_allow=confirm_allow,
+                    confirm_deny=confirm_deny,
+                    rounds_used=round_num + 2,
+                )
+                return final_response, checkpoint

        else:
-            # finish_reason == "stop" (or no tool_calls) — model is done
            final_response = msg.content or ""
            logger.info(
                "OpenAI orchestrator done after %d round(s). Tools used: %d",
@@ -173,35 +424,108 @@ async def run(
            break

    else:
-        # Hit the round limit
-        logger.warning("OpenAI orchestrator hit max rounds (%d)", settings.orchestrator_max_rounds)
+        logger.warning("OpenAI orchestrator hit max rounds (%d)", effective_limit)
        final_response = (
-            f"Reached the tool iteration limit ({settings.orchestrator_max_rounds} rounds). "
+            f"Reached the tool iteration limit ({effective_limit} rounds). "
            "Here is what was gathered:\n\n"
-            + "\n\n".join(
-                f"**{t['tool']}**: {t['result'][:500]}" for t in tool_call_log
-            )
+            + "\n\n".join(f"**{t['tool']}**: {t['result'][:500]}" for t in tool_call_log)
        )

-    model_label = model_cfg.get("label") or model_name
-    logger.info("OpenAI orchestrator complete — model=%s tools=%d", model_label, len(tool_call_log))
+    return final_response, None

-    return OrchestratorResult(
-        response=final_response,
-        tool_calls=tool_call_log,
-        backend="local",
-        gemini_summary=final_response,  # reused for UI display; same content in single-model mode
+
+_RETRY_STATUSES = {429, 500, 502, 503, 504}
+_MAX_API_RETRIES = 3
+
+
+async def _chat_with_retry(client, **kwargs):
+    """Wrap chat.completions.create with exponential backoff on transient errors."""
+    last_exc: Exception = RuntimeError("No attempts made")
+    for attempt in range(_MAX_API_RETRIES):
+        try:
+            return await client.chat.completions.create(**kwargs)
+        except APIConnectionError as e:
+            last_exc = e
+            logger.warning("OpenAI connection error (attempt %d/%d): %s", attempt + 1, _MAX_API_RETRIES, e)
+        except APIStatusError as e:
+            if e.status_code in _RETRY_STATUSES:
+                last_exc = e
+                logger.warning("OpenAI status %d (attempt %d/%d): %s", e.status_code, attempt + 1, _MAX_API_RETRIES, e)
+            else:
+                raise
+        if attempt < _MAX_API_RETRIES - 1:
+            await asyncio.sleep(2 ** attempt)  # 1s, 2s
+    raise last_exc
+
+
+def _build_client(
+    model_cfg: dict | None,
+    user_role: str = "user",
+    tool_list: list[str] | None = None,
+    max_risk: str | None = None,
+    risk_whitelist: list[str] | None = None,
+    risk_blacklist: list[str] | None = None,
+) -> tuple:
+    """Build AsyncOpenAI client and return (client, model_name, active_tools)."""
+    if not model_cfg:
+        raise RuntimeError("model_cfg is required for the OpenAI orchestrator")
+    api_url = model_cfg.get("api_url", "")
+    api_key = model_cfg.get("api_key", "") or "none"
+    model_name = model_cfg.get("model_name", "")
+    host_type = model_cfg.get("host_type", "openwebui")
+    if not api_url or not model_name:
+        raise RuntimeError(
+            f"model_cfg missing api_url or model_name: {model_cfg.get('label', model_cfg)}"
+        )
+    base_url = api_url.rstrip("/")
+    if host_type == "openwebui":
+        base_url = base_url + "/api"
+    client = AsyncOpenAI(base_url=base_url, api_key=api_key, timeout=settings.timeout_local)
+    if model_cfg.get("tools") is False:
+        active_tools = []
+    else:
+        active_tools = _get_cached_tools(
+            user_role, tool_list,
+            max_risk=max_risk, whitelist=risk_whitelist, blacklist=risk_blacklist,
+        )
+    return client, model_name, active_tools
+
+
+async def _execute_tool(
+    name: str,
+    arguments_json: str,
+    user_role: str = "user",
+    tool_list: list[str] | None = None,
+    max_risk: str | None = None,
+    risk_whitelist: list[str] | None = None,
+    risk_blacklist: list[str] | None = None,
+) -> str:
+    """Parse tool arguments and execute with role-filtered callables."""
+    _, callables = get_tools_for_role(
+        user_role, tool_list,
+        max_risk=max_risk, whitelist=risk_whitelist, blacklist=risk_blacklist,
    )
-
-
-async def _execute_tool(name: str, arguments_json: str) -> str:
-    """Parse tool arguments and execute, returning a string result."""
    try:
        args = json.loads(arguments_json)
    except json.JSONDecodeError:
        args = {}
    try:
-        return await call_tool(name, args)
+        return await call_tool(name, args, callables)
+    except Exception as e:
+        logger.warning("Tool %s failed: %s", name, e)
+        return f"Tool error: {e}"
+
+
+async def _execute_tool_dict(
+    name: str,
+    args: dict,
+    user_role: str = "user",
+    tool_list: list[str] | None = None,
+) -> str:
+    """Execute a tool from a pre-parsed args dict."""
+    _, callables = get_tools_for_role(user_role, tool_list)
+    try:
+        return await call_tool(name, args, callables)
    except Exception as e:
        logger.warning("Tool %s failed: %s", name, e)
        return f"Tool error: {e}"
--- a/cortex/orchestrator_engine.py
+++ b/cortex/orchestrator_engine.py
@@ -16,6 +16,7 @@ calls llm_client.complete() directly, which is faster and has no orchestration o
 """

 import asyncio
+import json
 import logging
 from dataclasses import dataclass, field

@@ -24,7 +25,10 @@ from google.genai import types

 from config import settings
 from llm_client import complete
-from tools import TOOL_DECLARATIONS, call_tool
+from tools import TOOL_DECLARATIONS, call_tool, get_tools_for_role, CONFIRM_REQUIRED
+import usage_tracker
+import tool_audit
+from persona import _user

 logger = logging.getLogger(__name__)

@@ -43,12 +47,61 @@ Keep your summary factual and complete. Include relevant URLs, data, and specifi
 If no tools are needed, return an empty summary."""


+def _track_gemini_usage(response, model_name: str | None) -> None:
+    meta = getattr(response, "usage_metadata", None)
+    if not meta:
+        return
+    prompt_tokens = getattr(meta, "prompt_token_count", 0) or 0
+    completion_tokens = getattr(meta, "candidates_token_count", 0) or 0
+    if prompt_tokens or completion_tokens:
+        try:
+            asyncio.create_task(usage_tracker.record(
+                username=_user.get(),
+                backend="gemini_api",
+                model_name=model_name or settings.orchestrator_model,
+                prompt_tokens=prompt_tokens,
+                completion_tokens=completion_tokens,
+            ))
+        except Exception:
+            pass
+
+
+@dataclass
+class OrchestrateCheckpoint:
+    """Saved execution state for a job paused at a confirmation gate."""
+    engine: str                          # "gemini" | "openai"
+    pre_fn_state: list                   # conversation state before function responses
+    executed_results: list[dict]         # tools that already ran this round
+    pending_tools: list[dict]            # [{name, args}] awaiting confirmation
+    tool_call_log: list[dict]            # all tool calls so far
+    task: str
+    # Gemini-specific config (unused by openai engine)
+    system_prompt: str = ""
+    session_messages: list | None = None
+    model_name: str | None = None
+    gemini_api_key: str | None = None
+    respond_with_claude: bool = True
+    response_role: str = "chat"
+    # OpenAI-specific config (unused by gemini engine)
+    model_cfg: dict | None = None
+    respond_with_final: bool = True
+    # Common
+    user_role: str = "user"
+    tool_list: list[str] | None = None
+    confirm_allow: frozenset = field(default_factory=frozenset)
+    confirm_deny: frozenset = field(default_factory=frozenset)
+    rounds_used: int = 0
+    max_rounds: int | None = None
+
+
@dataclass
 class OrchestratorResult:
    response: str                       # final user-facing response (from Claude)
    tool_calls: list[dict] = field(default_factory=list)  # [{tool, args, result}]
    backend: str = "claude"             # model that produced the final response
+    backend_label: str = ""             # human-readable model label for display
    gemini_summary: str = ""            # what Gemini handed to Claude (debug/display)
+    checkpoint: OrchestrateCheckpoint | None = None  # set when awaiting confirmation


 async def run(
@@ -57,6 +110,16 @@ async def run(
    session_messages: list[dict] | None = None,
    respond_with_claude: bool = True,
    gemini_api_key: str | None = None,
+    model_name: str | None = None,
+    response_role: str = "chat",
+    user_role: str = "user",
+    tool_list: list[str] | None = None,
+    confirm_allow: set[str] | None = None,
+    confirm_deny: set[str] | None = None,
+    max_rounds: int | None = None,
+    max_risk: str | None = None,
+    risk_whitelist: list[str] | None = None,
+    risk_blacklist: list[str] | None = None,
 ) -> OrchestratorResult:
    """
    Run the full orchestration loop for a task.
@@ -68,9 +131,13 @@ async def run(
        respond_with_claude: If False, return Gemini's summary as the response (useful for
                             background/cron tasks where a polished reply isn't needed)
        gemini_api_key:     Per-user Gemini API key (falls back to GEMINI_API_KEY in .env)
+        tool_list:          Optional explicit tool allow-list from role config; intersected
+                            with user_role access-level filter (cannot elevate privileges)
+        confirm_allow:      Tools to bypass the confirmation gate for this user
+        confirm_deny:       Tools to always block for this user

    Returns:
-        OrchestratorResult with response, tool call log, backend used, and Gemini summary
+        OrchestratorResult — if checkpoint is set, the job is awaiting confirmation
    """
    api_key = gemini_api_key or settings.gemini_api_key
    if not api_key:
@@ -80,105 +147,314 @@ async def run(
        )

    client = genai.Client(api_key=api_key)
+    tool_audit.set_context("gemini", model_name or settings.orchestrator_model)
+
+    _confirm_allow = frozenset(confirm_allow or ())
+    _confirm_deny = frozenset(confirm_deny or ())
+    effective_confirm = (CONFIRM_REQUIRED - set(_confirm_allow)) | set(_confirm_deny)

-    # Seed Gemini with the task — include recent session context if available
    task_with_context = _build_task_prompt(task, session_messages)
    contents: list[types.Content] = [
        types.Content(role="user", parts=[types.Part(text=task_with_context)])
    ]
-
+    tool_declarations, tool_callables = get_tools_for_role(
+        user_role, tool_list, max_risk=max_risk,
+        whitelist=risk_whitelist, blacklist=risk_blacklist,
+    )
    tool_call_log: list[dict] = []
-    gemini_summary = ""

-    # --- ReAct tool loop ---
-    for round_num in range(settings.orchestrator_max_rounds):
+    gemini_summary, checkpoint = await _run_from_contents(
+        client=client,
+        contents=contents,
+        tool_declarations=tool_declarations,
+        tool_callables=tool_callables,
+        tool_call_log=tool_call_log,
+        effective_confirm=effective_confirm,
+        model_name=model_name,
+        task=task,
+        system_prompt=system_prompt,
+        session_messages=session_messages,
+        respond_with_claude=respond_with_claude,
+        response_role=response_role,
+        user_role=user_role,
+        tool_list=tool_list,
+        confirm_allow=_confirm_allow,
+        confirm_deny=_confirm_deny,
+        starting_round=0,
+        gemini_api_key=api_key,
+        max_rounds=max_rounds,
+    )
+
+    if checkpoint:
+        return OrchestratorResult(
+            response=gemini_summary,
+            tool_calls=list(tool_call_log),
+            backend="gemini",
+            gemini_summary=gemini_summary,
+            checkpoint=checkpoint,
+        )
+
+    return await _claude_handoff(
+        task=task,
+        tool_call_log=tool_call_log,
+        gemini_summary=gemini_summary,
+        system_prompt=system_prompt,
+        session_messages=session_messages,
+        respond_with_claude=respond_with_claude,
+        response_role=response_role,
+    )
+
+
+async def resume(checkpoint: OrchestrateCheckpoint, confirmed: bool) -> OrchestratorResult:
+    """Continue a job that was paused at a confirmation gate."""
+    api_key = checkpoint.gemini_api_key or settings.gemini_api_key
+    client = genai.Client(api_key=api_key)
+    tool_declarations, tool_callables = get_tools_for_role(
+        checkpoint.user_role, checkpoint.tool_list,
+        max_risk=getattr(checkpoint, "max_risk", None),
+        whitelist=getattr(checkpoint, "risk_whitelist", None),
+        blacklist=getattr(checkpoint, "risk_blacklist", None),
+    )
+
+    effective_confirm = (CONFIRM_REQUIRED - set(checkpoint.confirm_allow)) | set(checkpoint.confirm_deny)
+
+    # Rebuild from saved state — strip "[awaiting confirmation]" placeholders
+    contents = list(checkpoint.pre_fn_state)
+    tool_call_log = [t for t in checkpoint.tool_call_log if t["result"] != "[awaiting confirmation]"]
+
+    # Build function responses for this round
+    response_parts: list[types.Part] = []
+
+    for er in checkpoint.executed_results:
+        response_parts.append(types.Part(function_response=types.FunctionResponse(
+            name=er["name"], response={"result": er["result"]}
+        )))
+
+    for pt in checkpoint.pending_tools:
+        if confirmed:
+            result_str = await _execute_tool(pt["name"], pt["args"], tool_callables)
+            logger.info("Confirmed tool %s → %d chars", pt["name"], len(result_str))
+        else:
+            result_str = "Action denied by user."
+            logger.info("Tool %s denied by user", pt["name"])
+        tool_call_log.append({"tool": pt["name"], "args": pt["args"], "result": result_str})
+        response_parts.append(types.Part(function_response=types.FunctionResponse(
+            name=pt["name"], response={"result": result_str}
+        )))
+
+    contents.append(types.Content(role="user", parts=response_parts))
+
+    gemini_summary, new_checkpoint = await _run_from_contents(
+        client=client,
+        contents=contents,
+        tool_declarations=tool_declarations,
+        tool_callables=tool_callables,
+        tool_call_log=tool_call_log,
+        effective_confirm=effective_confirm,
+        model_name=checkpoint.model_name,
+        task=checkpoint.task,
+        system_prompt=checkpoint.system_prompt,
+        session_messages=checkpoint.session_messages,
+        respond_with_claude=checkpoint.respond_with_claude,
+        response_role=checkpoint.response_role,
+        user_role=checkpoint.user_role,
+        tool_list=checkpoint.tool_list,
+        confirm_allow=checkpoint.confirm_allow,
+        confirm_deny=checkpoint.confirm_deny,
+        starting_round=checkpoint.rounds_used,
+        gemini_api_key=api_key,
+        max_rounds=checkpoint.max_rounds,
+    )
+
+    if new_checkpoint:
+        return OrchestratorResult(
+            response=gemini_summary,
+            tool_calls=list(tool_call_log),
+            backend="gemini",
+            gemini_summary=gemini_summary,
+            checkpoint=new_checkpoint,
+        )
+
+    return await _claude_handoff(
+        task=checkpoint.task,
+        tool_call_log=tool_call_log,
+        gemini_summary=gemini_summary,
+        system_prompt=checkpoint.system_prompt,
+        session_messages=checkpoint.session_messages,
+        respond_with_claude=checkpoint.respond_with_claude,
+        response_role=checkpoint.response_role,
+    )
+
+
+async def _run_from_contents(
+    client,
+    contents: list,
+    tool_declarations: list,
+    tool_callables: dict,
+    tool_call_log: list[dict],
+    effective_confirm: set[str],
+    model_name: str | None,
+    task: str,
+    system_prompt: str,
+    session_messages: list[dict] | None,
+    respond_with_claude: bool,
+    response_role: str,
+    user_role: str,
+    confirm_allow: frozenset,
+    confirm_deny: frozenset,
+    starting_round: int = 0,
+    gemini_api_key: str | None = None,
+    tool_list: list[str] | None = None,
+    max_rounds: int | None = None,
+) -> tuple[str, OrchestrateCheckpoint | None]:
+    """
+    Run the ReAct loop from the current contents state.
+    Returns (gemini_summary, checkpoint) — checkpoint is set if confirmation is needed.
+    """
+    gemini_summary = ""
+    per_model_limit = max_rounds or settings.orchestrator_max_rounds
+    effective_limit = min(per_model_limit, settings.orchestrator_max_rounds)
+
+    for round_num in range(starting_round, effective_limit):
        logger.info("Orchestrator round %d for task: %.80s", round_num + 1, task)

        response = await asyncio.to_thread(
            client.models.generate_content,
-            model=settings.orchestrator_model,
+            model=model_name or settings.orchestrator_model,
            contents=contents,
            config=types.GenerateContentConfig(
-                tools=TOOL_DECLARATIONS,
+                tools=tool_declarations,
                system_instruction=_ORCHESTRATOR_SYSTEM,
            ),
        )
+        _track_gemini_usage(response, model_name)

        candidate = response.candidates[0]
        parts = candidate.content.parts if candidate.content else []

-        # Check if Gemini wants to call any tools
        tool_call_parts = [
            p for p in parts
            if hasattr(p, "function_call") and p.function_call and p.function_call.name
        ]

        if not tool_call_parts:
-            # No more tool calls — extract Gemini's text summary
            gemini_summary = "".join(
                p.text for p in parts if hasattr(p, "text") and p.text
            ).strip()
            logger.info("Orchestrator done after %d round(s). Tools used: %d",
                        round_num + 1, len(tool_call_log))
-            break
+            return gemini_summary, None

-        # Add Gemini's response (with function calls) to the conversation
        contents.append(candidate.content)

-        # Execute all tool calls in parallel
-        tool_tasks = [
-            _execute_tool(fc.function_call.name, dict(fc.function_call.args))
-            for fc in tool_call_parts
-        ]
-        tool_results = await asyncio.gather(*tool_tasks, return_exceptions=True)
+        # Snapshot state before function responses — used if a checkpoint is needed
+        pre_fn_state = list(contents)

-        # Build function response parts and update log
        response_parts: list[types.Part] = []
-        for fc_part, result in zip(tool_call_parts, tool_results):
-            fc = fc_part.function_call
-            result_str = str(result) if not isinstance(result, Exception) else f"Error: {result}"
-            logger.info("Tool %s → %d chars", fc.name, len(result_str))
+        pending_tools: list[dict] = []
+        executed_results: list[dict] = []

-            tool_call_log.append({
-                "tool": fc.name,
-                "args": dict(fc.args),
-                "result": result_str,
-            })
-            response_parts.append(
-                types.Part(
-                    function_response=types.FunctionResponse(
-                        name=fc.name,
-                        response={"result": result_str},
-                    )
-                )
+        for fc_part in tool_call_parts:
+            fc = fc_part.function_call
+            name = fc.name
+            args = dict(fc.args)
+
+            if name in effective_confirm:
+                pending_tools.append({"name": name, "args": args})
+                logger.info("Tool %s blocked — confirmation required", name)
+            else:
+                result_str = await _execute_tool(name, args, tool_callables)
+                logger.info("Tool %s → %d chars", name, len(result_str))
+                executed_results.append({"name": name, "args": args, "result": result_str})
+                tool_call_log.append({"tool": name, "args": args, "result": result_str})
+                response_parts.append(types.Part(function_response=types.FunctionResponse(
+                    name=name, response={"result": result_str}
+                )))
+
+        if pending_tools:
+            # Add placeholder responses and get Gemini to produce the confirmation message
+            for pt in pending_tools:
+                placeholder = f"[AWAITING USER CONFIRMATION for {pt['name']}]"
+                response_parts.append(types.Part(function_response=types.FunctionResponse(
+                    name=pt["name"], response={"result": placeholder}
+                )))
+                tool_call_log.append({"tool": pt["name"], "args": pt["args"], "result": "[awaiting confirmation]"})
+
+            contents.append(types.Content(role="user", parts=response_parts))
+
+            conf_response = await asyncio.to_thread(
+                client.models.generate_content,
+                model=model_name or settings.orchestrator_model,
+                contents=contents,
+                config=types.GenerateContentConfig(
+                    tools=tool_declarations,
+                    system_instruction=_ORCHESTRATOR_SYSTEM,
+                ),
            )
+            _track_gemini_usage(conf_response, model_name)
+            conf_parts = (
+                conf_response.candidates[0].content.parts
+                if conf_response.candidates and conf_response.candidates[0].content
+                else []
+            )
+            gemini_summary = "".join(
+                p.text for p in conf_parts if hasattr(p, "text") and p.text
+            ).strip() or "This action requires your explicit confirmation before it can proceed."
+
+            checkpoint = OrchestrateCheckpoint(
+                engine="gemini",
+                pre_fn_state=pre_fn_state,
+                executed_results=executed_results,
+                pending_tools=pending_tools,
+                tool_call_log=list(tool_call_log),
+                task=task,
+                system_prompt=system_prompt,
+                session_messages=session_messages,
+                model_name=model_name,
+                gemini_api_key=gemini_api_key,
+                respond_with_claude=respond_with_claude,
+                response_role=response_role,
+                user_role=user_role,
+                tool_list=tool_list,
+                confirm_allow=confirm_allow,
+                confirm_deny=confirm_deny,
+                rounds_used=round_num + 2,
+                max_rounds=max_rounds,
+            )
+            return gemini_summary, checkpoint

        contents.append(types.Content(role="user", parts=response_parts))

    else:
-        # Hit the round limit — use whatever Gemini produced last
-        logger.warning("Orchestrator hit max rounds (%d)", settings.orchestrator_max_rounds)
+        logger.warning("Orchestrator hit max rounds (%d)", effective_limit)
        gemini_summary = (
-            f"Reached the tool iteration limit ({settings.orchestrator_max_rounds} rounds). "
+            f"Reached the tool iteration limit ({effective_limit} rounds). "
            "Here is what was gathered so far:\n\n"
            + "\n\n".join(f"**{t['tool']}**: {t['result'][:500]}" for t in tool_call_log)
        )

-    # --- Claude handoff ---
+    return gemini_summary, None
+
+
+async def _claude_handoff(
+    task: str,
+    tool_call_log: list[dict],
+    gemini_summary: str,
+    system_prompt: str,
+    session_messages: list[dict] | None,
+    respond_with_claude: bool,
+    response_role: str,
+) -> OrchestratorResult:
    if respond_with_claude:
        claude_prompt = _build_claude_prompt(task, tool_call_log, gemini_summary)
-
-        # Merge with session history so Claude has conversation context
        messages = list(session_messages or [])
        messages.append({"role": "user", "content": claude_prompt})
-
        response_text, backend = await complete(
            system_prompt=system_prompt,
            messages=messages,
-            model="claude",
+            role=response_role,
        )
    else:
-        # Cron/background tasks: return Gemini's summary directly, no Claude call
        response_text = gemini_summary or "No information gathered."
        backend = "gemini"

@@ -190,10 +466,10 @@ async def run(
    )


-async def _execute_tool(name: str, args: dict) -> str:
+async def _execute_tool(name: str, args: dict, callables: dict | None = None) -> str:
    """Execute a single tool call, catching all exceptions."""
    try:
-        return await call_tool(name, args)
+        return await call_tool(name, args, callables)
    except Exception as e:
        logger.warning("Tool %s failed: %s", name, e)
        return f"Tool error: {e}"
@@ -204,12 +480,11 @@ def _build_task_prompt(task: str, session_messages: list[dict] | None) -> str:
    if not session_messages:
        return task

-    # Include last few turns for context (don't send the full history to keep tokens low)
-    recent = session_messages[-6:]  # last 3 turns
+    recent = session_messages[-6:]
    history_lines = []
    for msg in recent:
        label = "User" if msg["role"] == "user" else "Assistant"
-        history_lines.append(f"{label}: {msg['content'][:300]}")  # truncate long messages
+        history_lines.append(f"{label}: {msg['content'][:300]}")

    context = "\n".join(history_lines)
    return f"<recent_conversation>\n{context}\n</recent_conversation>\n\nCurrent request: {task}"
@@ -227,7 +502,6 @@ def _build_claude_prompt(
        parts.append("## Research gathered\n")
        for tc in tool_calls:
            parts.append(f"### {tc['tool']}({_format_args(tc['args'])})")
-            # Truncate very long results — Claude gets the gist
            result = tc["result"]
            if len(result) > 2000:
                result = result[:2000] + "\n… [truncated]"
--- a/cortex/push_utils.py
+++ b/cortex/push_utils.py
@@ -0,0 +1,117 @@
+"""
+Web Push (VAPID) helpers.
+
+Subscriptions are stored per-user at:
+  home/{user}/push_subscriptions.json  →  list of {endpoint, keys:{p256dh, auth}}
+
+send_push(username, title, body, url) iterates all stored subscriptions for that
+user and fires a push. Stale endpoints (410 Gone) are pruned automatically.
+"""
+import asyncio
+import base64
+import json
+import logging
+from pathlib import Path
+
+from config import settings
+
+logger = logging.getLogger(__name__)
+
+
+def _subs_path(username: str) -> Path:
+    return settings.home_root() / username / "push_subscriptions.json"
+
+
+def load_subscriptions(username: str) -> list[dict]:
+    path = _subs_path(username)
+    if not path.exists():
+        return []
+    try:
+        return json.loads(path.read_text())
+    except Exception:
+        return []
+
+
+def _save_subscriptions(username: str, subs: list[dict]) -> None:
+    path = _subs_path(username)
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text(json.dumps(subs, indent=2))
+
+
+def add_subscription(username: str, sub: dict) -> None:
+    """Upsert a subscription by endpoint URL."""
+    subs = load_subscriptions(username)
+    endpoint = sub.get("endpoint", "")
+    subs = [s for s in subs if s.get("endpoint") != endpoint]
+    subs.append(sub)
+    _save_subscriptions(username, subs)
+
+
+def remove_subscription(username: str, endpoint: str) -> bool:
+    subs = load_subscriptions(username)
+    new_subs = [s for s in subs if s.get("endpoint") != endpoint]
+    if len(new_subs) == len(subs):
+        return False
+    _save_subscriptions(username, new_subs)
+    return True
+
+
+def _get_private_key_pem() -> str:
+    """Decode the base64-encoded PEM private key from settings."""
+    raw = settings.vapid_private_key_b64.strip()
+    if not raw:
+        raise RuntimeError("VAPID_PRIVATE_KEY_B64 is not set in .env")
+    return base64.b64decode(raw).decode()
+
+
+def _send_one(sub: dict, payload: dict) -> bool:
+    """Send a push to a single subscription. Returns False if the endpoint is stale (410)."""
+    from pywebpush import webpush, WebPushException
+    from py_vapid import Vapid
+
+    try:
+        vapid = Vapid.from_pem(_get_private_key_pem().encode())
+        webpush(
+            subscription_info=sub,
+            data=json.dumps(payload),
+            vapid_private_key=vapid,
+            vapid_claims={"sub": settings.vapid_contact},
+        )
+        return True
+    except WebPushException as e:
+        if e.response is not None and e.response.status_code == 410:
+            logger.info("push endpoint gone (410), pruning: %s", sub.get("endpoint", "")[:60])
+            return False
+        logger.warning("push failed: %s", e)
+        return True  # keep the sub; might be transient
+
+
+async def send_push(username: str, title: str, body: str, url: str = "") -> dict:
+    """
+    Send a push notification to all subscriptions for username.
+    Returns {"sent": n, "pruned": m}.
+    """
+    if not settings.vapid_public_key or not settings.vapid_private_key_b64:
+        return {"error": "VAPID keys not configured"}
+
+    subs = load_subscriptions(username)
+    if not subs:
+        return {"error": f"No push subscriptions for {username}"}
+
+    payload = {"title": title, "body": body, "url": url}
+    keep = []
+    sent = 0
+    pruned = 0
+
+    for sub in subs:
+        alive = await asyncio.to_thread(_send_one, sub, payload)
+        if alive:
+            keep.append(sub)
+            sent += 1
+        else:
+            pruned += 1
+
+    if pruned:
+        _save_subscriptions(username, keep)
+
+    return {"sent": sent, "pruned": pruned}
--- a/cortex/requirements.txt
+++ b/cortex/requirements.txt
@@ -19,8 +19,17 @@ python-multipart>=0.0.9   # required by FastAPI for Form() data
 # Async HTTP client — used for local OpenAI-compatible backend (Open WebUI / Ollama)
 httpx>=0.27.0

+# Web content extraction — strips ads/nav/boilerplate, returns clean article text
+trafilatura>=1.6.0
+
 # OpenAI-compatible client — tool calling for OpenRouter / LiteLLM / any OAI-compat host
 openai>=1.0.0

-# anthropic SDK not needed — using claude CLI subprocess for auth
-# anthropic>=0.40.0
+# Web Push / VAPID — browser push notifications
+pywebpush>=2.0.0
+
+# MariaDB / MySQL connector — used by ae_db_query orchestrator tool
+pymysql>=1.1.0
+
+# Anthropic SDK — direct API key backend (alternative to CLI OAuth)
+anthropic>=0.40.0
--- a/cortex/routers/audit.py
+++ b/cortex/routers/audit.py
@@ -0,0 +1,122 @@
+"""
+Tool audit log endpoints.
+
+Self-service (any authenticated user, own data):
+  GET /api/audit/files                 → list of available date strings (newest first)
+  GET /api/audit/day?date=YYYY-MM-DD   → entries for one day
+
+Admin-only (cross-user aggregation):
+  GET /api/audit/recent?user=scott&days=7&limit=200
+  GET /api/audit/stats?user=scott&days=7
+"""
+import jwt
+from collections import Counter
+from datetime import date, timedelta
+from fastapi import APIRouter, HTTPException, Query, Request
+
+from auth_utils import COOKIE_NAME, decode_token, get_user_role
+from config import settings
+import tool_audit
+from persona import list_users
+
+router = APIRouter(prefix="/api/audit")
+
+
+def _session_user(request: Request) -> str:
+    """Return the authenticated username or raise 401."""
+    token = request.cookies.get(COOKIE_NAME)
+    if not token:
+        raise HTTPException(status_code=401, detail="Not authenticated")
+    try:
+        return decode_token(token)
+    except jwt.InvalidTokenError:
+        raise HTTPException(status_code=401, detail="Invalid session")
+
+
+def _require_admin(request: Request) -> str:
+    username = _session_user(request)
+    if get_user_role(username) != "admin":
+        raise HTTPException(status_code=403, detail="Admin access required")
+    return username
+
+
+@router.get("/files")
+async def audit_files(request: Request) -> dict:
+    """List available audit log dates for the current user (newest first)."""
+    username = _session_user(request)
+    audit_dir = settings.home_root() / username / "tool_audit"
+    if not audit_dir.exists():
+        return {"dates": []}
+    dates = sorted(
+        [p.stem for p in audit_dir.glob("*.jsonl") if p.stem],
+        reverse=True,
+    )
+    return {"dates": dates}
+
+
+@router.get("/day")
+async def audit_day(
+    request: Request,
+    date: str = Query(..., description="YYYY-MM-DD"),
+) -> dict:
+    """Return all entries for a specific day (current user only)."""
+    username = _session_user(request)
+    try:
+        from datetime import date as _date
+        d = _date.fromisoformat(date)
+    except ValueError:
+        raise HTTPException(status_code=400, detail="date must be YYYY-MM-DD")
+    entries = tool_audit.read_day(username, date)
+    return {"date": date, "entries": entries}
+
+
+@router.get("/recent")
+async def audit_recent(
+    request: Request,
+    user: str = Query(None, description="Username to filter (omit for all users)"),
+    days: int = Query(7,   ge=1, le=90),
+    limit: int = Query(200, ge=1, le=1000),
+) -> dict:
+    _require_admin(request)
+
+    if user:
+        if user not in list_users():
+            raise HTTPException(status_code=404, detail=f"User not found: {user}")
+        entries = tool_audit.read_recent(user, days=days, limit=limit)
+    else:
+        entries = tool_audit.read_recent_all_users(days=days, limit=limit)
+
+    return {"entries": entries, "count": len(entries), "days": days}
+
+
+@router.get("/stats")
+async def audit_stats(
+    request: Request,
+    user: str = Query(None),
+    days: int = Query(7, ge=1, le=90),
+) -> dict:
+    _require_admin(request)
+
+    if user:
+        if user not in list_users():
+            raise HTTPException(status_code=404, detail=f"User not found: {user}")
+        entries = tool_audit.read_recent(user, days=days, limit=10000)
+    else:
+        entries = tool_audit.read_recent_all_users(days=days, limit=10000)
+
+    tool_counts: Counter = Counter()
+    status_counts: Counter = Counter()
+    user_counts: Counter = Counter()
+
+    for e in entries:
+        tool_counts[e.get("tool", "?")] += 1
+        status_counts[e.get("status", "?")] += 1
+        user_counts[e.get("user", "?")] += 1
+
+    return {
+        "total": len(entries),
+        "days": days,
+        "by_tool":   dict(tool_counts.most_common()),
+        "by_status": dict(status_counts),
+        "by_user":   dict(user_counts.most_common()),
+    }
--- a/cortex/routers/chat.py
+++ b/cortex/routers/chat.py
@@ -8,19 +8,20 @@ from pydantic import BaseModel
 from context_loader import load_context
 from llm_client import complete
 from session_logger import log_turn
-from session_store import load as load_session, save as save_session, list_all, generate_session_id, delete as delete_session, rename as rename_session
+from session_store import load as load_session, save as save_session, list_all, generate_session_id, delete as delete_session, rename as rename_session, get_name as get_session_name
 from config import settings
 from persona import set_context, validate as validate_persona
 from auth_utils import COOKIE_NAME, decode_token
 import model_registry
 import event_bus
+from model_registry import get_role_config


 router = APIRouter()


 def _backend_label(backend: str, username: str, role: str = "chat") -> str:
-    """Human-readable label for the model that handled a request."""
+    """Human-readable label for the model that handled a request (legacy path)."""
    if backend == "claude":
        return "Claude"
    if backend == "gemini":
@@ -33,17 +34,34 @@ def _backend_label(backend: str, username: str, role: str = "chat") -> str:
    return backend.title()


+def _role_model_label(username: str, role: str, actual_backend: str) -> str:
+    """Return the model label for a role, falling back to the generic backend label."""
+    cfg = model_registry.get_model_for_role(username, role)
+    if cfg:
+        return cfg.get("label") or cfg.get("model_name") or _backend_label(actual_backend, username, role)
+    return _backend_label(actual_backend, username, role)
+
+
+class Attachment(BaseModel):
+    filename: str
+    mime_type: str
+    data: str  # base64 data URL for images (e.g. "data:image/png;base64,...")
+
+
 class ChatRequest(BaseModel):
    message: str
    session_id: str | None = None
    tier: int | None = None
-    model: str | None = None  # "claude" or "gemini" to override; None = use primary_backend
+    model: str | None = None        # legacy backend override ("claude"|"gemini"|"local")
+    slot: str | None = None         # Phase 3: explicit slot ("primary"|"backup_1"|"backup_2")
+    chat_role: str = "chat"         # active role: "chat"|"coder"|"research"|"distill" etc.
    include_long: bool = True
    include_mid: bool = True
    include_short: bool = True
-    off_record: bool = False  # skip session log (in-memory context preserved)
+    off_record: bool = False        # skip session log (in-memory context preserved)
    user: str = "scott"
    persona: str = "inara"
+    attachment: Attachment | None = None  # image attachment (text files injected client-side)


 class BackendRequest(BaseModel):
@@ -81,19 +99,39 @@ async def _stream_chat(req: ChatRequest):
    session_id = req.session_id or generate_session_id()
    tier = req.tier or settings.default_tier

+    role_cfg = get_role_config(user, req.chat_role)
    system_prompt = load_context(
        tier,
        include_long=req.include_long,
        include_mid=req.include_mid,
        include_short=req.include_short,
+        inject_datetime=role_cfg.get("inject_datetime", True),
+        inject_mode=role_cfg.get("inject_mode", True),
+        mode="otr" if req.off_record else "chat",
    )
    history = load_session(session_id)
-    history.append({"role": "user", "content": req.message})
+
+    # req.message already contains the full user text:
+    # - text files: client embedded content as a fenced code block
+    # - images: client added "📎 filename.png" note; image data is in req.attachment
+    # History always stores text only — base64 image data is never written to disk.
+    llm_attachment: dict | None = None
+    if req.attachment and req.attachment.mime_type.startswith("image/"):
+        llm_attachment = {
+            "filename": req.attachment.filename,
+            "mime_type": req.attachment.mime_type,
+            "data": req.attachment.data,
+        }
+
+    history.append({"role": "user", "content": req.message, "off_record": req.off_record})

    task = asyncio.create_task(complete(
        system_prompt=system_prompt,
        messages=history,
        model=req.model,
+        role=req.chat_role,
+        slot=req.slot,
+        attachment=llm_attachment,
    ))

    try:
@@ -109,7 +147,11 @@ async def _stream_chat(req: ChatRequest):

        try:
            response_text, actual_backend = task.result()
-            backend_label = _backend_label(actual_backend, user, role="chat")
+            if req.slot:
+                slot_cfg = model_registry.get_model_for_slot(user, req.chat_role, req.slot)
+                backend_label = (slot_cfg or {}).get("label") or _role_model_label(user, req.chat_role, actual_backend)
+            else:
+                backend_label = _role_model_label(user, req.chat_role, actual_backend)
            host = platform.node()
            history.append({
                "role": "assistant",
@@ -117,6 +159,7 @@ async def _stream_chat(req: ChatRequest):
                "backend": actual_backend,
                "backend_label": backend_label,
                "host": host,
+                "off_record": req.off_record,
            })
            save_session(session_id, history)
            if not req.off_record:
@@ -164,28 +207,94 @@ _BACKEND_CYCLE = ("claude", "gemini", "local")
 _BACKEND_FALLBACK = {"claude": "gemini", "gemini": "claude", "local": "claude"}


+def _request_user(request: Request) -> str | None:
+    """Extract username from JWT cookie, or None."""
+    try:
+        token = request.cookies.get(COOKIE_NAME)
+        return decode_token(token) if token else None
+    except (jwt.InvalidTokenError, Exception):
+        return None
+
+
 def _local_model_info(request: Request) -> dict | None:
    """Return the best local model {label, model_name} for the session user, or None."""
+    username = _request_user(request)
+    if not username:
+        return None
    try:
-        token    = request.cookies.get(COOKIE_NAME)
-        username = decode_token(token) if token else None
-        if not username:
-            return None
        cfg = model_registry.get_best_local_model(username, "chat")
        if cfg:
            return {"label": cfg.get("label", ""), "model_name": cfg.get("model_name", "")}
-    except (jwt.InvalidTokenError, Exception):
+    except Exception:
        pass
    return None


+def _chat_slot_models(username: str) -> list[dict]:
+    """Return [{slot, label, type}] for each configured slot in the chat role, primary first."""
+    registry = model_registry.get_registry(username)
+    role_slots = registry.get("roles", {}).get("chat", {})
+    result = []
+    for slot_key in model_registry.PRIORITY_KEYS:
+        model_id = role_slots.get(slot_key)
+        if not model_id:
+            continue
+        resolved = model_registry._resolve_model(registry, model_id)
+        if resolved:
+            result.append({
+                "slot":  slot_key,
+                "label": resolved.get("label") or resolved.get("model_name") or "",
+                "type":  resolved.get("type", ""),
+            })
+    return result
+
+
+def _available_roles_for_toggle(username: str) -> list[dict]:
+    """Return roles with a primary model assigned (excluding orchestrator) for the UI toggle.
+
+    Returns [{role, label, model_label, type}] ordered by settings.defined_roles.
+    """
+    registry = model_registry.get_registry(username)
+    roles_cfg = registry.get("roles", {})
+    result = []
+    for role_name in settings.get_defined_roles():
+        if role_name == "orchestrator":
+            continue
+        primary_id = roles_cfg.get(role_name, {}).get("primary")
+        if not primary_id:
+            continue
+        resolved = model_registry._resolve_model(registry, primary_id)
+        if resolved:
+            result.append({
+                "role":        role_name,
+                "label":       role_name.title(),
+                "model_label": resolved.get("label") or resolved.get("model_name") or "",
+                "type":        resolved.get("type", ""),
+            })
+    return result
+
+
@router.get("/backend")
 async def get_backend(request: Request) -> dict:
+    username        = _request_user(request)
+    chat_models     = _chat_slot_models(username) if username else []
+    available_roles = _available_roles_for_toggle(username) if username else []
    p = settings.primary_backend
+
+    orch_label = None
+    if username:
+        orch_cfg = model_registry.get_model_for_role(username, "orchestrator")
+        if orch_cfg:
+            orch_label = orch_cfg.get("label") or orch_cfg.get("model_name") or None
+
    return {
-        "primary":      p,
-        "fallback":     _BACKEND_FALLBACK.get(p, "claude"),
-        "local_model":  _local_model_info(request),
+        "chat_models":        chat_models,       # Phase 3: [{slot, label, type}] for chat-role slots
+        "available_roles":    available_roles,    # kept for banner + backward compat
+        "orchestrator_model": orch_label,
+        # Legacy fields kept for backward compat
+        "primary":     p,
+        "fallback":    _BACKEND_FALLBACK.get(p, "claude"),
+        "local_model": _local_model_info(request),
    }


@@ -217,7 +326,8 @@ async def get_history(
    persona: str = Query("inara"),
 ) -> dict:
    _set_ctx(user, persona)
-    return {"session_id": session_id, "messages": load_session(session_id)}
+    name = get_session_name(session_id)
+    return {"session_id": session_id, "name": name, "messages": load_session(session_id)}


@router.get("/sessions")
@@ -247,6 +357,53 @@ async def rename_session_endpoint(
    return {"ok": True, "session_id": session_id, "name": req.name.strip()}


+@router.post("/api/sessions/backfill-names")
+async def backfill_session_names(
+    request: Request,
+    user: str = Query(""),
+    persona: str = Query(""),
+) -> dict:
+    """Name every unnamed session using its first user message (truncated to 60 chars).
+    Idempotent — only touches sessions that have no name set.
+    user/persona default to the JWT session user + last-used persona cookie."""
+    # Resolve user from JWT if not provided
+    if not user:
+        token = request.cookies.get(COOKIE_NAME)
+        if not token:
+            raise HTTPException(status_code=401, detail="Not authenticated")
+        try:
+            user = decode_token(token)
+        except jwt.InvalidTokenError:
+            raise HTTPException(status_code=401, detail="Invalid session")
+
+    # Resolve persona from cookie if not provided
+    if not persona:
+        from persona import list_user_personas
+        persona_cookie = request.cookies.get("cx_last_persona", "")
+        available = list_user_personas(user)
+        persona = persona_cookie if persona_cookie in available else (available[0] if available else "")
+    if not persona:
+        raise HTTPException(status_code=400, detail="No persona found for user")
+
+    _set_ctx(user, persona)
+    sessions = list_all()
+    named = 0
+    for s in sessions:
+        if s.get("name"):
+            continue
+        messages = load_session(s["session_id"])
+        first_user = next((m for m in messages if m.get("role") == "user"), None)
+        if not first_user:
+            continue
+        text = (first_user.get("content") or "").strip()
+        if not text:
+            continue
+        auto_name = text[:60].rstrip() + ("…" if len(text) > 60 else "")
+        rename_session(s["session_id"], auto_name)
+        named += 1
+    return {"ok": True, "named": named, "total": len(sessions)}
+
+
@router.delete("/sessions/{session_id}")
 async def delete_session_endpoint(
    session_id: str,
--- a/cortex/routers/crons.py
+++ b/cortex/routers/crons.py
@@ -0,0 +1,479 @@
+"""
+Schedules web UI — GET/POST /settings/crons/*
+
+Lets users view, add, toggle, and remove cron jobs without going through the AI.
+Cron data lives in home/{user}/persona/{persona}/CRONS.json.
+Scheduler registration mirrors what tools/cron.py does so changes take effect immediately.
+"""
+
+import html as _html
+import logging
+import secrets
+from datetime import datetime, timezone
+from pathlib import Path
+
+import jwt
+from fastapi import APIRouter, Form, Request
+from fastapi.responses import HTMLResponse, RedirectResponse
+
+from auth_utils import COOKIE_NAME, decode_token
+from cron_runner import load_crons, save_crons, parse_schedule
+from persona import list_user_personas
+
+logger = logging.getLogger(__name__)
+router = APIRouter()
+
+_STATIC = Path(__file__).parent.parent / "static"
+_LAST_PERSONA_COOKIE = "cx_last_persona"
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _get_session_user(request: Request) -> str | None:
+    token = request.cookies.get(COOKIE_NAME)
+    if not token:
+        return None
+    try:
+        return decode_token(token)
+    except jwt.InvalidTokenError:
+        return None
+
+
+def _preferred_persona(request: Request, username: str) -> str:
+    names = list_user_personas(username)
+    if not names:
+        return ""
+    cookie_val = request.cookies.get(_LAST_PERSONA_COOKIE, "")
+    if cookie_val in names:
+        return cookie_val
+    return names[0]
+
+
+def _integrations_nav(username: str) -> str:
+    from auth_utils import _read_auth
+    role = _read_auth(username).get("role", "user")
+    if role == "admin":
+        return '<a href="/settings/integrations" class="nav-link">Integrations</a>'
+    return ""
+
+
+def _now() -> str:
+    return datetime.now(timezone.utc).isoformat()
+
+
+def _short_id() -> str:
+    return "c_" + secrets.token_urlsafe(6)
+
+
+def _scheduler_add(job: dict, sched_kwargs: dict) -> None:
+    import asyncio
+    try:
+        import scheduler as sched_module
+        from cron_runner import run_job
+        s = sched_module.get_scheduler()
+        if s and s.running:
+            sched_id = f"{job['user']}:{job['persona']}:{job['id']}"
+            s.add_job(
+                lambda j=job: asyncio.ensure_future(run_job(j)),
+                "cron",
+                id=sched_id,
+                replace_existing=True,
+                **sched_kwargs,
+            )
+    except Exception as e:
+        logger.warning("scheduler_add failed: %s", e)
+
+
+def _scheduler_remove(job_id: str) -> None:
+    try:
+        import scheduler as sched_module
+        s = sched_module.get_scheduler()
+        if s and s.running:
+            s.remove_job(job_id)
+    except Exception:
+        pass
+
+
+def _scheduler_pause(job_id: str) -> None:
+    try:
+        import scheduler as sched_module
+        s = sched_module.get_scheduler()
+        if s and s.running:
+            s.pause_job(job_id)
+    except Exception:
+        pass
+
+
+def _scheduler_resume(job_id: str) -> None:
+    try:
+        import scheduler as sched_module
+        s = sched_module.get_scheduler()
+        if s and s.running:
+            s.resume_job(job_id)
+    except Exception:
+        pass
+
+
+_TYPE_CLASS = {
+    "remind": "badge-remind", "note": "badge-note", "message": "badge-message",
+    "brief": "badge-brief", "task": "badge-task",
+}
+
+
+def _render_cron_list(username: str) -> str:
+    personas = list_user_personas(username)
+    if not personas:
+        return '<div class="empty-state">No personas found.</div>'
+
+    all_empty = True
+    html_parts: list[str] = []
+
+    for persona in personas:
+        crons = load_crons(username, persona)
+        if not crons:
+            continue
+        all_empty = False
+
+        rows = []
+        for c in crons:
+            cid        = _html.escape(c["id"])
+            label      = _html.escape(c.get("label", ""))
+            schedule   = _html.escape(c.get("schedule", ""))
+            job_type   = _html.escape(c.get("type", ""))
+            payload    = _html.escape(c.get("payload", ""))
+            enabled    = c.get("enabled", True)
+            last_run   = (c.get("last_run") or "")[:10] or "never"
+            pers_esc   = _html.escape(persona)
+
+            type_class = _TYPE_CLASS.get(c.get("type", ""), "badge-note")
+            status_cls = "badge-enabled" if enabled else "badge-paused"
+            status_txt = "enabled" if enabled else "paused"
+            toggle_txt = "Pause" if enabled else "Resume"
+
+            rows.append(f"""
+        <tr>
+          <td>{label}</td>
+          <td><code>{schedule}</code></td>
+          <td><span class="badge {type_class}">{job_type}</span></td>
+          <td class="payload-cell" title="{payload}">{payload}</td>
+          <td>{last_run}</td>
+          <td><span class="badge {status_cls}">{status_txt}</span></td>
+          <td>
+            <div class="cron-actions">
+              <a href="/settings/crons/edit?cron_id={cid}&persona={pers_esc}"
+                 class="btn-cron">Edit</a>
+              <form method="POST" action="/settings/crons/toggle" style="display:inline">
+                <input type="hidden" name="cron_id" value="{cid}">
+                <input type="hidden" name="persona" value="{pers_esc}">
+                <button type="submit" class="btn-cron">{toggle_txt}</button>
+              </form>
+              <form method="POST" action="/settings/crons/remove" style="display:inline"
+                    onsubmit="return confirm('Delete this schedule?')">
+                <input type="hidden" name="cron_id" value="{cid}">
+                <input type="hidden" name="persona" value="{pers_esc}">
+                <button type="submit" class="btn-cron btn-cron-del">Delete</button>
+              </form>
+            </div>
+          </td>
+        </tr>""")
+
+        html_parts.append(f"""
+    <div class="persona-group">
+      <p class="persona-group-label">{_html.escape(persona)}</p>
+      <table class="cron-table">
+        <thead>
+          <tr>
+            <th>Label</th><th>Schedule</th><th>Type</th>
+            <th>Payload</th><th>Last run</th><th>Status</th><th></th>
+          </tr>
+        </thead>
+        <tbody>{"".join(rows)}
+        </tbody>
+      </table>
+    </div>""")
+
+    if all_empty:
+        return '<div class="empty-state">No schedules yet. Add one below.</div>'
+
+    return "\n".join(html_parts)
+
+
+def _persona_options(username: str, selected: str = "") -> str:
+    personas = list_user_personas(username)
+    return "\n".join(
+        f'<option value="{_html.escape(p)}"{"selected" if p == selected else ""}>{_html.escape(p)}</option>'
+        for p in personas
+    )
+
+
+_TYPE_OPTIONS = ("remind", "note", "message", "brief", "task")
+_TYPE_LABELS  = {
+    "remind":  "remind — append to REMINDERS.md",
+    "note":    "note — append to SCRATCH.md",
+    "message": "message — send payload as-is",
+    "brief":   "brief — LLM response, no tools",
+    "task":    "task — full orchestrator tool loop",
+}
+
+
+def _render_edit_form(job: dict, persona: str) -> str:
+    cid      = _html.escape(job["id"])
+    pers_esc = _html.escape(persona)
+    label    = _html.escape(job.get("label", ""))
+    schedule = _html.escape(job.get("schedule", ""))
+    payload  = _html.escape(job.get("payload", ""))
+    cur_type = job.get("type", "remind")
+
+    type_opts = "\n".join(
+        f'<option value="{t}" {"selected" if t == cur_type else ""}>{_html.escape(_TYPE_LABELS.get(t, t))}</option>'
+        for t in _TYPE_OPTIONS
+    )
+
+    return f"""
+    <div class="section" style="border: 2px solid var(--pg-accent); border-radius: 8px; padding: 1rem;">
+      <h2 style="margin-top:0">Edit schedule</h2>
+      <form method="POST" action="/settings/crons/save">
+        <input type="hidden" name="cron_id" value="{cid}">
+        <input type="hidden" name="persona" value="{pers_esc}">
+        <div class="add-form-grid">
+          <div class="field">
+            <label>Persona</label>
+            <input type="text" value="{pers_esc}" disabled style="opacity:0.5">
+          </div>
+          <div class="field">
+            <label for="edit_job_type">Type</label>
+            <select id="edit_job_type" name="job_type">{type_opts}</select>
+          </div>
+          <div class="field">
+            <label for="edit_label">Label</label>
+            <input type="text" id="edit_label" name="label" value="{label}" required autocomplete="off">
+          </div>
+          <div class="field">
+            <label for="edit_schedule">Schedule</label>
+            <input type="text" id="edit_schedule" name="schedule" value="{schedule}"
+                   required autocomplete="off" spellcheck="false">
+            <p class="hint">
+              hourly · daily · daily:HH:MM · weekly:DOW · weekly:DOW:HH:MM ·
+              monthly · monthly:DD · monthly:DD:HH:MM · yearly:MM:DD · yearly:MM:DD:HH:MM
+            </p>
+          </div>
+          <div class="field field-full">
+            <label for="edit_payload">Payload / prompt</label>
+            <textarea id="edit_payload" name="payload" rows="3" required>{payload}</textarea>
+          </div>
+        </div>
+        <div style="display:flex; gap:0.5rem; align-items:center; margin-top:0.5rem">
+          <button type="submit" class="btn-submit" style="margin-top:0">Save changes</button>
+          <a href="/settings/crons" style="font-size:0.85rem; color:var(--pg-muted)">Cancel</a>
+        </div>
+      </form>
+    </div>"""
+
+
+def _render_page(username: str, back_persona: str = "", success: str = "", error: str = "",
+                 edit_html: str = "") -> str:
+    html = (_STATIC / "crons.html").read_text()
+    html = html.replace("{{ edit_html }}", edit_html)
+    html = html.replace("{{ cron_list_html }}", _render_cron_list(username))
+    html = html.replace("{{ persona_options }}", _persona_options(username, back_persona))
+    html = html.replace("{{ back_href }}", f"/{username}/{back_persona}" if back_persona else "/")
+    html = html.replace("{{ help_href }}", f"/help?persona={back_persona}" if back_persona else "/help")
+    html = html.replace("{{ integrations_nav }}", _integrations_nav(username))
+    if success:
+        html = html.replace("<!-- SUCCESS -->", f'<p class="success">{_html.escape(success)}</p>')
+    if error:
+        html = html.replace("<!-- ERROR -->", f'<p class="error">{_html.escape(error)}</p>')
+    return html
+
+
+# ---------------------------------------------------------------------------
+# Routes
+# ---------------------------------------------------------------------------
+
+@router.get("/settings/crons", include_in_schema=False)
+async def crons_page(request: Request):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+    return HTMLResponse(_render_page(username, back_persona))
+
+
+@router.post("/settings/crons/add", include_in_schema=False)
+async def cron_add(
+    request: Request,
+    persona:   str = Form(""),
+    label:     str = Form(""),
+    schedule:  str = Form(""),
+    job_type:  str = Form(""),
+    payload:   str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+
+    label    = label.strip()
+    schedule = schedule.strip()
+    payload  = payload.strip()
+    persona  = persona.strip()
+
+    _VALID_TYPES = ("remind", "note", "message", "brief", "task")
+    if job_type not in _VALID_TYPES:
+        return HTMLResponse(_render_page(username, back_persona, error=f"Invalid type: {job_type}"))
+
+    try:
+        sched_kwargs = parse_schedule(schedule)
+    except ValueError as e:
+        return HTMLResponse(_render_page(username, back_persona, error=f"Bad schedule: {e}"))
+
+    if not label:
+        return HTMLResponse(_render_page(username, back_persona, error="Label is required."))
+    if not payload:
+        return HTMLResponse(_render_page(username, back_persona, error="Payload is required."))
+
+    crons = load_crons(username, persona)
+    job = {
+        "id":         _short_id(),
+        "user":       username,
+        "persona":    persona,
+        "label":      label,
+        "schedule":   schedule,
+        "type":       job_type,
+        "payload":    payload,
+        "enabled":    True,
+        "created_at": _now(),
+        "last_run":   None,
+    }
+    crons.append(job)
+    save_crons(crons, username, persona)
+    _scheduler_add(job, sched_kwargs)
+
+    logger.info("cron added via UI: %s %s [%s]", job["id"], schedule, job_type)
+    return HTMLResponse(_render_page(username, back_persona, success=f"Schedule '{label}' added."))
+
+
+@router.post("/settings/crons/toggle", include_in_schema=False)
+async def cron_toggle(
+    request: Request,
+    cron_id: str = Form(""),
+    persona: str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+
+    crons = load_crons(username, persona)
+    for c in crons:
+        if c["id"] == cron_id:
+            c["enabled"] = not c.get("enabled", True)
+            save_crons(crons, username, persona)
+            sched_id = f"{username}:{persona}:{cron_id}"
+            if c["enabled"]:
+                _scheduler_resume(sched_id)
+                action = "resumed"
+            else:
+                _scheduler_pause(sched_id)
+                action = "paused"
+            logger.info("cron %s %s via UI", cron_id, action)
+            return HTMLResponse(_render_page(username, back_persona, success=f"Schedule {action}."))
+
+    return HTMLResponse(_render_page(username, back_persona, error=f"Schedule not found: {cron_id}"))
+
+
+@router.post("/settings/crons/remove", include_in_schema=False)
+async def cron_remove(
+    request: Request,
+    cron_id: str = Form(""),
+    persona: str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+
+    crons = load_crons(username, persona)
+    before = len(crons)
+    crons = [c for c in crons if c["id"] != cron_id]
+    if len(crons) == before:
+        return HTMLResponse(_render_page(username, back_persona, error=f"Schedule not found: {cron_id}"))
+
+    save_crons(crons, username, persona)
+    _scheduler_remove(f"{username}:{persona}:{cron_id}")
+    logger.info("cron %s removed via UI", cron_id)
+    return HTMLResponse(_render_page(username, back_persona, success="Schedule deleted."))
+
+
+@router.get("/settings/crons/edit", include_in_schema=False)
+async def cron_edit_page(request: Request, cron_id: str = "", persona: str = ""):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+
+    crons = load_crons(username, persona)
+    job = next((c for c in crons if c["id"] == cron_id), None)
+    if not job:
+        return HTMLResponse(_render_page(username, back_persona, error=f"Schedule not found: {cron_id}"))
+
+    edit_html = _render_edit_form(job, persona)
+    return HTMLResponse(_render_page(username, back_persona, edit_html=edit_html))
+
+
+@router.post("/settings/crons/save", include_in_schema=False)
+async def cron_save(
+    request: Request,
+    cron_id:  str = Form(""),
+    persona:  str = Form(""),
+    label:    str = Form(""),
+    schedule: str = Form(""),
+    job_type: str = Form(""),
+    payload:  str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+
+    label    = label.strip()
+    schedule = schedule.strip()
+    payload  = payload.strip()
+
+    if job_type not in _TYPE_OPTIONS:
+        return HTMLResponse(_render_page(username, back_persona, error=f"Invalid type: {job_type}"))
+    if not label:
+        return HTMLResponse(_render_page(username, back_persona, error="Label is required."))
+    if not payload:
+        return HTMLResponse(_render_page(username, back_persona, error="Payload is required."))
+
+    try:
+        sched_kwargs = parse_schedule(schedule)
+    except ValueError as e:
+        # Re-render with the edit form still open so the user can fix the schedule
+        crons = load_crons(username, persona)
+        job = next((c for c in crons if c["id"] == cron_id), None)
+        edit_html = _render_edit_form(job, persona) if job else ""
+        return HTMLResponse(_render_page(username, back_persona, error=f"Bad schedule: {e}",
+                                         edit_html=edit_html))
+
+    crons = load_crons(username, persona)
+    for c in crons:
+        if c["id"] == cron_id:
+            c["label"]   = label
+            c["schedule"] = schedule
+            c["type"]    = job_type
+            c["payload"] = payload
+            save_crons(crons, username, persona)
+            # Replace the live scheduler job with the updated schedule
+            sched_id = f"{username}:{persona}:{cron_id}"
+            _scheduler_remove(sched_id)
+            if c.get("enabled", True):
+                _scheduler_add(c, sched_kwargs)
+            logger.info("cron %s updated via UI [%s]", cron_id, schedule)
+            return HTMLResponse(_render_page(username, back_persona,
+                                             success=f"Schedule '{label}' updated."))
+
+    return HTMLResponse(_render_page(username, back_persona, error=f"Schedule not found: {cron_id}"))
--- a/cortex/routers/distill.py
+++ b/cortex/routers/distill.py
@@ -1,25 +1,103 @@
 """
 Manual memory distillation endpoints.

-  POST /distill/short  — roll session logs → MEMORY_SHORT.md (no LLM)
-  POST /distill/mid    — summarize short   → MEMORY_MID.md   (LLM)
-  POST /distill/long   — integrate mid     → MEMORY_LONG.md  (LLM)
-  POST /distill/all    — run all three in sequence
+  POST /distill/short    — roll session logs → MEMORY_SHORT.md (no LLM)
+  POST /distill/mid      — summarize short   → MEMORY_MID.md   (LLM)
+  POST /distill/long     — integrate mid     → MEMORY_LONG.md  (LLM)
+  POST /distill/all      — run all three in sequence
+  POST /distill/rebuild  — wipe mid + long, then run all three from scratch
+
+All endpoints require ?user=<username>&persona=<name> query params.
+
+Concurrency: one distillation at a time per persona. A second request while one
+is running returns 409 immediately — no silent queuing.
 """
-from fastapi import APIRouter
+import asyncio
+from datetime import datetime, timedelta
+from fastapi import APIRouter, HTTPException, Query
 from memory_distiller import distill_short, distill_mid, distill_long
+from persona import validate as validate_persona, set_context, persona_path as _persona_path
 import scheduler

 router = APIRouter(prefix="/distill")

+# Per-persona asyncio lock. Key: (user, persona)
+_LOCKS: dict[tuple, asyncio.Lock] = {}
+_LOCKS_META: dict[tuple, str] = {}  # key → which step is currently running
+
+# Minimum time between successive runs of each endpoint, per persona.
+# Prevents accidental rapid-fire runs and token waste.
+_COOLDOWNS: dict[tuple, timedelta] = {
+    "short":   timedelta(minutes=1),
+    "mid":     timedelta(minutes=30),
+    "long":    timedelta(hours=6),
+    "all":     timedelta(hours=1),
+    "rebuild": timedelta(hours=6),
+}
+_LAST_RUN: dict[tuple, datetime] = {}  # key: (user, persona, endpoint)
+
+
+def _get_lock(user: str, persona: str) -> asyncio.Lock:
+    key = (user, persona)
+    if key not in _LOCKS:
+        _LOCKS[key] = asyncio.Lock()
+    return _LOCKS[key]
+
+
+def _resolve(user: str, persona: str) -> tuple[str, str]:
+    try:
+        u, p = validate_persona(user, persona)
+    except Exception:
+        raise HTTPException(status_code=404, detail=f"Persona not found: {user}/{persona}")
+    set_context(u, p)
+    return u, p
+
+
+def _check_lock(user: str, persona: str) -> asyncio.Lock:
+    """Return the lock if free, raise 409 if already held."""
+    lock = _get_lock(user, persona)
+    if lock.locked():
+        step = _LOCKS_META.get((user, persona), "distillation")
+        raise HTTPException(
+            status_code=409,
+            detail=f"A {step} is already running for {persona} — please wait for it to finish.",
+        )
+    return lock
+
+
+def _check_cooldown(user: str, persona: str, endpoint: str) -> None:
+    """Raise 429 if the endpoint was run too recently for this persona."""
+    cooldown = _COOLDOWNS.get(endpoint)
+    if not cooldown:
+        return
+    key = (user, persona, endpoint)
+    last = _LAST_RUN.get(key)
+    if last:
+        elapsed = datetime.now() - last
+        if elapsed < cooldown:
+            remaining = cooldown - elapsed
+            mins = int(remaining.total_seconds() // 60)
+            secs = int(remaining.total_seconds() % 60)
+            wait = f"{mins}m {secs}s" if mins else f"{secs}s"
+            raise HTTPException(
+                status_code=429,
+                detail=f"{endpoint} was just run — please wait {wait} before running again.",
+            )
+
+
+def _record_run(user: str, persona: str, endpoint: str) -> None:
+    _LAST_RUN[(user, persona, endpoint)] = datetime.now()
+

@router.get("/status")
 async def distill_status() -> dict:
-    """Show auto-distillation schedule and next run times."""
    from config import settings
+    # Include which personas are currently distilling
+    active = [f"{u}/{p}" for (u, p), lock in _LOCKS.items() if lock.locked()]
    return {
        "enabled": settings.auto_distill,
        "jobs": scheduler.status(),
+        "active": active,
        "config": {
            "short": settings.auto_distill_short,
            "mid": settings.auto_distill_mid,
@@ -29,32 +107,132 @@ async def distill_status() -> dict:


@router.post("/short")
-async def do_distill_short() -> dict:
-    return {"ok": True, **distill_short()}
+async def do_distill_short(
+    user: str = Query(...),
+    persona: str = Query(...),
+) -> dict:
+    u, p = _resolve(user, persona)
+    _check_cooldown(u, p, "short")
+    lock = _check_lock(u, p)
+    async with lock:
+        _LOCKS_META[(u, p)] = "short distill"
+        try:
+            result = distill_short(u, p)
+            _record_run(u, p, "short")
+            return {"ok": True, **result}
+        finally:
+            _LOCKS_META.pop((u, p), None)


@router.post("/mid")
-async def do_distill_mid() -> dict:
-    result = await distill_mid()
-    return {"ok": "error" not in result, **result}
+async def do_distill_mid(
+    user: str = Query(...),
+    persona: str = Query(...),
+) -> dict:
+    u, p = _resolve(user, persona)
+    _check_cooldown(u, p, "mid")
+    lock = _check_lock(u, p)
+    async with lock:
+        _LOCKS_META[(u, p)] = "mid distill"
+        try:
+            result = await distill_mid(u, p)
+            if "error" not in result:
+                _record_run(u, p, "mid")
+            return {"ok": "error" not in result, **result}
+        finally:
+            _LOCKS_META.pop((u, p), None)


@router.post("/long")
-async def do_distill_long() -> dict:
-    result = await distill_long()
-    return {"ok": "error" not in result, **result}
+async def do_distill_long(
+    user: str = Query(...),
+    persona: str = Query(...),
+) -> dict:
+    u, p = _resolve(user, persona)
+    _check_cooldown(u, p, "long")
+    lock = _check_lock(u, p)
+    async with lock:
+        _LOCKS_META[(u, p)] = "long distill"
+        try:
+            result = await distill_long(u, p)
+            if "error" not in result:
+                _record_run(u, p, "long")
+            return {"ok": "error" not in result, **result}
+        finally:
+            _LOCKS_META.pop((u, p), None)


@router.post("/all")
-async def do_distill_all() -> dict:
-    short_result = distill_short()
-    mid_result = await distill_mid()
-    if "error" in mid_result:
-        return {"ok": False, "short": short_result, "mid": mid_result}
-    long_result = await distill_long()
-    return {
-        "ok": "error" not in long_result,
-        "short": short_result,
-        "mid": mid_result,
-        "long": long_result,
-    }
+async def do_distill_all(
+    user: str = Query(...),
+    persona: str = Query(...),
+) -> dict:
+    u, p = _resolve(user, persona)
+    _check_cooldown(u, p, "all")
+    lock = _check_lock(u, p)
+    async with lock:
+        _LOCKS_META[(u, p)] = "full distill"
+        try:
+            short_result = distill_short(u, p)
+            mid_result = await distill_mid(u, p)
+            if "error" in mid_result:
+                return {"ok": False, "short": short_result, "mid": mid_result}
+            long_result = await distill_long(u, p)
+            ok = "error" not in long_result
+            if ok:
+                _record_run(u, p, "all")
+            return {
+                "ok": ok,
+                "short": short_result,
+                "mid": mid_result,
+                "long": long_result,
+            }
+        finally:
+            _LOCKS_META.pop((u, p), None)
+
+
+@router.post("/rebuild")
+async def do_distill_rebuild(
+    user: str = Query(...),
+    persona: str = Query(...),
+) -> dict:  # noqa: E501
+    """Wipe MEMORY_MID and MEMORY_LONG (with backups), then run short → mid → long.
+
+    Use when memories have drifted, been corrupted, or you want a clean slate
+    rebuilt purely from session logs. Hand-edited content will be replaced.
+    """
+    u, p = _resolve(user, persona)
+    _check_cooldown(u, p, "rebuild")
+    lock = _check_lock(u, p)
+    async with lock:
+        _LOCKS_META[(u, p)] = "memory rebuild"
+        try:
+            from memory_distiller import _rotate_backup, _read
+            inara_dir = _persona_path(u, p)
+
+            # Back up then wipe mid and long before rebuilding
+            for name in ("MEMORY_MID.md", "MEMORY_LONG.md"):
+                path = inara_dir / name
+                if path.exists():
+                    _rotate_backup(path)
+                    path.write_text(
+                        f"# {name}\n\n*Cleared for rebuild — {__import__('datetime').datetime.now().strftime('%Y-%m-%d %H:%M')}.*\n"
+                    )
+
+            short_result = distill_short(u, p)
+            mid_result = await distill_mid(u, p)
+            if "error" in mid_result:
+                return {"ok": False, "short": short_result, "mid": mid_result, "rebuilt": True}
+            long_result = await distill_long(u, p)
+            ok = "error" not in long_result
+            if ok:
+                _record_run(u, p, "rebuild")
+            return {
+                "ok": ok,
+                "short": short_result,
+                "mid": mid_result,
+                "long": long_result,
+                "rebuilt": True,
+            }
+        finally:
+            _LOCKS_META.pop((u, p), None)
--- a/cortex/routers/files.py
+++ b/cortex/routers/files.py
@@ -6,6 +6,7 @@ import re
 from fastapi import APIRouter, HTTPException, Query
 from pydantic import BaseModel
 from persona import persona_path, set_context, validate as validate_persona
+from config import settings as _settings

 router = APIRouter()

@@ -15,13 +16,33 @@ ALLOWED = {
    "USER.md",
    "PROTOCOLS.md",
    "CONTEXT_TIERS.md",
-    "MEMORY.md",        # legacy — kept for reference
+    "MEMORY.md",          # legacy — kept for reference
    "MEMORY_LONG.md",
    "MEMORY_MID.md",
    "MEMORY_SHORT.md",
+    "MEMORY_LONG.bak1.md",
+    "MEMORY_LONG.bak2.md",
+    "MEMORY_MID.bak1.md",
+    "MEMORY_MID.bak2.md",
+    "MEMORY_SHORT.bak1.md",
+    "MEMORY_SHORT.bak2.md",
    "HELP.md",
+    # Agent private notes — backups only; AGENT_NOTES.md itself is agent-only
+    "AGENT_NOTES.bak1.md",
+    "AGENT_NOTES.bak2.md",
+    "AGENT_NOTES.bak3.md",
 }

+# Files that can be read via the panel but not written by users
+READ_ONLY = {
+    "AGENT_NOTES.bak1.md",
+    "AGENT_NOTES.bak2.md",
+    "AGENT_NOTES.bak3.md",
+}
+
+# Files served from home/{user}/ instead of persona path
+USER_FILES = {"email_allowlist.json", "usage.json"}
+

 def _resolve(user: str, persona: str) -> None:
    """Validate and set context from query params. Raises HTTPException on bad input."""
@@ -32,7 +53,11 @@ def _resolve(user: str, persona: str) -> None:
        raise HTTPException(status_code=404, detail=str(e))


-def _path(filename: str):
+def _path(filename: str, user: str = ""):
+    if filename in USER_FILES:
+        if not user:
+            raise HTTPException(status_code=400, detail="user param required for this file")
+        return _settings.home_root() / user / filename
    if filename not in ALLOWED:
        raise HTTPException(status_code=404, detail=f"File not found: {filename}")
    return persona_path() / filename
@@ -55,6 +80,16 @@ async def list_files(
            "size": st.st_size if st else 0,
            "modified": st.st_mtime if st else None,
        })
+    for name in sorted(USER_FILES):
+        p = _settings.home_root() / user / name
+        st = p.stat() if p.exists() else None
+        files.append({
+            "name": name,
+            "exists": p.exists(),
+            "size": st.st_size if st else 0,
+            "modified": st.st_mtime if st else None,
+            "scope": "user",
+        })
    return {"files": files}


@@ -65,10 +100,14 @@ async def get_file(
    persona: str = Query("inara"),
 ) -> dict:
    _resolve(user, persona)
-    p = _path(filename)
+    p = _path(filename, user=user)
    if not p.exists():
        raise HTTPException(status_code=404, detail=f"{filename} does not exist")
-    return {"name": filename, "content": p.read_text()}
+    return {
+        "name": filename,
+        "content": p.read_text(),
+        "readonly": filename in READ_ONLY,
+    }


 class FileWrite(BaseModel):
@@ -82,8 +121,10 @@ async def save_file(
    user: str = Query("scott"),
    persona: str = Query("inara"),
 ) -> dict:
+    if filename in READ_ONLY:
+        raise HTTPException(status_code=403, detail=f"{filename} is read-only.")
    _resolve(user, persona)
-    p = _path(filename)
+    p = _path(filename, user=user)
    p.write_text(req.content)
    return {"ok": True, "name": filename, "size": len(req.content)}

--- a/cortex/routers/help.py
+++ b/cortex/routers/help.py
@@ -12,7 +12,7 @@ import jwt
 from fastapi import APIRouter, Request
 from fastapi.responses import HTMLResponse, RedirectResponse

-from auth_utils import COOKIE_NAME, decode_token
+from auth_utils import COOKIE_NAME, decode_token, _read_auth
 from persona import list_user_personas

 logger = logging.getLogger(__name__)
@@ -21,6 +21,9 @@ router = APIRouter()
 _STATIC = Path(__file__).parent.parent / "static"


+_LAST_PERSONA_COOKIE = "cx_last_persona"
+
+
 def _get_session_user(request: Request) -> str | None:
    token = request.cookies.get(COOKIE_NAME)
    if not token:
@@ -31,6 +34,16 @@ def _get_session_user(request: Request) -> str | None:
        return None


+def _preferred_persona(request: Request, username: str) -> str:
+    names = list_user_personas(username)
+    if not names:
+        return ""
+    cookie_val = request.cookies.get(_LAST_PERSONA_COOKIE, "")
+    if cookie_val in names:
+        return cookie_val
+    return names[0]
+
+
@router.get("/help", include_in_schema=False)
 async def help_page(request: Request, persona: str = ""):
    username = _get_session_user(request)
@@ -38,11 +51,11 @@ async def help_page(request: Request, persona: str = ""):
        return RedirectResponse("/login", status_code=302)

    personas = list_user_personas(username)
-    # Use persona from query param if valid, else fall back to first
+    # Use persona from query param if valid, else prefer last-visited from cookie
    if persona and persona in personas:
        back_persona = persona
    else:
-        back_persona = personas[0] if personas else ""
+        back_persona = _preferred_persona(request, username)
    back_href = f"/{username}/{back_persona}" if back_persona else "/"

    html = (_STATIC / "help.html").read_text()
@@ -51,4 +64,7 @@ async def help_page(request: Request, persona: str = ""):
        f'{{user: "{username}", persona: "{back_persona}", backHref: "{back_href}"}};</script>'
    )
    html = html.replace("</head>", f"{config_tag}\n</head>", 1)
+    nav = '<a href="/settings/integrations" class="nav-link">Integrations</a>' \
+        if _read_auth(username).get("role", "user") == "admin" else ""
+    html = html.replace("{{ integrations_nav }}", nav)
    return HTMLResponse(html)
--- a/cortex/routers/homeassistant.py
+++ b/cortex/routers/homeassistant.py
@@ -0,0 +1,199 @@
+"""
+Home Assistant webhook router — POST /webhook/ha/{username}/{webhook_id}
+
+Receives event payloads from HA automations and routes them to Inara.
+Auth: the webhook_id in the URL acts as the shared secret (same model HA uses).
+Response is delivered async via notify() — NC Talk, web push, etc.
+
+channels.json schema:
+  "homeassistant": {
+      "webhook_id": "your-secret-id",
+      "persona":    "inara",
+      "tier":       2,
+      "role":       "chat",
+      "tools":      false
+  }
+
+HA automation example (rest_command):
+  rest_command:
+    cortex_notify:
+      url: "https://cortex.dgrzone.com/webhook/ha/scott/your-secret-id"
+      method: POST
+      content_type: "application/json"
+      payload: '{"message": "{{message}}", "entity_id": "{{entity_id}}", "state": "{{state}}"}'
+
+  automation:
+    trigger:
+      - trigger: state
+        entity_id: binary_sensor.front_door
+        to: "on"
+    action:
+      - action: rest_command.cortex_notify
+        data:
+          message: "Front door opened"
+          entity_id: "binary_sensor.front_door"
+          state: "on"
+"""
+
+import json
+import logging
+
+from fastapi import APIRouter, BackgroundTasks, HTTPException, Request, Response
+
+from auth_utils import get_user_channels, get_user_gemini_key, get_user_role, get_tool_policy, get_risk_policy
+from context_loader import load_context
+from llm_client import complete
+from notification import notify
+from persona import set_context
+from session_logger import log_turn
+from session_store import load as load_session, save as save_session
+from config import settings
+import event_bus
+import model_registry
+import orchestrator_engine
+import openai_orchestrator
+
+logger = logging.getLogger(__name__)
+router = APIRouter()
+
+
+def _build_task(body: dict) -> str:
+    """Turn an HA event payload into a natural-language prompt for Inara."""
+    if "message" in body:
+        msg    = str(body["message"])
+        extras = {k: body[k] for k in ("entity_id", "state", "trigger", "event", "area") if k in body}
+        if extras:
+            msg += "\n\nContext: " + json.dumps(extras)
+        return msg
+    return "Home Assistant event:\n" + json.dumps(body, indent=2)
+
+
+async def _process_event(username: str, body: dict, cfg: dict) -> None:
+    persona_name = cfg.get("persona", "inara")
+    tier         = cfg.get("tier") or settings.default_tier
+    role         = cfg.get("role", "chat")
+    use_tools    = cfg.get("tools", False)
+
+    set_context(username, persona_name)
+
+    task         = _build_task(body)
+    session_id   = f"ha_{username}"
+    history      = load_session(session_id)
+    session_msgs = list(history)
+
+    logger.info("HA event for %s: %r", username, task[:80])
+
+    backend = "unknown"
+    try:
+        if use_tools:
+            role_cfg      = model_registry.get_role_config(username, role)
+            system_prompt = load_context(
+                tier,
+                role_append=role_cfg.get("system_append", ""),
+                inject_datetime=role_cfg.get("inject_datetime", True),
+                inject_mode=role_cfg.get("inject_mode", True),
+            )
+            orch_model    = model_registry.get_model_for_role(username, "orchestrator")
+            user_role_val = get_user_role(username)
+            tool_list     = role_cfg.get("tools")
+            policy        = get_tool_policy(username)
+            c_allow       = set(policy.get("allow", []))
+            c_deny        = set(policy.get("deny", []))
+            max_risk, risk_wl, risk_bl = get_risk_policy(username)
+
+            if orch_model and orch_model.get("type") == "local_openai":
+                result = await openai_orchestrator.run(
+                    task=task,
+                    system_prompt=system_prompt,
+                    session_messages=session_msgs or None,
+                    model_cfg=orch_model,
+                    user_role=user_role_val,
+                    tool_list=tool_list,
+                    confirm_allow=c_allow,
+                    confirm_deny=c_deny,
+                    max_risk=max_risk,
+                    risk_whitelist=risk_wl,
+                    risk_blacklist=risk_bl,
+                )
+            else:
+                gemini_key = (
+                    (orch_model.get("api_key") if orch_model else None)
+                    or get_user_gemini_key(username)
+                )
+                result = await orchestrator_engine.run(
+                    task=task,
+                    system_prompt=system_prompt,
+                    session_messages=session_msgs or None,
+                    respond_with_claude=True,
+                    gemini_api_key=gemini_key,
+                    model_name=orch_model.get("model_name") if orch_model else None,
+                    response_role=role,
+                    user_role=user_role_val,
+                    tool_list=tool_list,
+                    confirm_allow=c_allow,
+                    confirm_deny=c_deny,
+                    max_risk=max_risk,
+                    risk_whitelist=risk_wl,
+                    risk_blacklist=risk_bl,
+                )
+            response_text = result.response
+            backend       = result.backend
+
+        else:
+            system_prompt = load_context(tier)
+            msgs          = list(session_msgs) + [{"role": "user", "content": task}]
+            response_text, backend = await complete(system_prompt=system_prompt, messages=msgs)
+
+    except Exception as e:
+        logger.error("HA event error for %s: %s", username, e)
+        return
+
+    logger.info("HA response via %s (%d chars)", backend, len(response_text))
+
+    history.append({"role": "user",      "content": task})
+    history.append({"role": "assistant", "content": response_text})
+    save_session(session_id, history)
+    log_turn(session_id, task, response_text)
+
+    await event_bus.publish({
+        "type":       "ha_event",
+        "session_id": session_id,
+        "response":   response_text,
+        "backend":    backend,
+    })
+
+    await notify(username, response_text)
+
+
+@router.post("/webhook/ha/{username}/{webhook_id}")
+async def ha_webhook(
+    username:   str,
+    webhook_id: str,
+    request:    Request,
+    background_tasks: BackgroundTasks,
+) -> Response:
+    """Receive an event from a Home Assistant automation and route it to Inara."""
+    channels = get_user_channels(username)
+    cfg      = channels.get("homeassistant")
+    if not cfg:
+        raise HTTPException(status_code=404, detail="Channel not configured")
+
+    if webhook_id != cfg.get("webhook_id", ""):
+        logger.warning("HA webhook: bad webhook_id for user %r", username)
+        raise HTTPException(status_code=401, detail="Invalid webhook ID")
+
+    content_type = request.headers.get("content-type", "")
+    if "application/json" in content_type:
+        try:
+            body = await request.json()
+        except Exception:
+            raise HTTPException(status_code=400, detail="Invalid JSON")
+    else:
+        form = await request.form()
+        body = dict(form)
+
+    if not body:
+        return Response(status_code=200)
+
+    background_tasks.add_task(_process_event, username, body, cfg)
+    return Response(status_code=200)
--- a/cortex/routers/local_llm.py
+++ b/cortex/routers/local_llm.py
@@ -1,15 +1,24 @@
 """
-Model Registry settings — hosts, models, and role assignments.
+Model Registry settings — providers, hosts, models, and role assignments.

 Routes:
-  GET  /settings/local                        → settings page
-  POST /settings/local/host                   → save/create a host
-  POST /settings/local/host/{id}/remove       → remove a host (and its models)
-  POST /settings/local/models/add             → add a model entry
-  POST /settings/local/models/{id}/remove     → remove a model
-  POST /api/models/role                       → AJAX: set a role assignment
-  GET  /api/local-llm/fetch-models            → proxy to host /api/models (JSON)
+  GET  /settings/models                           → settings page (canonical)
+  GET  /settings/local                            → redirect to /settings/models
+  POST /settings/local/host                       → save/create a local host
+  POST /settings/local/host/{id}/remove           → remove a host (and its models)
+  POST /settings/local/google-account             → save/create a Google account
+  POST /settings/local/google-account/{id}/remove → remove a Google account
+  POST /settings/local/anthropic-key              → save/create an Anthropic API key
+  POST /settings/local/anthropic-key/{id}/remove  → remove an Anthropic API key
+  POST /settings/local/models/add                 → add a model (any provider)
+  POST /settings/local/models/{id}/edit           → edit an existing model entry
+  POST /settings/local/models/{id}/remove         → remove a model
+  POST /settings/local/roles/add                  → add a custom role (redirects to #roles)
+  POST /settings/local/roles/remove               → remove a custom role (redirects to #roles)
+  POST /api/models/role                           → AJAX: set a role assignment
+  GET  /api/local-llm/fetch-models                → proxy to host /api/models (JSON)
 """
+import json as _json
 import logging
 from pathlib import Path

@@ -18,9 +27,30 @@ import jwt
 from fastapi import APIRouter, Form, Request
 from fastapi.responses import HTMLResponse, JSONResponse, RedirectResponse

-from auth_utils import COOKIE_NAME, decode_token
+from auth_utils import COOKIE_NAME, decode_token, _read_auth
 from config import settings as app_settings
+from persona import list_user_personas
 import model_registry as reg
+from tools import TOOL_CATEGORIES
+
+_LAST_PERSONA_COOKIE = "cx_last_persona"
+
+
+def _preferred_persona(request: Request, username: str) -> str:
+    names = list_user_personas(username)
+    if not names:
+        return ""
+    cookie_val = request.cookies.get(_LAST_PERSONA_COOKIE, "")
+    if cookie_val in names:
+        return cookie_val
+    return names[0]
+
+
+def _integrations_nav(username: str) -> str:
+    role = _read_auth(username).get("role", "user")
+    if role == "admin":
+        return '<a href="/settings/integrations" class="nav-link">Integrations</a>'
+    return ""

 logger = logging.getLogger(__name__)
 router = APIRouter()
@@ -28,6 +58,70 @@ router = APIRouter()
 _STATIC = Path(__file__).parent.parent / "static"


+def _host_row_html(h: dict) -> str:
+    """Return the HTML for one host config row (edit form + remove link)."""
+    api_key  = h.get("api_key", "")
+    key_hint = f"…{api_key[-4:]}" if api_key else "not set"
+    ht   = h.get("host_type", "openwebui")
+    ow   = ' selected' if ht == "openwebui" else ''
+    ai   = ' selected' if ht == "openai"    else ''
+    hid  = h["id"]
+    hlbl = h.get("label", "")
+    hurl = h.get("api_url", "")
+    maxc = h.get("max_concurrent", 3)
+    return f'''
+        <div class="host-row">
+          <form method="POST" action="/settings/local/host" class="host-form">
+            <input type="hidden" name="host_id" value="{hid}">
+            <div class="field-row">
+              <div class="field">
+                <label>Label</label>
+                <input type="text" name="label" value="{hlbl}"
+                       placeholder="Gaming Laptop" autocomplete="off" data-form-type="other">
+              </div>
+              <div class="field" style="flex:2">
+                <label>API URL</label>
+                <input type="text" name="api_url" value="{hurl}"
+                       placeholder="http://192.168.x.x:3000"
+                       autocomplete="off" spellcheck="false" data-form-type="other">
+              </div>
+            </div>
+            <div class="field-row">
+              <div class="field">
+                <label>API Key</label>
+                <input type="password" name="api_key" placeholder="Leave blank to keep existing"
+                       autocomplete="new-password" data-1p-ignore data-lpignore="true"
+                       data-form-type="other">
+                <p class="key-status">Current: {key_hint}</p>
+              </div>
+              <div class="field" style="flex:0 0 auto">
+                <label>Type</label>
+                <select name="host_type">
+                  <option value="openwebui"{ow}>Open WebUI / Ollama</option>
+                  <option value="openai"{ai}>OpenAI-compatible API</option>
+                </select>
+              </div>
+              <div class="field" style="flex:0 0 auto; width:6rem">
+                <label>Max parallel</label>
+                <input type="number" name="max_concurrent" min="1" max="20"
+                       value="{maxc}" style="width:100%">
+              </div>
+            </div>
+            <div class="btn-row">
+              <button type="submit" class="btn btn-secondary btn-sm">Save</button>
+              <button type="button" class="btn btn-secondary btn-sm fetch-btn"
+                      data-host-id="{hid}">Fetch models</button>
+              <span class="fetch-status" id="fetch-{hid}"></span>
+            </div>
+          </form>
+          <form method="POST" action="/settings/local/host/{hid}/remove"
+                onsubmit="return confirm('Remove host and all its models?')"
+                style="margin-top:0.5rem">
+            <button type="submit" class="btn-link danger">Remove host</button>
+          </form>
+        </div>'''
+
+
 # ── Auth helper ───────────────────────────────────────────────────────────────

 def _get_user(request: Request) -> str | None:
@@ -42,154 +136,368 @@ def _get_user(request: Request) -> str | None:

 # ── Page renderer ─────────────────────────────────────────────────────────────

-def _render(username: str, success: str = "", error: str = "") -> str:
-    registry = reg.get_registry(username)
-    hosts    = registry.get("hosts", [])
-    models   = registry.get("models", [])
-    roles    = registry.get("roles", {})
-    builtins = reg._builtins()
+def _render(username: str, request: Request | None = None, success: str = "", error: str = "") -> str:
+    registry    = reg.get_registry(username)
+    hosts       = registry.get("hosts", [])
+    models      = registry.get("models", [])
+    roles       = registry.get("roles", {})
+    builtins    = reg._builtins()
+    host_by_id  = {h["id"]: h for h in hosts}
+    goog_accts  = registry.get("providers", {}).get("google", {}).get("accounts", [])

-    host_by_id = {h["id"]: h for h in hosts}
-
-    # ── Host rows ─────────────────────────────────────────────────────────────
-    host_rows = ""
-    for h in hosts:
-        key_hint  = f"…{h['api_key'][-4:]}" if h.get("api_key") else "not set"
-        ht        = h.get("host_type", "openwebui")
-        ow_sel    = ' selected' if ht == "openwebui" else ''
-        ai_sel    = ' selected' if ht == "openai"    else ''
-        host_rows += f'''
-        <div class="host-row">
-          <form method="POST" action="/settings/local/host" class="host-form">
-            <input type="hidden" name="host_id" value="{h["id"]}">
-            <div class="field-row">
-              <div class="field">
-                <label>Label</label>
-                <input type="text" name="label" value="{h.get("label","")}"
-                       placeholder="Home ML Laptop" autocomplete="off" data-form-type="other">
-              </div>
-              <div class="field" style="flex:2">
-                <label>API URL</label>
-                <input type="text" name="api_url" value="{h.get("api_url","")}"
-                       placeholder="http://192.168.x.x:3000"
-                       autocomplete="off" spellcheck="false" data-form-type="other">
-              </div>
-            </div>
-            <div class="field-row">
-              <div class="field">
-                <label>API Key</label>
-                <input type="password" name="api_key" placeholder="Leave blank to keep existing"
-                       autocomplete="new-password" data-1p-ignore data-lpignore="true" data-form-type="other">
-                <p class="key-status">Current: {key_hint}</p>
-              </div>
-              <div class="field" style="flex:0 0 auto">
-                <label>Type</label>
-                <select name="host_type">
-                  <option value="openwebui"{ow_sel}>Open WebUI / Ollama</option>
-                  <option value="openai"{ai_sel}>OpenAI-compatible (OpenRouter, etc.)</option>
-                </select>
-              </div>
-            </div>
-            <div class="btn-row">
-              <button type="submit" class="btn btn-secondary btn-sm">Save host</button>
-              <button type="button" class="btn btn-secondary btn-sm fetch-btn"
-                      data-host-id="{h["id"]}">Fetch models</button>
-              <span class="fetch-status" id="fetch-{h["id"]}"></span>
-            </div>
-          </form>
-          <form method="POST" action="/settings/local/host/{h["id"]}/remove"
-                onsubmit="return confirm('Remove host and all its models?')" style="margin-top:0.5rem">
-            <button type="submit" class="btn-link danger">Remove host</button>
+    # ── Google account rows ───────────────────────────────────────────────────
+    google_account_rows = ""
+    for a in goog_accts:
+        hint = (a.get("api_key") or "")[:10] + "…" if a.get("api_key") else "no key"
+        google_account_rows += f'''
+        <div class="account-row">
+          <div>
+            <span class="account-label">{a.get("label") or "Unnamed"}</span>
+            <span class="account-hint">{hint}</span>
+          </div>
+          <form method="POST" action="/settings/local/google-account/{a["id"]}/remove"
+                onsubmit="return confirm('Remove this Google account?')">
+            <button type="submit" class="btn-link danger">Remove</button>
          </form>
        </div>'''
+    if not google_account_rows:
+        google_account_rows = '<p class="empty-note">No accounts configured yet.</p>'

-    if not host_rows:
-        host_rows = '<p class="empty-note">No hosts configured yet. Add one below.</p>'
+    # ── Host rows — split cloud (openai) vs local (openwebui) ─────────────────
+    cloud_hosts = [h for h in hosts if h.get("host_type") == "openai"]
+    local_hosts  = [h for h in hosts if h.get("host_type", "openwebui") != "openai"]
+
+    cloud_host_rows = "".join(_host_row_html(h) for h in cloud_hosts)
+    local_host_rows  = "".join(_host_row_html(h) for h in local_hosts)
+    if not cloud_host_rows:
+        cloud_host_rows = '<p class="empty-note">No cloud API services configured yet. Add one below.</p>'
+    if not local_host_rows:
+        local_host_rows = '<p class="empty-note">No local hosts configured yet. Add one below.</p>'

-    # ── Host options for add-model form ───────────────────────────────────────
    host_options = "".join(
        f'<option value="{h["id"]}">{h.get("label") or h["api_url"]}</option>'
        for h in hosts
    )
-    add_model_hidden = "" if hosts else ' style="display:none"'

-    # ── Model rows ────────────────────────────────────────────────────────────
+    # ── Anthropic API key rows ────────────────────────────────────────────────
+    anthropic_api_keys = reg.get_anthropic_api_keys(username)
+    anthropic_keys_js  = _json.dumps(anthropic_api_keys)
+    anthropic_key_rows = ""
+    for c in anthropic_api_keys:
+        hint = c.get("hint", "no key")
+        anthropic_key_rows += f'''
+        <div class="account-row">
+          <div>
+            <span class="account-label">{c.get("label") or "API Key"}</span>
+            <span class="account-hint">{hint}</span>
+          </div>
+          <form method="POST" action="/settings/local/anthropic-key/{c["id"]}/remove"
+                onsubmit="return confirm('Remove this Anthropic API key?')">
+            <button type="submit" class="btn-link danger">Remove</button>
+          </form>
+        </div>'''
+    if not anthropic_key_rows:
+        anthropic_key_rows = '<p class="empty-note">No API keys configured. Add one below or use Claude CLI (OAuth).</p>'
+
+    # ── Model rows (all providers) ────────────────────────────────────────────
+    _PROVIDER_BADGE = {
+        "claude_cli":    ('<span class="pbadge pb-anthropic">Anthropic</span>', "Claude CLI"),
+        "anthropic_api": ('<span class="pbadge pb-anthropic">Anthropic</span>', "API Key"),
+        "gemini_api":    ('<span class="pbadge pb-google">Google</span>', ""),
+        "local_openai":  ('<span class="pbadge pb-local">Local</span>', ""),
+    }
    model_rows = ""
    for m in models:
        resolved = reg._resolve_model(registry, m["id"])
        if not resolved:
            continue
-        host_name = ""
-        if m.get("type") == "local_openai" and m.get("host_id"):
-            h = host_by_id.get(m["host_id"], {})
-            host_name = h.get("label") or h.get("api_url", "")
+        mtype = m.get("type", "local_openai")
+        badge, default_secondary = _PROVIDER_BADGE.get(mtype, ("", ""))

-        ctx_badge = f'<span class="ctx-badge">{m.get("context_k",0)}k ctx</span>' if m.get("context_k") else ""
-        tags_html = " ".join(
-            f'<span class="tag">{t}</span>' for t in (m.get("tags") or [])
+        if mtype == "local_openai":
+            h = host_by_id.get(m.get("host_id", ""), {})
+            secondary = h.get("label") or h.get("api_url", "")
+        elif mtype == "gemini_api":
+            acct = next((a for a in goog_accts if a["id"] == m.get("account_id")), None)
+            secondary = acct["label"] if acct else ""
+        else:
+            secondary = default_secondary
+
+        ctx       = f'<span class="ctx-badge">{m.get("context_k",0)}k</span>' if m.get("context_k") else ""
+        no_tools  = '' if m.get("tools", True) else '<span class="pbadge pb-notools">no tools</span>'
+        tags_html = " ".join(f'<span class="tag">{t}</span>' for t in (m.get("tags") or []))
+        sec      = f'<span class="model-host">{secondary}</span>' if secondary else ""
+
+        # ── Inline edit form fields (type-specific) ───────────────────────────
+        if mtype == "local_openai":
+            host_opts = "".join(
+                f'<option value="{h["id"]}"'
+                f'{" selected" if h["id"] == m.get("host_id") else ""}>'
+                f'{h.get("label") or h.get("api_url","")}</option>'
+                for h in hosts
+            )
+            mid = m["id"]
+            extra_fields = (
+                f'<div class="field"><label>Host</label>'
+                f'<select name="host_id" id="edit-host-{mid}">{host_opts}</select></div>'
+                f'<div class="btn-row" style="margin-bottom:0.75rem">'
+                f'<button type="button" class="btn btn-secondary btn-sm edit-fetch-btn" data-id="{mid}">Fetch models</button>'
+                f'<span class="fetch-status" id="edit-fetch-status-{mid}"></span>'
+                f'</div>'
+                f'<div id="edit-model-select-wrap-{mid}" style="display:none; margin-bottom:0.75rem">'
+                f'<label>Pick from host</label>'
+                f'<select id="edit-model-picker-{mid}"><option value="">— fetch first —</option></select>'
+                f'</div>'
+            )
+        elif mtype == "gemini_api":
+            acct_opts = "".join(
+                f'<option value="{a["id"]}"'
+                f'{" selected" if a["id"] == m.get("account_id") else ""}>'
+                f'{a.get("label","Unnamed")}</option>'
+                for a in goog_accts
+            )
+            extra_fields = (
+                f'<div class="field"><label>Google Account</label>'
+                f'<select name="account_id">{acct_opts}</select></div>'
+            )
+        elif mtype == "anthropic_api":
+            key_opts = "".join(
+                f'<option value="{c["id"]}"'
+                f'{" selected" if c["id"] == m.get("credential_id") else ""}>'
+                f'{c.get("label","API Key")} ({c.get("hint","")})</option>'
+                for c in anthropic_api_keys
+            )
+            extra_fields = (
+                f'<div class="field"><label>API Key</label>'
+                f'<select name="credential_id">{key_opts or "<option value=\"\">No API keys configured</option>"}</select></div>'
+            )
+        else:
+            extra_fields = '<input type="hidden" name="credential_id" value="cli">'
+
+        cur_label           = m.get("label", "")
+        cur_model_name      = m.get("model_name", "")
+        cur_ctx             = m.get("context_k", 0) or 0
+        cur_max_rounds      = m.get("max_rounds") or 0
+        cur_tools           = m.get("tools", True)
+        cur_tags            = ", ".join(m.get("tags") or [])
+        cur_reasoning_budget = m.get("reasoning_budget_tokens") or 0
+        _rb_levels = [(0, "Off — Non-think"), (1024, "Light"), (4096, "Moderate"), (8192, "High"), (32768, "Max")]
+        reasoning_opts = "".join(
+            f'<option value="{v}" {"selected" if cur_reasoning_budget == v else ""}>{lbl}</option>'
+            for v, lbl in _rb_levels
        )
-        host_html = f'<span class="model-host">{host_name}</span>' if host_name else ""

        model_rows += f'''
        <div class="model-row" id="model-{m["id"]}">
-          <div class="model-info">
-            <span class="model-label">{m.get("label") or m.get("model_name","")}</span>
-            <span class="model-name">{m.get("model_name","")}</span>
-            {host_html}{ctx_badge}
-            <div class="tag-row">{tags_html}</div>
-          </div>
-          <div class="model-actions">
-            <form method="POST" action="/settings/local/models/{m["id"]}/remove"
-                  onsubmit="return confirm('Remove this model?')" style="display:inline">
-              <button type="submit" class="row-btn danger">Remove</button>
-            </form>
+          <div class="model-row-header">
+            <div class="model-info">
+              <div>{badge}<span class="model-label">{m.get("label") or m.get("model_name","")}</span>{ctx}{no_tools}</div>
+              <span class="model-name">{m.get("model_name","")}</span>
+              {sec}
+              <div class="tag-row">{tags_html}</div>
+            </div>
+            <div class="model-btns">
+              <button type="button" class="row-btn model-edit-btn" data-id="{m["id"]}">Edit</button>
+              <form method="POST" action="/settings/local/models/{m["id"]}/remove"
+                    onsubmit="return confirm('Remove this model?')" style="margin:0">
+                <button type="submit" class="row-btn danger">Remove</button>
+              </form>
+            </div>
          </div>
+          <form class="model-edit-form" id="edit-form-{m["id"]}" style="display:none"
+                method="POST" action="/settings/local/models/{m["id"]}/edit">
+            <input type="hidden" name="mtype" value="{mtype}">
+            <div class="field-row">
+              <div class="field">
+                <label>Display label</label>
+                <input type="text" name="label" value="{cur_label}"
+                       placeholder="My Model" autocomplete="off" data-form-type="other">
+              </div>
+              <div class="field">
+                <label>Model name / ID</label>
+                <input type="text" name="model_name" value="{cur_model_name}"
+                       placeholder="provider/model-name" autocomplete="off"
+                       spellcheck="false" data-form-type="other" required>
+              </div>
+            </div>
+            {extra_fields}
+            <div class="field-row">
+              <div class="field" style="flex:0 0 auto">
+                <label title="Context window size in thousands of tokens. 0 = assume 32k.">Context (k)</label>
+                <input type="number" name="context_k" value="{cur_ctx}" min="0"
+                       title="Context window size in thousands of tokens. 0 = assume 32k (compaction budget ~24k tokens).">
+              </div>
+              <div class="field" style="flex:0 0 auto">
+                <label title="Per-model tool loop cap. 0 = use the global default (orchestrator_max_rounds).">Max rounds</label>
+                <input type="number" name="max_rounds" value="{cur_max_rounds}" min="0"
+                       title="Per-model tool loop cap. 0 = use the global default (orchestrator_max_rounds).">
+              </div>
+              <div class="field" style="flex:0 0 auto">
+                <label title="Reasoning depth via OpenRouter's reasoning.budget_tokens. Off = Non-think. Light ~1k, Moderate ~4k, High ~8k, Max ~32k tokens.">Reasoning</label>
+                <select name="reasoning_budget_tokens"
+                        title="Reasoning depth via OpenRouter's reasoning.budget_tokens. Off = Non-think. Light ~1k, Moderate ~4k, High ~8k, Max ~32k tokens.">
+                  {reasoning_opts}
+                </select>
+              </div>
+              <div class="field" style="flex:0 0 auto">
+                <label title="Whether this model supports tool calling. If not supported, requests skip the tool loop entirely.">Tool calling</label>
+                <select name="tools"
+                        title="Whether this model supports tool calling. If not supported, requests skip the tool loop entirely.">
+                  <option value="1" {'selected' if cur_tools else ''}>Supported</option>
+                  <option value="0" {'' if cur_tools else 'selected'}>Not supported</option>
+                </select>
+              </div>
+              <div class="field">
+                <label>Tags</label>
+                <input type="text" name="tags" value="{cur_tags}"
+                       placeholder="fast, code, vision" autocomplete="off" data-form-type="other">
+              </div>
+            </div>
+            <div class="btn-row" style="margin-top:0.5rem">
+              <button type="submit" class="btn btn-primary btn-sm">Save</button>
+              <button type="button" class="model-edit-cancel btn btn-secondary btn-sm"
+                      data-id="{m["id"]}">Cancel</button>
+            </div>
+          </form>
        </div>'''
-
    if not model_rows:
        model_rows = '<p class="empty-note">No models added yet.</p>'

    # ── Role assignment rows ──────────────────────────────────────────────────
-    # Build option list: (none) + built-ins + user models
    model_opts = '<option value="">— .env default —</option>\n'
    model_opts += '<optgroup label="Built-in">\n'
    for bid, bm in builtins.items():
        model_opts += f'  <option value="{bid}">{bm["label"]}</option>\n'
    model_opts += '</optgroup>\n'
    if models:
-        model_opts += '<optgroup label="Local models">\n'
+        model_opts += '<optgroup label="Configured models">\n'
        for m in models:
            lbl = m.get("label") or m.get("model_name", m["id"])
            model_opts += f'  <option value="{m["id"]}">{lbl}</option>\n'
        model_opts += '</optgroup>\n'

-    role_rows = ""
-    for role in app_settings.get_defined_roles():
-        role_cfg = roles.get(role, {})
-        role_rows += f'<div class="role-row" data-role="{role}"><span class="role-name">{role.title()}</span><div class="role-slots">'
-        for slot in reg.PRIORITY_KEYS[:3]:  # primary + backup_1 + backup_2
-            current = role_cfg.get(slot) or ""
-            slot_label = slot.replace("_", " ").title()
-            sel_html = f'<select class="role-select" data-role="{role}" data-slot="{slot}" title="{slot_label}">\n{model_opts}\n</select>'
-            # Pre-select current value via JS (simpler than string-building selected attrs)
-            role_rows += f'<div class="role-slot"><span class="slot-label">{slot_label}</span>{sel_html}</div>'
-        role_rows += '</div></div>'
+    all_roles = reg.get_all_roles(username)
+
+    role_rows = ""
+    for role in all_roles:
+        is_required = role in reg.REQUIRED_ROLES
+        role_cfg = roles.get(role, {})
+        role_title = role.replace("_", " ").title()
+        required_badge = (
+            '<span class="required-badge">required</span>'
+            if is_required else ''
+        )
+        rcp_danger = (
+            '' if is_required else
+            f'<div class="rcp-danger">'
+            f'<form method="POST" action="/settings/local/roles/remove" class="remove-role-form">'
+            f'<input type="hidden" name="role_name" value="{role}">'
+            f'<button type="submit" class="btn-link danger" data-role="{role}">Remove this role…</button>'
+            f'</form>'
+            f'</div>'
+        )
+        role_rows += (
+            f'<div class="role-row" data-role="{role}">'
+            f'<div class="role-name-col">'
+            f'<span class="role-name">{role_title}</span>'
+            f'{required_badge}'
+            f'</div>'
+            f'<div class="role-slots">'
+        )
+        for slot in reg.PRIORITY_KEYS[:2]:
+            slot_label = slot.replace("_", " ").title()
+            sel = (
+                f'<select class="role-select" data-role="{role}" '
+                f'data-slot="{slot}" title="{slot_label}">\n{model_opts}\n</select>'
+            )
+            role_rows += f'<div class="role-slot"><span class="slot-label">{slot_label}</span>{sel}</div>'
+        role_rows += (
+            f'</div>'
+            f'<button class="role-cfg-btn" data-role="{role}" title="Configure">⚙</button>'
+            f'</div>'
+            f'<div class="role-config-panel" id="rcp-{role}">'
+            f'<div class="rcp-field">'
+            f'<label class="rcp-label">System prompt addition</label>'
+            f'<textarea class="rcp-textarea" data-role="{role}" rows="3" '
+            f'placeholder="Extra instructions injected into the system prompt when this role is active…"></textarea>'
+            f'</div>'
+            f'<div class="rcp-field">'
+            f'<div style="display:flex;flex-direction:column;gap:0.3rem">'
+            f'<label class="rcp-check">'
+            f'<input type="checkbox" class="rcp-datetime-cb" data-role="{role}" checked>'
+            f'<span>Inject current date &amp; time into system prompt</span>'
+            f'</label>'
+            f'<label class="rcp-check">'
+            f'<input type="checkbox" class="rcp-mode-cb" data-role="{role}" checked>'
+            f'<span>Inject session mode (Chat / Off The Record) into system prompt</span>'
+            f'</label>'
+            f'</div>'
+            f'<p class="rcp-hint" style="margin-top:0.4rem">Disable both for pure processing roles (summarizer, classifier, translator).</p>'
+            f'</div>'
+            f'<div class="rcp-field">'
+            f'<label class="rcp-label">Tool allow-list '
+            f'<span class="rcp-hint">— all checked means no restriction (use all accessible tools)</span></label>'
+            f'<div class="rcp-tools" id="rcp-tools-{role}"></div>'
+            f'</div>'
+            f'<div class="rcp-actions">'
+            f'<button class="btn btn-primary btn-sm rcp-save" data-role="{role}">Save</button>'
+            f'<button class="btn btn-secondary btn-sm rcp-cancel" data-role="{role}">Cancel</button>'
+            f'</div>'
+            f'{rcp_danger}'
+            f'</div>'
+        )

-    # JS data for pre-selecting current role values
-    import json as _json
    role_data_js = _json.dumps({
-        role: {slot: (roles.get(role, {}).get(slot) or "") for slot in reg.PRIORITY_KEYS[:3]}
-        for role in app_settings.get_defined_roles()
+        role: {slot: (roles.get(role, {}).get(slot) or "") for slot in reg.PRIORITY_KEYS[:2]}
+        for role in all_roles
    })

+    role_config_data_js = _json.dumps({
+        role: {
+            "system_append":  roles.get(role, {}).get("system_append", ""),
+            "tools":          roles.get(role, {}).get("tools") or None,
+            "inject_datetime": roles.get(role, {}).get("inject_datetime", True),
+            "inject_mode":    roles.get(role, {}).get("inject_mode", True),
+        }
+        for role in all_roles
+    })
+    tool_categories_js = _json.dumps(TOOL_CATEGORIES)
+
+    # ── Catalog data + Google accounts for JS ─────────────────────────────────
+    google_accounts_js   = _json.dumps(reg.get_google_accounts(username))
+    google_catalog_js    = _json.dumps(reg.get_catalog("google"))
+    anthropic_catalog_js = _json.dumps(reg.get_catalog("anthropic"))
+    cloud_catalog_js     = _json.dumps(reg.get_catalog("cloud"))
+    has_hosts = "true" if hosts else "false"
+
    html = (_STATIC / "local_llm.html").read_text()
-    html = html.replace("{{ username }}",         username)
-    html = html.replace("{{ host_rows }}",         host_rows)
-    html = html.replace("{{ model_rows }}",        model_rows)
-    html = html.replace("{{ host_options }}",      host_options)
-    html = html.replace("{{ add_model_hidden }}",  add_model_hidden)
-    html = html.replace("{{ role_rows }}",         role_rows)
-    html = html.replace("{{ role_data_js }}",      role_data_js)
+    replacements = {
+        "{{ username }}":             username,
+        "{{ google_account_rows }}":  google_account_rows,
+        "{{ anthropic_key_rows }}":   anthropic_key_rows,
+        "{{ cloud_host_rows }}":      cloud_host_rows,
+        "{{ local_host_rows }}":      local_host_rows,
+        "{{ model_rows }}":           model_rows,
+        "{{ host_options }}":         host_options,
+        "{{ role_rows }}":            role_rows,
+        "{{ role_data_js }}":         role_data_js,
+        "{{ role_config_data_js }}":  role_config_data_js,
+        "{{ tool_categories_js }}":   tool_categories_js,
+        "{{ google_accounts_js }}":   google_accounts_js,
+        "{{ anthropic_keys_js }}":    anthropic_keys_js,
+        "{{ google_catalog_js }}":    google_catalog_js,
+        "{{ anthropic_catalog_js }}": anthropic_catalog_js,
+        "{{ cloud_catalog_js }}":     cloud_catalog_js,
+        "{{ has_hosts }}":            has_hosts,
+    }
+    for key, val in replacements.items():
+        html = html.replace(key, val)
+
+    back_persona = _preferred_persona(request, username) if request else ""
+    html = html.replace("{{ back_href }}", f"/{username}/{back_persona}" if back_persona else "/")
+    html = html.replace("{{ help_href }}", f"/help?persona={back_persona}" if back_persona else "/help")
+    html = html.replace("{{ integrations_nav }}", _integrations_nav(username))
+
    if success:
        html = html.replace("<!-- SUCCESS -->", f'<p class="msg success">{success}</p>')
    if error:
@@ -199,31 +507,86 @@ def _render(username: str, success: str = "", error: str = "") -> str:

 # ── Routes ────────────────────────────────────────────────────────────────────

-@router.get("/settings/local", include_in_schema=False)
-async def models_page(request: Request):
+@router.get("/settings/models", include_in_schema=False)
+async def models_page_canonical(request: Request):
    username = _get_user(request)
    if not username:
        return RedirectResponse("/login", status_code=302)
-    return HTMLResponse(_render(username))
+    return HTMLResponse(_render(username, request))
+
+
+@router.get("/settings/local", include_in_schema=False)
+async def models_page_legacy(request: Request):
+    return RedirectResponse("/settings/models", status_code=301)
+
+
+@router.post("/settings/local/google-account", include_in_schema=False)
+async def save_google_account(
+    request:    Request,
+    account_id: str = Form(""),
+    label:      str = Form(""),
+    api_key:    str = Form(""),
+):
+    username = _get_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    if not api_key.strip() and not account_id.strip():
+        return HTMLResponse(_render(username, request, error="API key is required."))
+    reg.save_google_account(username, account_id or None, label, api_key)
+    return HTMLResponse(_render(username, request, success="Google account saved."))
+
+
+@router.post("/settings/local/google-account/{account_id}/remove", include_in_schema=False)
+async def remove_google_account(request: Request, account_id: str):
+    username = _get_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    reg.remove_google_account(username, account_id)
+    return HTMLResponse(_render(username, request, success="Google account removed."))
+
+
+@router.post("/settings/local/anthropic-key", include_in_schema=False)
+async def save_anthropic_api_key(
+    request: Request,
+    key_id:  str = Form(""),
+    label:   str = Form(""),
+    api_key: str = Form(""),
+):
+    username = _get_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    if not api_key.strip() and not key_id.strip():
+        return HTMLResponse(_render(username, request, error="API key is required."))
+    reg.save_anthropic_api_key(username, key_id or None, label, api_key)
+    return HTMLResponse(_render(username, request, success="Anthropic API key saved."))
+
+
+@router.post("/settings/local/anthropic-key/{key_id}/remove", include_in_schema=False)
+async def remove_anthropic_api_key(request: Request, key_id: str):
+    username = _get_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    reg.remove_anthropic_api_key(username, key_id)
+    return HTMLResponse(_render(username, request, success="Anthropic API key removed."))


@router.post("/settings/local/host", include_in_schema=False)
 async def save_host(
-    request:   Request,
-    host_id:   str = Form(""),
-    label:     str = Form(""),
-    api_url:   str = Form(""),
-    api_key:   str = Form(""),
-    host_type: str = Form("openwebui"),
+    request:        Request,
+    host_id:        str = Form(""),
+    label:          str = Form(""),
+    api_url:        str = Form(""),
+    api_key:        str = Form(""),
+    host_type:      str = Form("openwebui"),
+    max_concurrent: int = Form(3),
 ):
    username = _get_user(request)
    if not username:
        return RedirectResponse("/login", status_code=302)
    if not api_url.strip():
-        return HTMLResponse(_render(username, error="API URL is required."))
-    reg.save_host(username, host_id or None, label, api_url, api_key, host_type)
-    logger.info("model registry host saved: %s (%s)", username, host_type)
-    return HTMLResponse(_render(username, success="Host saved."))
+        return HTMLResponse(_render(username, request, error="API URL is required."))
+    reg.save_host(username, host_id or None, label, api_url, api_key, host_type, max_concurrent)
+    return HTMLResponse(_render(username, request, success="Host saved."))


@router.post("/settings/local/host/{host_id}/remove", include_in_schema=False)
@@ -232,27 +595,110 @@ async def remove_host(request: Request, host_id: str):
    if not username:
        return RedirectResponse("/login", status_code=302)
    reg.remove_host(username, host_id)
-    return HTMLResponse(_render(username, success="Host removed."))
+    return HTMLResponse(_render(username, request, success="Host removed."))


@router.post("/settings/local/models/add", include_in_schema=False)
 async def add_model(
-    request:    Request,
-    host_id:    str = Form(...),
-    label:      str = Form(""),
-    model_name: str = Form(...),
-    context_k:  int = Form(0),
-    tags:       str = Form(""),
+    request:                  Request,
+    provider:                 str = Form("local"),
+    label:                    str = Form(""),
+    context_k:                int = Form(0),
+    max_rounds:               int = Form(0),
+    tools:                    int = Form(1),
+    tags:                     str = Form(""),
+    reasoning_budget_tokens:  int = Form(0),
+    # local-only fields
+    host_id:                  str = Form(""),
+    model_name:               str = Form(""),
+    # cloud-only fields
+    cloud_model_name:         str = Form(""),
+    account_id:               str = Form(""),
+    credential_id:            str = Form("cli"),
+):
+    username = _get_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+
+    tag_list   = [t.strip() for t in tags.split(",") if t.strip()]
+    max_rounds_ = max_rounds or None
+    tools_bool  = tools != 0
+    reasoning_budget_ = reasoning_budget_tokens or None
+
+    if provider == "local":
+        if not model_name.strip():
+            return HTMLResponse(_render(username, request, error="Model name is required."))
+        if not host_id.strip():
+            return HTMLResponse(_render(username, request, error="Select a host."))
+        reg.save_model(username, None, host_id, label, model_name, context_k, tag_list,
+                       max_rounds=max_rounds_, tools=tools_bool,
+                       reasoning_budget_tokens=reasoning_budget_)
+        display = label or model_name
+
+    elif provider in ("google", "anthropic"):
+        if not cloud_model_name.strip():
+            return HTMLResponse(_render(username, request, error="Select a model from the catalog."))
+        if provider == "google" and not account_id.strip():
+            return HTMLResponse(_render(username, request, error="Select a Google account."))
+        reg.save_cloud_model(
+            username, None, provider, cloud_model_name, label,
+            account_id=account_id or None,
+            credential_id=credential_id or None,
+            context_k=context_k, tags=tag_list,
+            max_rounds=max_rounds_, tools=tools_bool,
+        )
+        display = label or cloud_model_name
+    else:
+        return HTMLResponse(_render(username, request, error=f"Unknown provider: {provider}"))
+
+    logger.info("model added: %s / %s (%s)", username, display, provider)
+    return HTMLResponse(_render(username, request, success=f'Model "{display}" added.'))
+
+
+@router.post("/settings/local/models/{model_id}/edit", include_in_schema=False)
+async def edit_model(
+    request:                 Request,
+    model_id:                str,
+    mtype:                   str = Form(""),
+    label:                   str = Form(""),
+    model_name:              str = Form(""),
+    context_k:               int = Form(0),
+    max_rounds:              int = Form(0),
+    tools:                   int = Form(1),
+    tags:                    str = Form(""),
+    reasoning_budget_tokens: int = Form(0),
+    host_id:                 str = Form(""),
+    account_id:              str = Form(""),
+    credential_id:           str = Form("cli"),
 ):
    username = _get_user(request)
    if not username:
        return RedirectResponse("/login", status_code=302)
    if not model_name.strip():
-        return HTMLResponse(_render(username, error="Model name is required."))
-    tag_list = [t.strip() for t in tags.split(",") if t.strip()]
-    reg.save_model(username, None, host_id, label, model_name, context_k, tag_list)
-    logger.info("model added to registry: %s / %s", username, model_name)
-    return HTMLResponse(_render(username, success=f'Model "{label or model_name}" added.'))
+        return HTMLResponse(_render(username, request, error="Model name is required."))
+    tag_list          = [t.strip() for t in tags.split(",") if t.strip()]
+    max_rounds_       = max_rounds or None
+    tools_bool        = tools != 0
+    reasoning_budget_ = reasoning_budget_tokens or None
+    if mtype == "local_openai":
+        if not host_id.strip():
+            return HTMLResponse(_render(username, request, error="Select a host for this model."))
+        reg.save_model(username, model_id, host_id, label, model_name, context_k, tag_list,
+                       max_rounds=max_rounds_, tools=tools_bool,
+                       reasoning_budget_tokens=reasoning_budget_)
+    elif mtype == "gemini_api":
+        reg.save_cloud_model(username, model_id, "google", model_name, label,
+                             account_id=account_id or None, context_k=context_k, tags=tag_list,
+                             max_rounds=max_rounds_, tools=tools_bool)
+    elif mtype in ("claude_cli", "anthropic_api"):
+        reg.save_cloud_model(username, model_id, "anthropic", model_name, label,
+                             credential_id=credential_id or "cli", context_k=context_k, tags=tag_list,
+                             max_rounds=max_rounds_, tools=tools_bool)
+    else:
+        return HTMLResponse(_render(username, request, error=f"Unknown model type: {mtype}"))
+    display = label.strip() or model_name.strip()
+    logger.info("model edited: %s / %s (%s)", username, display, mtype)
+    return HTMLResponse(_render(username, request, success=f'Model "{display}" updated.'))


@router.post("/settings/local/models/{model_id}/remove", include_in_schema=False)
@@ -261,7 +707,41 @@ async def remove_model(request: Request, model_id: str):
    if not username:
        return RedirectResponse("/login", status_code=302)
    reg.remove_model(username, model_id)
-    return HTMLResponse(_render(username, success="Model removed."))
+    return HTMLResponse(_render(username, request, success="Model removed."))
+
+
+@router.post("/settings/local/roles/add", include_in_schema=False)
+async def add_custom_role_route(
+    request:   Request,
+    role_name: str = Form(""),
+):
+    username = _get_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    name = role_name.strip().lower()
+    if not name or not name[0].isalpha():
+        return HTMLResponse(_render(username, request, error="Role name must start with a letter."))
+    ok = reg.add_custom_role(username, name)
+    if not ok:
+        return HTMLResponse(_render(username, request, error=f'"{name}" is a required role and cannot be re-added.'))
+    logger.info("custom role added: %s / %s", username, name)
+    return RedirectResponse("/settings/models#roles", status_code=303)
+
+
+@router.post("/settings/local/roles/remove", include_in_schema=False)
+async def remove_custom_role_route(
+    request:   Request,
+    role_name: str = Form(""),
+):
+    username = _get_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    name = role_name.strip()
+    ok = reg.remove_custom_role(username, name)
+    if not ok:
+        return HTMLResponse(_render(username, request, error=f'"{name}" is a required role and cannot be removed.'))
+    logger.info("custom role removed: %s / %s", username, name)
+    return RedirectResponse("/settings/models#roles", status_code=303)


@router.post("/api/models/role")
@@ -287,39 +767,68 @@ async def set_role(request: Request) -> JSONResponse:

    ok = reg.set_role(username, role, slot, model_id)
    if not ok:
-        return JSONResponse({"error": f"Invalid slot or model_id not found"}, status_code=400)
+        return JSONResponse({"error": "Invalid slot or model_id not found"}, status_code=400)

    logger.info("role set: %s %s.%s = %s", username, role, slot, model_id)
    return JSONResponse({"ok": True})


+@router.post("/api/models/role-config")
+async def set_role_config(request: Request) -> JSONResponse:
+    """AJAX: save system_append, tool allow-list, and inject_datetime flag for a role.
+
+    Body: {"role": "coder", "system_append": "...", "tools": [...] | null, "inject_datetime": true}
+    tools=null clears the allow-list (role uses all accessible tools).
+    inject_datetime=false suppresses the date/time header for pure processing roles.
+    """
+    username = _get_user(request)
+    if not username:
+        return JSONResponse({"error": "Not authenticated"}, status_code=401)
+    try:
+        body = await request.json()
+    except Exception:
+        return JSONResponse({"error": "Invalid JSON"}, status_code=400)
+
+    role             = body.get("role", "").strip()
+    system_append    = body.get("system_append", "")
+    tools            = body.get("tools")          # list[str] or None
+    inject_datetime  = body.get("inject_datetime", True)
+    inject_mode      = body.get("inject_mode", True)
+
+    if not role:
+        return JSONResponse({"error": "role is required"}, status_code=400)
+    if tools is not None and not isinstance(tools, list):
+        return JSONResponse({"error": "tools must be a list or null"}, status_code=400)
+
+    reg.set_role_config(username, role, system_append, tools,
+                        inject_datetime=bool(inject_datetime),
+                        inject_mode=bool(inject_mode))
+    logger.info("role config saved: %s %s (tools=%s inject_datetime=%s inject_mode=%s)",
+                username, role, len(tools) if tools is not None else "all",
+                inject_datetime, inject_mode)
+    return JSONResponse({"ok": True})
+
+
@router.get("/api/local-llm/fetch-models")
 async def fetch_models(request: Request, host_id: str = "") -> JSONResponse:
-    """Proxy to the host's /api/models endpoint. host_id selects which host."""
+    """Proxy to the host's models endpoint. host_id selects which host."""
    username = _get_user(request)
    if not username:
        return JSONResponse({"error": "Not authenticated"}, status_code=401)

    registry = reg.get_registry(username)
-    hosts = registry.get("hosts", [])
+    hosts    = registry.get("hosts", [])

-    if host_id:
-        host = next((h for h in hosts if h["id"] == host_id), None)
-    else:
-        host = hosts[0] if hosts else None
+    host = next((h for h in hosts if h["id"] == host_id), None) if host_id else (hosts[0] if hosts else None)

-    # Fall back to .env
    if host:
-        api_url = host.get("api_url", "")
-        api_key = host.get("api_key", "")
+        api_url, api_key, host_type = host.get("api_url",""), host.get("api_key",""), host.get("host_type","openwebui")
    else:
-        api_url = app_settings.local_api_url
-        api_key = app_settings.local_api_key
+        api_url, api_key, host_type = app_settings.local_api_url, app_settings.local_api_key, "openwebui"

    if not api_url:
        return JSONResponse({"error": "No host configured."}, status_code=400)

-    host_type   = host.get("host_type", "openwebui") if host else "openwebui"
    models_path = "/models" if host_type == "openai" else "/api/models"
    url         = api_url.rstrip("/") + models_path
    headers     = {"Authorization": f"Bearer {api_key}"} if api_key else {}
@@ -329,11 +838,10 @@ async def fetch_models(request: Request, host_id: str = "") -> JSONResponse:
            resp = await client.get(url, headers=headers)
        resp.raise_for_status()
        data   = resp.json()
-        models = [
-            {"id": m["id"], "name": m.get("name") or m["id"]}
-            for m in data.get("data", [])
-        ]
-        models.sort(key=lambda m: m["name"].lower())
+        models = sorted(
+            [{"id": m["id"], "name": m.get("name") or m["id"]} for m in data.get("data", [])],
+            key=lambda m: m["name"].lower(),
+        )
        return JSONResponse({"models": models})
    except httpx.HTTPStatusError as e:
        return JSONResponse({"error": f"Host returned {e.response.status_code}"}, status_code=502)
--- a/cortex/routers/nextcloud_talk.py
+++ b/cortex/routers/nextcloud_talk.py
@@ -1,10 +1,12 @@
 import asyncio
+import hashlib
+import hmac
 import json
 import logging

 from fastapi import APIRouter, BackgroundTasks, HTTPException, Request, Response

-from auth_utils import get_user_channels
+from auth_utils import get_user_channels, get_user_gemini_key, get_user_role, get_tool_policy, get_risk_policy
 from context_loader import load_context
 from llm_client import complete
 from notification import _send_nct_message
@@ -13,6 +15,9 @@ from session_logger import log_turn
 from session_store import load as load_session, save as save_session
 from config import settings
 import event_bus
+import model_registry
+import orchestrator_engine
+import openai_orchestrator

 logger = logging.getLogger(__name__)
 logger.setLevel(logging.DEBUG)
@@ -50,15 +55,19 @@ async def _process_message(
    nextcloud_url: str,
    secret: str,
    timeout: int,
+    cfg: dict,
 ) -> None:
    logger.info("NCT process: token=%s user=%s text=%r", conversation_token, actor_name, user_text)

    set_context(username, persona_name)

-    session_id    = f"nct_{username}_{conversation_token}"
-    system_prompt = load_context(settings.default_tier)
-    history       = load_session(session_id)
-    history.append({"role": "user", "content": user_text})
+    tier      = cfg.get("tier") or settings.default_tier
+    role      = cfg.get("role", "chat")
+    use_tools = cfg.get("tools", False)
+
+    session_id   = f"nct_{username}_{conversation_token}"
+    history      = load_session(session_id)
+    session_msgs = list(history)  # snapshot before we append

    await event_bus.publish({
        "type": "nct_message",
@@ -68,11 +77,76 @@ async def _process_message(
        "actor": actor_name,
    })

+    backend = "unknown"
    try:
-        response_text, backend = await asyncio.wait_for(
-            complete(system_prompt=system_prompt, messages=history),
-            timeout=timeout,
-        )
+        if use_tools:
+            await _send_reply(conversation_token, "⏳ Working on it…", nextcloud_url, secret)
+
+            role_cfg      = model_registry.get_role_config(username, role)
+            system_prompt = load_context(
+                tier,
+                role_append=role_cfg.get("system_append", ""),
+                inject_datetime=role_cfg.get("inject_datetime", True),
+                inject_mode=role_cfg.get("inject_mode", True),
+            )
+            orch_model    = model_registry.get_model_for_role(username, "orchestrator")
+            user_role_val = get_user_role(username)
+            tool_list     = role_cfg.get("tools")
+            policy        = get_tool_policy(username)
+            c_allow       = set(policy.get("allow", []))
+            c_deny        = set(policy.get("deny", []))
+            max_risk, risk_wl, risk_bl = get_risk_policy(username)
+
+            if orch_model and orch_model.get("type") == "local_openai":
+                result = await openai_orchestrator.run(
+                    task=user_text,
+                    system_prompt=system_prompt,
+                    session_messages=session_msgs or None,
+                    model_cfg=orch_model,
+                    user_role=user_role_val,
+                    tool_list=tool_list,
+                    confirm_allow=c_allow,
+                    confirm_deny=c_deny,
+                    max_risk=max_risk,
+                    risk_whitelist=risk_wl,
+                    risk_blacklist=risk_bl,
+                )
+            else:
+                gemini_key = (
+                    (orch_model.get("api_key") if orch_model else None)
+                    or get_user_gemini_key(username)
+                )
+                result = await orchestrator_engine.run(
+                    task=user_text,
+                    system_prompt=system_prompt,
+                    session_messages=session_msgs or None,
+                    respond_with_claude=True,
+                    gemini_api_key=gemini_key,
+                    model_name=orch_model.get("model_name") if orch_model else None,
+                    response_role=role,
+                    user_role=user_role_val,
+                    tool_list=tool_list,
+                    confirm_allow=c_allow,
+                    confirm_deny=c_deny,
+                    max_risk=max_risk,
+                    risk_whitelist=risk_wl,
+                    risk_blacklist=risk_bl,
+                )
+
+            response_text = result.response
+            backend       = result.backend
+
+            if result.checkpoint:
+                response_text += "\n\n_(This action requires confirmation — use the web UI to approve or deny.)_"
+
+        else:
+            system_prompt   = load_context(tier)
+            history_for_llm = list(session_msgs) + [{"role": "user", "content": user_text}]
+            response_text, backend = await asyncio.wait_for(
+                complete(system_prompt=system_prompt, messages=history_for_llm),
+                timeout=timeout,
+            )
+
    except asyncio.TimeoutError:
        logger.warning("NCT timeout for %s", conversation_token)
        await _send_reply(conversation_token, "⏳ Still thinking — this is taking longer than usual.", nextcloud_url, secret)
@@ -83,6 +157,8 @@ async def _process_message(
        return

    logger.info("NCT LLM responded via %s (%d chars)", backend, len(response_text))
+
+    history.append({"role": "user", "content": user_text})
    history.append({"role": "assistant", "content": response_text})
    save_session(session_id, history)
    log_turn(session_id, user_text, response_text)
@@ -163,6 +239,6 @@ async def nextcloud_talk_webhook(username: str, request: Request, background_tas
    background_tasks.add_task(
        _process_message,
        conversation_token, user_text, actor_name,
-        username, persona_name, nextcloud_url, secret, timeout,
+        username, persona_name, nextcloud_url, secret, timeout, cfg,
    )
    return Response(status_code=200)
--- a/cortex/routers/onboarding.py
+++ b/cortex/routers/onboarding.py
@@ -1,11 +1,13 @@
 """
-Onboarding router — invite-based setup + persona creation.
+Onboarding router — invite-based setup + persona creation + model connect.

 Routes:
  GET  /setup/{token}      → show password setup form (step 1)
  POST /setup/{token}      → set password, redirect to persona step
  GET  /setup/persona      → show persona creation form (step 2, requires auth)
-  POST /setup/persona      → create persona, redirect to /{user}/{persona}
+  POST /setup/persona      → create persona, redirect to /setup/model
+  GET  /setup/model        → OpenRouter quick-connect (step 3, also standalone)
+  POST /setup/model        → save host + model + assign to chat role, redirect to chat
 """

 import logging
@@ -21,6 +23,7 @@ from auth_utils import (
 )
 from persona_template import create_persona
 from persona import list_user_personas, validate as validate_persona
+import model_registry

 logger = logging.getLogger(__name__)
 router = APIRouter(prefix="/setup")
@@ -114,7 +117,11 @@ async def persona_submit(
        description=description.strip(),
    )
    logger.info("persona created: %s/%s", username, persona_name)
-    return RedirectResponse(f"/{username}/{persona_name}", status_code=302)
+    # Step 3: guided model setup before entering the chat
+    resp = RedirectResponse("/setup/model", status_code=302)
+    # Remember which persona to land on after model setup
+    resp.set_cookie("cx_setup_persona", f"{username}/{persona_name}", max_age=3600, httponly=True, samesite="lax")
+    return resp


 # ---------------------------------------------------------------------------
@@ -178,3 +185,126 @@ async def setup_submit(
        return resp

    return HTMLResponse(_setup_page("Unknown step."), status_code=400)
+
+
+# ---------------------------------------------------------------------------
+# Step 3 — model connect (OpenRouter quick-connect, also standalone)
+# ---------------------------------------------------------------------------
+
+# Curated model list shown in the Step 3 dropdown.
+_OPENROUTER_MODELS = [
+    ("anthropic/claude-3-5-haiku-20241022",  "Claude 3.5 Haiku — Fast & affordable"),
+    ("anthropic/claude-3-7-sonnet-20250219", "Claude 3.7 Sonnet — Smarter Claude"),
+    ("google/gemini-2.0-flash-001",          "Gemini 2.0 Flash — Fast Google model"),
+    ("meta-llama/llama-3.3-70b-instruct",    "Llama 3.3 70B — Open source"),
+]
+
+
+def _model_page(error: str = "", from_setup: bool = False) -> str:
+    html = (_STATIC / "setup.html").read_text()
+    # Hide steps 1 and 2 inline; show step 3
+    html = html.replace('<div id="step-password">', '<div id="step-password" style="display:none">')
+    html = html.replace('<div id="step-persona" style="display:none">', '<div id="step-persona" style="display:none">')
+    html = html.replace('<div id="step-model" style="display:none">', '<div id="step-model">')
+    if from_setup:
+        html = html.replace("<!-- SETUP_STEP3_LABEL -->", "Step 3 of 3")
+    if error:
+        html = html.replace("<!-- ERROR_MODEL -->", f'<p class="error">{error}</p>')
+    return html
+
+
+@router.post("/model/skip", include_in_schema=False)
+async def model_skip(request: Request):
+    """Skip model setup — redirect to the remembered persona or user root."""
+    from auth_utils import decode_token
+    import jwt
+    token = request.cookies.get(COOKIE_NAME)
+    username = None
+    if token:
+        try:
+            username = decode_token(token)
+        except jwt.InvalidTokenError:
+            pass
+
+    dest_cookie = request.cookies.get("cx_setup_persona", "")
+    dest = f"/{dest_cookie}" if dest_cookie else (f"/{username}" if username else "/")
+    resp = RedirectResponse(dest, status_code=302)
+    resp.delete_cookie("cx_setup_persona")
+    return resp
+
+
+@router.get("/model", include_in_schema=False)
+async def model_page(request: Request):
+    from auth_utils import decode_token
+    import jwt
+    token = request.cookies.get(COOKIE_NAME)
+    if not token:
+        return RedirectResponse("/login", status_code=302)
+    try:
+        decode_token(token)
+    except jwt.InvalidTokenError:
+        return RedirectResponse("/login", status_code=302)
+
+    from_setup = bool(request.cookies.get("cx_setup_persona"))
+    return HTMLResponse(_model_page(from_setup=from_setup))
+
+
+@router.post("/model", include_in_schema=False)
+async def model_submit(
+    request: Request,
+    api_key: str = Form(...),
+    model_name: str = Form(...),
+):
+    from auth_utils import decode_token
+    import jwt
+    token = request.cookies.get(COOKIE_NAME)
+    if not token:
+        return RedirectResponse("/login", status_code=302)
+    try:
+        username = decode_token(token)
+    except jwt.InvalidTokenError:
+        return RedirectResponse("/login", status_code=302)
+
+    api_key = api_key.strip()
+    model_name = model_name.strip()
+
+    if not api_key:
+        from_setup = bool(request.cookies.get("cx_setup_persona"))
+        return HTMLResponse(_model_page("API key is required.", from_setup=from_setup), status_code=422)
+
+    # Save OpenRouter as a host
+    host_id = model_registry.save_host(
+        username=username,
+        host_id=None,
+        label="OpenRouter",
+        api_url="https://openrouter.ai/api/v1",
+        api_key=api_key,
+        host_type="openai",
+    )
+
+    # Find label for selected model
+    label = next((lbl for mn, lbl in _OPENROUTER_MODELS if mn == model_name), model_name)
+    label = label.split(" — ")[0]  # keep just the model name part
+
+    # Save model entry
+    mid = model_registry.save_model(
+        username=username,
+        model_id=None,
+        host_id=host_id,
+        label=label,
+        model_name=model_name,
+        context_k=128,
+        tools=True,
+    )
+
+    # Assign as chat role primary
+    model_registry.set_role(username, "chat", "primary", mid)
+    logger.info("openrouter setup complete: %s → %s", username, model_name)
+
+    # Redirect to chat (use remembered persona, or user root)
+    dest_cookie = request.cookies.get("cx_setup_persona", "")
+    dest = f"/{dest_cookie}" if dest_cookie else f"/{username}"
+
+    resp = RedirectResponse(dest, status_code=302)
+    resp.delete_cookie("cx_setup_persona")
+    return resp
--- a/cortex/routers/orchestrator.py
+++ b/cortex/routers/orchestrator.py
@@ -12,13 +12,14 @@ Designed to be triggered from:

 import asyncio
 import logging
+import platform
 import uuid
 from datetime import datetime, timezone

-from fastapi import APIRouter
+from fastapi import APIRouter, HTTPException
 from pydantic import BaseModel

-from auth_utils import get_user_gemini_key
+from auth_utils import get_user_gemini_key, get_user_role, get_tool_policy, get_risk_policy
 from config import settings
 from context_loader import load_context
 from persona import set_context, validate as validate_persona
@@ -31,12 +32,16 @@ router = APIRouter(prefix="/orchestrate", tags=["orchestrator"])

 # ---------------------------------------------------------------------------
 # In-memory job store
-# Jobs are keyed by UUID. For this phase, memory is fine — jobs are short-lived.
 # ---------------------------------------------------------------------------

 _jobs: dict[str, dict] = {}
 _jobs_lock = asyncio.Lock()

+# Checkpoints are stored separately — they hold Python objects (types.Content, etc.)
+# that can't be included in the JSON-serializable job dict.
+_checkpoints: dict[str, orchestrator_engine.OrchestrateCheckpoint] = {}
+_checkpoints_lock = asyncio.Lock()
+

 # ---------------------------------------------------------------------------
 # Request / response models
@@ -52,11 +57,13 @@ class OrchestrateRequest(BaseModel):
    include_short: bool = True
    user: str = "scott"
    persona: str = "inara"
+    chat_role: str = "chat"             # role used for the final response (decoupled from tool-loop model)
+    off_record: bool = False            # skip session log; inject OTR mode line into system prompt


 class OrchestrateResponse(BaseModel):
    job_id: str
-    status: str     # "queued" | "running" | "complete" | "error"
+    status: str     # "queued" | "running" | "complete" | "error" | "awaiting_confirmation"


 class JobStatusResponse(BaseModel):
@@ -69,8 +76,11 @@ class JobStatusResponse(BaseModel):
    response: str | None = None
    tool_calls: list[dict] | None = None
    backend: str | None = None
+    backend_label: str | None = None
+    host: str | None = None
    gemini_summary: str | None = None
    error: str | None = None
+    pending_confirmation: dict | None = None  # {tools: [{name, args}], message: str}


 # ---------------------------------------------------------------------------
@@ -84,7 +94,6 @@ async def orchestrate(req: OrchestrateRequest) -> OrchestrateResponse:
        user, persona = validate_persona(req.user, req.persona)
        set_context(user, persona)
    except ValueError as e:
-        from fastapi import HTTPException
        raise HTTPException(status_code=400, detail=str(e))

    job_id = str(uuid.uuid4())
@@ -96,17 +105,20 @@ async def orchestrate(req: OrchestrateRequest) -> OrchestrateResponse:
        "task": req.task,
        "created_at": now,
        "completed_at": None,
+        "session_id": None,
        "response": None,
        "tool_calls": None,
        "backend": None,
        "gemini_summary": None,
        "error": None,
+        "pending_confirmation": None,
+        "_user": user,
+        "_off_record": req.off_record,
    }

    async with _jobs_lock:
        _jobs[job_id] = job

-    # Run in background — caller polls GET /orchestrate/{job_id}
    asyncio.create_task(_run_job(job_id, req, user))
    logger.info("Orchestrator job queued: %s — %.80s", job_id, req.task)
    return OrchestrateResponse(job_id=job_id, status="queued")
@@ -119,10 +131,9 @@ async def job_status(job_id: str) -> JobStatusResponse:
        job = _jobs.get(job_id)

    if job is None:
-        from fastapi import HTTPException
        raise HTTPException(status_code=404, detail=f"Job {job_id} not found")

-    return JobStatusResponse(**job)
+    return JobStatusResponse(**{k: v for k, v in job.items() if not k.startswith("_")})


@router.get("", response_model=list[JobStatusResponse])
@@ -130,11 +141,55 @@ async def list_jobs() -> list[JobStatusResponse]:
    """List all jobs (most recent first). Useful for debugging."""
    async with _jobs_lock:
        jobs = sorted(_jobs.values(), key=lambda j: j["created_at"], reverse=True)
-    return [JobStatusResponse(**j) for j in jobs]
+    return [JobStatusResponse(**{k: v for k, v in j.items() if not k.startswith("_")}) for j in jobs]
+
+
+@router.post("/{job_id}/confirm", response_model=OrchestrateResponse)
+async def confirm_job(job_id: str) -> OrchestrateResponse:
+    """Confirm a pending tool call — the blocked tool will execute and the job continues."""
+    async with _checkpoints_lock:
+        checkpoint = _checkpoints.pop(job_id, None)
+
+    if checkpoint is None:
+        raise HTTPException(status_code=404, detail="No pending confirmation for this job")
+
+    async with _jobs_lock:
+        job = _jobs.get(job_id)
+        if not job or job["status"] != "awaiting_confirmation":
+            raise HTTPException(status_code=409, detail="Job is not awaiting confirmation")
+        _jobs[job_id]["status"] = "running"
+        _jobs[job_id]["pending_confirmation"] = None
+        user = job.get("_user", "scott")
+
+    asyncio.create_task(_resume_job(job_id, checkpoint, confirmed=True, user=user))
+    logger.info("Orchestrator job %s confirmed — resuming", job_id)
+    return OrchestrateResponse(job_id=job_id, status="running")
+
+
+@router.post("/{job_id}/deny", response_model=OrchestrateResponse)
+async def deny_job(job_id: str) -> OrchestrateResponse:
+    """Deny a pending tool call — the tool is skipped and the job produces a final response."""
+    async with _checkpoints_lock:
+        checkpoint = _checkpoints.pop(job_id, None)
+
+    if checkpoint is None:
+        raise HTTPException(status_code=404, detail="No pending confirmation for this job")
+
+    async with _jobs_lock:
+        job = _jobs.get(job_id)
+        if not job or job["status"] != "awaiting_confirmation":
+            raise HTTPException(status_code=409, detail="Job is not awaiting confirmation")
+        _jobs[job_id]["status"] = "running"
+        _jobs[job_id]["pending_confirmation"] = None
+        user = job.get("_user", "scott")
+
+    asyncio.create_task(_resume_job(job_id, checkpoint, confirmed=False, user=user))
+    logger.info("Orchestrator job %s denied — resuming with skip", job_id)
+    return OrchestrateResponse(job_id=job_id, status="running")


 # ---------------------------------------------------------------------------
-# Background runner
+# Background runners
 # ---------------------------------------------------------------------------

 async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
@@ -145,22 +200,31 @@ async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
    try:
        from session_store import load as load_session, save as save_session, generate_session_id

-        # Load Inara's system prompt (same as the chat router does)
        tier = req.tier or settings.default_tier
+        role_cfg = model_registry.get_role_config(user, req.chat_role)
        system_prompt = load_context(
            tier,
            include_long=req.include_long,
            include_mid=req.include_mid,
            include_short=req.include_short,
+            role_append=role_cfg.get("system_append", ""),
+            inject_datetime=role_cfg.get("inject_datetime", True),
+            inject_mode=role_cfg.get("inject_mode", True),
+            mode="otr" if req.off_record else "chat",
        )

-        # Load session history if a session_id was provided
        session_id = req.session_id or generate_session_id()
        history = load_session(session_id)
        session_messages = history or None

-        # Choose engine based on the orchestrator role in the model registry
        orch_model = model_registry.get_model_for_role(user, "orchestrator")
+        user_role = get_user_role(user)
+        tool_list = role_cfg.get("tools")
+
+        policy = get_tool_policy(user)
+        confirm_allow = set(policy.get("allow", []))
+        confirm_deny = set(policy.get("deny", []))
+        max_risk, risk_wl, risk_bl = get_risk_policy(user)

        if orch_model and orch_model.get("type") == "local_openai":
            result = await openai_orchestrator.run(
@@ -169,36 +233,58 @@ async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
                session_messages=session_messages,
                model_cfg=orch_model,
                respond_with_final=req.respond_with_claude,
+                user_role=user_role,
+                tool_list=tool_list,
+                confirm_allow=confirm_allow,
+                confirm_deny=confirm_deny,
+                max_risk=max_risk,
+                risk_whitelist=risk_wl,
+                risk_blacklist=risk_bl,
            )
        else:
+            gemini_key = (
+                (orch_model.get("api_key") if orch_model else None)
+                or get_user_gemini_key(user)
+            )
            result = await orchestrator_engine.run(
                task=req.task,
                system_prompt=system_prompt,
                session_messages=session_messages,
                respond_with_claude=req.respond_with_claude,
-                gemini_api_key=get_user_gemini_key(user),
+                gemini_api_key=gemini_key,
+                model_name=orch_model.get("model_name") if orch_model else None,
+                response_role=req.chat_role,
+                user_role=user_role,
+                tool_list=tool_list,
+                confirm_allow=confirm_allow,
+                confirm_deny=confirm_deny,
+                max_rounds=orch_model.get("max_rounds") if orch_model else None,
+                max_risk=max_risk,
+                risk_whitelist=risk_wl,
+                risk_blacklist=risk_bl,
            )

-        # Save the turn to the session store so it survives a page refresh
-        history.append({"role": "user", "content": req.task})
-        history.append({"role": "assistant", "content": result.response})
-        save_session(session_id, history)
+        if result.checkpoint:
+            async with _checkpoints_lock:
+                _checkpoints[job_id] = result.checkpoint
+            async with _jobs_lock:
+                _jobs[job_id].update({
+                    "status": "awaiting_confirmation",
+                    "response": result.response,
+                    "tool_calls": result.tool_calls,
+                    "backend": result.backend,
+                    "gemini_summary": result.gemini_summary,
+                    "session_id": session_id,
+                    "pending_confirmation": {
+                        "tools": result.checkpoint.pending_tools,
+                        "message": result.response,
+                    },
+                })
+            logger.info("Orchestrator job %s awaiting confirmation — %d tool(s) blocked",
+                        job_id, len(result.checkpoint.pending_tools))
+            return

-        from session_logger import log_turn
-        log_turn(session_id, req.task, result.response)
-
-        now = datetime.now(timezone.utc).isoformat()
-        async with _jobs_lock:
-            _jobs[job_id].update({
-                "status": "complete",
-                "completed_at": now,
-                "session_id": session_id,
-                "response": result.response,
-                "tool_calls": result.tool_calls,
-                "backend": result.backend,
-                "gemini_summary": result.gemini_summary,
-            })
-        logger.info("Orchestrator job complete: %s (%d tool calls)", job_id, len(result.tool_calls))
+        await _finalize_job(job_id, result, session_id, req.task, history, off_record=req.off_record)

    except Exception as e:
        logger.exception("Orchestrator job failed: %s", job_id)
@@ -209,3 +295,100 @@ async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
                "completed_at": now,
                "error": str(e),
            })
+
+
+async def _resume_job(
+    job_id: str,
+    checkpoint: orchestrator_engine.OrchestrateCheckpoint,
+    confirmed: bool,
+    user: str,
+) -> None:
+    """Resume a job after the user confirms or denies a pending tool call."""
+    try:
+        if checkpoint.engine == "gemini":
+            result = await orchestrator_engine.resume(checkpoint, confirmed)
+        else:
+            result = await openai_orchestrator.resume(checkpoint, confirmed)
+
+        if result.checkpoint:
+            # Another confirmation needed (chained gates)
+            async with _checkpoints_lock:
+                _checkpoints[job_id] = result.checkpoint
+            async with _jobs_lock:
+                _jobs[job_id].update({
+                    "status": "awaiting_confirmation",
+                    "response": result.response,
+                    "tool_calls": result.tool_calls,
+                    "backend": result.backend,
+                    "gemini_summary": result.gemini_summary,
+                    "pending_confirmation": {
+                        "tools": result.checkpoint.pending_tools,
+                        "message": result.response,
+                    },
+                })
+            logger.info("Orchestrator job %s awaiting another confirmation", job_id)
+            return
+
+        async with _jobs_lock:
+            session_id  = _jobs[job_id].get("session_id") or ""
+            task        = _jobs[job_id].get("task", "")
+            off_record  = _jobs[job_id].get("_off_record", False)
+
+        from session_store import load as load_session
+        history = load_session(session_id) if session_id else []
+        await _finalize_job(job_id, result, session_id, task, history, off_record=off_record)
+
+    except Exception as e:
+        logger.exception("Orchestrator resume failed: %s", job_id)
+        now = datetime.now(timezone.utc).isoformat()
+        async with _jobs_lock:
+            _jobs[job_id].update({
+                "status": "error",
+                "completed_at": now,
+                "error": str(e),
+            })
+
+
+async def _finalize_job(
+    job_id: str,
+    result: orchestrator_engine.OrchestratorResult,
+    session_id: str,
+    task: str,
+    history: list,
+    off_record: bool = False,
+) -> None:
+    """Save session, log the turn, and mark the job complete."""
+    from session_store import save as save_session, generate_session_id
+    from session_logger import log_turn
+
+    if not session_id:
+        session_id = generate_session_id()
+
+    host = platform.node()
+    history.append({"role": "user", "content": task, "off_record": off_record})
+    history.append({
+        "role": "assistant",
+        "content": result.response,
+        "backend": result.backend,
+        "backend_label": result.backend_label,
+        "host": host,
+        "off_record": off_record,
+    })
+    save_session(session_id, history)
+    if not off_record:
+        log_turn(session_id, task, result.response)
+
+    now = datetime.now(timezone.utc).isoformat()
+    async with _jobs_lock:
+        _jobs[job_id].update({
+            "status": "complete",
+            "completed_at": now,
+            "session_id": session_id,
+            "response": result.response,
+            "tool_calls": result.tool_calls,
+            "backend": result.backend,
+            "backend_label": result.backend_label,
+            "host": host,
+            "gemini_summary": result.gemini_summary,
+        })
+    logger.info("Orchestrator job complete: %s (%d tool calls)", job_id, len(result.tool_calls))
--- a/cortex/routers/push.py
+++ b/cortex/routers/push.py
@@ -0,0 +1,120 @@
+"""
+Web Push endpoints.
+
+  GET  /api/push/vapid-key      → public VAPID key for browser PushManager.subscribe()
+  POST /api/push/subscribe      → save a push subscription for the logged-in user
+  DELETE /api/push/subscribe    → remove a subscription by endpoint
+"""
+import jwt
+from fastapi import APIRouter, HTTPException, Request
+from pydantic import BaseModel
+
+from auth_utils import COOKIE_NAME, decode_token
+from config import settings
+import push_utils
+
+router = APIRouter(prefix="/api/push")
+
+
+def _require_user(request: Request) -> str:
+    token = request.cookies.get(COOKIE_NAME)
+    if not token:
+        raise HTTPException(status_code=401, detail="Not authenticated")
+    try:
+        return decode_token(token)
+    except jwt.InvalidTokenError:
+        raise HTTPException(status_code=401, detail="Invalid session")
+
+
+@router.get("/vapid-key")
+async def get_vapid_key() -> dict:
+    """Return the VAPID public key. Public endpoint — needed before login to subscribe."""
+    key = settings.vapid_public_key
+    if not key:
+        raise HTTPException(status_code=503, detail="Push notifications not configured")
+    return {"public_key": key}
+
+
+class SubscribeRequest(BaseModel):
+    subscription: dict   # full PushSubscription JSON from browser
+
+
+class UnsubscribeRequest(BaseModel):
+    endpoint: str
+
+
+@router.post("/subscribe")
+async def subscribe(req: SubscribeRequest, request: Request) -> dict:
+    username = _require_user(request)
+    sub = req.subscription
+    if not sub.get("endpoint"):
+        raise HTTPException(status_code=400, detail="subscription.endpoint is required")
+    push_utils.add_subscription(username, sub)
+    return {"ok": True}
+
+
+@router.delete("/subscribe")
+async def unsubscribe(req: UnsubscribeRequest, request: Request) -> dict:
+    username = _require_user(request)
+    found = push_utils.remove_subscription(username, req.endpoint)
+    return {"ok": True, "found": found}
+
+
+@router.post("/test")
+async def notify_test(request: Request) -> dict:
+    """Send a test notification via the user's configured notification channel.
+
+    Useful for verifying channel setup (web push, NCT, email, etc.) without
+    waiting for a cron job or reminder to fire naturally.
+    """
+    username = _require_user(request)
+    from notification import notify
+    await notify(username, "Test notification from Cortex — your notification channel is working.")
+    return {"ok": True, "user": username}
+
+
+@router.post("/reminders/check")
+async def reminder_check_now(request: Request) -> dict:
+    """Run the reminder check for the current user immediately.
+
+    Same logic as the daily 09:00 scheduler job, but scoped to one user
+    and fired on demand. Returns how many reminders were found and whether
+    a notification was sent.
+    """
+    import re
+    username = _require_user(request)
+
+    from persona import list_user_personas, set_context
+    from notification import notify
+
+    total_sent = 0
+    for persona_name in list_user_personas(username):
+        set_context(username, persona_name)
+        from tools.reminders import load_due_reminders
+        content = load_due_reminders()
+        if not content:
+            continue
+
+        entries = []
+        for line in content.splitlines():
+            m = re.match(r"^\d+\.\s+(.+)", line.strip())
+            if m:
+                text = re.sub(r"\[(OVERDUE|due TODAY|due: \S+)\]", "", m.group(1)).strip()
+                if text:
+                    entries.append(text)
+
+        if not entries:
+            continue
+
+        count = len(entries)
+        if count == 1:
+            msg = f"Reminder: {entries[0]}"
+        else:
+            bullet_list = "\n".join(f"• {e}" for e in entries[:3])
+            tail = f"\n…and {count - 3} more" if count > 3 else ""
+            msg = f"{count} reminders due:\n{bullet_list}{tail}"
+
+        await notify(username, msg)
+        total_sent += count
+
+    return {"ok": True, "user": username, "reminders_found": total_sent}
--- a/cortex/routers/settings.py
+++ b/cortex/routers/settings.py
@@ -8,6 +8,8 @@ Routes:
  POST /settings/persona/rename → rename a persona directory
 """

+import html as _html
+import json
 import logging
 import re
 from pathlib import Path
@@ -16,7 +18,7 @@ import jwt
 from fastapi import APIRouter, Form, Request
 from fastapi.responses import HTMLResponse, RedirectResponse

-from auth_utils import COOKIE_NAME, decode_token, check_credentials, set_password, _read_auth, _write_auth
+from auth_utils import COOKIE_NAME, decode_token, check_credentials, set_password, _read_auth, _write_auth, get_user_channels
 from persona import list_user_personas
 from config import settings as app_settings

@@ -28,6 +30,9 @@ router = APIRouter()
 _STATIC = Path(__file__).parent.parent / "static"


+_LAST_PERSONA_COOKIE = "cx_last_persona"
+
+
 def _get_session_user(request: Request) -> str | None:
    token = request.cookies.get(COOKIE_NAME)
    if not token:
@@ -38,23 +43,91 @@ def _get_session_user(request: Request) -> str | None:
        return None


-def _settings_page(username: str, personas: list[str], success: str = "", error: str = "") -> str:
+def _preferred_persona(request: Request, username: str) -> str:
+    names = list_user_personas(username)
+    if not names:
+        return ""
+    cookie_val = request.cookies.get(_LAST_PERSONA_COOKIE, "")
+    if cookie_val in names:
+        return cookie_val
+    return names[0]
+
+
+def _integrations_nav(username: str) -> str:
+    """Return the Integrations nav link for admin users, empty string otherwise."""
+    role = _read_auth(username).get("role", "user")
+    if role == "admin":
+        return '<a href="/settings/integrations" class="nav-link">Integrations</a>'
+    return ""
+
+
+def _notifications_page(username: str, back_persona: str = "", success: str = "", error: str = "") -> str:
+    html = (_STATIC / "notifications.html").read_text()
+    channels = get_user_channels(username)
+    nct = channels.get("nextcloud") or {}
+
+    notify_ch       = _html.escape(channels.get("notification_channel", "") or "")
+    notify_email    = _html.escape(channels.get("notification_email", "") or "")
+    nc_url          = _html.escape(nct.get("url", "") or "")
+    nc_bot_secret   = _html.escape(nct.get("bot_secret", "") or "")
+    nc_room         = _html.escape(nct.get("notification_room", "") or "")
+    nc_username     = _html.escape(nct.get("nc_username", "") or "")
+    nc_app_password = _html.escape(nct.get("nc_app_password", "") or "")
+    gc_webhook      = _html.escape((channels.get("google_chat") or {}).get("outbound_webhook", "") or "")
+    ha              = channels.get("homeassistant") or {}
+    ha_url          = _html.escape(ha.get("url", "") or "")
+    ha_webhook_id   = _html.escape(ha.get("webhook_id", "") or "")
+    ha_tools_checked = "checked" if ha.get("tools", False) else ""
+
+    html = html.replace("{{ notify_channel }}", notify_ch)
+    html = html.replace("{{ notify_email_override }}", notify_email)
+    html = html.replace("{{ nc_url }}", nc_url)
+    html = html.replace("{{ nc_bot_secret }}", nc_bot_secret)
+    html = html.replace("{{ nc_notify_room }}", nc_room)
+    html = html.replace("{{ nc_username }}", nc_username)
+    html = html.replace("{{ nc_app_password }}", nc_app_password)
+    html = html.replace("{{ gc_webhook }}", gc_webhook)
+    html = html.replace("{{ ha_url }}", ha_url)
+    html = html.replace("{{ ha_webhook_id }}", ha_webhook_id)
+    html = html.replace("{{ ha_tools_checked }}", ha_tools_checked)
+    html = html.replace("{{ ha_username }}", username)
+    html = html.replace("{{ back_href }}", f"/{username}/{back_persona}" if back_persona else "/")
+    html = html.replace("{{ help_href }}", f"/help?persona={back_persona}" if back_persona else "/help")
+    html = html.replace("{{ integrations_nav }}", _integrations_nav(username))
+    if success:
+        html = html.replace("<!-- SUCCESS -->", f'<p class="success">{success}</p>')
+    if error:
+        html = html.replace("<!-- ERROR -->", f'<p class="error">{error}</p>')
+    return html
+
+
+def _settings_page(username: str, personas: list[str], back_persona: str = "", success: str = "", error: str = "") -> str:
    html = (_STATIC / "settings.html").read_text()
    html = html.replace("{{ username }}", username)

-    # Connected Google account
+    # Connected Google account (OAuth sign-in)
    auth_data    = _read_auth(username)
    google_email = auth_data.get("google_email") or ""
    html = html.replace("{{ google_email }}", google_email)

-    # Gemini API key — show masked hint only, never the full key
-    gemini_key = auth_data.get("gemini_api_key") or ""
-    if gemini_key:
-        hint = f"Saved (…{gemini_key[-4:]})"
-    else:
-        hint = "Using server key"
-    html = html.replace("{{ gemini_key_hint }}", hint)
-    html = html.replace("{{ gemini_key_set }}", "true" if gemini_key else "false")
+    role = auth_data.get("role", "user")
+    html = html.replace("{{ user_role }}", role)
+
+    al_path = app_settings.home_root() / username / "email_allowlist.json"
+    try:
+        patterns = json.loads(al_path.read_text())
+        allowlist_text = _html.escape("\n".join(str(p) for p in patterns if str(p).strip()))
+    except Exception:
+        allowlist_text = ""
+    html = html.replace("{{ email_allowlist }}", allowlist_text)
+
+    http_al_path = app_settings.home_root() / username / "http_allowlist.json"
+    try:
+        http_prefixes = json.loads(http_al_path.read_text())
+        http_allowlist_text = _html.escape("\n".join(str(p) for p in http_prefixes if str(p).strip()))
+    except Exception:
+        http_allowlist_text = ""
+    html = html.replace("{{ http_allowlist }}", http_allowlist_text)

    persona_items = "\n".join(
        f'''<li>
@@ -65,13 +138,33 @@ def _settings_page(username: str, personas: list[str], success: str = "", error:
            <input type="hidden" name="old_name" value="{p}">
            <input type="text" name="new_name" value="{p}"
                   pattern="[a-z_][a-z0-9_\\-]{{0,31}}" required>
-            <button type="submit">Save</button>
-            <button type="button" class="persona-rename-cancel">Cancel</button>
+            <button type="submit" class="btn-save">Save</button>
+            <button type="button" class="btn-cancel persona-rename-cancel">Cancel</button>
          </form>
        </li>''' for p in personas
    )
    html = html.replace("{{ persona_items }}", persona_items or "<li><em>No personas yet.</em></li>")
-    back_persona = personas[0] if personas else ""
+    if not back_persona:
+        back_persona = personas[0] if personas else ""
+    html = html.replace("{{ back_href }}", f"/{username}/{back_persona}" if back_persona else "/")
+    html = html.replace("{{ help_href }}", f"/help?persona={back_persona}" if back_persona else "/help")
+    html = html.replace("{{ integrations_nav }}", _integrations_nav(username))
+    if success:
+        html = html.replace("<!-- SUCCESS -->", f'<p class="success">{success}</p>')
+    if error:
+        html = html.replace("<!-- ERROR -->", f'<p class="error">{error}</p>')
+    return html
+
+
+def _integrations_page(username: str, back_persona: str = "", success: str = "", error: str = "") -> str:
+    html = (_STATIC / "integrations.html").read_text()
+    channels = get_user_channels(username)
+    ae_db = channels.get("aether_db") or {}
+
+    html = html.replace("{{ ae_db_host }}", _html.escape(ae_db.get("host", "") or ""))
+    html = html.replace("{{ ae_db_port }}", _html.escape(str(ae_db.get("port", 3306))))
+    html = html.replace("{{ ae_db_name }}", _html.escape(ae_db.get("name", "") or ""))
+    html = html.replace("{{ ae_db_user }}", _html.escape(ae_db.get("user", "") or ""))
    html = html.replace("{{ back_href }}", f"/{username}/{back_persona}" if back_persona else "/")
    html = html.replace("{{ help_href }}", f"/help?persona={back_persona}" if back_persona else "/help")
    if success:
@@ -87,7 +180,8 @@ async def settings_page(request: Request):
    if not username:
        return RedirectResponse("/login", status_code=302)
    personas = list_user_personas(username)
-    return HTMLResponse(_settings_page(username, personas))
+    back_persona = _preferred_persona(request, username)
+    return HTMLResponse(_settings_page(username, personas, back_persona=back_persona))


@router.post("/settings/password", include_in_schema=False)
@@ -102,19 +196,20 @@ async def change_password(
        return RedirectResponse("/login", status_code=302)

    personas = list_user_personas(username)
+    back_persona = _preferred_persona(request, username)

    if not check_credentials(username, current_password):
-        return HTMLResponse(_settings_page(username, personas, error="Current password is incorrect."))
+        return HTMLResponse(_settings_page(username, personas, back_persona, error="Current password is incorrect."))

    if len(new_password) < 8:
-        return HTMLResponse(_settings_page(username, personas, error="New password must be at least 8 characters."))
+        return HTMLResponse(_settings_page(username, personas, back_persona, error="New password must be at least 8 characters."))

    if new_password != confirm_password:
-        return HTMLResponse(_settings_page(username, personas, error="New passwords do not match."))
+        return HTMLResponse(_settings_page(username, personas, back_persona, error="New passwords do not match."))

    set_password(username, new_password)
    logger.info("password changed: %s", username)
-    return HTMLResponse(_settings_page(username, personas, success="Password updated successfully."))
+    return HTMLResponse(_settings_page(username, personas, back_persona, success="Password updated successfully."))


@router.post("/settings/username", include_in_schema=False)
@@ -127,11 +222,12 @@ async def rename_username(
        return RedirectResponse("/login", status_code=302)

    personas = list_user_personas(username)
+    back_persona = _preferred_persona(request, username)
    new_username = new_username.strip().lower()

    if not _SLUG_RE.match(new_username):
        return HTMLResponse(_settings_page(
-            username, personas,
+            username, personas, back_persona,
            error="Invalid username. Use lowercase letters, digits, _ or - only."))

    if new_username == username:
@@ -143,7 +239,7 @@ async def rename_username(

    if new_dir.exists():
        return HTMLResponse(_settings_page(
-            username, personas,
+            username, personas, back_persona,
            error=f"Username '{new_username}' is already taken."))

    old_dir.rename(new_dir)
@@ -165,6 +261,7 @@ async def save_gemini_key(
        return RedirectResponse("/login", status_code=302)

    personas = list_user_personas(username)
+    back_persona = _preferred_persona(request, username)
    gemini_api_key = gemini_api_key.strip()

    data = _read_auth(username)
@@ -176,7 +273,7 @@ async def save_gemini_key(
        msg = "Gemini API key removed — using server key."
    _write_auth(username, data)
    logger.info("gemini key updated: %s", username)
-    return HTMLResponse(_settings_page(username, personas, success=msg))
+    return HTMLResponse(_settings_page(username, personas, back_persona, success=msg))


@router.post("/settings/persona/rename", include_in_schema=False)
@@ -190,11 +287,12 @@ async def rename_persona(
        return RedirectResponse("/login", status_code=302)

    personas = list_user_personas(username)
+    back_persona = _preferred_persona(request, username)
    new_name = new_name.strip().lower()

    if not _SLUG_RE.match(new_name):
        return HTMLResponse(_settings_page(
-            username, personas,
+            username, personas, back_persona,
            error="Invalid name. Use lowercase letters, digits, _ or - only."))

    if new_name == old_name:
@@ -205,13 +303,197 @@ async def rename_persona(
    new_dir = persona_root / new_name

    if not old_dir.exists():
-        return HTMLResponse(_settings_page(username, personas, error=f"Persona '{old_name}' not found."))
+        return HTMLResponse(_settings_page(username, personas, back_persona, error=f"Persona '{old_name}' not found."))

    if new_dir.exists():
        return HTMLResponse(_settings_page(
-            username, personas,
+            username, personas, back_persona,
            error=f"A persona named '{new_name}' already exists."))

    old_dir.rename(new_dir)
    logger.info("persona renamed: %s/%s → %s", username, old_name, new_name)
    return RedirectResponse("/settings", status_code=302)
+
+
+@router.get("/settings/notifications", include_in_schema=False)
+async def notifications_page(request: Request):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+    return HTMLResponse(_notifications_page(username, back_persona))
+
+
+@router.post("/settings/notifications", include_in_schema=False)
+async def save_notifications(
+    request: Request,
+    notification_channel: str = Form(""),
+    notification_email: str = Form(""),
+    nc_url: str = Form(""),
+    nc_bot_secret: str = Form(""),
+    nc_notification_room: str = Form(""),
+    nc_username: str = Form(""),
+    nc_app_password: str = Form(""),
+    gc_outbound_webhook: str = Form(""),
+    ha_url: str = Form(""),
+    ha_token: str = Form(""),
+    ha_webhook_id: str = Form(""),
+    ha_tools: str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+
+    back_persona = _preferred_persona(request, username)
+
+    channels_path = app_settings.home_root() / username / "channels.json"
+    try:
+        channels = json.loads(channels_path.read_text())
+    except Exception:
+        channels = {}
+
+    # Top-level notification preference
+    notification_channel = notification_channel.strip()
+    if notification_channel in ("web_push", "email", "nextcloud", "google_chat"):
+        channels["notification_channel"] = notification_channel
+    else:
+        channels.pop("notification_channel", None)
+
+    # Optional email address override (blank = use login email)
+    notification_email = notification_email.strip()
+    if notification_email:
+        channels["notification_email"] = notification_email
+    else:
+        channels.pop("notification_email", None)
+
+    # Nextcloud Talk — full config nested under "nextcloud"
+    if "nextcloud" not in channels:
+        channels["nextcloud"] = {}
+    nct = channels["nextcloud"]
+    if nc_url.strip():
+        nct["url"] = nc_url.strip().rstrip("/")
+    # Only overwrite secrets if a new value was provided (blank = keep existing)
+    if nc_bot_secret.strip():
+        nct["bot_secret"] = nc_bot_secret.strip()
+    nct["notification_room"] = nc_notification_room.strip()
+    if nc_username.strip():
+        nct["nc_username"] = nc_username.strip()
+    if nc_app_password.strip():
+        nct["nc_app_password"] = nc_app_password.strip()
+
+    # Google Chat outbound webhook — nested under "google_chat"
+    if "google_chat" not in channels:
+        channels["google_chat"] = {}
+    channels["google_chat"]["outbound_webhook"] = gc_outbound_webhook.strip()
+
+    # Home Assistant — nested under "homeassistant"
+    if "homeassistant" not in channels:
+        channels["homeassistant"] = {}
+    ha = channels["homeassistant"]
+    if ha_url.strip():
+        ha["url"] = ha_url.strip().rstrip("/")
+    if ha_token.strip():
+        ha["token"] = ha_token.strip()
+    if ha_webhook_id.strip():
+        ha["webhook_id"] = ha_webhook_id.strip()
+    ha["tools"] = ha_tools == "1"
+
+    channels_path.write_text(json.dumps(channels, indent=2) + "\n")
+    logger.info("notifications updated for %s (channel=%s)", username, notification_channel or "none")
+    return HTMLResponse(_notifications_page(username, back_persona, success="Notification settings saved."))
+
+
+@router.post("/settings/email-allowlist", include_in_schema=False)
+async def save_email_allowlist(
+    request: Request,
+    patterns: str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+
+    personas = list_user_personas(username)
+    back_persona = _preferred_persona(request, username)
+    lines = [ln.strip() for ln in patterns.splitlines() if ln.strip()]
+    path = app_settings.home_root() / username / "email_allowlist.json"
+    path.write_text(json.dumps(lines, indent=2))
+    logger.info("email allowlist updated for %s (%d patterns)", username, len(lines))
+    return HTMLResponse(_settings_page(username, personas, back_persona, success=f"Email allowlist saved ({len(lines)} pattern{'s' if len(lines) != 1 else ''})."))
+
+
+@router.post("/settings/http-allowlist", include_in_schema=False)
+async def save_http_allowlist(
+    request: Request,
+    prefixes: str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+
+    personas = list_user_personas(username)
+    back_persona = _preferred_persona(request, username)
+    lines = [ln.strip() for ln in prefixes.splitlines() if ln.strip()]
+    path = app_settings.home_root() / username / "http_allowlist.json"
+    path.write_text(json.dumps(lines, indent=2))
+    logger.info("http allowlist updated for %s (%d prefixes)", username, len(lines))
+    return HTMLResponse(_settings_page(username, personas, back_persona, success=f"HTTP allowlist saved ({len(lines)} prefix{'es' if len(lines) != 1 else ''})."))
+
+
+def _require_admin(username: str) -> bool:
+    return _read_auth(username).get("role", "user") == "admin"
+
+
+@router.get("/settings/integrations", include_in_schema=False)
+async def integrations_page(request: Request):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    if not _require_admin(username):
+        return RedirectResponse("/settings", status_code=302)
+    back_persona = _preferred_persona(request, username)
+    return HTMLResponse(_integrations_page(username, back_persona))
+
+
+@router.post("/settings/integrations", include_in_schema=False)
+async def save_integrations(
+    request: Request,
+    ae_db_host: str = Form(""),
+    ae_db_port: str = Form("3306"),
+    ae_db_name: str = Form(""),
+    ae_db_user: str = Form(""),
+    ae_db_password: str = Form(""),
+):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    if not _require_admin(username):
+        return RedirectResponse("/settings", status_code=302)
+
+    back_persona = _preferred_persona(request, username)
+
+    channels_path = app_settings.home_root() / username / "channels.json"
+    try:
+        channels = json.loads(channels_path.read_text())
+    except Exception:
+        channels = {}
+
+    if "aether_db" not in channels:
+        channels["aether_db"] = {}
+    db = channels["aether_db"]
+
+    if ae_db_host.strip():
+        db["host"] = ae_db_host.strip()
+    try:
+        db["port"] = int(ae_db_port.strip()) if ae_db_port.strip() else 3306
+    except ValueError:
+        db["port"] = 3306
+    if ae_db_name.strip():
+        db["name"] = ae_db_name.strip()
+    if ae_db_user.strip():
+        db["user"] = ae_db_user.strip()
+    if ae_db_password.strip():
+        db["password"] = ae_db_password.strip()
+
+    channels_path.write_text(json.dumps(channels, indent=2) + "\n")
+    logger.info("integrations updated for %s", username)
+    return HTMLResponse(_integrations_page(username, back_persona, success="Integration settings saved."))
--- a/cortex/routers/tools_settings.py
+++ b/cortex/routers/tools_settings.py
@@ -0,0 +1,193 @@
+"""
+Tool settings router.
+
+Routes:
+  GET  /settings/tools  → tool risk policy page
+  POST /settings/tools  → save max_risk + per-tool overrides
+"""
+
+import html as _html
+import json
+import logging
+from pathlib import Path
+
+import jwt
+from fastapi import APIRouter, Form, Request
+from fastapi.responses import HTMLResponse, RedirectResponse
+
+from auth_utils import COOKIE_NAME, decode_token, get_tool_policy, save_tool_policy, _read_auth
+from persona import list_user_personas
+from tools import TOOL_CATEGORIES, TOOL_RISK, CONFIRM_REQUIRED
+
+logger = logging.getLogger(__name__)
+router = APIRouter()
+
+_STATIC = Path(__file__).parent.parent / "static"
+_LAST_PERSONA_COOKIE = "cx_last_persona"
+
+
+def _get_session_user(request: Request) -> str | None:
+    token = request.cookies.get(COOKIE_NAME)
+    if not token:
+        return None
+    try:
+        return decode_token(token)
+    except jwt.InvalidTokenError:
+        return None
+
+
+def _preferred_persona(request: Request, username: str) -> str:
+    names = list_user_personas(username)
+    if not names:
+        return ""
+    cookie_val = request.cookies.get(_LAST_PERSONA_COOKIE, "")
+    return cookie_val if cookie_val in names else (names[0] if names else "")
+
+
+def _build_tool_table(policy: dict) -> str:
+    """Generate the per-tool override table rows grouped by category."""
+    whitelist = set(policy.get("whitelist") or [])
+    blacklist = set(policy.get("blacklist") or [])
+
+    rows: list[str] = []
+    for category, tools in TOOL_CATEGORIES.items():
+        # Category header spanning all columns
+        escaped_cat = _html.escape(category)
+        rows.append(f'<tr class="tool-cat-row"><td colspan="4">{escaped_cat}</td></tr>')
+        for tool in tools:
+            risk      = TOOL_RISK.get(tool, "medium")
+            risk_cls  = f"risk-{risk}"
+            risk_html = f'<span class="risk {risk_cls}">{_html.escape(risk)}</span>'
+
+            # Override select value
+            if tool in whitelist:
+                override_val = "whitelist"
+            elif tool in blacklist:
+                override_val = "blacklist"
+            else:
+                override_val = "default"
+
+            def _opt(val: str, label: str) -> str:
+                sel = 'selected' if override_val == val else ''
+                return f'<option value="{val}" {sel}>{label}</option>'
+
+            override_sel = (
+                f'<select name="override_{_html.escape(tool)}" '
+                f'class="override-sel" data-tool="{_html.escape(tool)}">'
+                + _opt("default",   "Default (auto)")
+                + _opt("whitelist", "Force include")
+                + _opt("blacklist", "Force exclude")
+                + '</select>'
+            )
+
+            rows.append(
+                f'<tr data-tool-risk="{_html.escape(risk)}">'
+                f'<td class="tool-name">{_html.escape(tool)}</td>'
+                f'<td>{risk_html}</td>'
+                f'<td><span class="auto-pill"></span></td>'
+                f'<td>{override_sel}</td>'
+                f'</tr>'
+            )
+
+    table_body = "\n".join(rows)
+    return (
+        '<table class="tool-table">'
+        '<thead><tr>'
+        '<th>Tool</th><th>Risk</th><th>Auto status</th><th>Override</th>'
+        '</tr></thead>'
+        f'<tbody>{table_body}</tbody>'
+        '</table>'
+    )
+
+
+def _tools_page(
+    username: str,
+    back_persona: str = "",
+    success: str = "",
+    error: str = "",
+) -> str:
+    html = (_STATIC / "tools_settings.html").read_text()
+    policy  = get_tool_policy(username)
+    max_risk = policy.get("max_risk") or ""
+
+    # Max risk select options
+    html = html.replace("{{ sel_none }}",   "selected" if max_risk == ""       else "")
+    html = html.replace("{{ sel_low }}",    "selected" if max_risk == "low"    else "")
+    html = html.replace("{{ sel_medium }}", "selected" if max_risk == "medium" else "")
+    html = html.replace("{{ sel_high }}",   "selected" if max_risk == "high"   else "")
+
+    html = html.replace("{{ tool_table_html }}", _build_tool_table(policy))
+    html = html.replace("{{ tool_risk_json }}", json.dumps(TOOL_RISK))
+    html = html.replace("{{ confirm_required_tools }}", _html.escape(", ".join(sorted(CONFIRM_REQUIRED))))
+    html = html.replace("{{ tool_allow }}", _html.escape("\n".join(policy.get("allow") or [])))
+    html = html.replace("{{ tool_deny }}",  _html.escape("\n".join(policy.get("deny")  or [])))
+    html = html.replace("{{ back_href }}", f"/{username}/{back_persona}" if back_persona else "/")
+    html = html.replace("{{ help_href }}", f"/help?persona={back_persona}" if back_persona else "/help")
+    nav = '<a href="/settings/integrations" class="nav-link">Integrations</a>' \
+        if _read_auth(username).get("role", "user") == "admin" else ""
+    html = html.replace("{{ integrations_nav }}", nav)
+
+    if success:
+        html = html.replace("<!-- SUCCESS -->", f'<p class="success">{success}</p>')
+    if error:
+        html = html.replace("<!-- ERROR -->", f'<p class="error">{error}</p>')
+    return html
+
+
+@router.get("/settings/tools", include_in_schema=False)
+async def tools_page(request: Request):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+    back_persona = _preferred_persona(request, username)
+    return HTMLResponse(_tools_page(username, back_persona))
+
+
+@router.post("/settings/tools", include_in_schema=False)
+async def save_tools(request: Request):
+    username = _get_session_user(request)
+    if not username:
+        return RedirectResponse("/login", status_code=302)
+
+    back_persona = _preferred_persona(request, username)
+    form = await request.form()
+
+    max_risk = (form.get("max_risk") or "").strip()
+    if max_risk not in ("", "low", "medium", "high"):
+        max_risk = ""
+
+    whitelist: list[str] = []
+    blacklist: list[str] = []
+
+    all_tools = [t for tools in TOOL_CATEGORIES.values() for t in tools]
+    for tool in all_tools:
+        val = (form.get(f"override_{tool}") or "").strip()
+        if val == "whitelist":
+            whitelist.append(tool)
+        elif val == "blacklist":
+            blacklist.append(tool)
+
+    allow_tools = [ln.strip() for ln in (form.get("allow_list") or "").splitlines() if ln.strip()]
+    deny_tools  = [ln.strip() for ln in (form.get("deny_list")  or "").splitlines() if ln.strip()]
+
+    policy = get_tool_policy(username)
+    if max_risk:
+        policy["max_risk"] = max_risk
+    else:
+        policy.pop("max_risk", None)
+
+    policy["whitelist"] = whitelist
+    policy["blacklist"] = blacklist
+    policy["allow"]     = allow_tools
+    policy["deny"]      = deny_tools
+
+    save_tool_policy(username, policy)
+    logger.info(
+        "tool policy saved for %s: max_risk=%s whitelist=%d blacklist=%d allow=%d deny=%d",
+        username, max_risk or "none", len(whitelist), len(blacklist), len(allow_tools), len(deny_tools),
+    )
+    return HTMLResponse(_tools_page(
+        username, back_persona,
+        success=f"Tool policy saved — max risk: {max_risk or 'none'}, "
+                f"{len(whitelist)} whitelisted, {len(blacklist)} blacklisted.",
+    ))
--- a/cortex/routers/ui.py
+++ b/cortex/routers/ui.py
@@ -56,12 +56,26 @@ def _set_cookie(response: Response, username: str) -> None:
    )


+_LAST_PERSONA_COOKIE = "cx_last_persona"
+
+
 def _first_persona(username: str) -> str | None:
    """Return the first available persona for a user, or None."""
    names = list_user_personas(username)
    return names[0] if names else None


+def _preferred_persona(request: Request, username: str) -> str | None:
+    """Return the last-visited persona from cookie if valid, else the first available."""
+    names = list_user_personas(username)
+    if not names:
+        return None
+    cookie_val = request.cookies.get(_LAST_PERSONA_COOKIE, "")
+    if cookie_val in names:
+        return cookie_val
+    return names[0]
+
+
 # ---------------------------------------------------------------------------
 # Favicon — default sparkle; persona pages override via JS
 # ---------------------------------------------------------------------------
@@ -76,6 +90,18 @@ async def favicon():
    return Response(content=_FAVICON_SVG, media_type="image/svg+xml")


+@router.get("/sw.js", include_in_schema=False)
+async def service_worker():
+    from fastapi.responses import FileResponse
+    return FileResponse(str(_STATIC / "sw.js"), media_type="application/javascript")
+
+
+@router.get("/manifest.json", include_in_schema=False)
+async def web_manifest():
+    from fastapi.responses import FileResponse
+    return FileResponse(str(_STATIC / "manifest.json"), media_type="application/manifest+json")
+
+
 # ---------------------------------------------------------------------------
 # Root redirect
 # ---------------------------------------------------------------------------
@@ -85,7 +111,7 @@ async def root(request: Request):
    user = _get_session_user(request)
    if not user:
        return RedirectResponse("/login", status_code=302)
-    persona = _first_persona(user)
+    persona = _preferred_persona(request, user)
    if not persona:
        return HTMLResponse("<h1>No personas configured for your account.</h1>", status_code=500)
    return RedirectResponse(f"/{user}/{persona}", status_code=302)
@@ -100,7 +126,7 @@ async def login_page(request: Request):
    user = _get_session_user(request)
    if user:
        # Already logged in — redirect home
-        persona = _first_persona(user)
+        persona = _preferred_persona(request, user)
        if persona:
            return RedirectResponse(f"/{user}/{persona}", status_code=302)
    return HTMLResponse((_STATIC / "login.html").read_text())
@@ -254,7 +280,16 @@ async def api_personas(request: Request) -> dict:
    if not user:
        from fastapi import HTTPException
        raise HTTPException(status_code=401, detail="Not authenticated")
-    return {"user": user, "personas": list_user_personas(user)}
+    personas_with_emoji = []
+    for p in list_user_personas(user):
+        emoji = "✨"
+        identity_path = persona_path(user, p) / "IDENTITY.md"
+        if identity_path.exists():
+            m = re.search(r"\|\s*Emoji\s*\|\s*(.+?)\s*\|", identity_path.read_text())
+            if m:
+                emoji = m.group(1).strip()
+        personas_with_emoji.append({"name": p, "emoji": emoji})
+    return {"user": user, "personas": personas_with_emoji}


@router.get("/{username}/{persona}", include_in_schema=False)
@@ -288,4 +323,6 @@ async def serve_ui(username: str, persona: str, request: Request):
        f'{{user: "{username}", persona: "{persona}", emoji: "{emoji}"}};</script>'
    )
    html = html.replace("</head>", f"{config_tag}\n</head>", 1)
-    return HTMLResponse(html)
+    resp = HTMLResponse(html)
+    resp.set_cookie(_LAST_PERSONA_COOKIE, persona, max_age=365 * 86400, httponly=False, samesite="lax")
+    return resp
--- a/cortex/routers/usage.py
+++ b/cortex/routers/usage.py
@@ -0,0 +1,104 @@
+"""
+Usage / token-tracking endpoints.
+
+Self-service (any authenticated user, own data):
+  GET /api/usage              → full usage dict  {date: {model_key: {calls, prompt_tokens, completion_tokens}}}
+  GET /api/usage/summary      → aggregate totals per model key, with friendly labels resolved from registry
+
+Admin-only (cross-user aggregation):
+  GET /api/usage/all          → summary for every user  {username: summary_dict}
+"""
+import jwt
+from fastapi import APIRouter, HTTPException, Request
+
+from auth_utils import COOKIE_NAME, decode_token, get_user_role
+from persona import list_users
+import model_registry
+import usage_tracker
+
+router = APIRouter(prefix="/api/usage")
+
+
+def _session_user(request: Request) -> str:
+    token = request.cookies.get(COOKIE_NAME)
+    if not token:
+        raise HTTPException(status_code=401, detail="Not authenticated")
+    try:
+        return decode_token(token)
+    except jwt.InvalidTokenError:
+        raise HTTPException(status_code=401, detail="Invalid session")
+
+
+def _build_label_map(username: str) -> dict[str, str]:
+    """Build a map from usage key (backend/model_name) → registered label."""
+    label_map: dict[str, str] = {}
+    try:
+        for m in model_registry.get_all_models(username):
+            model_name = m.get("model_name", "")
+            label = m.get("label", "")
+            host_type = m.get("host_type", "")
+            if not model_name or not label:
+                continue
+            # local models: key is "local/{model_name}"
+            if host_type in ("openwebui", "ollama", "openai_compatible"):
+                label_map[f"local/{model_name}"] = label
+            # cloud Gemini: key is "gemini_api/{model_name}"
+            elif host_type == "google":
+                label_map[f"gemini_api/{model_name}"] = label
+    except Exception:
+        pass
+    return label_map
+
+
+def _summarize(data: dict, label_map: dict[str, str] | None = None) -> list[dict]:
+    """Collapse date-keyed usage dict into per-model totals, sorted by total tokens desc."""
+    totals: dict[str, dict] = {}
+    for _date, models in data.items():
+        for key, counts in models.items():
+            t = totals.setdefault(key, {"calls": 0, "prompt_tokens": 0, "completion_tokens": 0})
+            t["calls"]             += counts.get("calls", 0)
+            t["prompt_tokens"]     += counts.get("prompt_tokens", 0)
+            t["completion_tokens"] += counts.get("completion_tokens", 0)
+
+    result = []
+    for key, counts in totals.items():
+        entry = {
+            "key":              key,
+            "label":            (label_map or {}).get(key) or key,
+            "calls":            counts["calls"],
+            "prompt_tokens":    counts["prompt_tokens"],
+            "completion_tokens": counts["completion_tokens"],
+            "total_tokens":     counts["prompt_tokens"] + counts["completion_tokens"],
+        }
+        result.append(entry)
+
+    result.sort(key=lambda x: x["total_tokens"], reverse=True)
+    return result
+
+
+@router.get("")
+async def get_usage(request: Request) -> dict:
+    """Return the raw daily usage log for the authenticated user."""
+    username = _session_user(request)
+    return usage_tracker.read_usage(username)
+
+
+@router.get("/summary")
+async def get_usage_summary(request: Request) -> list:
+    """Return per-model totals (all time) for the authenticated user, with friendly labels."""
+    username = _session_user(request)
+    label_map = _build_label_map(username)
+    return _summarize(usage_tracker.read_usage(username), label_map)
+
+
+@router.get("/all")
+async def get_all_usage(request: Request) -> dict:
+    """Admin: return per-model summary for every user."""
+    username = _session_user(request)
+    if get_user_role(username) != "admin":
+        raise HTTPException(status_code=403, detail="Admin access required")
+    result = {}
+    for user in list_users():
+        label_map = _build_label_map(user)
+        result[user] = _summarize(usage_tracker.read_usage(user), label_map)
+    return result
--- a/cortex/scheduler.py
+++ b/cortex/scheduler.py
@@ -19,41 +19,95 @@ logger = logging.getLogger(__name__)
 _scheduler: AsyncIOScheduler | None = None


+def _all_personas() -> list[tuple[str, str]]:
+    """Return [(username, persona_name)] for every persona on this instance."""
+    from persona import list_users, list_user_personas
+    pairs = []
+    for u in list_users():
+        for p in list_user_personas(u):
+            pairs.append((u, p))
+    return pairs
+
+
 async def _run_short() -> None:
    from memory_distiller import distill_short
-    try:
-        result = distill_short()
-        logger.info("auto distill short: %d files, %d chars", result["files_included"], result["chars_written"])
-    except Exception as e:
-        logger.error("auto distill short failed: %s", e)
+    for u, p in _all_personas():
+        try:
+            result = distill_short(u, p)
+            logger.info("auto distill short [%s/%s]: %d files, %d chars", u, p, result["files_included"], result["chars_written"])
+        except Exception as e:
+            logger.error("auto distill short [%s/%s] failed: %s", u, p, e)


 async def _run_mid() -> None:
    from memory_distiller import distill_mid
    from notification import notify
-    try:
-        result = await distill_mid()
-        if "error" in result:
-            logger.warning("auto distill mid skipped: %s", result["error"])
-        else:
-            logger.info("auto distill mid: %d chars via %s", result["chars_written"], result["backend"])
-            await notify(result["username"], f"📝 Weekly memory digest complete ({result['chars_written']} chars via {result['backend']}).")
-    except Exception as e:
-        logger.error("auto distill mid failed: %s", e)
+    for u, p in _all_personas():
+        try:
+            result = await distill_mid(u, p)
+            if "error" in result:
+                logger.warning("auto distill mid [%s/%s] skipped: %s", u, p, result["error"])
+            else:
+                logger.info("auto distill mid [%s/%s]: %d chars via %s", u, p, result["chars_written"], result["backend"])
+                await notify(u, f"📝 Weekly memory digest complete ({result['chars_written']} chars via {result['backend']}).")
+        except Exception as e:
+            logger.error("auto distill mid [%s/%s] failed: %s", u, p, e)


 async def _run_long() -> None:
    from memory_distiller import distill_long
    from notification import notify
-    try:
-        result = await distill_long()
-        if "error" in result:
-            logger.warning("auto distill long skipped: %s", result["error"])
-        else:
-            logger.info("auto distill long: %d chars via %s", result["chars_written"], result["backend"])
-            await notify(result["username"], f"🧠 Monthly long-term memory integration complete ({result['chars_written']} chars via {result['backend']}). Worth a quick review.")
-    except Exception as e:
-        logger.error("auto distill long failed: %s", e)
+    for u, p in _all_personas():
+        try:
+            result = await distill_long(u, p)
+            if "error" in result:
+                logger.warning("auto distill long [%s/%s] skipped: %s", u, p, result["error"])
+            else:
+                logger.info("auto distill long [%s/%s]: %d chars via %s", u, p, result["chars_written"], result["backend"])
+                await notify(u, f"🧠 Monthly long-term memory integration complete ({result['chars_written']} chars via {result['backend']}). Worth a quick review.")
+        except Exception as e:
+            logger.error("auto distill long [%s/%s] failed: %s", u, p, e)
+
+
+async def _run_reminder_check() -> None:
+    """Notify users of any due or overdue reminders (fires once daily at 09:00)."""
+    import re
+    from notification import notify
+    from persona import set_context
+
+    for u, p in _all_personas():
+        try:
+            set_context(u, p)
+            from tools.reminders import load_due_reminders
+            content = load_due_reminders()
+            if not content:
+                continue
+
+            # Extract numbered entries (lines like "1. [label] text" or "1. text")
+            entries = []
+            for line in content.splitlines():
+                m = re.match(r"^\d+\.\s+(.+)", line.strip())
+                if m:
+                    # Strip status tags ([OVERDUE], [due TODAY], etc.) for display
+                    text = re.sub(r"\[(OVERDUE|due TODAY|due: \S+)\]", "", m.group(1)).strip()
+                    if text:
+                        entries.append(text)
+
+            if not entries:
+                continue
+
+            count = len(entries)
+            if count == 1:
+                msg = f"Reminder: {entries[0]}"
+            else:
+                bullet_list = "\n".join(f"• {e}" for e in entries[:3])
+                tail = f"\n…and {count - 3} more" if count > 3 else ""
+                msg = f"{count} reminders due:\n{bullet_list}{tail}"
+
+            await notify(u, msg)
+            logger.info("reminder check [%s/%s]: notified %d reminder(s)", u, p, count)
+        except Exception as e:
+            logger.error("reminder check [%s/%s] failed: %s", u, p, e)


 def get_scheduler() -> AsyncIOScheduler | None:
@@ -80,6 +134,10 @@ def start() -> None:
        _scheduler.add_job(_run_long, "cron", day=1, hour=4, minute=0, id="distill_long")
        logger.info("scheduled: distill_long monthly on 1st at 04:00")

+    # Daily reminder notification check — 09:00
+    _scheduler.add_job(_run_reminder_check, "cron", hour=9, minute=0, id="reminder_check")
+    logger.info("scheduled: reminder_check daily at 09:00")
+
    # Load user-defined cron jobs from CRONS.json
    _load_user_crons()

--- a/cortex/session_logger.py
+++ b/cortex/session_logger.py
@@ -1,6 +1,5 @@
 from datetime import datetime
-from config import settings
-from persona import persona_path
+from persona import persona_path, get_user, get_persona


 def log_turn(
@@ -21,11 +20,15 @@ def log_turn(
    meta_parts = [p for p in [backend_label, host] if p]
    meta = f" · {' / '.join(meta_parts)}" if meta_parts else ""

+    # Use the actual user/persona names from the current request context
+    user_label = get_user().title()
+    persona_label = get_persona().title()
+
    with open(log_file, "a") as f:
        if is_new:
            f.write(f"# Session Log — {today}\n")
        f.write(
            f"\n### [{timestamp}] `{session_id}`{meta}\n"
-            f"**{settings.user_name}:** {user_msg}\n\n"
-            f"**{settings.agent_name}:** {assistant_msg}\n"
+            f"**{user_label}:** {user_msg}\n\n"
+            f"**{persona_label}:** {assistant_msg}\n"
        )
--- a/cortex/session_store.py
+++ b/cortex/session_store.py
@@ -73,6 +73,17 @@ def save(session_id: str, messages: list[dict]) -> None:
    path.write_text(json.dumps(data, indent=2))


+def get_name(session_id: str) -> str:
+    """Return the friendly name for a session, or '' if none set."""
+    path = _path(session_id)
+    if not path.exists():
+        return ""
+    try:
+        return json.loads(path.read_text()).get("name", "")
+    except Exception:
+        return ""
+
+
 def rename(session_id: str, name: str) -> bool:
    """Set (or clear) the friendly name on a session. Returns False if not found."""
    path = _path(session_id)
@@ -101,16 +112,17 @@ def list_all() -> list[dict]:
    if not d.exists():
        return []
    results = []
-    for f in sorted(d.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True):
+    for f in d.glob("*.json"):
        try:
            data = json.loads(f.read_text())
-            entry = {
+            results.append({
                "session_id": data["session_id"],
                "name": data.get("name", ""),
                "updated": data.get("updated"),
                "message_count": len(data.get("messages", [])),
-            }
-            results.append(entry)
+                "_sort_key": data.get("updated") or f.stat().st_mtime,
+            })
        except Exception:
            pass
+    results.sort(key=lambda s: s.pop("_sort_key"), reverse=True)
    return results
--- a/cortex/static/HELP.md
+++ b/cortex/static/HELP.md
@@ -6,7 +6,24 @@
     and are appended automatically by help.html when present.
 -->

-*Last updated: 2026-03-27*
+*Last updated: 2026-05-13*
+
+---
+
+## Getting Started
+
+If this is your first time using Cortex, you need one thing before the chat will work: an AI model connected to your account.
+
+**Fastest path — OpenRouter:**
+OpenRouter gives you access to Claude, Gemini, and dozens of other models with a single API key.
+
+1. Get a free API key at [openrouter.ai/keys](https://openrouter.ai/keys)
+2. Go to **☰ → Account → [Set up OpenRouter →]** (shown automatically if no model is configured)
+3. Paste your key, pick a starting model, click **Connect**
+
+That's it — you're ready to chat.
+
+**Already past setup but seeing errors?** Go to **☰ → Account → Model Registry → Manage models** and confirm a model is assigned to the **Chat** role (Primary slot). If all slots are empty, add a model first.

 ---

@@ -15,21 +32,21 @@
 | Button | What it does |
 |---|---|
 | **Sessions** | Open the sessions panel — list, resume, or start sessions |
-| **Files** | Open the identity file editor (SOUL, MEMORY, etc.) |
-| **⚙ N** | Open the Settings panel (N = current context tier) |
+| **N** (sliders icon) | Open the Context & Memory panel (N = current context tier) |
+| **☰** | Settings menu — Files, push notification toggle, Account, Sign Out |
 | **?** | Open this help panel |

-The **⚙ Settings** panel contains all configuration options:
+The **Context & Memory** panel (sliders icon with tier number) contains all configuration options:

 | Section | Controls |
 |---|---|
 | **Context Tier** | T1 – T4 context depth |
 | **Memory Layers** | Toggle Long / Mid / Short memory on/off |
-| **Distill Memory** | Manually trigger short / mid / long / all distillation |
-| **Backend** | Active LLM backend — click to toggle claude ↔ gemini |
-| **Display** | Aa/A+/A− font size cycle · ☾/☀ theme toggle |
+| **Distill Memory** | Manually trigger Short / Mid / Long / All distillation |
+| **Model** | Active chat model — click to cycle through your configured slot models (Primary → Backup 1 → …) |
+| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L** cycles input area height · **⌃↵** toggles send shortcut |

-All header settings (theme, font size, tier, memory layers) persist in `localStorage` across page refreshes.
+All settings persist in `localStorage` across page refreshes.

 ---

@@ -38,25 +55,68 @@ All header settings (theme, font size, tier, memory layers) persist in `localSto
 - **Send:** `Ctrl+Enter` by default. Click `⌃↵` in the input controls to toggle to plain `Enter` mode.
 - **Stop:** Click **Stop** to cancel an in-progress response at any time.
 - **Edit a message:** Hover over any message → click **edit**. `Ctrl+Enter` saves, `Esc` cancels.
- **Delete a message:** Hover over any message → click **del**. Removes from session history.
- **Copy a response:** Hover over any assistant message → click **copy**.
+- **Delete a message:** Hover over any message → click **del**, then **confirm delete**.
+- **Copy:** Hover over any message → click **copy**.
 - **New line while typing:** `Shift+Enter` (in `Ctrl+Enter` mode) or `Shift+Enter` / Enter (in Enter mode).

+Each assistant response shows a small **model tag** below the message identifying which model and host responded.
+
 ---

-## Agent Mode
+## Tools (⚡)

-Click the **Agent** button in the input row to enable Agent mode. The button highlights and Send changes to **Run**.
+Click the **⚡** button in the input row to enable the Tools toggle. When lit (amber), **Send** changes to **Run** and messages are routed through the **orchestrator** instead of directly to the chat model.

-In Agent mode, messages are routed through the **orchestrator** instead of directly to Claude:
+The orchestrator runs a multi-step tool loop:

-1. **Gemini** runs a tool loop — searches the web, reads files, checks tasks, calls APIs as needed
-2. **Claude** receives the enriched context and writes the final response
-3. A `⚡ N tool calls: …` note appears below the response listing what was used
+1. The **orchestrator model** reasons about the request and calls tools as needed
+2. Tool results are fed back into the conversation; the loop continues until the model has what it needs
+3. The model produces the final user-facing reply — when the orchestrator role uses Gemini, Claude writes the final response; when it uses a local model, that same model writes it
+4. Expandable tool-call cards appear above the response — click any card to see the arguments sent and the result returned

-Agent mode is best for tasks that require research, multi-step reasoning, or tool use (e.g. "search for X", "add a task", "what's on my list?"). Regular chat is faster for conversational turns.
+The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**.

-Agent mode sessions persist to history exactly like regular chat — they survive page refreshes and appear in the Sessions panel.
+Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.
+
+Orchestrated sessions persist to history exactly like regular chat.
+
+### Available Tools
+
+69 tools across 17 categories. Tool schemas are narrowed per-message using keyword routing — only categories relevant to your request are sent, keeping token overhead low. Per-role tool sets provide additional filtering.
+
+| Category | Tools |
+|---|---|
+| **Web** | `web_search`, `http_fetch`, `web_read`, `http_post` |
+| **Project Files** | `project_file_read`, `project_file_list`, `file_stat`, `file_grep`, `file_diff`, `file_syntax_check` |
+| **Files** (admin) | `file_read`, `file_list`, `file_write`, `session_read`, `session_search` |
+| **Git** | `git_status`, `git_log`, `git_diff` |
+| **Shell** | `shell_exec`, `claude_allow_dir` |
+| **System** | `cortex_restart`, `cortex_logs`, `cortex_status`, `cortex_update` |
+| **Tasks** | `task_list`, `task_create`, `task_update`, `task_complete` |
+| **Cron** | `cron_list`, `cron_add`, `cron_remove`, `cron_toggle` |
+| **Reminders** | `reminders_add`, `reminders_list`, `reminders_remove`, `reminders_clear` |
+| **Scratchpad** | `scratch_read`, `scratch_write`, `scratch_append`, `scratch_clear` |
+| **Notifications** | `web_push`, `email_send`, `nc_talk_send`, `nc_talk_history` |
+| **Aether Journals** | `ae_journal_list/search`, `ae_journal_entries_list`, `ae_journal_entry_read/create/update/disable/append/prepend` |
+| **Aether Tasks** | `ae_task_list` |
+| **Aether Database** (admin) | `ae_db_query`, `ae_db_describe`, `ae_db_show_view` |
+| **Agent Notes** | `agent_notes_read`, `agent_notes_write`, `agent_notes_append`, `agent_notes_clear` |
+| **Agents** | `spawn_agent`, `aider_run` |
+| **Home Assistant** | `ha_get_state`, `ha_get_states`, `ha_call_service` |
+
+Files, Shell, System, Aether Database, Agents, and some Notification/Web tools are **admin-only** and not visible to regular users.
+`http_post` requires a URL prefix allowlist in `home/{user}/http_allowlist.json`.
+`nc_talk_history` requires `nc_username` and `nc_app_password` in `channels.json` under `nextcloud`.
+`ae_db_*` tools require Aether DB credentials configured in **Integrations** settings. All queries are SELECT-only — no writes possible.
+`aider_run` requires Aider installed (`pip install aider-chat`) and a model configured via `AIDER_MODEL` env var or the project's `.aider.conf.yml`. Supports any OpenAI-compatible backend — DeepSeek, OpenRouter, Ollama, etc.
+
+### Per-Role Tool Sets
+
+Each role can be configured with a specific subset of tool categories. When a role has a tool subset configured, only those tools are sent to the orchestrator — the rest are invisible to the model for that session.
+
+**Example:** a Coder role might only need Web, Files, Shell, and Agent Notes. A Research role might only need Web. Configuring this avoids sending schemas for 30+ irrelevant tools on every call.
+
+Configure per-role tool sets in **Account → Model Registry → Role Assignments** — expand a role card to see the category checkboxes. The default (no checkboxes selected) sends all tools the user has access to.

 ---

@@ -82,59 +142,215 @@ Notes are injected into a session without triggering an LLM response.

 ---

-## Backends
+## Install as App (PWA)

- **Claude CLI** and **Gemini CLI** are both available. One is primary, the other is fallback.
- Click **⚙** → **Backend** to toggle between `claude` and `gemini` as the primary.
- If the primary fails or times out, the fallback is used automatically. A **⚡** notice appears in the chat when this happens.
- Timeouts: Claude 60s, Gemini 120s.
+Cortex supports installation as a Progressive Web App — it runs in its own window with no browser chrome.
+
+- **Chrome / Edge (desktop):** Look for the install icon in the address bar, or open the browser menu → **Install Cortex…**
+- **Android (Chrome):** Tap ⋮ → **Add to Home Screen**
+- **iOS (Safari):** Tap the Share button → **Add to Home Screen**
+
+Once installed, opening Cortex from the home screen or app launcher skips the browser UI entirely.
+
+---
+
+## Switching Models
+
+The **Model** button in the Context & Memory panel cycles through the slot models configured for your active role (Primary → Backup 1). Click it to switch between models mid-session.
+
+- The button label shows the active model (e.g. "GPT-4o", "Gemini 2.5 Flash")
+- The selected slot is sent with each chat request so the correct model is used
+- If only one model is configured, the toggle does nothing
+- A system message appears in the chat when you switch models
+
+If the active model fails, the next configured backup slot is tried automatically.
+
+Each response shows a **model tag** (bottom-right of message) with the model label and host, so you always know what responded.
+
+---
+
+## Account Settings
+
+**Navigate to:** ☰ (top-right menu) → **Account**
+
+| Section | What you can do |
+|---|---|
+| **Account** | View your username, role badge (Admin / User), rename your username |
+| **Connected Accounts** | See which Google account is linked for OAuth sign-in |
+| **Email Allowlist** | Regex patterns controlling which addresses the `email_send` tool can reach |
+| **Notifications** | Dedicated page — set channel (Browser Push, NC Talk, Google Chat, email) for proactive messages; configure Home Assistant inbound webhook; test buttons for instant verification |
+| **Schedules** | View, add, edit, pause, and delete scheduled jobs directly — without going through the AI |
+| **Tool Permissions** | Allow or block specific orchestrator tools for your account |
+| **Usage** | Token consumption by model — see below |
+| **Browser Cache** | Clear UI preferences stored locally (theme, font size, session ID, etc.) |
+| **Model Registry** | Configure AI providers, local hosts, and role assignments |
+| **Change Password** | Update your login password |
+| **Personas** | List and rename your personas |
+
+---
+
+## Usage
+
+Token consumption is tracked automatically for API-backed models. **Navigate to:** ☰ → **Account** → **Usage** section.
+
+The table shows all-time totals per model key, with columns for:
+
+| Column | Meaning |
+|---|---|
+| **Model** | `backend/model-name` key (e.g. `gemini_api/gemini-2.5-flash`, `local/deepseek-v4`) |
+| **Calls** | Number of API calls made |
+| **Prompt** | Input tokens sent |
+| **Output** | Completion tokens received |
+| **Total** | Prompt + Output |
+
+Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).
+
+**What is and isn't tracked:**
+
+- ✅ Gemini API calls (orchestrator, distillation)
+- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
+- ✗ Claude CLI — no structured token data is returned by the subprocess
+- ✗ Gemini CLI — same reason
+
+The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.
+
+---
+
+## Model Registry
+
+Configure which AI models are available and which handles each task type.
+
+**New user quick path:** ☰ → **Account** → **Set up OpenRouter →** (the guided wizard adds a host, model, and role assignment in one step).
+
+**Full manual path:** ☰ → **Account** → scroll to **Model Registry** → **Manage models →**
+
+---
+
+### Step 1 — Set up providers and hosts
+
+Do this before adding models — models need a provider account or local host to attach to.
+
+**Anthropic (Claude):** Two options:
+- **CLI (OAuth):** Nothing to configure — uses your existing `claude auth login` session. If Claude isn't working, run `claude auth login` in a terminal.
+- **Direct API key:** Scroll to **Cloud Providers → Anthropic** → click **+ Add API key**. Enter a label and your `sk-ant-…` key from [console.anthropic.com/keys](https://console.anthropic.com/keys). When you add a model using an API key credential, it routes through the Anthropic SDK instead of the CLI.
+
+**Google (Gemini):** Add one entry per API key you want to use:
+1. Scroll to **Cloud Providers → Google** → click **+ Add Google account**
+2. Enter a label (e.g. "Work", "Personal") and your API key
+3. Get a free key at [aistudio.google.com/apikey](https://aistudio.google.com/apikey)
+
+**OpenRouter** (recommended for new users — one key for many models):
+1. Get a key at [openrouter.ai/keys](https://openrouter.ai/keys)
+2. Scroll to **Local Hosts** → **+ Add host**
+3. Label: "OpenRouter", URL: `https://openrouter.ai/api/v1`, paste your key, Type: OpenAI-compatible
+4. Click **Fetch models** to verify, then add models from the fetched list
+
+**Other local hosts** (Open WebUI, Ollama, LM Studio, etc.):
+1. Scroll to **Local Hosts** → click **+ Add host** to expand the form
+2. Enter a label, the API URL (e.g. `http://192.168.1.100:3000`), and optional API key
+3. Set **Type**: Open WebUI / Ollama, or OpenAI-compatible
+4. Click **Fetch models** on the saved host card to verify connectivity
+
+---
+
+### Step 2 — Add models
+
+Scroll to **Add Model**. Select the provider tab, fill in the details, click **Add Model**:
+
+| Tab | What you need |
+|---|---|
+| **Local** | Select a host (from Step 1) → enter model name, or use **Fetch from host** to pick from a live list |
+| **Google** | Select a Gemini model from the catalog → select a Google account (from Step 1) |
+| **Anthropic** | Select a credential (CLI OAuth or an API key added in Step 1) → select a Claude model from the catalog |
+
+The label and context window size auto-fill from the catalog — edit them if you want. Tags are optional.
+
+---
+
+### Step 3 — Assign models to roles
+
+Scroll to **Role Assignments** at the bottom of the page. Each role has **Primary** and **Backup 1** slots — Primary is tried first, then Backup 1. Changes save automatically.
+
+**Required roles** (always present, cannot be removed):
+
+| Role | Used for |
+|---|---|
+| **Chat** | Regular conversation |
+| **Orchestrator** | Agent mode tool loop |
+| **Distill** | Memory distillation (short / mid / long) |
+
+**Custom roles** — Click **+ Add custom role** to create your own. Each custom role gets its own model selection, tool set, and system prompt addition. Good examples:
+
+| Example | Purpose |
+|---|---|
+| **Coder** | Code-focused tasks — larger context window, code-aware model |
+| **Research** | Long-context research — high-token model, web tools prioritized |
+
+Switch roles via the **Role** selector in the Context & Memory panel (⚙). Leave all slots empty to use the server default.
+
+**Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).
+
+**Inject timestamp:** Each role card has an "Inject current date & time into system prompt" checkbox (default on). Disable it for pure processing roles (summarizer, classifier, translator) that don't need clock awareness.

 ---

 ## Nextcloud Talk Bot

-Inara is registered as a bot in Nextcloud Talk.
+The Cortex bot is registered in Nextcloud Talk.

- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to by Inara.
- The webhook returns `200 OK` immediately; the LLM call and reply happen asynchronously.
+- Messages sent in enabled Talk conversations are received by Cortex, processed, and replied to.
+- The webhook returns `200 OK` immediately; the reply happens asynchronously.
 - Real-time updates stream to the web UI via SSE — you see Talk messages and responses appear live.
- To enable the bot in a conversation: open Talk conversation settings → Bots → enable Inara.
+- To enable the bot in a conversation: open Talk conversation settings → Bots → enable the bot.

 ---

 ## Google Chat Bot

-Inara is available as a bot in Google Chat (One Sky IT Workspace).
+The Cortex bot is available in Google Chat (One Sky IT Workspace).

- Send Inara a direct message in Google Chat to start a conversation.
+- Send the bot a direct message in Google Chat to start a conversation.
 - Each DM thread is its own session (`gc_spaces/*` prefix) — history persists across messages.
- Responses are synchronous — Google Chat displays Inara's reply directly in the thread.
- To add Inara to a space: open the space, add a person/app, search for **Inara**.
+- Responses are synchronous — Google Chat displays the reply directly in the thread.
+- To add the bot to a space: open the space, click **Add people & apps**, and search for the Cortex bot.
 - Sessions from Google Chat appear as `gc_*` prefixed IDs in the Sessions panel.

-**Technical note:** Cortex uses Google's Workspace Add-on format (`hostAppDataAction`) — the modern API required for all Google Chat apps as of 2025.
-
 ---

 ## Files (Identity Editor)

-The **Files** button opens an editor for Inara's identity and memory files:
+The **Files** button opens an editor for your persona's identity and memory files:

 | File | Purpose |
 |---|---|
 | `SOUL.md` | Core personality, values, and voice |
 | `IDENTITY.md` | Role, capabilities, and context |
-| `USER.md` | Scott's profile, preferences, and history |
+| `USER.md` | Your profile, preferences, and history |
 | `PROTOCOLS.md` | Behavioural rules and communication protocols |
 | `CONTEXT_TIERS.md` | Defines what gets loaded at each context tier |
 | `MEMORY_LONG.md` | Permanent curated long-term memory |
 | `MEMORY_MID.md` | Rolling mid-term digest (LLM-distilled) |
 | `MEMORY_SHORT.md` | Recent session rollup (auto-aggregated) |
-| `TASKS.json` | Inara's personal task list (managed via Agent mode) |
-| `HELP.md` | This file |
+| `HELP.md` | This file — persona-specific additions appended below |
+| `email_allowlist.json` | Regex patterns for permitted `email_send` recipients (one per line) |

 Toggle **preview** / **edit** to switch between rendered markdown and raw text. **Ctrl+S** saves, **Esc** closes.

+The **Audit Log** group at the bottom of the sidebar (collapsed by default) lists tool call logs by date (`YYYY-MM-DD.jsonl`). Click any date to view a read-only table of every orchestrator tool call: time, tool name, status, model, args, and result snippet. Status is colour-coded: green = ok, red = error, amber = denied.
+
+---
+
+## Push Notifications
+
+Cortex can send browser push notifications — even when the tab is closed.
+
+- Open **☰ → Enable notifications** and accept the browser permission prompt.
+- Once enabled, the button shows **Notifications on** (in accent colour).
+- Click again to disable. Subscriptions are stored per-device.
+- The orchestrator's `web_push` tool lets your persona send you a push proactively (e.g. when a long task completes).
+
+**Notification channel settings:** ☰ → **Account** → **Notification settings →** — choose Browser Push, Email, Nextcloud Talk, or Google Chat as the channel your persona uses for scheduled reminders, cron job completions, and memory digests. Use the **Send Test Notification** button to verify your setup, or **Check Reminders Now** to trigger the reminder check immediately.
+
 ---

 ## Context & Memory ( ⚙ panel )
@@ -145,28 +361,28 @@ Controls how much context is prepended to each LLM call:

 | Tier | Loads | ~Tokens |
 |---|---|---|
-| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
-| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
-| **T3** | + last 2 raw session logs | ~15,000 |
-| **T4** | + last 7 raw session logs | ~50,000 |
+| **Min** | SOUL + IDENTITY + USER summary | ~1,500 |
+| **Std** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
+| **Ext** | + last 2 raw session logs | ~15,000 |
+| **Full** | + last 7 raw session logs | ~50,000 |

-Default is T2. Use T1 for small/local models. Use T3–T4 for complex multi-session tasks.
+Default is **Std**. Use **Min** for small/local models. Use **Ext** or **Full** for complex multi-session tasks.

 ### Memory Layers

-Three independently toggleable memory files, loaded **Long → Mid → Short** (short sits closest to the conversation turn for better LLM recall):
+Three independently toggleable memory files, loaded **Long → Mid → Short**:

 | Layer | File | Contents |
 |---|---|---|
-| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, Scott's profile highlights |
+| **Long** | `MEMORY_LONG.md` | Permanent facts — origin, key decisions, profile highlights |
 | **Mid** | `MEMORY_MID.md` | Rolling digest of recent weeks — LLM-distilled from Short |
-| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session log files |
+| **Short** | `MEMORY_SHORT.md` | Recent session rollup — auto-aggregated from session logs |

-Toggle any layer off to save tokens for a focused conversation where history isn't needed.
+Toggle any layer off to save tokens for a focused conversation.

-### Memory Distillation (manual)
+### Memory Distillation

-Distillation builds up the memory layers from raw session logs. Currently **manual** — trigger via the ⚙ panel:
+Distillation builds up the memory layers from raw session logs. Runs automatically on a schedule; trigger manually via the ⚙ panel:

 | Button | What it does |
 |---|---|
@@ -175,12 +391,54 @@ Distillation builds up the memory layers from raw session logs. Currently **manu
 | **long** | LLM integrates `MEMORY_MID.md` → `MEMORY_LONG.md` |
 | **all** | Runs short → mid → long in sequence |

-**Recommended workflow:**
- Run **short** after any productive session to capture it.
- Run **mid** weekly to distil short → mid.
- Run **long** monthly to absorb mid into permanent memory.
+**Recommended workflow:** run **short** after any productive session; **mid** weekly; **long** monthly.

-Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BUDGET_MID`, `MEMORY_BUDGET_SHORT`).
+---
+
+## Scheduled Jobs
+
+Cortex can run recurring jobs on a schedule — reminders, daily briefings, automated research, and more. Manage them by asking your persona to set them up, or go directly to **☰ → Account → Schedules**.
+
+### Job Types
+
+| Type | What it does |
+|---|---|
+| `remind` | Appends to `REMINDERS.md` — automatically surfaced in chat context |
+| `note` | Appends to `SCRATCH.md` — read on demand via the scratchpad |
+| `message` | Sends the payload text directly to your notification channel |
+| `brief` | Calls the AI with your payload as the prompt, sends the response to your notification channel. Good for morning briefings, check-ins. |
+| `task` | Runs the full orchestrator tool loop with your payload as the request, sends Claude's response to your notification channel. Use this for agentic scheduled work: research, file updates, summaries that need tool access. |
+
+For `task` jobs: tools that require confirmation are skipped in scheduled context. Pre-approve them in **Settings → Tools** to allow them in scheduled tasks.
+
+### Schedule Formats
+
+| Format | When it runs |
+|---|---|
+| `hourly` | Every hour at :00 |
+| `daily` | Every day at 09:00 |
+| `daily:HH:MM` | Every day at the specified time |
+| `weekly:DOW` | Every specified day at 09:00 (e.g. `weekly:mon`) |
+| `weekly:DOW:HH:MM` | Every specified day at the specified time (e.g. `weekly:fri:17:00`) |
+| `monthly` | 1st of every month at 09:00 |
+| `monthly:DD` | Specific day of month at 09:00 (e.g. `monthly:15`) |
+| `monthly:DD:HH:MM` | Specific day of month at the specified time |
+| `yearly:MM:DD` | Every year on that date at 09:00 — for birthdays, anniversaries (e.g. `yearly:03:15`) |
+| `yearly:MM:DD:HH:MM` | Every year on that date at the specified time |
+
+DOW values: `mon tue wed thu fri sat sun`. All times are server-local.
+
+Schedules take effect immediately when added or edited — no restart needed. Paused jobs stay in the list and can be resumed at any time.
+
+### Home Assistant Integration
+
+HA automations can trigger your persona via webhook. Configure in **Notifications → Home Assistant → Inbound webhook**:
+
+- Set a **Webhook ID** (long random string — this is your secret URL component)
+- Your endpoint: `https://cortex.dgrzone.com/webhook/ha/{username}/{webhook_id}`
+- **Enable orchestrator tools** — when checked, HA events trigger the full tool loop; when unchecked, events get a direct LLM response (faster, no tools)
+
+HA payload fields recognized: `message`, `entity_id`, `state`, `trigger`, `event`, `area`.

 ---

@@ -192,9 +450,8 @@ Token budgets for each layer are set in `.env` (`MEMORY_BUDGET_LONG`, `MEMORY_BU
 | `Enter` | Send (when in Enter mode) |
 | `Shift+Enter` | New line in message input |
 | `Ctrl+Enter` | Save inline message edit |
-| `Esc` | Cancel inline edit |
+| `Esc` | Cancel inline edit / close any open modal |
 | `Ctrl+S` | Save file (Files modal) |
-| `Esc` | Close any open modal |

 ---

@@ -219,10 +476,26 @@ For direct access or scripting:
 | `POST` | `/distill/mid` | Summarize short → MEMORY_MID (LLM) |
 | `POST` | `/distill/long` | Integrate mid → MEMORY_LONG (LLM) |
 | `POST` | `/distill/all` | Run all three distillation steps |
-| `GET` | `/distill/status` | Show scheduler status and next run times |
+| `GET` | `/distill/status` | Scheduler status and next run times |
 | `POST` | `/orchestrate` | Submit an agent task — returns `{"job_id": "..."}` |
 | `GET` | `/orchestrate/{job_id}` | Poll job status and result |
-| `GET` | `/orchestrate` | List all jobs from current session (in-memory) |
+| `GET` | `/settings/models` | Model registry UI |
+| `POST` | `/api/models/role` | Set a role assignment (JSON body) |
+| `POST` | `/api/models/role-config` | Set per-role tool list and system prompt append |
+| `GET` | `/api/push/vapid-key` | VAPID public key (for push subscription) |
+| `POST` | `/api/push/subscribe` | Register a push subscription |
+| `DELETE` | `/api/push/subscribe` | Remove a push subscription |
+| `POST` | `/api/push/test` | Send a test notification via configured channel |
+| `POST` | `/api/push/reminders/check` | Run reminder check immediately; returns `{"reminders_found": n}` |
+| `GET` | `/api/audit/files` | List available audit log dates (own data) |
+| `GET` | `/api/audit/day?date=` | Tool call entries for a specific date (own data) |
+| `GET` | `/api/audit/recent` | Recent tool calls across days (admin) |
+| `GET` | `/api/audit/stats` | Tool call counts by tool/status/user (admin) |
+| `GET` | `/api/usage` | Full daily token usage log (own data) |
+| `GET` | `/api/usage/summary` | Per-model token totals, all time (own data) |
+| `GET` | `/api/usage/all` | Per-model totals for all users (admin) |
+| `GET` | `/setup/model` | Guided OpenRouter setup form (Step 3 / standalone) |
+| `POST` | `/setup/model` | Save OpenRouter host + model + assign to chat role |
 | `GET` | `/health` | Health check — returns `{"status": "ok"}` |

 Chat request body (`POST /chat`):
@@ -230,33 +503,16 @@ Chat request body (`POST /chat`):
 {
  "message": "string",
  "session_id": "string | null",
-  "tier": 1,
-  "model": "claude | gemini | null",
+  "tier": 2,
+  "chat_role": "chat",
+  "slot": "primary | backup_1 | backup_2 | null",
  "include_long": true,
  "include_mid": true,
-  "include_short": true
+  "include_short": true,
+  "off_record": false
 }
 ```

 ---

-## In Progress / Planned
-
- **Ollama local model backend** — direct Ollama API support (no CLI wrapper); target host: scott_gaming via WireGuard
- **Nextcloud Talk stabilization** — test end-to-end after restarts; complete bot registration docs
- **Multi-user support** — per-user identity/memory files; currently single-user (Scott); Holly instance planned
-
-### Recently Completed
-
- ✓ **Google Chat bot** — Workspace Add-on integration; DM and spaces; JWT verification; session persistence
- ✓ **Agent mode** — Gemini tool loop + Claude responder, accessible via UI toggle
- ✓ **Personal task management** — `task_list`, `task_create`, `task_update`, `task_complete` tools backed by `TASKS.json`
- ✓ **Web search fixed** — DDG package updated (`ddgs`); `WebSearch`/`WebFetch` allowed for Claude CLI fallback
- ✓ **Session persistence for orchestrator** — agent mode turns now survive page refresh
- ✓ **Systemd user service** — Cortex runs as a user service; no sudo required (`systemctl --user restart cortex`)
- ✓ **OAuth token warning banner** — amber banner when Claude CLI token is within 24h of expiry
-
---
-
-*Cortex is Scott's personal AI orchestration system. Inara is its primary resident agent.*
-*Built on FastAPI + Claude CLI + Gemini CLI. Named after Firefly.*
+*Cortex is a self-hosted personal AI platform. Named after the 'verse-wide communications network in Firefly.*
--- a/cortex/static/TOOLS.md
+++ b/cortex/static/TOOLS.md
@@ -0,0 +1,123 @@
+# Tool Reference
+
+> This reference covers all 45 orchestrator tools available when the ⚡ toggle is on.
+> Tools are invoked automatically by the orchestrator — you don't call them directly.
+
+¹ **Admin only** — requires the `admin` role. Invisible to regular users.  
+² **Confirmation required** — the orchestrator pauses and shows **Confirm / Deny** buttons before executing.
+
+---
+
+## Web
+
+| Tool | What it does |
+|---|---|
+| `web_search` | DuckDuckGo search — returns titles, URLs, and snippets for the top results |
+| `http_fetch` | Fetch a specific URL and return the response body (8 192 char cap) |
+
+## Files ¹
+
+| Tool | What it does |
+|---|---|
+| `file_read` ¹ | Read any file under the persona home directory |
+| `file_list` ¹ | List files and directories with sizes (200 entry cap) |
+| `file_write` ¹ ² | Write or append to a file under the persona home directory |
+
+## Shell ¹
+
+| Tool | What it does |
+|---|---|
+| `shell_exec` ¹ ² | Run any shell command on the Cortex host; timeout 1–120 s |
+| `claude_allow_dir` ¹ | Add a directory to Claude Code's auto-allowed paths |
+
+## System ¹
+
+| Tool | What it does |
+|---|---|
+| `cortex_restart` ¹ ² | Restart the Cortex service (5 s delay); connection drops — refresh the page |
+| `cortex_logs` ¹ | Recent lines from the systemd journal (default 50, max 200) |
+| `cortex_status` ¹ | Current git branch, commit, ahead/behind remote, and service state |
+| `cortex_update` ¹ ² | `git pull` + syntax check all `.py` files; reports what changed. Does **not** restart automatically — call `cortex_restart` after reviewing |
+
+## Tasks
+
+| Tool | What it does |
+|---|---|
+| `task_list` | List personal tasks; pass `include_done=true` to include completed |
+| `task_create` | Create a task with title, optional notes and due date |
+| `task_update` | Update any fields on an existing task |
+| `task_complete` | Mark a task as complete |
+
+## Cron
+
+| Tool | What it does |
+|---|---|
+| `cron_list` | List all scheduled jobs for this persona |
+| `cron_add` | Add a scheduled job — accepts cron syntax or plain-English interval |
+| `cron_remove` ² | Remove a scheduled job by ID |
+| `cron_toggle` | Enable or disable a job without removing it |
+
+## Reminders
+
+| Tool | What it does |
+|---|---|
+| `reminders_add` | Add a reminder with optional label; surfaced in context at Tier 2+ |
+| `reminders_list` | List all pending reminders, numbered for easy removal |
+| `reminders_remove` | Remove a single reminder by number (call `reminders_list` first) |
+| `reminders_clear` ² | Clear all reminders at once |
+
+## Scratchpad
+
+| Tool | What it does |
+|---|---|
+| `scratch_read` | Read the current scratchpad |
+| `scratch_write` | Overwrite the scratchpad with new content |
+| `scratch_append` | Append a timestamped section to the scratchpad |
+| `scratch_clear` | Erase the scratchpad |
+
+## Notifications ¹
+
+| Tool | What it does |
+|---|---|
+| `web_push` | Send a browser push notification to the active user's registered devices |
+| `email_send` ¹ | Send an email via SMTP; recipient must match your `email_allowlist.json` |
+| `nc_talk_send` ¹ | Send a message to a Nextcloud Talk conversation |
+
+## Aether Journals
+
+| Tool | What it does |
+|---|---|
+| `ae_journal_list` | List all journals for the configured AE account (returns names + IDs) |
+| `ae_journal_search` | Search entries by keyword, tag, date range, type, status, or priority |
+| `ae_journal_entries_list` | Browse all entries in a specific journal, newest first; paginated |
+| `ae_journal_entry_read` | Read the full content of a single entry by ID |
+| `ae_journal_entry_create` | Create a new entry with title, content, tags, and summary |
+| `ae_journal_entry_update` | Patch any fields on an existing entry (title, content, tags, summary, enable) |
+| `ae_journal_entry_disable` | Soft-delete an entry (`enable=false`) without permanently removing it |
+| `ae_journal_entry_append` | Append a timestamped section to the bottom of an entry's content |
+| `ae_journal_entry_prepend` | Prepend a timestamped section to the top of an entry's content |
+
+## Aether Tasks ¹
+
+| Tool | What it does |
+|---|---|
+| `ae_task_list` ¹ | List tasks from the agents_sync Kanban board |
+
+## Agent Notes
+
+Private, durable notes visible only to the orchestrator — not surfaced to users. Persist across sessions. Only available in orchestrated (tool-enabled) sessions.
+
+| Tool | What it does |
+|---|---|
+| `agent_notes_read` | Read the current private notes file |
+| `agent_notes_write` | Overwrite the notes file completely |
+| `agent_notes_append` | Append a timestamped entry (keeps last 3 backups automatically) |
+| `agent_notes_clear` | Erase all notes (backs up first) |
+
+## Agents ¹
+
+Spawn sub-agents that run their own tool loop using a specific role's model and tools.
+
+| Tool | What it does |
+|---|---|
+| `spawn_agent` ¹ | Spawn a sub-agent synchronously — blocks until the task completes or times out. Params: `task`, `role` (default `chat`), `tier` (1–4, default 1), `timeout` seconds, `max_rounds` override. Only works with `local_openai` and `gemini_api` models. |
--- a/cortex/static/app.js
+++ b/cortex/static/app.js
--- a/cortex/static/crons.html
+++ b/cortex/static/crons.html
@@ -0,0 +1,172 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <title>Cortex — Schedules</title>
+  <link rel="preconnect" href="https://fonts.googleapis.com">
+  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
+  <script src="https://cdn.tailwindcss.com"></script>
+  <script>
+  tailwind.config = {
+    corePlugins: { preflight: false },
+    darkMode: ['selector', '[data-theme="dark"]'],
+    theme: {
+      extend: {
+        colors: {
+          pg: {
+            bg:      'var(--pg-bg)',
+            surface: 'var(--pg-surface)',
+            border:  'var(--pg-border)',
+            text:    'var(--pg-text)',
+            muted:   'var(--pg-muted)',
+            dim:     'var(--pg-dim)',
+            dimmer:  'var(--pg-dimmer)',
+            bright:  'var(--pg-bright)',
+            accent:  'var(--pg-accent)',
+            action:  'var(--pg-action)',
+          }
+        },
+        fontFamily: { sans: ['Inter', 'system-ui', 'sans-serif'] }
+      }
+    }
+  }
+  </script>
+  <link rel="stylesheet" href="/static/pg.css">
+  <script>(function(){var t=localStorage.getItem('theme')||(window.matchMedia('(prefers-color-scheme: dark)').matches?'dark':'light');document.documentElement.setAttribute('data-theme',t);})();</script>
+  <style>
+    /* ── Server-generated table + badges ── */
+    .cron-table {
+      width: 100%; border-collapse: collapse; font-size: 0.82rem;
+      margin-bottom: 1.5rem;
+    }
+    .cron-table th {
+      text-align: left; padding: 0.4rem 0.6rem;
+      border-bottom: 2px solid var(--pg-border);
+      color: var(--pg-muted); font-weight: 600; font-size: 0.75rem;
+      text-transform: uppercase; letter-spacing: 0.04em;
+    }
+    .cron-table td {
+      padding: 0.5rem 0.6rem; border-bottom: 1px solid var(--pg-border);
+      vertical-align: middle;
+    }
+    .cron-table tr:last-child td { border-bottom: none; }
+    .cron-table tr:hover td { background: var(--pg-hover); }
+
+    .badge {
+      display: inline-block; padding: 0.15rem 0.45rem;
+      border-radius: 4px; font-size: 0.72rem; font-weight: 600;
+      text-transform: uppercase; letter-spacing: 0.03em;
+    }
+    .badge-enabled  { background: color-mix(in srgb, var(--pg-accent) 18%, transparent); color: var(--pg-accent); }
+    .badge-paused   { background: color-mix(in srgb, var(--pg-muted) 18%, transparent);  color: var(--pg-muted); }
+    .badge-remind   { background: color-mix(in srgb, #a78bfa 15%, transparent); color: #a78bfa; }
+    .badge-note     { background: color-mix(in srgb, #60a5fa 15%, transparent); color: #60a5fa; }
+    .badge-message  { background: color-mix(in srgb, #34d399 15%, transparent); color: #34d399; }
+    .badge-brief    { background: color-mix(in srgb, #fb923c 15%, transparent); color: #fb923c; }
+    .badge-task     { background: color-mix(in srgb, #f472b6 15%, transparent); color: #f472b6; }
+
+    .cron-actions { display: flex; gap: 0.35rem; }
+    .btn-cron {
+      padding: 0.2rem 0.55rem; border-radius: 4px; border: 1px solid var(--pg-border);
+      background: transparent; color: var(--pg-muted); font-size: 0.75rem; cursor: pointer;
+      font-family: inherit;
+    }
+    .btn-cron:hover { border-color: var(--pg-accent); color: var(--pg-accent); }
+    .btn-cron-del { color: var(--pg-dimmer); }
+    .btn-cron-del:hover { border-color: #ef4444; color: #ef4444; }
+
+    .payload-cell {
+      max-width: 240px; overflow: hidden; text-overflow: ellipsis;
+      white-space: nowrap; color: var(--pg-dimmer);
+    }
+
+    .persona-group-label {
+      font-size: 0.72rem; font-weight: 700; text-transform: uppercase;
+      letter-spacing: 0.06em; color: var(--pg-dimmer); margin: 1.25rem 0 0.5rem;
+    }
+
+    .empty-state {
+      text-align: center; padding: 2rem 1rem;
+      color: var(--pg-dimmer); font-size: 0.85rem;
+      border: 1px dashed var(--pg-border); border-radius: 8px;
+      margin-bottom: 1.5rem;
+    }
+  </style>
+</head>
+<body>
+  <nav class="page-nav">
+    <a href="{{ back_href }}" class="nav-link">← Chat</a>
+    <a href="{{ help_href }}" class="nav-link">Help</a>
+    <a href="/settings" class="nav-link">Settings</a>
+    <a href="/settings/models" class="nav-link">Models</a>
+    <a href="/settings/notifications" class="nav-link">Notifications</a>
+    <a href="/settings/tools" class="nav-link">Tools</a>
+    <a href="/settings/crons" class="nav-link active">Schedules</a>
+    {{ integrations_nav }}
+    <span class="nav-spacer"></span>
+    <a href="/logout" class="nav-link nav-logout">Sign out</a>
+  </nav>
+  <div class="page-wrap">
+    <h1 class="page-title">Schedules</h1>
+    <p class="page-subtitle">Recurring jobs — reminders, notes, briefings, and agentic tasks.</p>
+
+    <!-- SUCCESS -->
+    <!-- ERROR -->
+
+    <!-- Edit form (shown only when editing) -->
+    {{ edit_html }}
+
+    <!-- Cron list -->
+    {{ cron_list_html }}
+
+    <!-- Add new schedule -->
+    <div class="section">
+      <h2>Add schedule</h2>
+      <form method="POST" action="/settings/crons/add">
+        <div class="grid grid-cols-2 gap-x-3">
+          <div class="field">
+            <label for="add_persona">Persona</label>
+            <select id="add_persona" name="persona">
+              {{ persona_options }}
+            </select>
+          </div>
+          <div class="field">
+            <label for="add_job_type">Type</label>
+            <select id="add_job_type" name="job_type">
+              <option value="remind">remind — append to REMINDERS.md</option>
+              <option value="note">note — append to SCRATCH.md</option>
+              <option value="message">message — send payload as-is</option>
+              <option value="brief">brief — LLM response, no tools</option>
+              <option value="task">task — full orchestrator tool loop</option>
+            </select>
+          </div>
+          <div class="field">
+            <label for="add_label">Label</label>
+            <input type="text" id="add_label" name="label"
+                   placeholder="Monday morning summary"
+                   required autocomplete="off">
+          </div>
+          <div class="field">
+            <label for="add_schedule">Schedule</label>
+            <input type="text" id="add_schedule" name="schedule"
+                   placeholder="weekly:mon:08:00"
+                   required autocomplete="off" spellcheck="false">
+            <p class="hint">
+              hourly · daily · daily:HH:MM · weekly:DOW · weekly:DOW:HH:MM ·
+              monthly · monthly:DD · monthly:DD:HH:MM · yearly:MM:DD · yearly:MM:DD:HH:MM
+            </p>
+          </div>
+          <div class="field col-span-2">
+            <label for="add_payload">Payload / prompt</label>
+            <textarea id="add_payload" name="payload" rows="3"
+                      placeholder="Check my open tasks and send a summary." required></textarea>
+          </div>
+        </div>
+        <button type="submit" class="btn-submit w-full md:w-96">Add schedule</button>
+      </form>
+    </div>
+  </div>
+</body>
+</html>
--- a/cortex/static/help.html
+++ b/cortex/static/help.html
@@ -8,157 +8,174 @@
  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
  <script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
+  <script src="https://cdn.tailwindcss.com"></script>
+  <script>
+  tailwind.config = {
+    corePlugins: { preflight: false },
+    darkMode: ['selector', '[data-theme="dark"]'],
+    theme: {
+      extend: {
+        colors: {
+          pg: {
+            bg:      'var(--pg-bg)',
+            surface: 'var(--pg-surface)',
+            border:  'var(--pg-border)',
+            text:    'var(--pg-text)',
+            muted:   'var(--pg-muted)',
+            dim:     'var(--pg-dim)',
+            dimmer:  'var(--pg-dimmer)',
+            bright:  'var(--pg-bright)',
+            accent:  'var(--pg-accent)',
+            action:  'var(--pg-action)',
+          }
+        },
+        fontFamily: { sans: ['Inter', 'system-ui', 'sans-serif'] }
+      }
+    }
+  }
+  </script>
+  <link rel="stylesheet" href="/static/pg.css">
+  <script>(function(){var t=localStorage.getItem('theme')||(window.matchMedia('(prefers-color-scheme: dark)').matches?'dark':'light');document.documentElement.setAttribute('data-theme',t);})();</script>
  <style>
-    *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
+    /* ── Tab panels (JS-toggled display) ── */
+    .tab-panel { display: none; }
+    .tab-panel.active { display: block; }

-    body {
-      min-height: 100vh;
-      background: #0f1117;
-      font-family: 'Inter', system-ui, -apple-system, sans-serif;
-      font-weight: 450;
-      -webkit-font-smoothing: antialiased;
-      -moz-osx-font-smoothing: grayscale;
-      color: #e2e8f0;
-      padding: 2rem 1.5rem;
-    }
+    /* ── Dynamically-rendered markdown content ── */
+    .help-body { line-height: 1.7; }

-    .page {
-      max-width: 720px;
-      margin: 0 auto;
-    }
-
-    .page-nav {
-      display: flex;
-      align-items: center;
-      gap: 0.25rem;
-      margin-bottom: 1.75rem;
-      flex-wrap: wrap;
-    }
-    .nav-link {
-      display: inline-flex;
-      align-items: center;
-      padding: 0.3rem 0.6rem;
-      border-radius: 6px;
-      font-size: 0.8rem;
-      font-weight: 500;
-      color: #64748b;
-      text-decoration: none;
-      transition: color 0.15s, background 0.15s;
-      white-space: nowrap;
-    }
-    .nav-link:hover { color: #cbd5e1; background: rgba(255,255,255,0.05); }
-    .nav-link.active { color: #a78bfa; }
-    .nav-spacer { flex: 1; min-width: 0.5rem; }
-    .nav-link.nav-logout { color: #475569; }
-    .nav-link.nav-logout:hover { color: #94a3b8; background: none; }
-
-    header {
-      margin-bottom: 2rem;
-      padding-bottom: 1rem;
-      border-bottom: 1px solid #2d3148;
-    }
-    header h1 { font-size: 1.5rem; font-weight: 700; color: #a78bfa; }
-    header p  { font-size: 0.85rem; color: #94a3b8; margin-top: 0.25rem; }
-
-    #help-body { line-height: 1.7; }
-
-    /* Collapsible sections */
    details {
      margin-bottom: 0.75rem;
-      background: #1a1d27;
-      border: 1px solid #2d3148;
-      border-radius: 8px;
-      overflow: hidden;
+      background: var(--pg-surface);
+      border: 1px solid var(--pg-border);
+      border-radius: 8px; overflow: hidden;
    }
    summary {
-      padding: 0.85rem 1rem;
-      font-weight: 600;
-      font-size: 0.95rem;
-      color: #cbd5e1;
-      cursor: pointer;
-      list-style: none;
-      display: flex;
-      align-items: center;
-      gap: 0.5rem;
+      padding: 0.85rem 1rem; font-weight: 600; font-size: 0.95rem;
+      color: var(--pg-bright); cursor: pointer; list-style: none;
+      display: flex; align-items: center; gap: 0.5rem;
    }
    summary::before {
-      content: '▶';
-      font-size: 0.65rem;
-      color: #94a3b8;
+      content: '▶'; font-size: 0.65rem; color: var(--pg-muted);
      transition: transform 0.15s;
    }
    details[open] summary::before { transform: rotate(90deg); }
    summary::-webkit-details-marker { display: none; }
+    details > *:not(summary) { padding: 0 1rem 1rem; }

-    details > *:not(summary) {
-      padding: 0 1rem 1rem;
+    .help-body p  { margin: 0.5rem 0; font-size: 0.9rem; color: var(--pg-bright); }
+    .help-body ul { margin: 0.5rem 0 0.5rem 1.25rem; }
+    .help-body li { font-size: 0.9rem; color: var(--pg-bright); margin-bottom: 0.25rem; }
+    .help-body strong { color: var(--pg-text); }
+    .help-body code {
+      background: var(--pg-bg); border: 1px solid var(--pg-border);
+      border-radius: 4px; padding: 0.1em 0.4em;
+      font-size: 0.85em; color: var(--pg-accent);
    }
-
-    #help-body p  { margin: 0.5rem 0; font-size: 0.9rem; color: #cbd5e1; }
-    #help-body ul { margin: 0.5rem 0 0.5rem 1.25rem; }
-    #help-body li { font-size: 0.9rem; color: #cbd5e1; margin-bottom: 0.25rem; }
-    #help-body strong { color: #e2e8f0; }
-    #help-body code {
-      background: #0f1117;
-      border: 1px solid #2d3148;
-      border-radius: 4px;
-      padding: 0.1em 0.4em;
-      font-size: 0.85em;
-      color: #a78bfa;
+    .help-body a { color: var(--pg-accent); }
+    .help-body h1 { font-size: 1.1rem; font-weight: 700; color: var(--pg-text); margin: 0.75rem 0 0.5rem; }
+    .help-body h3 {
+      font-size: 0.8rem; font-weight: 600; color: var(--pg-muted);
+      text-transform: uppercase; letter-spacing: 0.05em; margin: 0.75rem 0 0.25rem;
    }
-    #help-body a { color: #a78bfa; }
-
-    #help-body h3 {
-      font-size: 0.8rem;
-      font-weight: 600;
-      color: #94a3b8;
-      text-transform: uppercase;
-      letter-spacing: 0.05em;
-      margin: 0.75rem 0 0.25rem;
-    }
-
-    #loading { color: #94a3b8; font-size: 0.9rem; padding: 1rem 0; }
+    .help-body table { width: 100%; border-collapse: collapse; font-size: 0.88rem; margin: 0.5rem 0 0.75rem; }
+    .help-body th, .help-body td { padding: 0.45rem 0.7rem; text-align: left; border-bottom: 1px solid var(--pg-border); }
+    .help-body th { color: var(--pg-muted); font-weight: 600; font-size: 0.8rem; text-transform: uppercase; letter-spacing: 0.04em; }
+    .help-body td { color: var(--pg-bright); }
+    .help-body pre { background: var(--pg-bg); border: 1px solid var(--pg-border); border-radius: 6px; padding: 0.75rem 1rem; overflow-x: auto; margin: 0.5rem 0; }
+    .help-body pre code { background: none; border: none; padding: 0; font-size: 0.85em; color: var(--pg-muted); }
+    .help-body hr { border: none; border-top: 1px solid var(--pg-border); margin: 0.5rem 0; }
  </style>
 </head>
 <body>
-  <div class="page">
-    <nav class="page-nav" id="page-nav">
-      <a id="nav-chat" href="/" class="nav-link">← Chat</a>
-      <a href="/help" class="nav-link active">Help</a>
-      <a href="/settings" class="nav-link" id="nav-settings">Settings</a>
-      <span class="nav-spacer"></span>
-      <a href="/logout" class="nav-link nav-logout">Sign out</a>
-    </nav>
+  <nav class="page-nav" id="page-nav">
+    <a id="nav-chat" href="/" class="nav-link">← Chat</a>
+    <a href="/help" class="nav-link active">Help</a>
+    <a href="/settings" class="nav-link" id="nav-settings">Settings</a>
+    <a href="/settings/models" class="nav-link">Models</a>
+    <a href="/settings/notifications" class="nav-link">Notifications</a>
+    <a href="/settings/tools" class="nav-link">Tools</a>
+    <a href="/settings/crons" class="nav-link">Schedules</a>
+    {{ integrations_nav }}
+    <span class="nav-spacer"></span>
+    <a href="/logout" class="nav-link nav-logout">Sign out</a>
+  </nav>

-    <header>
-      <h1>Help &amp; Reference</h1>
-      <p id="persona-label"></p>
-    </header>
+  <div class="max-w-3xl mx-auto px-6 py-8 pb-16">
+    <div class="mb-6 pb-4 border-b border-pg-border">
+      <h1 class="text-xl font-bold text-pg-accent">Help &amp; Reference</h1>
+      <p id="persona-label" class="text-xs text-pg-muted mt-1"></p>
+    </div>

-    <div id="help-body"><p id="loading">Loading…</p></div>
+    <!-- Tab bar -->
+    <div class="tab-bar">
+      <button class="tab-btn active" data-tab="ui">UI Guide</button>
+      <button class="tab-btn" data-tab="tools">Tools</button>
+      <button class="tab-btn" data-tab="persona" id="tab-btn-persona">Persona</button>
+    </div>
+
+    <div id="tab-ui"      class="tab-panel active"><div class="help-body"><p class="text-pg-dimmer text-sm text-center py-8">Loading…</p></div></div>
+    <div id="tab-tools"   class="tab-panel">       <div class="help-body"><p class="text-pg-dimmer text-sm text-center py-8">Loading…</p></div></div>
+    <div id="tab-persona" class="tab-panel">       <div class="help-body"><p class="text-pg-dimmer text-sm text-center py-8">Loading…</p></div></div>
  </div>

+  <style>
+    .tab-bar {
+      display: flex; gap: 0.25rem; margin-bottom: 1.5rem;
+      border-bottom: 1px solid var(--pg-border);
+    }
+    .tab-btn {
+      padding: 0.45rem 1rem; font-size: 0.85rem; font-weight: 500;
+      color: var(--pg-dim); background: none; border: none;
+      border-bottom: 2px solid transparent; margin-bottom: -1px;
+      cursor: pointer; transition: color 0.15s, border-color 0.15s;
+      font-family: inherit;
+    }
+    .tab-btn:hover { color: var(--pg-bright); }
+    .tab-btn.active { color: var(--pg-accent); border-bottom-color: var(--pg-accent); }
+  </style>
+
  <script>
    const cfg     = window.HELP_CONFIG || {};
    const user    = cfg.user    || 'scott';
    const persona = cfg.persona || 'inara';
    const params  = `user=${encodeURIComponent(user)}&persona=${encodeURIComponent(persona)}`;

-    // Wire up nav links and persona label
    document.getElementById('nav-chat').href = cfg.backHref || '/';
    if (persona) {
      document.getElementById('persona-label').textContent =
        `${persona.charAt(0).toUpperCase() + persona.slice(1)} · ${user}`;
    }

-    const OPEN_SECTIONS = new Set(['Header Controls', 'Chat', 'Sessions', 'Notes']);
+    // Rename Persona tab to the actual persona name
+    const personaTabBtn = document.getElementById('tab-btn-persona');
+    personaTabBtn.textContent = persona.charAt(0).toUpperCase() + persona.slice(1);

-    function makeCollapsible(container) {
-      const h2s = Array.from(container.querySelectorAll('h2'));
-      for (const h2 of h2s) {
+    // ── Tab switching ────────────────────────────────────────────────
+    const tabBtns   = document.querySelectorAll('.tab-btn');
+    const tabPanels = document.querySelectorAll('.tab-panel');
+    const TAB_KEY   = `cx_help_tab_${user}_${persona}`;
+
+    function activateTab(name) {
+      tabBtns.forEach(b   => b.classList.toggle('active',   b.dataset.tab === name));
+      tabPanels.forEach(p => p.classList.toggle('active', p.id === `tab-${name}`));
+      try { localStorage.setItem(TAB_KEY, name); } catch (_) {}
+    }
+
+    tabBtns.forEach(btn => btn.addEventListener('click', () => activateTab(btn.dataset.tab)));
+
+    // Restore last active tab
+    try {
+      const saved = localStorage.getItem(TAB_KEY);
+      if (saved) activateTab(saved);
+    } catch (_) {}
+
+    // ── Collapsible h2 sections ──────────────────────────────────────
+    function makeCollapsible(container, openAll = false, openSet = null) {
+      container.querySelectorAll('h2').forEach(h2 => {
        const title   = h2.textContent.trim();
        const details = document.createElement('details');
-        if (OPEN_SECTIONS.has(title)) details.open = true;
+        if (openAll || (openSet && openSet.has(title))) details.open = true;

        const summary = document.createElement('summary');
        summary.textContent = title;
@@ -166,46 +183,57 @@

        const siblings = [];
        let node = h2.nextSibling;
-        while (node && node.nodeName !== 'H2') {
-          siblings.push(node);
-          node = node.nextSibling;
-        }
-        for (const sib of siblings) details.appendChild(sib);
+        while (node && node.nodeName !== 'H2') { siblings.push(node); node = node.nextSibling; }
+        siblings.forEach(s => details.appendChild(s));
        h2.parentNode.replaceChild(details, h2);
-      }
+      });
    }

-    async function loadHelp() {
+    // ── Render markdown into a panel ────────────────────────────────
+    function render(panelId, markdown, openAll, openSet) {
+      const panel = document.querySelector(`#${panelId} .help-body`);
+      panel.innerHTML = marked.parse(markdown);
+      panel.querySelectorAll('a').forEach(a => { a.target = '_blank'; a.rel = 'noopener noreferrer'; });
+      makeCollapsible(panel, openAll, openSet);
+    }
+
+    // ── Load all three tabs in parallel ─────────────────────────────
+    const UI_OPEN = new Set(['Getting Started', 'Chat', 'Sessions', 'Model Registry']);
+
+    async function loadAll() {
+      // UI Guide
+      fetch('/static/HELP.md')
+        .then(r => r.ok ? r.text() : Promise.reject(r.status))
+        .then(md => render('tab-ui', md, false, UI_OPEN))
+        .catch(e => { document.querySelector('#tab-ui .help-body').innerHTML = `<p class="text-pg-dimmer text-sm text-center py-8">Failed to load: ${e}</p>`; });
+
+      // Tools
+      fetch('/static/TOOLS.md')
+        .then(r => r.ok ? r.text() : Promise.reject(r.status))
+        .then(md => render('tab-tools', md, true, null))
+        .catch(e => { document.querySelector('#tab-tools .help-body').innerHTML = `<p class="text-pg-dimmer text-sm text-center py-8">Failed to load: ${e}</p>`; });
+
+      // Persona-specific HELP.md
+      const personaPanel = document.querySelector('#tab-persona .help-body');
      try {
-        // Always load the shared base from static
-        const baseRes = await fetch('/static/HELP.md');
-        if (!baseRes.ok) throw new Error(`HTTP ${baseRes.status}`);
-        let markdown = await baseRes.text();
-
-        // Try to load persona-specific additions and append them
-        try {
-          const personaRes = await fetch(`/files/HELP.md?${params}`);
-          if (personaRes.ok) {
-            const personaData = await personaRes.json();
-            const extra = (personaData.content || '').trim();
-            if (extra) {
-              markdown += '\n\n---\n\n## ' + persona.charAt(0).toUpperCase() + persona.slice(1) + ' Notes\n\n' + extra;
-            }
+        const res = await fetch(`/files/HELP.md?${params}`);
+        if (res.ok) {
+          const data = await res.json();
+          const content = (data.content || '').trim();
+          if (content) {
+            render('tab-persona', content, true, null);
+          } else {
+            personaPanel.innerHTML = `<p class="text-pg-dimmer text-sm text-center py-8">No ${persona}-specific notes yet. Edit <code>HELP.md</code> in the Files panel to add them.</p>`;
          }
-        } catch (_) { /* persona-specific file is optional */ }
-
-        const body = document.getElementById('help-body');
-        body.innerHTML = marked.parse(markdown);
-        body.querySelectorAll('a').forEach(a => {
-          a.target = '_blank'; a.rel = 'noopener noreferrer';
-        });
-        makeCollapsible(body);
-      } catch (err) {
-        document.getElementById('help-body').textContent = `Failed to load help: ${err.message}`;
+        } else {
+          personaPanel.innerHTML = `<p class="text-pg-dimmer text-sm text-center py-8">No ${persona}-specific notes yet.</p>`;
+        }
+      } catch (_) {
+        personaPanel.innerHTML = `<p class="text-pg-dimmer text-sm text-center py-8">No ${persona}-specific notes yet.</p>`;
      }
    }

-    loadHelp();
+    loadAll();
  </script>
 </body>
 </html>
--- a/cortex/static/icon-192.png
+++ b/cortex/static/icon-192.png
--- a/cortex/static/icon-512.png
+++ b/cortex/static/icon-512.png
--- a/cortex/static/icon.svg
+++ b/cortex/static/icon.svg
@@ -0,0 +1,4 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512">
+  <rect width="512" height="512" rx="96" fill="#1a1228"/>
+  <text x="256" y="390" font-size="340" text-anchor="middle" font-family="system-ui, -apple-system, sans-serif">✨</text>
+</svg>
--- a/cortex/static/index.html
+++ b/cortex/static/index.html
@@ -5,6 +5,13 @@
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Cortex — Inara</title>
    <link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>✨</text></svg>">
+    <link rel="manifest" href="/manifest.json">
+    <meta name="theme-color" content="#1a1228" id="meta-theme-color">
+    <meta name="mobile-web-app-capable" content="yes">
+    <meta name="apple-mobile-web-app-capable" content="yes">
+    <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
+    <meta name="apple-mobile-web-app-title" content="Cortex">
+    <link rel="apple-touch-icon" href="/static/icon-192.png">
    <link rel="preconnect" href="https://fonts.googleapis.com">
    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
@@ -20,6 +27,9 @@
        })();
    </script>
    <link rel="stylesheet" href="/static/style.css">
+    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/codemirror@5.65.17/lib/codemirror.min.css">
+    <script src="https://cdn.jsdelivr.net/npm/codemirror@5.65.17/lib/codemirror.min.js"></script>
+    <script src="https://cdn.jsdelivr.net/npm/codemirror@5.65.17/mode/markdown/markdown.min.js"></script>
    <script src="/static/marked.min.js"></script>
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/styles/atom-one-dark.min.css">
    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/highlight.min.js"></script>
@@ -31,6 +41,7 @@
        <div class="persona-switcher" id="persona-switcher">
            <div class="name" id="persona-name">Inara</div>
            <div class="subtitle">Cortex · Local</div>
+            <div id="session-id"></div>
            <div class="persona-dropdown" id="persona-dropdown"></div>
        </div>

@@ -53,6 +64,10 @@
                    <a href="/settings" class="hdr-dd-item">
                        <svg data-lucide="user" class="btn-icon"></svg> Account
                    </a>
+                    <button id="push-btn" class="hdr-dd-item" style="display:none">
+                        <svg data-lucide="bell" class="btn-icon"></svg>
+                        <span id="push-btn-label">Enable notifications</span>
+                    </button>
                    <div class="hdr-dd-divider"></div>
                    <form method="POST" action="/logout" style="margin:0">
                        <button type="submit" class="hdr-dd-item">
@@ -73,10 +88,10 @@
            <div class="ctx-section">
                <div class="ctx-section-title">Context Tier</div>
                <div class="ctx-row">
-                    <button class="ctx-btn" data-tier="1" id="tier-1" title="Minimal (~1.5k tokens)">T1</button>
-                    <button class="ctx-btn active" data-tier="2" id="tier-2" title="Standard (~5k tokens)">T2</button>
-                    <button class="ctx-btn" data-tier="3" id="tier-3" title="Extended (~15k tokens)">T3</button>
-                    <button class="ctx-btn" data-tier="4" id="tier-4" title="Full (~50k tokens)">T4</button>
+                    <button class="ctx-btn" data-tier="1" id="tier-1" title="Minimal — identity only (~1.5k tokens)">Min</button>
+                    <button class="ctx-btn active" data-tier="2" id="tier-2" title="Standard — memory + user profile (~5k tokens)">Std</button>
+                    <button class="ctx-btn" data-tier="3" id="tier-3" title="Extended — + last 2 sessions (~15k tokens)">Ext</button>
+                    <button class="ctx-btn" data-tier="4" id="tier-4" title="Full — + last 7 sessions (~50k tokens)">Full</button>
                </div>
            </div>
            <div class="ctx-section">
@@ -90,32 +105,29 @@
            <div class="ctx-section">
                <div class="ctx-section-title">Distill Memory</div>
                <div class="ctx-row">
-                    <button class="ctx-btn" id="distill-short-btn" title="Roll session logs → MEMORY_SHORT (no LLM)">short</button>
-                    <button class="ctx-btn" id="distill-mid-btn"   title="Summarize short → MEMORY_MID (LLM)">mid</button>
-                    <button class="ctx-btn" id="distill-long-btn"  title="Integrate mid → MEMORY_LONG (LLM)">long</button>
-                    <button class="ctx-btn" id="distill-all-btn"   title="Run all three steps in sequence">all</button>
+                    <button class="ctx-btn" id="distill-short-btn" title="Roll today's sessions → MEMORY_SHORT.md (fast, no LLM)">Short</button>
+                    <button class="ctx-btn" id="distill-mid-btn"   title="Summarize SHORT → MID memory (uses LLM)">Mid</button>
+                    <button class="ctx-btn" id="distill-long-btn"  title="Integrate MID → LONG memory (uses LLM)">Long</button>
+                    <button class="ctx-btn" id="distill-all-btn"   title="Run Short → Mid → Long in sequence">All</button>
+                    <button class="ctx-btn ctx-btn-danger" id="distill-rebuild-btn" title="⚠ Wipe Mid + Long memories and rebuild from session logs. Hand-edited content will be replaced.">Rebuild</button>
                </div>
                <div id="ctx-distill-status"></div>
                <div id="ctx-schedule"></div>
            </div>
            <div class="ctx-section">
-                <div class="ctx-section-title">Backend</div>
+                <div class="ctx-section-title">Role</div>
                <div class="ctx-row">
-                    <button id="backend-toggle" class="ctx-btn" title="Click to switch primary backend">claude</button>
+                    <button id="backend-toggle" class="ctx-btn" title="Active role — click to cycle">chat</button>
                </div>
                <div id="backend-model-hint"></div>
            </div>
            <div class="ctx-section">
                <div class="ctx-section-title">Display</div>
                <div class="ctx-row">
-                    <button id="font-size-btn" class="ctx-btn" title="Cycle font size: normal → large → small">Aa</button>
-                    <button id="theme-btn" class="ctx-btn" title="Toggle light/dark mode">☾</button>
-                    <select id="height-sel" class="ctx-btn" title="Max input height" style="cursor:pointer">
-                        <option value="120">5 lines</option>
-                        <option value="240">10 lines</option>
-                        <option value="480">20 lines</option>
-                    </select>
-                    <button id="enter-toggle" class="ctx-btn" title="Toggle send shortcut">⌃↵</button>
+                    <button id="font-size-btn" class="ctx-btn" title="Cycle font size: Normal → Large → Small">Aa</button>
+                    <button id="theme-btn" class="ctx-btn" title="Toggle light / dark theme">☾</button>
+                    <button id="height-cycle-btn" class="ctx-btn" title="Input size: Compact — click to cycle">S</button>
+                    <button id="enter-toggle" class="ctx-btn" title="Toggle send shortcut: Ctrl+Enter ↔ Enter">⌃↵</button>
                </div>
            </div>
        </div>
@@ -144,7 +156,7 @@
                    </div>
                </div>
                <div id="file-modal-body">
-                    <textarea id="file-editor" spellcheck="false"></textarea>
+                    <div id="file-editor-wrap"></div>
                    <div id="file-preview"></div>
                    <div id="session-search-results" style="display:none"></div>
                </div>
@@ -152,17 +164,7 @@
        </div>
    </div>

-    <!-- Auth warning banner — shown when Claude CLI token is near expiry -->
-    <div id="auth-banner">
-        <div id="auth-banner-text">
-            <span id="auth-banner-msg"></span>
-            <span id="auth-banner-hint"></span>
-        </div>
-        <button id="auth-banner-close" title="Dismiss">✕</button>
-    </div>
-
    <div id="messages"></div>
-    <div id="session-id"></div>

    <div id="input-area">
        <!-- Mode select — compact dropdown, opens upward, MRU sorted -->
@@ -176,6 +178,21 @@
            <div id="mode-dropdown"></div>
            <!-- Note visibility sub-toggle — only shown when note mode is active -->
            <button id="note-vis-btn" title="Toggle note visibility (private / public)">prv</button>
+            <!-- Tools toggle — routes through the orchestrator tool loop when active -->
+            <button id="tools-toggle" title="Tools disabled — click to enable">⚡</button>
+            <!-- Attach file — images (vision) or text/code files -->
+            <button id="attach-btn" title="Attach image or text file">📎</button>
+            <input type="file" id="file-input" style="display:none"
+                   accept="image/png,image/jpeg,image/webp,image/gif,text/plain,text/markdown,.md,.txt,.py,.js,.ts,.jsx,.tsx,.json,.yaml,.yml,.toml,.html,.css,.sh,.csv,.xml,.rs,.go,.java,.c,.cpp,.h,.rb,.php,.swift,.kt,.sql">
+        </div>
+        <!-- Attachment preview — shown when a file is pending -->
+        <div id="attachment-row" style="display:none">
+            <div id="attachment-preview">
+                <img id="attachment-thumb" alt="" style="display:none">
+                <span id="attachment-icon">📎</span>
+                <span id="attachment-name"></span>
+                <button id="attachment-clear" title="Remove attachment">✕</button>
+            </div>
        </div>
        <textarea id="input" rows="1" placeholder="Message…" autofocus></textarea>
        <div id="send-col">
--- a/cortex/static/integrations.html
+++ b/cortex/static/integrations.html
@@ -0,0 +1,134 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <title>Cortex — Integrations</title>
+  <link rel="preconnect" href="https://fonts.googleapis.com">
+  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
+  <script src="https://cdn.tailwindcss.com"></script>
+  <script>
+  tailwind.config = {
+    corePlugins: { preflight: false },
+    darkMode: ['selector', '[data-theme="dark"]'],
+    theme: {
+      extend: {
+        colors: {
+          pg: {
+            bg:      'var(--pg-bg)',
+            surface: 'var(--pg-surface)',
+            border:  'var(--pg-border)',
+            text:    'var(--pg-text)',
+            muted:   'var(--pg-muted)',
+            dim:     'var(--pg-dim)',
+            dimmer:  'var(--pg-dimmer)',
+            bright:  'var(--pg-bright)',
+            accent:  'var(--pg-accent)',
+            action:  'var(--pg-action)',
+          }
+        },
+        fontFamily: { sans: ['Inter', 'system-ui', 'sans-serif'] }
+      }
+    }
+  }
+  </script>
+  <link rel="stylesheet" href="/static/pg.css">
+  <script>(function(){var t=localStorage.getItem('theme')||(window.matchMedia('(prefers-color-scheme: dark)').matches?'dark':'light');document.documentElement.setAttribute('data-theme',t);})();</script>
+  <style>
+    details.channel-block summary::-webkit-details-marker { display: none; }
+    details.channel-block summary::before {
+      content: '▶'; font-size: 0.65rem; color: var(--pg-dimmer);
+      transition: transform 0.15s; flex-shrink: 0;
+    }
+    details.channel-block[open] summary::before { transform: rotate(90deg); }
+    details.channel-block[open] summary { border-bottom: 1px solid var(--pg-border); }
+  </style>
+</head>
+<body>
+  <nav class="page-nav">
+    <a href="{{ back_href }}" class="nav-link">← Chat</a>
+    <a href="{{ help_href }}" class="nav-link">Help</a>
+    <a href="/settings" class="nav-link">Settings</a>
+    <a href="/settings/models" class="nav-link">Models</a>
+    <a href="/settings/notifications" class="nav-link">Notifications</a>
+    <a href="/settings/tools" class="nav-link">Tools</a>
+    <a href="/settings/crons" class="nav-link">Schedules</a>
+    <a href="/settings/integrations" class="nav-link active">Integrations</a>
+    <span class="nav-spacer"></span>
+    <a href="/logout" class="nav-link nav-logout">Sign out</a>
+  </nav>
+  <div class="page-wrap">
+    <h1 class="page-title">Integrations</h1>
+    <p class="page-subtitle">External service connections — admin only.</p>
+
+    <!-- SUCCESS -->
+    <!-- ERROR -->
+
+    <form method="POST" action="/settings/integrations">
+
+      <div class="section">
+        <h2>Aether Platform Database</h2>
+        <p class="section-note">
+          Gives the orchestrator direct read-only access to the Aether MariaDB via the
+          <code>ae_db_query</code>, <code>ae_db_describe</code>, and <code>ae_db_show_view</code> tools.
+          Only SELECT, SHOW, DESCRIBE, and EXPLAIN are permitted — no writes possible.
+        </p>
+
+        <details class="channel-block border border-pg-border rounded-lg overflow-hidden mb-3"
+                 {{ ae_db_host and 'open' or '' }}>
+          <summary class="flex items-center gap-2 px-4 py-3 text-sm font-semibold text-pg-muted cursor-pointer select-none bg-pg-bg">
+            Connection
+          </summary>
+          <div class="px-4 pt-4 pb-2">
+            <p class="text-xs text-pg-dimmer mb-4 -mt-1 leading-relaxed">
+              Use the same credentials as
+              <code class="font-mono text-pg-accent bg-pg-bg border border-pg-border rounded px-1 text-xs">agents_sync/mcp/scripts/sql_inspector.py</code>.
+              Leave the password blank to keep the stored value.
+            </p>
+            <div class="grid grid-cols-[1fr_7rem] gap-3 items-start">
+              <div class="field">
+                <label for="ae_db_host">Host</label>
+                <input type="text" id="ae_db_host" name="ae_db_host"
+                       value="{{ ae_db_host }}"
+                       placeholder="192.168.64.5"
+                       autocomplete="off" spellcheck="false">
+              </div>
+              <div class="field">
+                <label for="ae_db_port">Port</label>
+                <input type="number" id="ae_db_port" name="ae_db_port"
+                       value="{{ ae_db_port }}"
+                       placeholder="3306" min="1" max="65535"
+                       autocomplete="off">
+              </div>
+            </div>
+            <div class="field">
+              <label for="ae_db_name">Database name</label>
+              <input type="text" id="ae_db_name" name="ae_db_name"
+                     value="{{ ae_db_name }}"
+                     placeholder="aether_dev"
+                     autocomplete="off" spellcheck="false">
+            </div>
+            <div class="field">
+              <label for="ae_db_user">Username</label>
+              <input type="text" id="ae_db_user" name="ae_db_user"
+                     value="{{ ae_db_user }}"
+                     placeholder="aether_dev"
+                     autocomplete="off" spellcheck="false">
+            </div>
+            <div class="field">
+              <label for="ae_db_password">Password</label>
+              <input type="password" id="ae_db_password" name="ae_db_password"
+                     value=""
+                     placeholder="Leave blank to keep existing value"
+                     autocomplete="new-password" spellcheck="false">
+            </div>
+          </div>
+        </details>
+      </div>
+
+      <button type="submit" class="btn-submit w-full md:w-96">Save integrations</button>
+    </form>
+  </div>
+</body>
+</html>
--- a/cortex/static/local_llm.html
+++ b/cortex/static/local_llm.html
--- a/cortex/static/manifest.json
+++ b/cortex/static/manifest.json
@@ -0,0 +1,30 @@
+{
+  "name": "Cortex · Inara",
+  "short_name": "Cortex",
+  "description": "Personal AI assistant",
+  "start_url": "/",
+  "scope": "/",
+  "display": "standalone",
+  "background_color": "#1a1228",
+  "theme_color": "#1a1228",
+  "icons": [
+    {
+      "src": "/static/icon-192.png",
+      "sizes": "192x192",
+      "type": "image/png",
+      "purpose": "any"
+    },
+    {
+      "src": "/static/icon-512.png",
+      "sizes": "512x512",
+      "type": "image/png",
+      "purpose": "any maskable"
+    },
+    {
+      "src": "/static/icon.svg",
+      "sizes": "any",
+      "type": "image/svg+xml",
+      "purpose": "any"
+    }
+  ]
+}
--- a/cortex/static/notifications.html
+++ b/cortex/static/notifications.html
@@ -0,0 +1,348 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <title>Cortex — Notifications</title>
+  <link rel="preconnect" href="https://fonts.googleapis.com">
+  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
+  <script src="https://cdn.tailwindcss.com"></script>
+  <script>
+  tailwind.config = {
+    corePlugins: { preflight: false },
+    darkMode: ['selector', '[data-theme="dark"]'],
+    theme: {
+      extend: {
+        colors: {
+          pg: {
+            bg:      'var(--pg-bg)',
+            surface: 'var(--pg-surface)',
+            border:  'var(--pg-border)',
+            text:    'var(--pg-text)',
+            muted:   'var(--pg-muted)',
+            dim:     'var(--pg-dim)',
+            dimmer:  'var(--pg-dimmer)',
+            bright:  'var(--pg-bright)',
+            accent:  'var(--pg-accent)',
+            action:  'var(--pg-action)',
+          }
+        },
+        fontFamily: { sans: ['Inter', 'system-ui', 'sans-serif'] }
+      }
+    }
+  }
+  </script>
+  <link rel="stylesheet" href="/static/pg.css">
+  <script>(function(){var t=localStorage.getItem('theme')||(window.matchMedia('(prefers-color-scheme: dark)').matches?'dark':'light');document.documentElement.setAttribute('data-theme',t);})();</script>
+  <style>
+    /* ── Channel collapsible arrow ── */
+    details.channel-block summary::-webkit-details-marker { display: none; }
+    details.channel-block summary::before {
+      content: '▶'; font-size: 0.65rem; color: var(--pg-dimmer);
+      transition: transform 0.15s; flex-shrink: 0;
+    }
+    details.channel-block[open] summary::before { transform: rotate(90deg); }
+    details.channel-block[open] summary { border-bottom: 1px solid var(--pg-border); }
+
+    /* ── Test result feedback (JS-toggled display) ── */
+    #test-result { display: none; }
+  </style>
+</head>
+<body>
+  <nav class="page-nav">
+    <a href="{{ back_href }}" class="nav-link">← Chat</a>
+    <a href="{{ help_href }}" class="nav-link">Help</a>
+    <a href="/settings" class="nav-link">Settings</a>
+    <a href="/settings/models" class="nav-link">Models</a>
+    <a href="/settings/notifications" class="nav-link active">Notifications</a>
+    <a href="/settings/tools" class="nav-link">Tools</a>
+    <a href="/settings/crons" class="nav-link">Schedules</a>
+    {{ integrations_nav }}
+    <span class="nav-spacer"></span>
+    <a href="/logout" class="nav-link nav-logout">Sign out</a>
+  </nav>
+  <div class="page-wrap">
+    <h1 class="page-title">Notifications</h1>
+    <p class="page-subtitle">How your persona reaches out proactively — reminders, cron jobs, and memory digests.</p>
+
+    <!-- SUCCESS -->
+    <!-- ERROR -->
+
+    <form method="POST" action="/settings/notifications">
+
+      <!-- Channel selector -->
+      <div class="section">
+        <h2>Channel</h2>
+        <div class="field">
+          <label for="notification_channel">Default outbound channel</label>
+          <select id="notification_channel" name="notification_channel"
+                  data-value="{{ notify_channel }}">
+            <option value="">None (disabled)</option>
+            <option value="web_push">Browser Push Notification</option>
+            <option value="email">Email</option>
+            <option value="nextcloud">Nextcloud Talk</option>
+            <option value="google_chat">Google Chat</option>
+          </select>
+          <p class="hint">Used for reminder alerts, distillation summaries, and cron job notifications.</p>
+        </div>
+        <div class="field">
+          <label for="notification_email">
+            Email address override
+            <span class="font-normal text-pg-dim">(optional)</span>
+          </label>
+          <input type="email" id="notification_email" name="notification_email"
+                 value="{{ notify_email_override }}"
+                 placeholder="Leave blank to use your login email"
+                 autocomplete="off">
+        </div>
+      </div>
+
+      <!-- Nextcloud Talk -->
+      <div class="section">
+        <h2>Nextcloud Talk</h2>
+        <p class="section-note">
+          Configure to send and receive messages via your Nextcloud Talk bot.
+          <strong>Sending</strong> requires the bot URL, secret, and notification room.
+          <strong>Reading history</strong> (<code>nc_talk_history</code> tool) additionally
+          requires a Nextcloud username and app password.
+        </p>
+
+        <details class="channel-block border border-pg-border rounded-lg overflow-hidden mb-3"
+                 {{ nc_url and 'open' or '' }}>
+          <summary class="flex items-center gap-2 px-4 py-3 text-sm font-semibold text-pg-muted cursor-pointer select-none bg-pg-bg">
+            Bot credentials (sending)
+          </summary>
+          <div class="px-4 pt-4 pb-2">
+            <p class="text-xs text-pg-dimmer -mt-1 mb-4 leading-relaxed">
+              Set these up in your Nextcloud Talk room → Bot settings.
+              See the <a href="/help" class="text-pg-accent">setup guide</a> for step-by-step instructions.
+            </p>
+            <div class="field">
+              <label for="nc_url">Nextcloud URL</label>
+              <input type="url" id="nc_url" name="nc_url"
+                     value="{{ nc_url }}"
+                     placeholder="https://cloud.example.com"
+                     autocomplete="off" spellcheck="false">
+            </div>
+            <div class="field">
+              <label for="nc_bot_secret">Bot secret</label>
+              <input type="password" id="nc_bot_secret" name="nc_bot_secret"
+                     value="{{ nc_bot_secret }}"
+                     placeholder="Leave blank to keep existing value"
+                     autocomplete="new-password" spellcheck="false">
+              <p class="hint">Generated when you registered the bot in Nextcloud Talk.</p>
+            </div>
+            <div class="field">
+              <label for="nc_notification_room">Notification room token</label>
+              <input type="text" id="nc_notification_room" name="nc_notification_room"
+                     value="{{ nc_notify_room }}"
+                     placeholder="Token from the Talk room URL"
+                     autocomplete="off" spellcheck="false">
+              <p class="hint">The token at the end of the Talk room URL — e.g. <code>abc123def</code>.</p>
+            </div>
+          </div>
+        </details>
+
+        <details class="channel-block border border-pg-border rounded-lg overflow-hidden mb-3"
+                 {{ nc_username and 'open' or '' }}>
+          <summary class="flex items-center gap-2 px-4 py-3 text-sm font-semibold text-pg-muted cursor-pointer select-none bg-pg-bg">
+            API credentials (reading history)
+          </summary>
+          <div class="px-4 pt-4 pb-2">
+            <p class="text-xs text-pg-dimmer -mt-1 mb-4 leading-relaxed">
+              Required for the <code>nc_talk_history</code> orchestrator tool.
+              Generate an app password in Nextcloud → Settings → Security → App passwords.
+            </p>
+            <div class="field">
+              <label for="nc_username">Nextcloud username</label>
+              <input type="text" id="nc_username" name="nc_username"
+                     value="{{ nc_username }}"
+                     placeholder="Your Nextcloud login username"
+                     autocomplete="off" spellcheck="false">
+            </div>
+            <div class="field">
+              <label for="nc_app_password">App password</label>
+              <input type="password" id="nc_app_password" name="nc_app_password"
+                     value="{{ nc_app_password }}"
+                     placeholder="Leave blank to keep existing value"
+                     autocomplete="new-password" spellcheck="false">
+            </div>
+          </div>
+        </details>
+      </div>
+
+      <!-- Home Assistant -->
+      <div class="section">
+        <h2>Home Assistant</h2>
+        <p class="section-note">
+          Receive events from HA automations and let your persona call the HA REST API
+          (read states, control devices). Webhook ID is the shared secret used in your
+          HA <code>rest_command</code> URL.
+        </p>
+
+        <details class="channel-block border border-pg-border rounded-lg overflow-hidden mb-3"
+                 {{ ha_url and 'open' or '' }}>
+          <summary class="flex items-center gap-2 px-4 py-3 text-sm font-semibold text-pg-muted cursor-pointer select-none bg-pg-bg">
+            Connection
+          </summary>
+          <div class="px-4 pt-4 pb-2">
+            <p class="text-xs text-pg-dimmer -mt-1 mb-4 leading-relaxed">
+              HA URL and a Long-Lived Access Token (Profile → scroll to bottom →
+              Long-Lived Access Tokens → Create Token).
+            </p>
+            <div class="field">
+              <label for="ha_url">Home Assistant URL</label>
+              <input type="url" id="ha_url" name="ha_url"
+                     value="{{ ha_url }}"
+                     placeholder="https://ha.yourdomain.com"
+                     autocomplete="off" spellcheck="false">
+            </div>
+            <div class="field">
+              <label for="ha_token">Long-Lived Access Token</label>
+              <input type="password" id="ha_token" name="ha_token"
+                     value=""
+                     placeholder="Leave blank to keep existing token"
+                     autocomplete="new-password" spellcheck="false">
+            </div>
+          </div>
+        </details>
+
+        <details class="channel-block border border-pg-border rounded-lg overflow-hidden mb-3"
+                 {{ ha_webhook_id and 'open' or '' }}>
+          <summary class="flex items-center gap-2 px-4 py-3 text-sm font-semibold text-pg-muted cursor-pointer select-none bg-pg-bg">
+            Inbound webhook (HA → Cortex)
+          </summary>
+          <div class="px-4 pt-4 pb-2">
+            <p class="text-xs text-pg-dimmer -mt-1 mb-4 leading-relaxed">
+              The webhook ID is the shared secret in your HA <code>rest_command</code> URL.
+              Your endpoint: <code>https://cortex.dgrzone.com/webhook/ha/{{ ha_username }}/&lt;webhook_id&gt;</code>
+            </p>
+            <div class="field">
+              <label for="ha_webhook_id">Webhook ID</label>
+              <input type="text" id="ha_webhook_id" name="ha_webhook_id"
+                     value="{{ ha_webhook_id }}"
+                     placeholder="Paste or generate a random secret"
+                     autocomplete="off" spellcheck="false">
+              <p class="hint">Treat this like a password — use a long, random string.</p>
+            </div>
+            <div class="field">
+              <label class="checkbox-label">
+                <input type="checkbox" name="ha_tools" value="1" {{ ha_tools_checked }}>
+                Enable orchestrator tools
+              </label>
+              <p class="hint">When checked, HA events trigger the full tool loop (research, home control, tasks). When unchecked, events get a direct LLM response — faster but no tools.</p>
+            </div>
+          </div>
+        </details>
+      </div>
+
+      <!-- Google Chat -->
+      <div class="section">
+        <h2>Google Chat</h2>
+        <p class="section-note">
+          Outbound webhook for proactive messages to a Google Chat space.
+          Incoming messages are handled separately via the Google Chat Add-on.
+        </p>
+
+        <details class="channel-block border border-pg-border rounded-lg overflow-hidden mb-3"
+                 {{ gc_webhook and 'open' or '' }}>
+          <summary class="flex items-center gap-2 px-4 py-3 text-sm font-semibold text-pg-muted cursor-pointer select-none bg-pg-bg">
+            Outbound webhook
+          </summary>
+          <div class="px-4 pt-4 pb-2">
+            <p class="text-xs text-pg-dimmer -mt-1 mb-4 leading-relaxed">
+              Create a webhook in your Google Chat space → Manage webhooks. Paste the full URL here.
+            </p>
+            <div class="field">
+              <label for="gc_outbound_webhook">Webhook URL</label>
+              <input type="url" id="gc_outbound_webhook" name="gc_outbound_webhook"
+                     value="{{ gc_webhook }}"
+                     placeholder="https://chat.googleapis.com/v1/spaces/…"
+                     autocomplete="off" spellcheck="false">
+            </div>
+          </div>
+        </details>
+      </div>
+
+      <button type="submit" class="btn-submit w-full md:w-96">Save notification settings</button>
+    </form>
+
+    <!-- Test -->
+    <div class="section" style="margin-top:2rem;">
+      <h2>Test</h2>
+      <p class="section-note">
+        Fire a notification via your configured channel or run the reminder check
+        immediately — no need to wait for the daily 09:00 scheduler job.
+      </p>
+      <div class="flex gap-3 mt-2">
+        <button class="flex-1 px-3 py-2.5 text-sm font-medium border border-pg-border rounded-md bg-pg-bg text-pg-text hover:border-pg-action hover:text-pg-accent transition-colors cursor-pointer disabled:opacity-50"
+                id="btn-test-notify">Send Test Notification</button>
+        <button class="flex-1 px-3 py-2.5 text-sm font-medium border border-pg-border rounded-md bg-pg-bg text-pg-text hover:border-pg-action hover:text-pg-accent transition-colors cursor-pointer disabled:opacity-50"
+                id="btn-check-reminders">Check Reminders Now</button>
+      </div>
+      <div id="test-result"
+           class="mt-3 px-3 py-2.5 rounded-md text-sm leading-relaxed"></div>
+    </div>
+  </div>
+
+  <script>
+    // Set channel select to saved value
+    const sel = document.getElementById('notification_channel');
+    if (sel) {
+      const saved = sel.dataset.value;
+      if (saved) {
+        for (const opt of sel.options) {
+          if (opt.value === saved) { opt.selected = true; break; }
+        }
+      }
+    }
+
+    // Test buttons
+    const resultEl = document.getElementById('test-result');
+
+    function showResult(ok, msg) {
+      resultEl.textContent = msg;
+      resultEl.className = ok
+        ? 'mt-3 px-3 py-2.5 rounded-md text-sm leading-relaxed bg-green-950 text-green-400 border border-green-800'
+        : 'mt-3 px-3 py-2.5 rounded-md text-sm leading-relaxed bg-red-950 text-red-400 border border-red-800';
+      resultEl.style.display = 'block';
+    }
+
+    async function apiPost(url, btnEl, label) {
+      btnEl.disabled = true;
+      btnEl.textContent = label + '…';
+      resultEl.style.display = 'none';
+      try {
+        const r = await fetch(url, { method: 'POST' });
+        const data = await r.json();
+        if (r.ok && data.ok) {
+          if (url.includes('reminders')) {
+            const n = data.reminders_found ?? 0;
+            showResult(true, n > 0
+              ? `Found ${n} due reminder${n !== 1 ? 's' : ''} — notification sent.`
+              : 'No due reminders found — nothing sent.');
+          } else {
+            showResult(true, 'Notification sent. Check your configured channel.');
+          }
+        } else {
+          showResult(false, data.detail || 'Request failed.');
+        }
+      } catch (e) {
+        showResult(false, 'Network error: ' + e.message);
+      } finally {
+        btnEl.disabled = false;
+        btnEl.textContent = label;
+      }
+    }
+
+    document.getElementById('btn-test-notify').addEventListener('click', function() {
+      apiPost('/api/push/test', this, 'Send Test Notification');
+    });
+
+    document.getElementById('btn-check-reminders').addEventListener('click', function() {
+      apiPost('/api/push/reminders/check', this, 'Check Reminders Now');
+    });
+  </script>
+</body>
+</html>
--- a/cortex/static/pg.css
+++ b/cortex/static/pg.css
@@ -0,0 +1,189 @@
+/* ─── Cortex settings pages — shared stylesheet ───────────────────────────── */
+
+/* ── Variables ── */
+:root {
+  --pg-bg:     #0f1117; --pg-surface: #1a1d27; --pg-border: #2d3148;
+  --pg-text:   #e2e8f0; --pg-muted:   #94a3b8;
+  --pg-dim:    #64748b; --pg-dimmer:  #475569;
+  --pg-bright: #cbd5e1; --pg-nav-hover: rgba(255,255,255,0.05);
+  --pg-accent: #a78bfa; /* heading/highlight purple */
+  --pg-action: #7c3aed; /* button/focus purple */
+}
+[data-theme="light"] {
+  --pg-bg:     #f4f2fa; --pg-surface: #ffffff; --pg-border: #d0c8e8;
+  --pg-text:   #1a1228; --pg-muted:   #5a5478;
+  --pg-dim:    #7a7290; --pg-dimmer:  #9e98b0;
+  --pg-bright: #1a1228; --pg-nav-hover: rgba(0,0,0,0.05);
+  --pg-accent: #7c3aed;
+  --pg-action: #6d28d9;
+}
+
+/* ── Reset ── */
+*, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
+
+/* ── Base ── */
+body {
+  min-height: 100vh;
+  background: var(--pg-bg);
+  font-family: 'Inter', system-ui, -apple-system, sans-serif;
+  font-weight: 450;
+  -webkit-font-smoothing: antialiased;
+  -moz-osx-font-smoothing: grayscale;
+  color: var(--pg-text);
+}
+
+/* ── Top nav ── */
+.page-nav {
+  display: flex; align-items: center; gap: 0.25rem;
+  padding: 0.5rem 1rem; background: var(--pg-surface);
+  border-bottom: 1px solid var(--pg-border); flex-wrap: wrap;
+}
+.nav-link {
+  display: inline-flex; align-items: center;
+  padding: 0.3rem 0.6rem; border-radius: 6px;
+  font-size: 0.8rem; font-weight: 500; color: var(--pg-dim);
+  text-decoration: none; transition: color 0.15s, background 0.15s;
+  white-space: nowrap;
+}
+.nav-link:hover { color: var(--pg-bright); background: var(--pg-nav-hover); }
+.nav-link.active { color: var(--pg-accent); }
+.nav-spacer { flex: 1; min-width: 0.5rem; }
+.nav-link.nav-logout { color: var(--pg-dimmer); }
+.nav-link.nav-logout:hover { color: var(--pg-muted); background: none; }
+
+/* ── Page container ── */
+.page-wrap {
+  max-width: 860px; margin: 0 auto;
+  padding: 2rem 1.5rem 4rem; width: 100%;
+}
+
+/* ── Page heading ── */
+.page-title {
+  font-size: 1.4rem; font-weight: 700; color: var(--pg-accent);
+}
+.page-subtitle {
+  font-size: 0.8rem; color: var(--pg-muted);
+  margin-top: 0.2rem; margin-bottom: 1.75rem; line-height: 1.5;
+}
+
+/* ── Sections (settings-style, bottom-bordered h2) ── */
+.section { margin-bottom: 2rem; }
+.section > h2 {
+  font-size: 0.9rem; font-weight: 600; color: var(--pg-muted);
+  margin-bottom: 1rem; padding-bottom: 0.4rem;
+  border-bottom: 1px solid var(--pg-border);
+}
+
+/* ── Form elements ── */
+.field { margin-bottom: 1rem; }
+
+label {
+  display: block; font-size: 0.8rem; font-weight: 500;
+  color: var(--pg-muted); margin-bottom: 0.4rem;
+}
+
+input, select, textarea {
+  width: 100%; padding: 0.65rem 0.85rem;
+  background: var(--pg-bg); border: 1px solid var(--pg-border);
+  border-radius: 6px; color: var(--pg-text); font-size: 0.95rem;
+  font-family: inherit; outline: none; transition: border-color 0.15s;
+}
+input:focus, select:focus, textarea:focus { border-color: var(--pg-action); }
+input[readonly] { color: var(--pg-muted); cursor: default; }
+input[type="password"] { font-family: monospace; letter-spacing: 0.05em; }
+input[type="checkbox"], input[type="radio"] { width: auto; padding: 0; }
+
+textarea {
+  font-family: 'SF Mono', 'Fira Mono', 'Menlo', monospace;
+  font-size: 0.88rem; line-height: 1.55; resize: vertical;
+}
+
+/* ── Buttons ── */
+
+/* Primary form submit */
+.btn-submit {
+  padding: 0.6rem 1.5rem; margin-top: 0.25rem;
+  background: var(--pg-action); border: none; border-radius: 6px;
+  color: #fff; font-size: 0.9rem; font-weight: 600;
+  cursor: pointer; transition: opacity 0.15s;
+}
+.btn-submit:hover { opacity: 0.88; }
+
+/* Compact inline primary (e.g. rename save, inline forms) */
+.btn-save {
+  padding: 0.4rem 0.9rem; background: var(--pg-action); border: none;
+  border-radius: 6px; color: #fff; font-size: 0.9rem;
+  font-weight: 600; cursor: pointer; transition: opacity 0.15s;
+}
+.btn-save:hover { opacity: 0.88; }
+
+/* Outline secondary (e.g. clear cache, cancel, test actions) */
+.btn-secondary {
+  padding: 0.5rem 1rem; background: none;
+  border: 1px solid var(--pg-border); border-radius: 6px;
+  color: var(--pg-muted); font-size: 0.88rem; font-weight: 500;
+  cursor: pointer; transition: border-color 0.15s, color 0.15s;
+}
+.btn-secondary:hover { border-color: var(--pg-muted); color: var(--pg-text); }
+.btn-secondary:disabled { opacity: 0.5; cursor: default; }
+
+/* Inline cancel */
+.btn-cancel {
+  padding: 0.4rem 0.75rem; background: none;
+  border: 1px solid var(--pg-border); border-radius: 6px;
+  color: var(--pg-muted); font-size: 0.9rem;
+  cursor: pointer; transition: border-color 0.15s, color 0.15s;
+}
+.btn-cancel:hover { border-color: var(--pg-muted); color: var(--pg-text); }
+
+/* Button-styled link (purple, used for "Settings →" style CTAs) */
+.action-link {
+  display: inline-block; padding: 0.5rem 1rem;
+  background: var(--pg-action); border-radius: 6px;
+  color: #fff; font-size: 0.88rem; font-weight: 600;
+  text-decoration: none; transition: opacity 0.15s;
+}
+.action-link:hover { opacity: 0.88; }
+
+/* Inline button row */
+.btn-row { display: flex; gap: 0.5rem; margin-top: 0.5rem; }
+
+/* ── Text utilities ── */
+
+/* Small muted helper text below inputs */
+.hint { font-size: 0.78rem; color: var(--pg-dim); margin-top: 0.35rem; line-height: 1.5; }
+
+/* Section-level description paragraph */
+.section-note { font-size: 0.8rem; color: var(--pg-muted); margin-bottom: 0.85rem; line-height: 1.55; }
+
+/* Inline code */
+code {
+  font-size: 0.82rem; font-family: 'SF Mono', 'Fira Mono', 'Menlo', monospace;
+  background: var(--pg-bg); border: 1px solid var(--pg-border);
+  padding: 0.1rem 0.35rem; border-radius: 4px; color: var(--pg-accent);
+}
+
+/* ── Feedback messages ── */
+.success { color: #4ade80; font-size: 0.85rem; text-align: center; margin-bottom: 1rem; }
+.error   { color: #f87171; font-size: 0.85rem; text-align: center; margin-bottom: 1rem; }
+
+/* ── Usage table (JS-rendered in settings) ── */
+.usage-table { border-collapse: collapse; width: 100%; min-width: 360px; }
+.usage-table th {
+  padding: 0.35rem 0.5rem; font-size: 0.75rem; color: var(--pg-muted);
+  font-weight: 600; text-align: right; border-bottom: 1px solid var(--pg-border);
+}
+.usage-table th:first-child { padding-left: 0; text-align: left; }
+.usage-table td {
+  padding: 0.4rem 0.5rem; font-size: 0.82rem; color: var(--pg-muted); text-align: right;
+}
+.usage-table td:first-child { padding-left: 0; color: var(--pg-text); text-align: left; white-space: nowrap; }
+.usage-table td:last-child  { color: var(--pg-text); font-weight: 600; }
+
+/* ── Tool category header row (tools_settings.py generated) ── */
+.tool-cat-row td {
+  padding: 0.75rem 0.9rem 0.3rem;
+  font-size: 0.72rem; font-weight: 700; letter-spacing: 0.07em;
+  text-transform: uppercase; color: var(--pg-dimmer);
+  border-bottom: 1px solid var(--pg-border);
+}
--- a/cortex/static/settings.html
+++ b/cortex/static/settings.html
@@ -7,223 +7,118 @@
  <link rel="preconnect" href="https://fonts.googleapis.com">
  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
+  <script src="https://cdn.tailwindcss.com"></script>
+  <script>
+  tailwind.config = {
+    corePlugins: { preflight: false },
+    darkMode: ['selector', '[data-theme="dark"]'],
+    theme: {
+      extend: {
+        colors: {
+          pg: {
+            bg:      'var(--pg-bg)',
+            surface: 'var(--pg-surface)',
+            border:  'var(--pg-border)',
+            text:    'var(--pg-text)',
+            muted:   'var(--pg-muted)',
+            dim:     'var(--pg-dim)',
+            dimmer:  'var(--pg-dimmer)',
+            bright:  'var(--pg-bright)',
+            accent:  'var(--pg-accent)',
+            action:  'var(--pg-action)',
+          }
+        },
+        fontFamily: { sans: ['Inter', 'system-ui', 'sans-serif'] }
+      }
+    }
+  }
+  </script>
+  <link rel="stylesheet" href="/static/pg.css">
+  <script>(function(){var t=localStorage.getItem('theme')||(window.matchMedia('(prefers-color-scheme: dark)').matches?'dark':'light');document.documentElement.setAttribute('data-theme',t);})();</script>
  <style>
-    *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
-
-    body {
-      min-height: 100vh;
-      display: flex;
-      align-items: center;
-      justify-content: center;
-      background: #0f1117;
-      font-family: 'Inter', system-ui, -apple-system, sans-serif;
-      font-weight: 450;
-      -webkit-font-smoothing: antialiased;
-      -moz-osx-font-smoothing: grayscale;
-      color: #e2e8f0;
-      padding: 1.5rem;
-    }
-
-    .card {
-      background: #1a1d27;
-      border: 1px solid #2d3148;
-      border-radius: 12px;
-      padding: 2.5rem 2rem;
-      width: 100%;
-      max-width: 480px;
-    }
-
-    .page-nav {
-      display: flex;
-      align-items: center;
-      gap: 0.25rem;
-      margin-bottom: 1.75rem;
-      flex-wrap: wrap;
-    }
-    .nav-link {
-      display: inline-flex;
-      align-items: center;
-      padding: 0.3rem 0.6rem;
-      border-radius: 6px;
-      font-size: 0.8rem;
-      font-weight: 500;
-      color: #64748b;
-      text-decoration: none;
-      transition: color 0.15s, background 0.15s;
-      white-space: nowrap;
-    }
-    .nav-link:hover { color: #cbd5e1; background: rgba(255,255,255,0.05); }
-    .nav-link.active { color: #a78bfa; }
-    .nav-spacer { flex: 1; min-width: 0.5rem; }
-    .nav-link.nav-logout { color: #475569; }
-    .nav-link.nav-logout:hover { color: #94a3b8; background: none; }
-
-    .logo {
-      margin-bottom: 1.75rem;
-    }
-    .logo h1 { font-size: 1.4rem; font-weight: 700; color: #a78bfa; }
-    .logo p  { font-size: 0.8rem; color: #94a3b8; margin-top: 0.2rem; }
-
-    h2 {
-      font-size: 0.9rem;
-      font-weight: 600;
-      color: #94a3b8;
-      margin-bottom: 1rem;
-      padding-bottom: 0.4rem;
-      border-bottom: 1px solid #2d3148;
-    }
-
-    .section { margin-bottom: 2rem; }
-
-    label {
-      display: block;
-      font-size: 0.8rem;
-      font-weight: 500;
-      color: #94a3b8;
-      margin-bottom: 0.4rem;
-    }
-
-    input {
-      width: 100%;
-      padding: 0.65rem 0.85rem;
-      background: #0f1117;
-      border: 1px solid #2d3148;
-      border-radius: 6px;
-      color: #e2e8f0;
-      font-size: 0.95rem;
-      outline: none;
-      transition: border-color 0.15s;
-    }
-    input:focus { border-color: #7c3aed; }
-    input[readonly] { color: #94a3b8; cursor: default; }
-
-    .field { margin-bottom: 1rem; }
-
-    button[type="submit"] {
-      width: 100%;
-      padding: 0.7rem;
-      margin-top: 0.25rem;
-      background: #7c3aed;
-      border: none;
-      border-radius: 6px;
-      color: #fff;
-      font-size: 1rem;
-      font-weight: 600;
-      cursor: pointer;
-      transition: background 0.15s;
-    }
-    button[type="submit"]:hover { background: #6d28d9; }
-
-    .error {
-      color: #f87171;
-      font-size: 0.85rem;
-      text-align: center;
-      margin-bottom: 1rem;
-    }
-
-    .success {
-      color: #4ade80;
-      font-size: 0.85rem;
-      text-align: center;
-      margin-bottom: 1rem;
-    }
-
+    /* ── Server-generated persona list ── */
    .persona-list {
-      list-style: none;
-      display: flex;
-      flex-direction: column;
-      gap: 0.5rem;
-      margin-top: 0.5rem;
-    }
-    .persona-list li {
-      display: flex;
-      align-items: center;
-      gap: 0.5rem;
-      flex-wrap: wrap;
+      list-style: none; display: flex; flex-direction: column;
+      gap: 0.5rem; margin-top: 0.5rem;
    }
+    .persona-list li { display: flex; align-items: center; gap: 0.5rem; flex-wrap: wrap; }
    .persona-link {
-      display: inline-block;
-      padding: 0.3rem 0.75rem;
-      background: #0f1117;
-      border: 1px solid #2d3148;
-      border-radius: 20px;
-      color: #a78bfa;
-      font-size: 0.85rem;
-      text-decoration: none;
-      transition: border-color 0.15s;
+      display: inline-block; padding: 0.3rem 0.75rem;
+      background: var(--pg-bg); border: 1px solid var(--pg-border);
+      border-radius: 20px; color: var(--pg-accent); font-size: 0.85rem;
+      text-decoration: none; transition: border-color 0.15s;
    }
-    .persona-link:hover { border-color: #7c3aed; }
-    .persona-list li em { color: #94a3b8; font-size: 0.85rem; }
-
+    .persona-link:hover { border-color: var(--pg-action); }
+    .persona-list li em { color: var(--pg-muted); font-size: 0.85rem; }
    .persona-rename-toggle {
-      background: none;
-      border: none;
-      color: #94a3b8;
-      font-size: 0.85rem;
-      cursor: pointer;
-      padding: 0.2rem 0.4rem;
-      border-radius: 4px;
-      opacity: 0.7;
-      transition: opacity 0.15s, color 0.15s;
+      background: none; border: 1px solid var(--pg-border);
+      border-radius: 6px; color: var(--pg-muted); font-size: 0.8rem;
+      padding: 0.3rem 0.6rem; margin-top: 0.25rem;
+      cursor: pointer; opacity: 0.7; transition: opacity 0.15s, color 0.15s;
    }
-    .persona-rename-toggle:hover { opacity: 1; color: #a78bfa; }
-
+    .persona-rename-toggle:hover { opacity: 1; color: var(--pg-accent); }
    .persona-rename-form { display: flex; align-items: center; gap: 0.4rem; }
    .persona-rename-form input[type="text"] {
-      width: 12rem;
-      padding: 0.3rem 0.6rem;
-      background: #0f1117;
-      border: 1px solid #7c3aed;
-      border-radius: 6px;
-      color: #e2e8f0;
-      font-size: 0.9rem;
-      outline: none;
+      width: 12rem; padding: 0.3rem 0.6rem;
+      border-color: var(--pg-action); font-size: 0.9rem;
    }
-    .persona-rename-form button[type="submit"] {
-      width: auto;
-      padding: 0.3rem 0.75rem;
-      font-size: 0.85rem;
-      margin-top: 0;
-    }
-    .persona-rename-cancel {
-      background: none;
-      border: 1px solid #2d3148;
-      border-radius: 6px;
-      color: #94a3b8;
-      font-size: 0.85rem;
-      padding: 0.3rem 0.6rem;
-      cursor: pointer;
-    }
-    .persona-rename-cancel:hover { border-color: #94a3b8; color: #e2e8f0; }
+    .persona-rename-form .btn-save { padding: 0.3rem 0.75rem; font-size: 0.85rem; }

-    .add-persona {
-      display: inline-block;
-      margin-top: 0.75rem;
-      font-size: 0.8rem;
-      color: #94a3b8;
-      text-decoration: none;
+    /* ── Server-generated role badge ── */
+    .role-badge {
+      display: inline-block; padding: 0.25rem 0.75rem;
+      border-radius: 20px; font-size: 0.78rem; font-weight: 600;
+      text-transform: uppercase; letter-spacing: 0.06em;
    }
-    .add-persona:hover { color: #a78bfa; }
+    .role-badge.role-admin {
+      background: rgba(124,58,237,0.15); color: var(--pg-accent);
+      border: 1px solid rgba(124,58,237,0.4);
+    }
+    .role-badge.role-user {
+      background: rgba(100,116,139,0.12); color: var(--pg-muted);
+      border: 1px solid var(--pg-border);
+    }
+
+    /* ── JS-toggled states ── */
+    #clear-ls-ok { display: none; margin-left: 0.75rem; font-size: 0.8rem; color: #4ade80; }
+    .usage-wrap { overflow-x: auto; }
  </style>
 </head>
 <body>
-  <div class="card">
-    <nav class="page-nav">
-      <a href="{{ back_href }}" class="nav-link">← Chat</a>
-      <a href="{{ help_href }}" class="nav-link">Help</a>
-      <a href="/settings" class="nav-link active">Settings</a>
-      <span class="nav-spacer"></span>
-      <a href="/logout" class="nav-link nav-logout">Sign out</a>
-    </nav>
-
-    <div class="logo">
-      <h1>Account Settings</h1>
-      <p>Manage your account and personas.</p>
-    </div>
+  <nav class="page-nav">
+    <a href="{{ back_href }}" class="nav-link">← Chat</a>
+    <a href="{{ help_href }}" class="nav-link">Help</a>
+    <a href="/settings" class="nav-link active">Settings</a>
+    <a href="/settings/models" class="nav-link">Models</a>
+    <a href="/settings/notifications" class="nav-link">Notifications</a>
+    <a href="/settings/tools" class="nav-link">Tools</a>
+    <a href="/settings/crons" class="nav-link">Schedules</a>
+    {{ integrations_nav }}
+    <span class="nav-spacer"></span>
+    <a href="/logout" class="nav-link nav-logout">Sign out</a>
+  </nav>
+  <div class="page-wrap">
+    <h1 class="page-title">Account Settings</h1>
+    <p class="page-subtitle">Manage your account and personas.</p>

    <!-- SUCCESS -->
    <!-- ERROR -->

+    <!-- OpenRouter quickstart (shown by JS when no model is configured) -->
+    <div id="openrouter-quickstart"
+         class="hidden rounded-xl border border-amber-800 bg-amber-950 p-4 mb-5">
+      <p class="text-xs font-semibold text-amber-400 mb-1">⚡ You're on the server default model</p>
+      <p class="text-xs text-amber-600 mb-3 leading-relaxed">
+        You can chat now, but adding your own model gives you more choices, lets you pick
+        role-specific models, and tracks your usage separately.
+        OpenRouter is the easiest way to get started — one key, many models.
+      </p>
+      <a href="/setup/model"
+         class="inline-block px-3 py-2 rounded-md bg-amber-900 text-amber-100 text-sm font-medium hover:bg-amber-800 transition-colors">
+        Set up OpenRouter →
+      </a>
+    </div>
+
    <!-- Account info -->
    <div class="section">
      <h2>Account</h2>
@@ -231,26 +126,25 @@
        <label>Username</label>
        <input type="text" value="{{ username }}" readonly>
      </div>
-      <button type="button" id="show-rename-user" class="persona-rename-toggle"
-              style="opacity:0.7; font-size:0.8rem; padding:0.3rem 0.6rem; border:1px solid #2d3148; border-radius:6px; margin-top:0.25rem;">
+      <div class="field">
+        <label>Role</label>
+        <span class="role-badge role-{{ user_role }}">{{ user_role }}</span>
+      </div>
+      <button type="button" id="show-rename-user" class="persona-rename-toggle">
        ✏ Change username
      </button>
-      <form id="rename-user-form" method="POST" action="/settings/username"
-            style="display:none; margin-top:0.75rem;">
+      <form id="rename-user-form" method="POST" action="/settings/username" style="display:none; margin-top:0.75rem;">
        <div class="field">
          <label for="new_username">New username</label>
          <input type="text" id="new_username" name="new_username"
                 value="{{ username }}"
                 pattern="[a-z_][a-z0-9_\-]{0,31}" required autofocus
                 autocomplete="off" data-form-type="other">
-          <p style="font-size:0.75rem; color:#94a3b8; margin-top:0.3rem;">
-            Lowercase letters, digits, _ or - only. You will be logged out after renaming.
-          </p>
+          <p class="hint">Lowercase letters, digits, _ or - only. You will be logged out after renaming.</p>
        </div>
-        <div style="display:flex; gap:0.5rem;">
-          <button type="submit" style="flex:1; padding:0.5rem; background:#7c3aed; border:none; border-radius:6px; color:#fff; font-size:0.9rem; font-weight:600; cursor:pointer;">Save</button>
-          <button type="button" id="cancel-rename-user"
-                  style="padding:0.5rem 0.9rem; background:none; border:1px solid #2d3148; border-radius:6px; color:#94a3b8; font-size:0.9rem; cursor:pointer;">Cancel</button>
+        <div class="btn-row">
+          <button type="submit" class="btn-save">Save</button>
+          <button type="button" id="cancel-rename-user" class="btn-cancel">Cancel</button>
        </div>
      </form>
    </div>
@@ -262,55 +156,71 @@
        <label>Google Account</label>
        <input type="text" value="{{ google_email }}" readonly
               placeholder="No Google account linked"
-               style="{{ google_email == '' and 'color:#475569' or '' }}">
+               style="{{ google_email == '' and 'color:var(--pg-dimmer)' or '' }}">
      </div>
-      <p style="font-size:0.75rem; color:#94a3b8; margin-top:-0.5rem;">
-        To link or change your Google account, contact Scott.
-      </p>
+      <p class="hint" style="margin-top:-0.5rem;">To link or change your Google account, contact Scott.</p>
    </div>

-    <!-- Gemini API key -->
+    <!-- Email Allowlist -->
    <div class="section">
-      <h2>Gemini API Key</h2>
-      <p style="font-size:0.8rem; color:#94a3b8; margin-bottom:0.85rem; line-height:1.55;">
-        Paste your personal key from
-        <a href="https://aistudio.google.com/apikey" target="_blank" rel="noopener"
-           style="color:#a78bfa;">aistudio.google.com/apikey</a>
-        to use your own Gemini quota. Leave blank to use the shared server key.
+      <h2>Email Allowlist</h2>
+      <p class="section-note">
+        One regex pattern per line. The <code>email_send</code> tool will only send to addresses
+        that match at least one pattern. Leave blank to block all outbound email.
      </p>
-      <form method="POST" action="/settings/gemini-key">
+      <form method="POST" action="/settings/email-allowlist">
        <div class="field">
-          <label for="gemini_api_key">API Key</label>
-          <input type="text" id="gemini_api_key" name="gemini_api_key"
-                 placeholder="{{ gemini_key_hint }}"
-                 autocomplete="new-password" spellcheck="false"
-                 data-1p-ignore data-lpignore="true" data-form-type="other">
+          <label for="email_allowlist_ta">Allowed patterns</label>
+          <textarea id="email_allowlist_ta" name="patterns" rows="6"
+                    placeholder=".*@example\.com&#10;alice@example\.com"
+                    spellcheck="false">{{ email_allowlist }}</textarea>
        </div>
-        <button type="submit">Save Key</button>
+        <button type="submit" class="btn-submit w-full md:w-96">Save allowlist</button>
      </form>
-      <p id="gemini-key-status" style="font-size:0.75rem; color:#94a3b8; margin-top:0.5rem;">
-        Current: {{ gemini_key_hint }}
-        <span id="gemini-remove-wrap" style="{{ gemini_key_set == 'false' and 'display:none' or '' }}">
-          — <a href="#" id="gemini-remove-link" style="color:#f87171;">remove</a>
-        </span>
-      </p>
    </div>

-    <!-- Local models link -->
+    <!-- HTTP POST Allowlist -->
    <div class="section">
-      <h2>Local Models</h2>
-      <p style="font-size:0.8rem; color:#94a3b8; margin-bottom:0.85rem; line-height:1.55;">
-        Configure OpenAI-compatible hosts and models (Open WebUI, Ollama, LM Studio, etc.).
+      <h2>HTTP POST Allowlist</h2>
+      <p class="section-note">
+        One URL prefix per line. The <code>http_post</code> tool will only POST to URLs that
+        start with a listed prefix. Leave blank to block all outbound POST requests.
      </p>
-      <a href="/settings/local"
-         style="display:inline-block; padding:0.55rem 1rem; background:#7c3aed; border-radius:6px;
-                color:#fff; font-size:0.88rem; font-weight:600; text-decoration:none;
-                transition:background 0.15s;">
-        Manage local models →
-      </a>
+      <form method="POST" action="/settings/http-allowlist">
+        <div class="field">
+          <label for="http_allowlist_ta">Allowed URL prefixes</label>
+          <textarea id="http_allowlist_ta" name="prefixes" rows="5"
+                    placeholder="https://ha.dgrzone.com/api/webhook/&#10;https://n8n.dgrzone.com/webhook/"
+                    spellcheck="false">{{ http_allowlist }}</textarea>
+        </div>
+        <button type="submit" class="btn-submit w-full md:w-96">Save allowlist</button>
+      </form>
    </div>

-    <!-- Change password -->
+    <!-- Usage summary -->
+    <div class="section" id="usage-section">
+      <h2>Usage</h2>
+      <p class="section-note">
+        Token consumption tracked for API-backed models (Gemini API, local OpenAI-compatible).
+        Claude CLI calls are not metered.
+      </p>
+      <div id="usage-table-wrap" class="usage-wrap">
+        <p class="section-note">Loading…</p>
+      </div>
+    </div>
+
+    <!-- Browser Cache -->
+    <div class="section">
+      <h2>Browser Cache</h2>
+      <p class="section-note">
+        Clears UI preferences stored in this browser: active mode, session ID, memory toggles,
+        theme, font size, and context tier. Does not sign you out.
+      </p>
+      <button type="button" id="clear-ls-btn" class="btn-secondary">Clear browser cache</button>
+      <span id="clear-ls-ok">Cleared.</span>
+    </div>
+
+    <!-- Change Password -->
    <div class="section">
      <h2>Change Password</h2>
      <form method="POST" action="/settings/password" id="password-form">
@@ -329,17 +239,33 @@
          <input type="password" id="confirm_password" name="confirm_password"
                 autocomplete="new-password" required>
        </div>
-        <button type="submit">Update password</button>
+        <button type="submit" class="btn-submit w-full md:w-96">Update password</button>
      </form>
    </div>

+    <!-- Sessions -->
+    <div class="section">
+      <h2>Sessions</h2>
+      <p class="section-note">
+        Auto-name any sessions that still show a random ID, using their first message as the name.
+        Only unnamed sessions are affected — existing names are left alone.
+      </p>
+      <button type="button" id="backfill-names-btn" class="btn-secondary">Auto-name old sessions</button>
+      <span id="backfill-names-ok"
+            class="ml-3 text-xs hidden"
+            style="color:#4ade80"></span>
+    </div>
+
    <!-- Personas -->
    <div class="section">
      <h2>Personas</h2>
      <ul class="persona-list">
        {{ persona_items }}
      </ul>
-      <a href="/setup/persona" class="add-persona">+ Add new persona</a>
+      <a href="/setup/persona"
+         class="inline-block mt-3 text-xs text-pg-muted hover:text-pg-accent transition-colors">
+        + Add new persona
+      </a>
    </div>
  </div>

@@ -365,15 +291,87 @@
      document.getElementById('show-rename-user').style.display = '';
    });

-    // Gemini key — "remove" link clears the input and submits the form
-    const geminiRemove = document.getElementById('gemini-remove-link');
-    if (geminiRemove) {
-      geminiRemove.addEventListener('click', e => {
-        e.preventDefault();
-        document.getElementById('gemini_api_key').value = '';
-        document.querySelector('form[action="/settings/gemini-key"]').submit();
-      });
-    }
+    // Clear localStorage (keeps JWT cookie — no sign-out)
+    document.getElementById('clear-ls-btn').addEventListener('click', () => {
+      localStorage.clear();
+      document.getElementById('clear-ls-ok').style.display = 'inline';
+    });
+
+    // Show OpenRouter quick-start card if no model is configured
+    (async () => {
+      try {
+        const d = await fetch('/backend').then(r => r.json());
+        if ((d.available_roles || []).length === 0) {
+          const el = document.getElementById('openrouter-quickstart');
+          el.classList.remove('hidden');
+          el.style.display = 'block';
+        }
+      } catch (_) {}
+    })();
+
+    // Usage summary table
+    (async () => {
+      const wrap = document.getElementById('usage-table-wrap');
+      try {
+        const resp = await fetch('/api/usage/summary');
+        if (!resp.ok) throw new Error(resp.statusText);
+        const rows_data = await resp.json();
+        if (!rows_data.length) {
+          wrap.innerHTML = '<p class="section-note">No usage recorded yet.</p>';
+          return;
+        }
+        const fmt = n => n >= 1000 ? (n / 1000).toFixed(1) + 'k' : String(n);
+        const rows = rows_data.map(d => {
+          const labelCell = d.label !== d.key
+            ? `<span title="${d.key}">${d.label}</span>`
+            : `<span>${d.key}</span>`;
+          return `<tr>
+            <td>${labelCell}</td>
+            <td>${d.calls}</td>
+            <td>${fmt(d.prompt_tokens)}</td>
+            <td>${fmt(d.completion_tokens)}</td>
+            <td>${fmt(d.total_tokens)}</td>
+          </tr>`;
+        }).join('');
+        wrap.innerHTML = `<table class="usage-table">
+          <thead><tr>
+            <th style="text-align:left">Model</th>
+            <th>Calls</th><th>Prompt</th><th>Output</th><th>Total</th>
+          </tr></thead>
+          <tbody>${rows}</tbody>
+        </table>`;
+      } catch (e) {
+        wrap.innerHTML = '<p class="section-note">Could not load usage data.</p>';
+      }
+    })();
+
+    // Auto-name old sessions backfill
+    document.getElementById('backfill-names-btn').addEventListener('click', async () => {
+      const btn = document.getElementById('backfill-names-btn');
+      const ok  = document.getElementById('backfill-names-ok');
+      btn.disabled = true;
+      btn.textContent = 'Working…';
+      try {
+        const params = new URLSearchParams(window.location.search);
+        const user    = params.get('user')    || document.querySelector('input[value]')?.value || '';
+        const persona = params.get('persona') || '';
+        const qs = user ? `?user=${encodeURIComponent(user)}&persona=${encodeURIComponent(persona)}` : '';
+        const res = await fetch(`/api/sessions/backfill-names${qs}`, { method: 'POST' });
+        const data = await res.json();
+        if (!res.ok) throw new Error(data.detail || res.statusText);
+        const n = data.named ?? 0;
+        ok.textContent = `Named ${n} session${n !== 1 ? 's' : ''}.`;
+        ok.style.display = 'inline';
+        ok.classList.remove('hidden');
+      } catch (e) {
+        ok.textContent = 'Error — check console.';
+        ok.style.color = '#f87171';
+        ok.style.display = 'inline';
+        ok.classList.remove('hidden');
+      }
+      btn.textContent = 'Auto-name old sessions';
+      btn.disabled = false;
+    });

    // Persona rename toggle
    document.querySelectorAll('.persona-rename-toggle').forEach(btn => {
--- a/cortex/static/setup.html
+++ b/cortex/static/setup.html
@@ -127,6 +127,36 @@

    .emoji-opt.selected { border-color: #7c3aed; background: #2d1f52; }
    #emoji-hidden { display: none; }
+
+    .provider-badge {
+      display: inline-flex;
+      align-items: center;
+      gap: 0.4rem;
+      background: #2d1f52;
+      border: 1px solid #7c3aed;
+      border-radius: 6px;
+      padding: 0.3rem 0.6rem;
+      font-size: 0.78rem;
+      color: #a78bfa;
+      margin-bottom: 1rem;
+    }
+
+    .skip-link {
+      display: block;
+      text-align: center;
+      margin-top: 1rem;
+      font-size: 0.8rem;
+      color: #64748b;
+      text-decoration: none;
+    }
+    .skip-link:hover { color: #94a3b8; }
+
+    .model-hint {
+      font-size: 0.72rem;
+      color: #64748b;
+      margin-top: 0.75rem;
+      text-align: center;
+    }
  </style>
 </head>
 <body>
@@ -137,10 +167,11 @@
    </div>

    <!-- ERROR -->
+    <!-- ERROR_MODEL -->

    <!-- ── Step 1: password ───────────────────────────────────────── -->
    <div id="step-password">
-      <div class="step-label">Step 1 of 2</div>
+      <div class="step-label">Step 1 of 3</div>
      <h2>Set your password</h2>
      <form method="POST" action="" id="password-form">
        <input type="hidden" name="step" value="password">
@@ -161,7 +192,7 @@

    <!-- ── Step 2: persona ────────────────────────────────────────── -->
    <div id="step-persona" style="display:none">
-      <div class="step-label">Step 2 of 2</div>
+      <div class="step-label">Step 2 of 3</div>
      <h2>Create your persona</h2>
      <form method="POST" action="" id="persona-form">
        <input type="hidden" name="step" value="persona">
@@ -203,6 +234,39 @@
        <button type="submit">Create my persona →</button>
      </form>
    </div>
+
+    <!-- ── Step 3: model connect ─────────────────────────────────── -->
+    <div id="step-model" style="display:none">
+      <div class="step-label"><!-- SETUP_STEP3_LABEL --></div>
+      <h2>Connect an AI model</h2>
+      <div class="provider-badge">⚡ Recommended: OpenRouter</div>
+      <p style="font-size:0.82rem;color:#94a3b8;margin-bottom:1rem;">
+        One API key gives you access to Claude, Gemini, Llama, and dozens of other models.
+        Get a free key at <a href="https://openrouter.ai/keys" target="_blank" style="color:#a78bfa;">openrouter.ai/keys</a>.
+      </p>
+      <form method="POST" action="/setup/model" id="model-form">
+        <div class="field">
+          <label for="api_key">OpenRouter API key</label>
+          <input type="password" id="api_key" name="api_key"
+                 autocomplete="off" placeholder="sk-or-v1-..." required>
+        </div>
+        <div class="field">
+          <label for="model_name">Starting model</label>
+          <select id="model_name" name="model_name">
+            <option value="anthropic/claude-3-5-haiku-20241022">Claude 3.5 Haiku — Fast &amp; affordable</option>
+            <option value="anthropic/claude-3-7-sonnet-20250219">Claude 3.7 Sonnet — Smarter Claude</option>
+            <option value="google/gemini-2.0-flash-001">Gemini 2.0 Flash — Fast Google model</option>
+            <option value="meta-llama/llama-3.3-70b-instruct">Llama 3.3 70B — Open source</option>
+          </select>
+          <p class="hint">You can add more models or switch anytime in Account → Model Registry.</p>
+        </div>
+        <button type="submit">Connect &amp; start chatting →</button>
+      </form>
+      <p class="model-hint">
+        Using Ollama, a local model, or something else?
+        <a href="#" id="skip-model-link" style="color:#64748b;">Skip this step →</a>
+      </p>
+    </div>
  </div>

  <script>
@@ -232,6 +296,11 @@
      document.getElementById('step-password').style.display = 'none';
      document.getElementById('step-persona').style.display  = 'block';
    }
+    if (params.get('step') === '3') {
+      document.getElementById('step-password').style.display = 'none';
+      document.getElementById('step-persona').style.display  = 'none';
+      document.getElementById('step-model').style.display    = 'block';
+    }

    // ── Client-side confirm password check ───────────────────────────
    document.getElementById('password-form').addEventListener('submit', e => {
@@ -243,6 +312,15 @@
      }
    });

+    // ── Skip model setup — navigate to user home ─────────────────────
+    document.getElementById('skip-model-link')?.addEventListener('click', e => {
+      e.preventDefault();
+      // Ask server for skip target (the cx_setup_persona cookie has the path)
+      fetch('/setup/model/skip', { method: 'POST', credentials: 'same-origin' })
+        .then(r => { if (r.redirected) location.href = r.url; else location.href = '/'; })
+        .catch(() => { location.href = '/'; });
+    });
+
    // ── Auto-generate persona slug from display name ─────────────────
    document.getElementById('display_name').addEventListener('input', function() {
      const slugField = document.getElementById('persona_name');
--- a/cortex/static/style.css
+++ b/cortex/static/style.css
@@ -21,6 +21,9 @@
            --pre-bg:       rgba(0,0,0,0.35);
            --success:      #6abf6a;
            --success-dim:  #2a4a2a;
+            --amber:        #f59e0b;
+            --amber-border: #92400e;
+            --amber-glow:   rgba(245,158,11,0.35);
        }

        /* ── Light theme ─────────────────────────────────────────── */
@@ -45,6 +48,9 @@
                --pre-bg:       rgba(0,0,0,0.07);
                --success:      #1e6e1e;
                --success-dim:  #5aaa5a;
+                --amber:        #b45309;
+                --amber-border: #92400e;
+                --amber-glow:   rgba(180,83,9,0.25);
            }
        }

@@ -69,6 +75,9 @@
            --pre-bg:       rgba(0,0,0,0.35);
            --success:      #6abf6a;
            --success-dim:  #2a4a2a;
+            --amber:        #f59e0b;
+            --amber-border: #92400e;
+            --amber-glow:   rgba(245,158,11,0.35);
        }

        [data-theme="light"] {
@@ -91,6 +100,9 @@
            --pre-bg:       rgba(0,0,0,0.07);
            --success:      #1e6e1e;
            --success-dim:  #5aaa5a;
+            --amber:        #b45309;
+            --amber-border: #92400e;
+            --amber-glow:   rgba(180,83,9,0.25);
        }

        body {
@@ -130,6 +142,15 @@

        .header-emoji.processing { animation: shimmer 0.75s ease-in-out infinite; }

+        @keyframes border-pulse {
+            0%, 100% { box-shadow: inset 0 0 15px var(--amber-glow); }
+            50%      { box-shadow: inset 0 0 30px var(--amber-glow); }
+        }
+
+        body.processing {
+            animation: border-pulse 1.5s ease-in-out infinite;
+        }
+
        header .name     { font-size: 1.1rem; font-weight: 600; color: var(--accent); }
        header .subtitle { font-size: 0.78rem; color: var(--muted); }

@@ -151,7 +172,7 @@
            background: var(--surface);
            border: 1px solid var(--border);
            border-radius: 8px;
-            box-shadow: 0 4px 16px rgba(0,0,0,0.4);
+            box-shadow: 0 8px 24px var(--shadow);
            z-index: 200;
            overflow: hidden;
        }
@@ -159,7 +180,9 @@
        .persona-dropdown.open { display: block; }

        .persona-dropdown a {
-            display: block;
+            display: flex;
+            align-items: center;
+            gap: 8px;
            padding: 0.55rem 0.85rem;
            color: var(--text);
            text-decoration: none;
@@ -171,6 +194,12 @@

        .persona-dropdown a.active { color: var(--accent); font-weight: 600; }

+        .persona-dropdown .pd-emoji {
+            font-size: 1.1rem;
+            line-height: 1;
+            flex-shrink: 0;
+        }
+
        .persona-dropdown .pd-divider {
            border-top: 1px solid var(--border);
            margin: 0.25rem 0;
@@ -223,7 +252,7 @@
            background: var(--surface);
            border: 1px solid var(--border);
            border-radius: 8px;
-            box-shadow: 0 4px 16px rgba(0,0,0,0.4);
+            box-shadow: 0 8px 24px var(--shadow);
            z-index: 200;
            overflow: hidden;
        }
@@ -246,6 +275,7 @@
            box-sizing: border-box;
        }
        .hdr-dd-item:hover { background: var(--border); }
+        .hdr-dd-item.push-active { color: var(--accent); }

        .hdr-dd-divider {
            border-top: 1px solid var(--border);
@@ -258,8 +288,8 @@
            position: absolute;
            top: calc(100% + 4px);
            right: 12px;
-            width: min(300px, calc(100vw - 24px));
-            max-height: 340px;
+            width: min(420px, calc(100vw - 24px));
+            max-height: 400px;
            overflow-y: auto;
            background: var(--surface);
            border: 1px solid var(--border);
@@ -271,19 +301,26 @@
        #sessions-panel.open { display: block; }

        .session-item {
-            padding: 10px 14px;
+            padding: 8px 12px;
            cursor: pointer;
            border-bottom: 1px solid var(--border);
            display: flex;
-            justify-content: space-between;
            align-items: center;
-            gap: 8px;
+            gap: 6px;
        }

        .session-item:last-child { border-bottom: none; }
        .session-item:hover { background: var(--bg); }
        .session-item.new { color: var(--accent); justify-content: center; }

+        .session-body {
+            flex: 1;
+            min-width: 0;
+            display: flex;
+            flex-direction: column;
+            gap: 2px;
+        }
+
        .session-delete-btn {
            background: none;
            border: none;
@@ -300,7 +337,7 @@
        }
        .session-delete-btn:hover { color: #e06c75; }

-        .session-rename-btn {
+        .session-edit-btn {
            background: none;
            border: none;
            color: var(--muted);
@@ -310,13 +347,59 @@
            cursor: pointer;
            border-radius: 3px;
            flex-shrink: 0;
-            opacity: 0.4;
+            opacity: 0.3;
            transition: opacity 0.15s, color 0.15s;
            min-width: 24px;
            text-align: center;
        }
-        .session-item:hover .session-rename-btn { opacity: 1; }
-        .session-rename-btn:hover { color: var(--accent); }
+        .session-item:hover .session-edit-btn { opacity: 0.75; }
+        .session-edit-btn:hover { color: var(--accent); opacity: 1; }
+
+        .session-save-btn {
+            background: none;
+            border: none;
+            color: var(--accent);
+            font-size: 1rem;
+            font-weight: bold;
+            line-height: 1;
+            padding: 2px 6px;
+            cursor: pointer;
+            border-radius: 3px;
+            flex-shrink: 0;
+            min-width: 24px;
+            text-align: center;
+            transition: opacity 0.15s;
+        }
+        .session-save-btn:hover { opacity: 0.75; }
+
+        .session-confirm-row {
+            display: flex;
+            align-items: center;
+            gap: 0.4rem;
+            flex: 1;
+            min-width: 0;
+        }
+        .session-confirm-label {
+            flex: 1;
+            font-size: 0.78rem;
+            color: #e06c75;
+            white-space: nowrap;
+            overflow: hidden;
+            text-overflow: ellipsis;
+        }
+        .session-confirm-yes, .session-confirm-no {
+            background: none;
+            border: 1px solid;
+            border-radius: 4px;
+            font-size: 0.72rem;
+            padding: 2px 8px;
+            cursor: pointer;
+            flex-shrink: 0;
+            transition: opacity 0.15s;
+        }
+        .session-confirm-yes { border-color: #e06c75; color: #e06c75; }
+        .session-confirm-no  { border-color: var(--muted); color: var(--muted); }
+        .session-confirm-yes:hover, .session-confirm-no:hover { opacity: 0.75; }

        .session-rename-input {
            flex: 1;
@@ -343,11 +426,9 @@
        }

        .session-meta {
-            font-size: 0.78rem;
+            font-size: 0.75rem;
            color: var(--muted);
            white-space: nowrap;
-            text-align: right;
-            flex-shrink: 0;
        }

        /* Messages */
@@ -526,6 +607,25 @@

        .message.thinking { color: var(--muted); font-style: italic; }

+        /* Confirmation gate */
+        .confirm-gate { display: flex; flex-direction: column; gap: 0.6rem; }
+        .confirm-gate p { margin: 0; }
+        .confirm-tools { font-size: 0.82rem; color: var(--muted); }
+        .confirm-actions { display: flex; gap: 0.5rem; margin-top: 0.25rem; }
+        .confirm-btn, .deny-btn {
+            padding: 0.35rem 0.9rem;
+            border-radius: 6px;
+            border: none;
+            font-size: 0.85rem;
+            font-weight: 600;
+            cursor: pointer;
+            transition: opacity 0.15s;
+        }
+        .confirm-btn { background: #16a34a; color: #fff; }
+        .confirm-btn:hover { opacity: 0.85; }
+        .deny-btn { background: var(--surface); border: 1px solid var(--border); color: var(--text); }
+        .deny-btn:hover { border-color: var(--muted); }
+
        /* Copy button */
        .message.assistant, .message.user { position: relative; }

@@ -552,18 +652,34 @@
        .copy-btn:hover  { color: var(--text); border-color: var(--muted); }
        .copy-btn.copied { color: var(--success); border-color: var(--success-dim); }

-        /* Model tag — shown at the bottom of every assistant message */
-        .model-tag {
-            display: block;
-            font-size: 0.67rem;
-            color: #475569;
-            margin-top: 0.55rem;
-            padding-top: 0.4rem;
-            border-top: 1px solid #2d3148;
-            text-align: right;
+        /* Message metadata — shown in the hover bar below the bubble */
+        .msg-meta {
+            display: flex;
+            align-items: center;
+            gap: 5px;
+            flex: 1;
+            min-width: 0;
+            font-size: 0.62rem;
+            color: var(--dim);
            letter-spacing: 0.02em;
+            overflow: hidden;
        }
-        .model-tag.fallback { color: #f59e0b; }
+        .msg-meta-model {
+            overflow: hidden;
+            text-overflow: ellipsis;
+            white-space: nowrap;
+        }
+        .msg-meta-model.fallback { color: #f59e0b; }
+        .msg-meta-badge {
+            flex-shrink: 0;
+            padding: 1px 5px;
+            border-radius: 3px;
+            font-size: 0.6rem;
+            font-weight: 600;
+            letter-spacing: 0.04em;
+        }
+        .msg-meta-badge.otr { background: #1e1b4b; color: #818cf8; }
+        [data-theme="light"] .msg-meta-badge.otr { background: #ede9fe; color: #5b21b6; }

        /* Retry button — shown in error message bubbles */
        .retry-btn {
@@ -640,6 +756,16 @@
            gap: 4px;
        }

+        /* S: collapse to a single row — mode button + compact tools toggle */
+        #mode-select[data-size="s"] {
+            flex-direction: row;
+            align-items: center;
+        }
+        #mode-select[data-size="s"] #tools-toggle {
+            padding: 3px 7px;
+            font-size: 0.75rem;
+        }
+
        #mode-select-btn {
            display: flex;
            align-items: center;
@@ -654,10 +780,9 @@
            white-space: nowrap;
            transition: border-color 0.15s, color 0.15s;
        }
-        #mode-select-btn:hover                  { border-color: var(--muted); color: var(--text); }
-        #mode-select-btn.mode-note              { border-color: rgba(180,130,40,0.6); color: #c9a84c; }
-        #mode-select-btn.mode-otr               { border-color: rgba(120,80,160,0.6); color: #a87fd4; }
-        #mode-select-btn.mode-agent             { border-color: rgba(80,140,200,0.6); color: #7cb9e8; }
+        #mode-select-btn:hover     { border-color: var(--muted); color: var(--text); }
+        #mode-select-btn.mode-note { border-color: rgba(180,130,40,0.6); color: #c9a84c; }
+        #mode-select-btn.mode-otr  { border-color: rgba(120,80,160,0.6); color: #a87fd4; }

        #mode-icon  { display: flex; align-items: center; }
        .mode-arrow { font-size: 0.55rem; color: var(--muted); margin-left: 2px; opacity: 0.5; }
@@ -716,6 +841,78 @@
            color: rgba(40,170,150,0.75);
        }

+        /* Tools toggle — OFF: dim/muted; ON: amber with glow */
+        #tools-toggle {
+            background: var(--bg);
+            border: 1px solid rgba(255,255,255,0.1);
+            border-radius: 6px;
+            color: rgba(255,255,255,0.2);
+            font-size: 0.85rem;
+            padding: 4px 8px;
+            cursor: pointer;
+            text-align: center;
+            transition: color 0.15s, border-color 0.15s, box-shadow 0.15s;
+        }
+        #tools-toggle:hover { color: rgba(255,255,255,0.4); border-color: rgba(255,255,255,0.2); }
+        #tools-toggle.local-on {
+            color: var(--amber);
+            border-color: var(--amber-border);
+            box-shadow: 0 0 6px var(--amber-glow);
+        }
+        #tools-toggle.local-on:hover { box-shadow: 0 0 10px var(--amber-glow); }
+
+        #attach-btn {
+            background: var(--bg);
+            border: 1px solid rgba(255,255,255,0.1);
+            border-radius: 6px;
+            color: rgba(255,255,255,0.3);
+            font-size: 0.95rem;
+            padding: 3px 7px;
+            cursor: pointer;
+            transition: color 0.15s, border-color 0.15s;
+        }
+        #attach-btn:hover { color: rgba(255,255,255,0.6); border-color: rgba(255,255,255,0.25); }
+
+        #attachment-row {
+            padding: 0.3rem 0.5rem;
+            border-bottom: 1px solid var(--border);
+        }
+        #attachment-preview {
+            display: inline-flex;
+            align-items: center;
+            gap: 0.4rem;
+            background: var(--bg-alt);
+            border: 1px solid var(--border);
+            border-radius: 6px;
+            padding: 0.2rem 0.5rem;
+            font-size: 0.82rem;
+            max-width: 100%;
+        }
+        #attachment-thumb {
+            max-height: 2.4rem;
+            max-width: 3.5rem;
+            border-radius: 3px;
+            object-fit: contain;
+        }
+        #attachment-name {
+            color: var(--text-mid);
+            max-width: 220px;
+            overflow: hidden;
+            text-overflow: ellipsis;
+            white-space: nowrap;
+        }
+        #attachment-clear {
+            background: none;
+            border: none;
+            color: var(--muted);
+            cursor: pointer;
+            padding: 0 0.15rem;
+            font-size: 0.78rem;
+            line-height: 1;
+            flex-shrink: 0;
+        }
+        #attachment-clear:hover { color: var(--text); }
+
        #input {
            flex: 1;
            background: var(--bg);
@@ -737,8 +934,7 @@
        #input.mode-note:focus        { border-color: rgba(180,130,40,0.85); }
        #input.mode-note.public       { border-color: rgba(40,170,150,0.55); }
        #input.mode-note.public:focus { border-color: rgba(40,170,150,0.85); }
-        #input.mode-otr               { border-color: rgba(120,80,160,0.4); background: rgba(120,80,160,0.04); }
-        #input.mode-agent             { border-color: rgba(80,140,200,0.4); }
+        #input.mode-otr { border-color: rgba(120,80,160,0.4); background: rgba(120,80,160,0.04); }

        /* Send column — right side, stacked */
        #send-col {
@@ -791,11 +987,14 @@
        #stop:hover { background: #5c1a1a; }

        #session-id {
-            font-size: 0.7rem;
+            font-size: 0.68rem;
            color: var(--border);
-            padding: 0 20px 6px;
-            background: var(--surface);
+            white-space: nowrap;
+            overflow: hidden;
+            text-overflow: ellipsis;
+            max-width: 220px;
        }
+        #session-id:empty { display: none; }

        /* ── Message wrappers (edit/delete controls) ──────────────── */
        .msg-wrapper {
@@ -1093,20 +1292,43 @@
            flex-direction: column;
        }

-        #file-editor {
+        /* CodeMirror markdown editor */
+        #file-editor-wrap {
            flex: 1;
-            width: 100%;
+            min-height: 0;
+            overflow: hidden;
+            display: flex;
+            flex-direction: column;
+        }
+        #file-editor-wrap.hidden { display: none; }
+
+        #file-editor-wrap .CodeMirror {
+            flex: 1;
+            height: 100%;
            background: var(--bg);
            color: var(--text);
-            border: none;
-            outline: none;
-            padding: 16px;
            font-family: 'Courier New', monospace;
            font-size: 0.85rem;
            line-height: 1.55;
-            resize: none;
-            display: block;
+            border: none;
        }
+        #file-editor-wrap .CodeMirror-scroll { padding: 12px 16px; }
+        #file-editor-wrap .CodeMirror-lines  { padding: 0; }
+        #file-editor-wrap .CodeMirror-cursor { border-left-color: var(--accent); }
+        #file-editor-wrap .CodeMirror-selectedtext { background: var(--border) !important; }
+        #file-editor-wrap .CodeMirror-selected { background: var(--border) !important; }
+        #file-editor-wrap .CodeMirror-focused .CodeMirror-selected { background: var(--border) !important; }
+
+        /* Markdown token colours */
+        #file-editor-wrap .cm-header   { color: var(--accent); font-weight: 600; }
+        #file-editor-wrap .cm-strong   { color: var(--text); font-weight: 700; }
+        #file-editor-wrap .cm-em       { color: var(--muted); font-style: italic; }
+        #file-editor-wrap .cm-link     { color: #a78bfa; }
+        #file-editor-wrap .cm-url      { color: var(--muted); }
+        #file-editor-wrap .cm-comment  { color: var(--muted); }
+        #file-editor-wrap .cm-quote    { color: var(--muted); font-style: italic; }
+        #file-editor-wrap .cm-code     { color: var(--muted); background: var(--surface); }
+        #file-editor-wrap .cm-hr       { color: var(--border); }

        #file-preview {
            flex: 1;
@@ -1116,8 +1338,45 @@
            line-height: 1.6;
        }

-        #file-preview.active { display: block; }
-        #file-editor.hidden  { display: none; }
+        #file-preview.active      { display: block; }
+        #file-editor-wrap.hidden  { display: none; }
+
+        /* ── Audit log table ────────────────────────────────────────── */
+        .audit-table {
+            width: 100%;
+            border-collapse: collapse;
+            font-size: 0.78rem;
+            table-layout: fixed;
+        }
+        .audit-table th {
+            text-align: left;
+            padding: 5px 8px;
+            border-bottom: 1px solid var(--border);
+            color: var(--muted);
+            font-weight: 600;
+            white-space: nowrap;
+        }
+        .audit-table td {
+            padding: 5px 8px;
+            border-bottom: 1px solid color-mix(in srgb, var(--border) 50%, transparent);
+            vertical-align: top;
+            overflow: hidden;
+            text-overflow: ellipsis;
+            white-space: nowrap;
+        }
+        .audit-table tr:last-child td { border-bottom: none; }
+        .audit-table tr:hover td { background: var(--surface); }
+        /* Column widths */
+        .at-time   { width: 7em;  color: var(--muted); white-space: nowrap; }
+        .at-tool   { width: 11em; color: var(--accent); font-weight: 500; }
+        .at-status { width: 4.5em; font-weight: 600; }
+        .at-model  { width: 10em; color: var(--muted); }
+        .at-args   { width: 25%; color: var(--muted); }
+        .at-result { color: var(--muted); }
+        .at-status.ok     { color: #4ade80; }
+        .at-status.error  { color: #f87171; }
+        .at-status.denied { color: #fbbf24; }
+        .audit-empty { padding: 24px; color: var(--muted); text-align: center; font-size: 0.9rem; }

        /* Talk activity badge on Sessions button */
        #sessions-btn.talk-badge::after {
@@ -1138,7 +1397,7 @@
            background: var(--surface);
            border: 1px solid var(--border);
            border-radius: 8px;
-            z-index: 100;
+            z-index: 200;
            box-shadow: 0 8px 24px var(--shadow);
            overflow: hidden;
        }
@@ -1178,9 +1437,12 @@
        .ctx-btn:hover    { color: var(--text); border-color: var(--muted); }
        .ctx-btn.active   { color: var(--accent); border-color: var(--accent); }
        .ctx-btn.mem-on   { color: var(--success); border-color: var(--success-dim); }
-        .ctx-btn.local-on { color: #f59e0b; border-color: #92400e; }
+        .ctx-btn.local-on   { color: var(--amber); border-color: var(--amber-border); }
+        .ctx-btn-danger     { color: #f87171 !important; border-color: #7f1d1d !important; }
+        .ctx-btn-danger:hover { border-color: #f87171 !important; }
+        .ctx-btn:disabled   { opacity: 0.4; cursor: not-allowed; pointer-events: none; }
        #backend-model-hint {
-            font-size: 0.68rem; color: #f59e0b; opacity: 0.8;
+            font-size: 0.68rem; color: var(--amber); opacity: 0.9;
            margin-top: 4px; word-break: break-all; line-height: 1.3;
        }

@@ -1372,52 +1634,6 @@
            padding-left: 12px;
        }

-        /* ── Auth warning banner ─────────────────────────────────── */
-        #auth-banner {
-            display: none;
-            align-items: center;
-            gap: 10px;
-            padding: 8px 20px;
-            background: rgba(160, 100, 0, 0.18);
-            border-bottom: 1px solid rgba(200, 140, 20, 0.45);
-            font-size: 0.82rem;
-            color: #c9a84c;
-        }
-
-        #auth-banner.show { display: flex; }
-        #auth-banner.expired {
-            background: rgba(120, 20, 20, 0.25);
-            border-color: rgba(200, 60, 60, 0.45);
-            color: var(--error-text);
-        }
-
-        #auth-banner-text { flex: 1; display: flex; flex-direction: column; gap: 2px; }
-        #auth-banner-msg  { font-weight: 500; }
-        #auth-banner-hint {
-            font-size: 0.76rem;
-            opacity: 0.8;
-        }
-        #auth-banner-hint code {
-            font-family: 'Courier New', monospace;
-            background: rgba(0,0,0,0.2);
-            border-radius: 3px;
-            padding: 0 4px;
-        }
-
-        #auth-banner-close {
-            background: none;
-            border: 1px solid currentColor;
-            border-radius: 4px;
-            color: inherit;
-            font-size: 0.7rem;
-            padding: 2px 7px;
-            cursor: pointer;
-            opacity: 0.7;
-            flex-shrink: 0;
-        }
-
-        #auth-banner-close:hover { opacity: 1; }
-
        /* ── Toasts ──────────────────────────────────────────────── */
        #toast-container {
            position: fixed;
@@ -1497,17 +1713,10 @@
                font-size: 16px; /* prevent iOS Safari auto-zoom */
            }

-            /* Mode select: row layout (btn left, note-vis right) */
-            #mode-select {
-                flex-direction: row;
-                flex: 1;
-                align-items: center;
-            }
+            /* Mode select: grows to fill left side of bottom row; back to row on mobile */
+            #mode-select { flex: 1; flex-direction: row; align-items: center; }
            #mode-select-btn { flex: 1; justify-content: center; }

-            /* Note vis button sits to the right of the mode btn on mobile */
-            #note-vis-btn { margin-top: 0; }
-
            /* Dropdown still opens upward on mobile */
            #mode-dropdown { min-width: 140px; }

@@ -1536,7 +1745,7 @@
                top: 0;
                right: 0;
                bottom: 0;
-                width: min(300px, 85vw);
+                width: min(380px, 90vw);
                max-height: none;
                height: 100%;
                border-radius: 0;
@@ -1572,6 +1781,9 @@
                min-width: 36px;
                min-height: 36px;
            }
+
+            /* On touch: edit button always fully visible (no hover to reveal it) */
+            .session-edit-btn { opacity: 0.6; }
        }

        @media (max-width: 380px) {
@@ -1579,3 +1791,4 @@
            .header-emoji { font-size: 1.3rem; }
            .hdr-btn { padding: 5px 8px; }
        }
+
--- a/cortex/static/sw.js
+++ b/cortex/static/sw.js
@@ -0,0 +1,106 @@
+const CACHE = 'cortex-v2';
+
+const PRECACHE = [
+    '/static/style.css',
+    '/static/app.js',
+    '/static/marked.min.js',
+    '/static/icon-192.png',
+    '/static/icon-512.png',
+    '/static/icon.svg',
+    '/static/manifest.json',
+];
+
+self.addEventListener('install', evt => {
+    evt.waitUntil(
+        caches.open(CACHE)
+            .then(c => c.addAll(PRECACHE))
+            .then(() => self.skipWaiting())
+    );
+});
+
+self.addEventListener('activate', evt => {
+    evt.waitUntil(
+        caches.keys()
+            .then(keys => Promise.all(
+                keys.filter(k => k !== CACHE).map(k => caches.delete(k))
+            ))
+            .then(() => self.clients.claim())
+    );
+});
+
+self.addEventListener('push', evt => {
+    let data = { title: 'Cortex', body: '', url: '/' };
+    if (evt.data) {
+        try { data = { ...data, ...evt.data.json() }; } catch (_) {}
+    }
+    evt.waitUntil(
+        self.registration.showNotification(data.title, {
+            body: data.body,
+            icon: '/static/icon-192.png',
+            badge: '/static/icon-192.png',
+            data: { url: data.url },
+        })
+    );
+});
+
+self.addEventListener('notificationclick', evt => {
+    evt.notification.close();
+    const url = evt.notification.data?.url || '/';
+    evt.waitUntil(
+        clients.matchAll({ type: 'window', includeUncontrolled: true }).then(list => {
+            for (const c of list) {
+                if (c.url.includes(self.location.origin) && 'focus' in c) {
+                    c.navigate(url);
+                    return c.focus();
+                }
+            }
+            if (clients.openWindow) return clients.openWindow(url);
+        })
+    );
+});
+
+self.addEventListener('fetch', evt => {
+    const url = new URL(evt.request.url);
+
+    // Only handle same-origin GETs
+    if (evt.request.method !== 'GET' || url.origin !== self.location.origin) return;
+
+    // Never intercept streaming or API calls
+    if (
+        url.pathname.startsWith('/chat') ||
+        url.pathname.startsWith('/orchestrate') ||
+        url.pathname.startsWith('/api/') ||
+        url.pathname.startsWith('/distill') ||
+        url.pathname.startsWith('/webhook') ||
+        url.pathname.startsWith('/auth/')
+    ) return;
+
+    // Static assets — cache first, refresh in background (stale-while-revalidate)
+    if (url.pathname.startsWith('/static/')) {
+        evt.respondWith(
+            caches.open(CACHE).then(cache =>
+                cache.match(evt.request).then(cached => {
+                    const network = fetch(evt.request).then(resp => {
+                        if (resp.ok) cache.put(evt.request, resp.clone());
+                        return resp;
+                    });
+                    return cached || network;
+                })
+            )
+        );
+        return;
+    }
+
+    // HTML pages — network first, cached shell fallback
+    evt.respondWith(
+        fetch(evt.request)
+            .then(resp => {
+                if (resp.ok) {
+                    const clone = resp.clone();
+                    caches.open(CACHE).then(c => c.put(evt.request, clone));
+                }
+                return resp;
+            })
+            .catch(() => caches.match(evt.request))
+    );
+});
--- a/cortex/static/tools_settings.html
+++ b/cortex/static/tools_settings.html
@@ -0,0 +1,213 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <title>Tool Settings — Cortex</title>
+  <link rel="preconnect" href="https://fonts.googleapis.com">
+  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@100..900&display=swap" rel="stylesheet">
+  <script src="https://cdn.tailwindcss.com"></script>
+  <script>
+  tailwind.config = {
+    corePlugins: { preflight: false },
+    darkMode: ['selector', '[data-theme="dark"]'],
+    theme: {
+      extend: {
+        colors: {
+          pg: {
+            bg:      'var(--pg-bg)',
+            surface: 'var(--pg-surface)',
+            border:  'var(--pg-border)',
+            text:    'var(--pg-text)',
+            muted:   'var(--pg-muted)',
+            dim:     'var(--pg-dim)',
+            dimmer:  'var(--pg-dimmer)',
+            bright:  'var(--pg-bright)',
+            accent:  'var(--pg-accent)',
+            action:  'var(--pg-action)',
+          }
+        },
+        fontFamily: { sans: ['Inter', 'system-ui', 'sans-serif'] }
+      }
+    }
+  }
+  </script>
+  <link rel="stylesheet" href="/static/pg.css">
+  <script>(function(){var t=localStorage.getItem('theme')||(window.matchMedia('(prefers-color-scheme: dark)').matches?'dark':'light');document.documentElement.setAttribute('data-theme',t);})();</script>
+  <style>
+    /* ── Server-generated tool table ── */
+    .table-section-label {
+      font-size: 0.7rem; font-weight: 700; letter-spacing: 0.08em;
+      text-transform: uppercase; color: var(--pg-dimmer);
+      margin: 1.75rem 0 0.6rem;
+    }
+    .tool-table {
+      width: 100%; border-collapse: collapse;
+      background: var(--pg-surface); border: 1px solid var(--pg-border);
+      border-radius: 0.75rem; overflow: hidden; margin-bottom: 0.5rem;
+      font-size: 0.85rem;
+    }
+    .tool-table th {
+      text-align: left; padding: 0.55rem 0.9rem;
+      border-bottom: 1px solid var(--pg-border);
+      color: var(--pg-muted); font-weight: 600; font-size: 0.78rem;
+      text-transform: uppercase; letter-spacing: 0.04em;
+    }
+    .tool-table td { padding: 0.5rem 0.9rem; border-bottom: 1px solid var(--pg-border); vertical-align: middle; }
+    .tool-table tr:last-child td { border-bottom: none; }
+    .tool-table tr:hover td { background: rgba(124,58,237,0.04); }
+    .tool-name { font-family: monospace; font-size: 0.82rem; }
+
+    /* Risk badges (server-generated) */
+    .risk { display: inline-block; font-size: 0.7rem; font-weight: 700;
+            padding: 0.15rem 0.45rem; border-radius: 9999px; letter-spacing: 0.04em; }
+    .risk-low    { background: rgba(34,197,94,0.12);  color: #4ade80; }
+    .risk-medium { background: rgba(234,179,8,0.12);  color: #fbbf24; }
+    .risk-high   { background: rgba(239,68,68,0.12);  color: #f87171; }
+    [data-theme="light"] .risk-low    { background: rgba(34,197,94,0.15);  color: #16a34a; }
+    [data-theme="light"] .risk-medium { background: rgba(234,179,8,0.15);  color: #ca8a04; }
+    [data-theme="light"] .risk-high   { background: rgba(239,68,68,0.15);  color: #dc2626; }
+
+    /* Auto-status pill (server-generated, updated by JS) */
+    .auto-pill {
+      display: inline-block; font-size: 0.68rem; font-weight: 600;
+      padding: 0.12rem 0.4rem; border-radius: 9999px;
+    }
+    .auto-on  { background: rgba(124,58,237,0.12); color: #a78bfa; }
+    .auto-off { background: rgba(148,163,184,0.12); color: var(--pg-dimmer); }
+    [data-theme="light"] .auto-on { color: #7c3aed; }
+
+    /* Override select (server-generated) */
+    .override-sel {
+      font-size: 0.78rem; padding: 0.25rem 0.5rem;
+      border-radius: 0.3rem; min-width: 7rem; width: auto;
+    }
+    .override-sel.forced-on  { border-color: #7c3aed; color: #7c3aed; }
+    .override-sel.forced-off { border-color: #dc2626; color: #dc2626; }
+  </style>
+</head>
+<body>
+
+<nav class="page-nav">
+  <a href="{{ back_href }}" class="nav-link">← Chat</a>
+  <a href="{{ help_href }}" class="nav-link">Help</a>
+  <a href="/settings" class="nav-link">Settings</a>
+  <a href="/settings/models" class="nav-link">Models</a>
+  <a href="/settings/notifications" class="nav-link">Notifications</a>
+  <a href="/settings/tools" class="nav-link active">Tools</a>
+  <a href="/settings/crons" class="nav-link">Schedules</a>
+  {{ integrations_nav }}
+  <span class="nav-spacer"></span>
+  <a href="/logout" class="nav-link nav-logout">Sign out</a>
+</nav>
+
+<div class="page-wrap">
+  <h1 class="page-title">Tool Settings</h1>
+  <p class="page-subtitle">
+    Control which orchestrator tools are available. The risk level sets an automatic threshold;
+    whitelist and blacklist let you fine-tune individual tools beyond that.
+  </p>
+
+  <!-- SUCCESS -->
+  <!-- ERROR -->
+
+  <form method="POST" action="/settings/tools" id="tools-form">
+
+    <!-- Risk policy card -->
+    <div class="rounded-xl border border-pg-border bg-pg-surface p-5 mb-5">
+      <h2 class="text-sm font-semibold text-pg-bright mb-4">Risk Policy</h2>
+      <div class="flex items-center gap-4 flex-wrap mb-3">
+        <span class="text-sm font-medium text-pg-text min-w-[6rem]">Max risk level</span>
+        <select name="max_risk" id="max-risk-sel" class="w-auto">
+          <option value=""       {{ sel_none   }}>No filter — use all role-permitted tools</option>
+          <option value="low"    {{ sel_low    }}>Low — read-only and sandboxed tools only</option>
+          <option value="medium" {{ sel_medium }}>Medium — low + medium risk (recommended)</option>
+          <option value="high"   {{ sel_high   }}>High — all tools including destructive ones</option>
+        </select>
+      </div>
+      <p class="text-xs text-pg-muted leading-relaxed mb-2">
+        <strong class="text-pg-text">Low</strong> tools are read-only and sandboxed (web search, project file reads, HA status checks).<br>
+        <strong class="text-pg-text">Medium</strong> tools write to local data or send notifications to you (cron jobs, scratch, task management).<br>
+        <strong class="text-pg-text">High</strong> tools affect external systems or the host (shell exec, email, device control, service restart).
+      </p>
+      <p class="text-xs text-pg-muted leading-relaxed">
+        The <em>Auto</em> column below shows each tool's status at your current max risk level.
+        Use the override column to force-include or force-exclude individual tools.
+      </p>
+    </div>
+
+    <!-- Legend -->
+    <div class="flex gap-5 flex-wrap mb-4 text-xs text-pg-muted">
+      <span><span class="inline-block w-2 h-2 rounded-full bg-[#a78bfa] mr-1.5"></span>Auto-included by risk level</span>
+      <span><span class="inline-block w-2 h-2 rounded-full bg-pg-dimmer mr-1.5"></span>Auto-excluded by risk level</span>
+    </div>
+
+    <!-- Tool table (server-generated) -->
+{{ tool_table_html }}
+
+    <!-- Confirmation gate card -->
+    <div class="rounded-xl border border-pg-border bg-pg-surface p-5 mt-5 mb-5">
+      <h2 class="text-sm font-semibold text-pg-bright mb-2">Confirmation Gate</h2>
+      <p class="text-xs text-pg-muted leading-relaxed mb-4">
+        Some tools require explicit confirmation before executing. Override the defaults here.<br>
+        Tools requiring confirmation by default: <code class="font-mono text-pg-accent bg-pg-bg border border-pg-border rounded px-1">{{ confirm_required_tools }}</code>
+      </p>
+      <div class="flex gap-6 flex-wrap items-start">
+        <div class="flex-1 min-w-[200px]">
+          <label class="block text-xs font-semibold text-pg-muted mb-1">Allow list — bypass confirmation</label>
+          <textarea name="allow_list" rows="4"
+                    placeholder="reminders_clear&#10;cron_remove"
+                    autocomplete="off" spellcheck="false">{{ tool_allow }}</textarea>
+          <p class="hint">One tool name per line. These tools skip the confirmation prompt.</p>
+        </div>
+        <div class="flex-1 min-w-[200px]">
+          <label class="block text-xs font-semibold text-pg-muted mb-1">Deny list — always block</label>
+          <textarea name="deny_list" rows="4"
+                    placeholder="shell_exec&#10;file_write"
+                    autocomplete="off" spellcheck="false">{{ tool_deny }}</textarea>
+          <p class="hint">These tools are always blocked regardless of risk policy.</p>
+        </div>
+      </div>
+    </div>
+
+    <div class="mt-4">
+      <button type="submit" class="btn-submit w-full md:w-96">Save tool settings</button>
+    </div>
+  </form>
+</div>
+
+<script>
+  const riskRank = { "": 99, "low": 0, "medium": 1, "high": 2 };
+  const toolRisk = {{ tool_risk_json }};
+
+  const sel = document.getElementById('max-risk-sel');
+
+  function updateAutoPills() {
+    const maxRank = riskRank[sel.value] ?? 99;
+    document.querySelectorAll('[data-tool-risk]').forEach(row => {
+      const risk = row.dataset.toolRisk;
+      const pill = row.querySelector('.auto-pill');
+      const isAuto = riskRank[risk] <= maxRank;
+      pill.textContent = isAuto ? 'auto ✓' : 'excluded';
+      pill.className   = 'auto-pill ' + (isAuto ? 'auto-on' : 'auto-off');
+    });
+  }
+
+  sel.addEventListener('change', updateAutoPills);
+  updateAutoPills();
+
+  // Color the override selects
+  document.querySelectorAll('.override-sel').forEach(s => {
+    function refresh() {
+      s.className = 'override-sel';
+      if (s.value === 'whitelist') s.classList.add('forced-on');
+      if (s.value === 'blacklist') s.classList.add('forced-off');
+    }
+    s.addEventListener('change', refresh);
+    refresh();
+  });
+</script>
+
+</body>
+</html>
--- a/cortex/tests/test_agent_manager.py
+++ b/cortex/tests/test_agent_manager.py
@@ -0,0 +1,876 @@
+"""
+Tests for agent_manager.py and the spawn_agent / aider_run background paths.
+
+Run with:
+    cd cortex && .venv/bin/python -m pytest tests/test_agent_manager.py -v
+
+No browser, no LLM calls, no Cortex service needed. All LLM interactions are mocked.
+The agent_manager tests need no mocks at all — the module is pure asyncio.
+"""
+
+import asyncio
+import pytest
+import pytest_asyncio
+from datetime import datetime, timedelta
+from unittest.mock import AsyncMock, MagicMock, patch
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _make_mock_result(response: str = "Agent done."):
+    """Build a mock OrchestratorResult returned by openai_orchestrator.run."""
+    r = MagicMock()
+    r.checkpoint = None
+    r.response = response
+    return r
+
+
+def _mock_spawn_deps(
+    model_type: str = "local_openai",
+    user_role: str = "admin",
+    tool_policy: dict | None = None,
+    role_tools: list | None = None,
+):
+    """Return a context-manager stack that patches all spawn_agent external deps."""
+    if tool_policy is None:
+        tool_policy = {"allow": [], "deny": []}
+    model_cfg = {
+        "type": model_type,
+        "api_url": "http://localhost:3000",
+        "model_name": "test-model",
+        "api_key": "x",
+    }
+    role_cfg = {
+        "tools": role_tools,
+        "system_append": "",
+        "inject_datetime": True,
+        "inject_mode": True,
+    }
+
+    class _Stack:
+        def __enter__(self_):
+            self_._patches = [
+                patch("model_registry.get_role_config", return_value=role_cfg),
+                patch("model_registry.get_model_for_role", return_value=model_cfg),
+                patch("model_registry.get_registry", return_value={"hosts": []}),
+                patch("context_loader.load_context", return_value="Test system prompt"),
+                patch("auth_utils.get_user_role", return_value=user_role),
+                patch("auth_utils.get_tool_policy", return_value=tool_policy),
+                patch("persona.get_user", return_value="scott"),
+            ]
+            for p in self_._patches:
+                p.start()
+            return self_
+
+        def __exit__(self_, *args):
+            for p in self_._patches:
+                p.stop()
+
+    return _Stack()
+
+
+# ---------------------------------------------------------------------------
+# Fixture — reset agent_manager state between tests
+# ---------------------------------------------------------------------------
+
+@pytest.fixture(autouse=True)
+def clear_agent_registry():
+    """Wipe the in-process agent registry before each test."""
+    import agent_manager
+    agent_manager._agents.clear()
+    yield
+    agent_manager._agents.clear()
+
+
+# ---------------------------------------------------------------------------
+# agent_manager — core CRUD
+# ---------------------------------------------------------------------------
+
+class TestAgentManagerCore:
+
+    @pytest.mark.asyncio
+    async def test_register_creates_record(self):
+        import agent_manager
+        rec = await agent_manager.register(
+            user="scott", role="research", task="Investigate topic X", level=2
+        )
+        assert rec.agent_id in agent_manager._agents
+        assert rec.status == "running"
+        assert rec.level == 2
+        assert rec.role == "research"
+        assert rec.task == "Investigate topic X"
+        assert rec.user == "scott"
+        assert rec.finished is None
+
+    @pytest.mark.asyncio
+    async def test_register_truncates_long_task(self):
+        import agent_manager
+        long_task = "x" * 500
+        rec = await agent_manager.register(user="scott", role="chat", task=long_task, level=2)
+        assert len(rec.task) == 200
+
+    @pytest.mark.asyncio
+    async def test_finish_updates_record(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        await agent_manager.finish(rec.agent_id, "All done!", "done")
+
+        updated = agent_manager.get(rec.agent_id)
+        assert updated.status == "done"
+        assert updated.result == "All done!"
+        assert updated.finished is not None
+
+    @pytest.mark.asyncio
+    async def test_finish_truncates_result(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        await agent_manager.finish(rec.agent_id, "y" * 2000)
+
+        updated = agent_manager.get(rec.agent_id)
+        assert len(updated.result) <= agent_manager._RESULT_PREVIEW_CHARS
+
+    @pytest.mark.asyncio
+    async def test_finish_failed_status(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        await agent_manager.finish(rec.agent_id, "Boom", "failed")
+        assert agent_manager.get(rec.agent_id).status == "failed"
+
+    @pytest.mark.asyncio
+    async def test_cancel_own_agent(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        msg = await agent_manager.cancel_agent(rec.agent_id, "scott")
+        assert "cancelled" in msg
+        assert agent_manager.get(rec.agent_id).status == "cancelled"
+
+    @pytest.mark.asyncio
+    async def test_cancel_wrong_user_denied(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        msg = await agent_manager.cancel_agent(rec.agent_id, "holly")
+        assert "denied" in msg.lower()
+        assert agent_manager.get(rec.agent_id).status == "running"
+
+    @pytest.mark.asyncio
+    async def test_cancel_nonexistent_agent(self):
+        import agent_manager
+        msg = await agent_manager.cancel_agent("does-not-exist", "scott")
+        assert "No agent found" in msg
+
+    @pytest.mark.asyncio
+    async def test_cancel_already_done(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        await agent_manager.finish(rec.agent_id, "done", "done")
+        msg = await agent_manager.cancel_agent(rec.agent_id, "scott")
+        assert "already" in msg or "done" in msg
+
+    @pytest.mark.asyncio
+    async def test_cancel_kills_real_task(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+
+        sleep_task = asyncio.create_task(asyncio.sleep(60))
+        agent_manager.set_task_ref(rec.agent_id, sleep_task)
+
+        await agent_manager.cancel_agent(rec.agent_id, "scott")
+        await asyncio.sleep(0)  # let the event loop process the cancellation
+
+        assert sleep_task.cancelled() or sleep_task.done()
+
+    def test_list_agents_returns_users_agents(self):
+        import agent_manager
+        # Manually populate the registry
+        agent_manager._agents["a1"] = _make_record("a1", "scott", "running")
+        agent_manager._agents["a2"] = _make_record("a2", "scott", "done")
+        agent_manager._agents["a3"] = _make_record("a3", "holly", "running")
+
+        records = agent_manager.list_agents("scott")
+        ids = {r.agent_id for r in records}
+        assert "a1" in ids
+        assert "a2" in ids
+        assert "a3" not in ids
+
+    def test_list_agents_filters_by_status(self):
+        import agent_manager
+        agent_manager._agents["a1"] = _make_record("a1", "scott", "running")
+        agent_manager._agents["a2"] = _make_record("a2", "scott", "done")
+
+        running = agent_manager.list_agents("scott", status="running")
+        assert len(running) == 1
+        assert running[0].agent_id == "a1"
+
+    def test_list_agents_respects_limit(self):
+        import agent_manager
+        for i in range(20):
+            agent_manager._agents[f"a{i}"] = _make_record(f"a{i}", "scott", "done")
+
+        records = agent_manager.list_agents("scott", limit=5)
+        assert len(records) == 5
+
+    @pytest.mark.asyncio
+    async def test_prune_removes_old_completed(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        await agent_manager.finish(rec.agent_id, "done")
+
+        # Manually backdate the finished time past the prune threshold
+        agent_manager._agents[rec.agent_id].finished = (
+            datetime.now() - agent_manager._PRUNE_AFTER - timedelta(seconds=1)
+        )
+
+        # Trigger pruning via a new registration
+        await agent_manager.register(user="scott", role="chat", task="t2", level=2)
+
+        assert agent_manager.get(rec.agent_id) is None
+
+    @pytest.mark.asyncio
+    async def test_prune_keeps_running_agents(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        # Running agent — finished is None so it should never be pruned
+        assert rec.agent_id in agent_manager._agents
+
+        await agent_manager.register(user="scott", role="chat", task="t2", level=2)
+        assert agent_manager.get(rec.agent_id) is not None
+
+    @pytest.mark.asyncio
+    async def test_finish_unknown_agent_is_noop(self):
+        import agent_manager
+        # Should not raise
+        await agent_manager.finish("ghost-id", "result", "done")
+
+
+# ---------------------------------------------------------------------------
+# agent_manager — notification hook
+# ---------------------------------------------------------------------------
+
+class TestAgentManagerNotify:
+
+    @pytest.mark.asyncio
+    async def test_notify_called_on_done(self):
+        import agent_manager
+        rec = await agent_manager.register(
+            user="scott", role="chat", task="t", level=2, notify=True
+        )
+        with patch("notification.notify", new_callable=AsyncMock) as mock_notify:
+            await agent_manager.finish(rec.agent_id, "All good", "done")
+            mock_notify.assert_called_once()
+            call_args = mock_notify.call_args
+            assert call_args[0][0] == "scott"   # user
+            assert "✅" in call_args[0][1]       # success emoji
+
+    @pytest.mark.asyncio
+    async def test_notify_called_on_failed(self):
+        import agent_manager
+        rec = await agent_manager.register(
+            user="scott", role="chat", task="t", level=2, notify=True
+        )
+        with patch("notification.notify", new_callable=AsyncMock) as mock_notify:
+            await agent_manager.finish(rec.agent_id, "Oops", "failed")
+            mock_notify.assert_called_once()
+            assert "⚠️" in mock_notify.call_args[0][1]
+
+    @pytest.mark.asyncio
+    async def test_no_notify_when_cancelled(self):
+        import agent_manager
+        rec = await agent_manager.register(
+            user="scott", role="chat", task="t", level=2, notify=True
+        )
+        with patch("notification.notify", new_callable=AsyncMock) as mock_notify:
+            await agent_manager.finish(rec.agent_id, "Cancelled.", "cancelled")
+            mock_notify.assert_not_called()
+
+    @pytest.mark.asyncio
+    async def test_no_notify_when_flag_false(self):
+        import agent_manager
+        rec = await agent_manager.register(
+            user="scott", role="chat", task="t", level=2, notify=False
+        )
+        with patch("notification.notify", new_callable=AsyncMock) as mock_notify:
+            await agent_manager.finish(rec.agent_id, "Done", "done")
+            mock_notify.assert_not_called()
+
+
+# ---------------------------------------------------------------------------
+# spawn_agent — background mode
+# ---------------------------------------------------------------------------
+
+class TestSpawnAgentBackground:
+
+    @pytest.mark.asyncio
+    async def test_background_returns_agent_id_immediately(self):
+        import agent_manager
+        from tools.agents import spawn_agent
+
+        mock_result = _make_mock_result("Research complete.")
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", new_callable=AsyncMock, return_value=mock_result):
+                result = await spawn_agent(
+                    task="Test background research",
+                    role="research",
+                    background=True,
+                )
+
+        assert "Agent started in background" in result
+        assert "ID:" in result
+
+    @pytest.mark.asyncio
+    async def test_background_registers_agent(self):
+        import agent_manager
+        from tools.agents import spawn_agent
+
+        mock_result = _make_mock_result()
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", new_callable=AsyncMock, return_value=mock_result):
+                await spawn_agent(task="Background task", background=True)
+
+        agents = agent_manager.list_agents("scott")
+        assert len(agents) >= 1
+
+    @pytest.mark.asyncio
+    async def test_background_agent_eventually_completes(self):
+        import agent_manager
+        from tools.agents import spawn_agent
+
+        mock_result = _make_mock_result("Task done!")
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", new_callable=AsyncMock, return_value=mock_result):
+                result = await spawn_agent(task="Quick task", background=True)
+                agent_id = result.split("ID: ")[1].split("\n")[0].strip()
+
+                # Poll while patches are still active
+                for _ in range(40):
+                    rec = agent_manager.get(agent_id)
+                    if rec and rec.status != "running":
+                        break
+                    await asyncio.sleep(0.05)
+
+        rec = agent_manager.get(agent_id)
+        assert rec is not None
+        assert rec.status == "done"
+        assert "Task done!" in (rec.result or "")
+
+    @pytest.mark.asyncio
+    async def test_background_sync_path_unchanged(self):
+        """Verify that background=False still blocks and returns the result string."""
+        from tools.agents import spawn_agent
+
+        mock_result = _make_mock_result("Sync result here.")
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", new_callable=AsyncMock, return_value=mock_result):
+                result = await spawn_agent(task="Sync task", background=False)
+
+        assert result == "Sync result here."
+
+    @pytest.mark.asyncio
+    async def test_background_agent_timeout(self):
+        import agent_manager
+        from tools.agents import spawn_agent
+
+        async def _slow(*args, **kwargs):
+            await asyncio.sleep(60)
+            return _make_mock_result()
+
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", side_effect=_slow):
+                result = await spawn_agent(task="Slow task", background=True, timeout=1)
+                agent_id = result.split("ID: ")[1].split("\n")[0].strip()
+
+                # Poll while patches are still active (timeout=1s so this completes quickly)
+                for _ in range(60):
+                    rec = agent_manager.get(agent_id)
+                    if rec and rec.status != "running":
+                        break
+                    await asyncio.sleep(0.05)
+
+        rec = agent_manager.get(agent_id)
+        assert rec.status == "timeout"
+
+    @pytest.mark.asyncio
+    async def test_background_agent_failure(self):
+        import agent_manager
+        from tools.agents import spawn_agent
+
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", new_callable=AsyncMock, side_effect=RuntimeError("Boom")):
+                result = await spawn_agent(task="Failing task", background=True)
+
+        agent_id = result.split("ID: ")[1].split("\n")[0].strip()
+
+        for _ in range(20):
+            rec = agent_manager.get(agent_id)
+            if rec and rec.status != "running":
+                break
+            await asyncio.sleep(0.05)
+
+        assert agent_manager.get(agent_id).status == "failed"
+
+
+# ---------------------------------------------------------------------------
+# spawn_agent — level enforcement
+# ---------------------------------------------------------------------------
+
+class TestLevelEnforcement:
+
+    @pytest.mark.asyncio
+    async def test_l2_parent_denies_spawn_in_l3_child(self):
+        """Level 2 agent spawning a child: spawn_agent and aider_run must be denied."""
+        from tools.agents import spawn_agent
+
+        captured_kwargs = {}
+
+        async def _capture_run(**kwargs):
+            captured_kwargs.update(kwargs)
+            return _make_mock_result()
+
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", side_effect=_capture_run):
+                await spawn_agent(
+                    task="Test L3 enforcement",
+                    background=False,
+                    _agent_level=2,   # this agent is Level 2; its child would be Level 3
+                )
+
+        # The orchestrator should have received spawn_agent and aider_run in confirm_deny
+        confirm_deny = captured_kwargs.get("confirm_deny", set())
+        assert "spawn_agent" in confirm_deny, "spawn_agent must be blocked for L3 children"
+        assert "aider_run" in confirm_deny, "aider_run must be blocked for L3 children"
+
+    @pytest.mark.asyncio
+    async def test_l1_parent_does_not_deny_spawn(self):
+        """Level 1 agent (persona) spawning a Level 2 child: no extra denies."""
+        from tools.agents import spawn_agent
+
+        captured_kwargs = {}
+
+        async def _capture_run(**kwargs):
+            captured_kwargs.update(kwargs)
+            return _make_mock_result()
+
+        with _mock_spawn_deps():
+            with patch("openai_orchestrator.run", side_effect=_capture_run):
+                await spawn_agent(
+                    task="Test L2 spawn",
+                    background=False,
+                    _agent_level=1,   # persona is Level 1; child would be Level 2
+                )
+
+        confirm_deny = captured_kwargs.get("confirm_deny", set())
+        assert "spawn_agent" not in confirm_deny, "L2 agents must be allowed to spawn"
+
+    @pytest.mark.asyncio
+    async def test_l2_deny_intersected_with_tool_list(self):
+        """When the role has an explicit tool_list, L3 deny removes from list directly."""
+        from tools.agents import spawn_agent
+
+        captured_kwargs = {}
+
+        async def _capture_run(**kwargs):
+            captured_kwargs.update(kwargs)
+            return _make_mock_result()
+
+        # Role has an explicit tool_list that includes spawn_agent
+        with _mock_spawn_deps(role_tools=["web_search", "spawn_agent", "aider_run"]):
+            with patch("openai_orchestrator.run", side_effect=_capture_run):
+                await spawn_agent(
+                    task="Test",
+                    background=False,
+                    _agent_level=2,
+                )
+
+        # spawn_agent and aider_run must be absent from the tool_list passed to orchestrator
+        tool_list = captured_kwargs.get("tool_list", [])
+        assert "spawn_agent" not in tool_list
+        assert "aider_run" not in tool_list
+        assert "web_search" in tool_list   # unrelated tools must survive
+
+
+# ---------------------------------------------------------------------------
+# Agent lifecycle tools — output formatting
+# ---------------------------------------------------------------------------
+
+class TestAgentLifecycleTools:
+
+    @pytest.mark.asyncio
+    async def test_agent_status_running(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="research", task="Do research", level=2)
+
+        with patch("persona.get_user", return_value="scott"):
+            from tools.agents import agent_status
+            output = await agent_status(rec.agent_id)
+
+        assert "running" in output
+        assert "research" in output
+        assert rec.agent_id[:8] in output
+
+    @pytest.mark.asyncio
+    async def test_agent_status_done(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="Task", level=2)
+        await agent_manager.finish(rec.agent_id, "The result text", "done")
+
+        with patch("persona.get_user", return_value="scott"):
+            from tools.agents import agent_status
+            output = await agent_status(rec.agent_id)
+
+        assert "done" in output
+        assert "The result text" in output
+
+    @pytest.mark.asyncio
+    async def test_agent_status_wrong_user(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+
+        with patch("persona.get_user", return_value="holly"):
+            from tools.agents import agent_status
+            output = await agent_status(rec.agent_id)
+
+        assert "denied" in output.lower()
+
+    @pytest.mark.asyncio
+    async def test_agent_status_not_found(self):
+        with patch("persona.get_user", return_value="scott"):
+            from tools.agents import agent_status
+            output = await agent_status("nonexistent-id")
+
+        assert "No agent found" in output
+
+    @pytest.mark.asyncio
+    async def test_agent_list_shows_running(self):
+        import agent_manager
+        await agent_manager.register(user="scott", role="research", task="Research X", level=2)
+        await agent_manager.register(user="scott", role="coder", task="Fix bug", level=2)
+
+        with patch("persona.get_user", return_value="scott"):
+            from tools.agents import agent_list
+            output = await agent_list()
+
+        assert "2 agent(s)" in output
+        assert "research" in output
+        assert "coder" in output
+
+    @pytest.mark.asyncio
+    async def test_agent_list_status_filter(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+        await agent_manager.finish(rec.agent_id, "done", "done")
+        await agent_manager.register(user="scott", role="chat", task="t2", level=2)
+
+        with patch("persona.get_user", return_value="scott"):
+            from tools.agents import agent_list
+            output = await agent_list(status="running")
+
+        assert "1 agent(s)" in output
+
+    @pytest.mark.asyncio
+    async def test_agent_list_empty(self):
+        with patch("persona.get_user", return_value="scott"):
+            from tools.agents import agent_list
+            output = await agent_list()
+
+        assert "No agents found" in output
+
+    @pytest.mark.asyncio
+    async def test_agent_cancel_tool(self):
+        import agent_manager
+        rec = await agent_manager.register(user="scott", role="chat", task="t", level=2)
+
+        with patch("persona.get_user", return_value="scott"):
+            from tools.agents import agent_cancel
+            output = await agent_cancel(rec.agent_id)
+
+        assert "cancelled" in output
+        assert agent_manager.get(rec.agent_id).status == "cancelled"
+
+
+# ---------------------------------------------------------------------------
+# aider_run — background mode
+# ---------------------------------------------------------------------------
+
+class TestAiderRunBackground:
+
+    @pytest.mark.asyncio
+    async def test_background_returns_agent_id(self):
+        import agent_manager
+
+        async def _fake_proc(*args, **kwargs):
+            mock_proc = MagicMock()
+            mock_proc.communicate = AsyncMock(return_value=(b"All changes applied.", b""))
+            mock_proc.returncode = 0
+            return mock_proc
+
+        with (
+            patch("persona.get_user", return_value="scott"),
+            patch("model_registry.get_registry", return_value={"hosts": []}),
+            patch("asyncio.create_subprocess_exec", side_effect=_fake_proc),
+        ):
+            from tools.aider import aider_run
+            result = await aider_run(
+                project=str(_CORTEX_DIR.parent),  # use actual project root (exists)
+                task="Test background task",
+                background=True,
+            )
+
+        assert "Aider task started in background" in result
+        assert "ID:" in result
+
+    @pytest.mark.asyncio
+    async def test_background_agent_completes(self):
+        import agent_manager
+
+        async def _fake_proc(*args, **kwargs):
+            mock_proc = MagicMock()
+            mock_proc.communicate = AsyncMock(return_value=(b"Edits applied.", b""))
+            mock_proc.returncode = 0
+            return mock_proc
+
+        from tools.aider import aider_run
+        with (
+            patch("persona.get_user", return_value="scott"),
+            patch("model_registry.get_registry", return_value={"hosts": []}),
+            patch("asyncio.create_subprocess_exec", side_effect=_fake_proc),
+        ):
+            result = await aider_run(
+                project=str(_CORTEX_DIR.parent),
+                task="Test",
+                background=True,
+            )
+            agent_id = result.split("ID: ")[1].split("\n")[0].strip()
+
+            # Poll while patches are still active
+            for _ in range(40):
+                rec = agent_manager.get(agent_id)
+                if rec and rec.status != "running":
+                    break
+                await asyncio.sleep(0.05)
+
+        rec = agent_manager.get(agent_id)
+        assert rec.status == "done"
+        assert "Edits applied" in (rec.result or "")
+
+    @pytest.mark.asyncio
+    async def test_invalid_project_directory(self):
+        from tools.aider import aider_run
+        result = await aider_run(project="/this/does/not/exist", task="Test")
+        assert "does not exist" in result
+
+    @pytest.mark.asyncio
+    async def test_sync_path_still_works(self):
+        async def _fake_proc(*args, **kwargs):
+            mock_proc = MagicMock()
+            mock_proc.communicate = AsyncMock(return_value=(b"Done.", b""))
+            mock_proc.returncode = 0
+            return mock_proc
+
+        with (
+            patch("persona.get_user", return_value="scott"),
+            patch("model_registry.get_registry", return_value={"hosts": []}),
+            patch("asyncio.create_subprocess_exec", side_effect=_fake_proc),
+        ):
+            from tools.aider import aider_run
+            result = await aider_run(
+                project=str(_CORTEX_DIR.parent),
+                task="Sync test",
+                background=False,
+            )
+
+        assert "Done." in result
+
+
+# ---------------------------------------------------------------------------
+# aider_run — credential resolver (_resolve_credentials)
+# ---------------------------------------------------------------------------
+
+class TestAiderCredentialResolver:
+    """Pure unit tests for _resolve_credentials — no subprocess, no registry I/O."""
+
+    def _registry(self, hosts=None, anthropic_key=None):
+        reg = {"hosts": hosts or [], "providers": {}}
+        if anthropic_key:
+            reg["providers"]["anthropic"] = {
+                "credentials": [{"api_key": anthropic_key}]
+            }
+        return reg
+
+    def _host(self, label, api_url, api_key="sk-test", host_type="openai"):
+        return {"id": "x", "label": label, "api_url": api_url,
+                "api_key": api_key, "host_type": host_type}
+
+    # --- Provider detection ---
+
+    def test_openrouter_host_gets_api_key_flag(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("OpenRouter", "https://openrouter.ai/api/v1", "or-key"),
+        ])
+        flags, model = _resolve_credentials(reg, None, None)
+        assert "--api-key" in flags
+        assert "openrouter=or-key" in flags
+
+    def test_anthropic_model_hint_uses_provider_key(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(
+            hosts=[self._host("OpenRouter", "https://openrouter.ai/api/v1")],
+            anthropic_key="ant-key",
+        )
+        flags, model = _resolve_credentials(reg, "claude-3-5-sonnet-20241022", None)
+        assert "anthropic=ant-key" in flags
+        assert model == "claude-3-5-sonnet-20241022"
+
+    def test_anthropic_slash_prefix_hint(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(anthropic_key="ant-key")
+        flags, _ = _resolve_credentials(reg, "anthropic/claude-opus-4", None)
+        assert "anthropic=ant-key" in flags
+
+    def test_local_openwebui_host_gets_base_url(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("Local", "http://192.168.32.19:3000", "localkey", host_type="openwebui"),
+        ])
+        flags, model = _resolve_credentials(reg, None, None)
+        assert "--openai-api-base" in flags
+        base = flags[flags.index("--openai-api-base") + 1]
+        assert base == "http://192.168.32.19:3000/api"
+        assert "--openai-api-key" in flags
+
+    def test_local_host_appends_api_suffix_for_openwebui(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("OpenWebUI", "http://localhost:3000", host_type="openwebui"),
+        ])
+        flags, _ = _resolve_credentials(reg, None, None)
+        base = flags[flags.index("--openai-api-base") + 1]
+        assert base.endswith("/api")
+
+    def test_generic_openai_host_no_api_suffix(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("Custom", "http://localhost:8080/v1", host_type="openai"),
+        ])
+        flags, _ = _resolve_credentials(reg, None, None)
+        base = flags[flags.index("--openai-api-base") + 1]
+        assert not base.endswith("/api")
+        assert base == "http://localhost:8080/v1"
+
+    # --- Model name adjustment ---
+
+    def test_local_host_prefixes_model_without_slash(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("Local", "http://localhost:3000", host_type="openwebui"),
+        ])
+        _, model = _resolve_credentials(reg, "gemma-4-27b-it", None)
+        assert model == "openai/gemma-4-27b-it"
+
+    def test_local_host_leaves_model_with_slash(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("Local", "http://localhost:3000", host_type="openwebui"),
+        ])
+        _, model = _resolve_credentials(reg, "ollama/gemma4", None)
+        assert model == "ollama/gemma4"  # already prefixed, don't touch
+
+    def test_cloud_provider_does_not_prefix_model(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("OpenRouter", "https://openrouter.ai/api/v1"),
+        ])
+        _, model = _resolve_credentials(reg, "google/gemma-3-27b-it", None)
+        assert model == "google/gemma-3-27b-it"
+
+    # --- Host label override ---
+
+    def test_host_label_selects_local_over_openrouter(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("OpenRouter", "https://openrouter.ai/api/v1", "or-key"),
+            self._host("Local RTX", "http://192.168.32.19:3000", "local-key", host_type="openwebui"),
+        ])
+        flags, _ = _resolve_credentials(reg, None, "Local")
+        assert "--openai-api-base" in flags
+        assert "--api-key" not in flags
+
+    def test_host_label_case_insensitive(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("OpenRouter", "https://openrouter.ai/api/v1", "or-key"),
+        ])
+        flags, _ = _resolve_credentials(reg, None, "openrouter")
+        assert "openrouter=or-key" in flags
+
+    # --- Model prefix routing ---
+
+    def test_model_openrouter_prefix_routes_to_openrouter(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("Local", "http://localhost:3000", host_type="openwebui"),
+            self._host("OpenRouter", "https://openrouter.ai/api/v1", "or-key"),
+        ])
+        flags, model = _resolve_credentials(reg, "openrouter/google/gemma-3-27b-it", None)
+        assert "openrouter=or-key" in flags
+        assert model == "openrouter/google/gemma-3-27b-it"
+
+    def test_model_groq_prefix_routes_to_groq_host(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("Groq", "https://api.groq.com/openai/v1", "groq-key"),
+        ])
+        flags, _ = _resolve_credentials(reg, "groq/llama-3.3-70b", None)
+        assert "groq=groq-key" in flags
+
+    # --- Default fallback priority ---
+
+    def test_prefers_openrouter_over_local_when_no_hint(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(hosts=[
+            self._host("Local", "http://localhost:3000", host_type="openwebui"),
+            self._host("OpenRouter", "https://openrouter.ai/api/v1", "or-key"),
+        ])
+        flags, _ = _resolve_credentials(reg, None, None)
+        assert "openrouter=or-key" in flags
+
+    def test_prefers_anthropic_over_local_when_no_openrouter(self):
+        from tools.aider import _resolve_credentials
+        reg = self._registry(
+            hosts=[self._host("Local", "http://localhost:3000", host_type="openwebui")],
+            anthropic_key="ant-key",
+        )
+        flags, _ = _resolve_credentials(reg, None, None)
+        assert "anthropic=ant-key" in flags
+
+    def test_empty_registry_returns_no_flags(self):
+        from tools.aider import _resolve_credentials
+        flags, model = _resolve_credentials({}, "gemma-4", None)
+        assert flags == []
+        assert model == "gemma-4"
+
+
+# ---------------------------------------------------------------------------
+# Helpers for manual test record creation (used in list tests above)
+# ---------------------------------------------------------------------------
+
+import agent_manager as _am
+
+_CORTEX_DIR = _am.__file__ and _am and __import__("pathlib").Path(_am.__file__).parent
+
+
+def _make_record(agent_id: str, user: str, status: str) -> "_am.AgentRecord":
+    from datetime import datetime
+    import agent_manager
+    rec = agent_manager.AgentRecord(
+        agent_id=agent_id,
+        level=2,
+        role="chat",
+        task="test task",
+        status=status,
+        started=datetime.now(),
+        user=user,
+        finished=datetime.now() if status != "running" else None,
+    )
+    return rec
--- a/cortex/tests/test_api_files.py
+++ b/cortex/tests/test_api_files.py
@@ -25,7 +25,10 @@ async def test_files_get_allowed(client):
@pytest.mark.anyio
 async def test_files_get_not_in_allowed(client):
    """Files outside the ALLOWED set should return 404, not the file content."""
-    for name in ("TASKS.json", "CRONS.json", "SCRATCH.md", "../config.py", ".env"):
+    # Note: paths with '..' are normalized at the ASGI layer (e.g. /files/../config.py
+    # becomes /config.py which hits the /{username} UI catch-all, not the files router).
+    # Only test paths that stay within the files router's scope.
+    for name in ("TASKS.json", "CRONS.json", "SCRATCH.md", ".env"):
        r = await client.get(f"/files/{name}")
        assert r.status_code == 404, f"Expected 404 for {name}, got {r.status_code}"

--- a/cortex/tests/test_health.py
+++ b/cortex/tests/test_health.py
@@ -30,5 +30,7 @@ async def test_distill_status(client):

@pytest.mark.anyio
 async def test_unknown_route_404(client):
-    r = await client.get("/does-not-exist")
+    # Single-segment paths hit the /{username} persona-picker catch-all (302 redirect).
+    # Three-segment paths don't match any route pattern → genuine 404.
+    r = await client.get("/totally/unknown/deep-path")
    assert r.status_code == 404
--- a/cortex/tests/test_model_registry.py
+++ b/cortex/tests/test_model_registry.py
@@ -70,7 +70,7 @@ def test_empty_registry_no_files(tmp_path):
    import model_registry as reg
    with patch.object(config.settings, "home_dir", home):
        data = reg._load("scott")
-    assert data["version"] == 1
+    assert data["version"] == 2
    assert data["hosts"] == []
    assert data["models"] == []
    assert data["roles"] == {}
@@ -244,7 +244,7 @@ def test_migration_saves_registry_file(tmp_path):
        data2 = reg._load("scott")

    assert (home / "scott" / "model_registry.json").exists()
-    assert data2["version"] == 1
+    assert data2["version"] == 2


 # ---------------------------------------------------------------------------
--- a/cortex/tests/test_security.py
+++ b/cortex/tests/test_security.py
@@ -69,10 +69,11 @@ async def test_nct_replayed_request_rejected(client):
    payload = json.dumps({"type": "Create", "actor": {}, "object": {}, "target": {}}).encode()
    # Use wrong secret to generate sig
    wrong_sig = hmac_lib.new(b"wrong-secret", b"abc123" + payload, hashlib.sha256).hexdigest()
+    _channels = {"nextcloud": {"bot_secret": "correct-secret", "url": "https://nc.example.com"}}
    from unittest.mock import patch
-    with patch("config.settings.nextcloud_talk_bot_secret", "correct-secret"):
+    with patch("routers.nextcloud_talk.get_user_channels", return_value=_channels):
        r = await client.post(
-            "/inara-nextcloud-talk-webhook",
+            "/webhook/nextcloud/scott",
            content=payload,
            headers={
                "Content-Type": "application/json",
@@ -118,9 +119,11 @@ async def test_known_gap__gchat_no_audience_bypass(client, mock_llm):
    LLM responses without a valid token.
    Fix: make audience required; fail loudly if not set.
    """
+    # Channel config with no audience — JWT check is skipped (the known gap).
+    _channels = {"google_chat": {"persona": "inara"}}
    from unittest.mock import patch
-    with patch("config.settings.google_chat_audience", ""):
-        r = await client.post("/channels/google-chat", json={
+    with patch("routers.google_chat.get_user_channels", return_value=_channels):
+        r = await client.post("/channels/google-chat/scott", json={
            "chat": {
                "messagePayload": {
                    "message": {"text": "Exploit"},
--- a/cortex/tests/test_tools.py
+++ b/cortex/tests/test_tools.py
@@ -101,19 +101,19 @@ class TestTasks:

    def test_list_empty(self):
        from tools.tasks import _task_list
-        assert "No tasks" in _task_list(status=None)
+        assert "No tasks" in _task_list(status=None, priority=None)

    def test_create_and_list(self):
        from tools.tasks import _task_list
        self._mk("Buy coffee", description="Dark roast", priority="high")
-        result = _task_list(status=None)
+        result = _task_list(status=None, priority=None)
        assert "Buy coffee" in result
        assert "[high]" in result

    def test_create_bad_priority_defaults_to_normal(self):
        from tools.tasks import _task_list
        self._mk("Test task", priority="urgent")  # invalid — becomes "normal"
-        result = _task_list(status=None)
+        result = _task_list(status=None, priority=None)
        assert "Test task" in result
        assert "[normal]" not in result  # normal priority not shown in brackets

@@ -121,20 +121,20 @@ class TestTasks:
        from tools.tasks import _task_update, _task_list
        tid = self._id(self._mk("Work item"))
        _task_update(tid, status="in_progress", title=None, description=None, priority=None)
-        assert "Work item" in _task_list(status="in_progress")
+        assert "Work item" in _task_list(status="in_progress", priority=None)

    def test_complete(self):
        from tools.tasks import _task_complete, _task_list
        tid = self._id(self._mk("Finish this"))
        _task_complete(tid)
-        assert "Finish this" in _task_list(status="done")
-        assert "Finish this" not in _task_list(status="todo")
+        assert "Finish this" in _task_list(status="done", priority=None)
+        assert "Finish this" not in _task_list(status="todo", priority=None)

    def test_filter_by_status(self):
        from tools.tasks import _task_list
        self._mk("A task")
-        assert "A task" in _task_list(status="todo")
-        assert "A task" not in _task_list(status="done")
+        assert "A task" in _task_list(status="todo", priority=None)
+        assert "A task" not in _task_list(status="done", priority=None)

    def test_update_unknown_id(self):
        from tools.tasks import _task_update
@@ -231,7 +231,8 @@ class TestCronTools:

    def _extract_id(self, result: str) -> str:
        import re
-        m = re.search(r'c_\w+', result)
+        # token_urlsafe can include '-'; use [\w-]+ to capture the full ID
+        m = re.search(r'c_[\w-]+', result)
        assert m, f"No cron ID in: {result}"
        return m.group()

--- a/cortex/tests/test_webhooks.py
+++ b/cortex/tests/test_webhooks.py
@@ -2,6 +2,10 @@
 Webhook auth tests — NC Talk HMAC, Google Chat JWT.

 These tests verify that auth is enforced, not that full LLM responses work.
+
+Architecture note: channel config (secrets, audience) lives in per-user channels.json,
+not in settings. Tests mock get_user_channels() rather than patching settings fields.
+Endpoints are per-user: /webhook/nextcloud/{username} and /channels/google-chat/{username}.
 """
 import hashlib
 import hmac
@@ -26,6 +30,14 @@ _VALID_NC_PAYLOAD = {
    "target": {"id": "abc123token"},
 }

+_NCT_CHANNELS = {
+    "nextcloud": {
+        "bot_secret":        _NC_SECRET,
+        "notification_room": "abc123token",
+        "url":               "https://nc.example.com",
+    }
+}
+

 def _nc_headers(body: bytes, secret: str) -> dict:
    random_str = "abc123"
@@ -43,11 +55,11 @@ def _nc_headers(body: bytes, secret: str) -> dict:
@pytest.mark.anyio
 async def test_nct_valid_signature(client, mock_llm):
    body = json.dumps(_VALID_NC_PAYLOAD).encode()
-    with patch("config.settings.nextcloud_talk_bot_secret", _NC_SECRET):
+    with patch("routers.nextcloud_talk.get_user_channels", return_value=_NCT_CHANNELS):
        with patch("routers.nextcloud_talk._send_reply", new_callable=AsyncMock):
            headers = _nc_headers(body, _NC_SECRET)
            r = await client.post(
-                "/inara-nextcloud-talk-webhook",
+                "/webhook/nextcloud/scott",
                content=body,
                headers={**headers, "Content-Type": "application/json"},
            )
@@ -57,9 +69,9 @@ async def test_nct_valid_signature(client, mock_llm):
@pytest.mark.anyio
 async def test_nct_wrong_signature(client):
    body = json.dumps(_VALID_NC_PAYLOAD).encode()
-    with patch("config.settings.nextcloud_talk_bot_secret", _NC_SECRET):
+    with patch("routers.nextcloud_talk.get_user_channels", return_value=_NCT_CHANNELS):
        r = await client.post(
-            "/inara-nextcloud-talk-webhook",
+            "/webhook/nextcloud/scott",
            content=body,
            headers={
                "Content-Type": "application/json",
@@ -73,9 +85,9 @@ async def test_nct_wrong_signature(client):
@pytest.mark.anyio
 async def test_nct_missing_signature(client):
    body = json.dumps(_VALID_NC_PAYLOAD).encode()
-    with patch("config.settings.nextcloud_talk_bot_secret", _NC_SECRET):
+    with patch("routers.nextcloud_talk.get_user_channels", return_value=_NCT_CHANNELS):
        r = await client.post(
-            "/inara-nextcloud-talk-webhook",
+            "/webhook/nextcloud/scott",
            content=body,
            headers={"Content-Type": "application/json"},
        )
@@ -84,11 +96,13 @@ async def test_nct_missing_signature(client):

@pytest.mark.anyio
 async def test_nct_no_secret_configured(client):
-    """Service should return 500 if secret is not set, not process the message."""
+    """Service should return 500 if bot_secret is missing, not process the message."""
    body = json.dumps(_VALID_NC_PAYLOAD).encode()
-    with patch("config.settings.nextcloud_talk_bot_secret", ""):
+    # cfg must be non-empty (truthy) to get past the 404 guard; missing bot_secret → 500
+    empty_cfg = {"nextcloud": {"url": "https://nc.example.com"}}
+    with patch("routers.nextcloud_talk.get_user_channels", return_value=empty_cfg):
        r = await client.post(
-            "/inara-nextcloud-talk-webhook",
+            "/webhook/nextcloud/scott",
            content=body,
            headers={"Content-Type": "application/json"},
        )
@@ -100,10 +114,10 @@ async def test_nct_bot_message_ignored(client):
    """Messages from other bots should be silently ignored (not processed)."""
    payload = {**_VALID_NC_PAYLOAD, "actor": {"type": "bots", "id": "otherbot", "name": "Bot"}}
    body = json.dumps(payload).encode()
-    with patch("config.settings.nextcloud_talk_bot_secret", _NC_SECRET):
+    with patch("routers.nextcloud_talk.get_user_channels", return_value=_NCT_CHANNELS):
        headers = _nc_headers(body, _NC_SECRET)
        r = await client.post(
-            "/inara-nextcloud-talk-webhook",
+            "/webhook/nextcloud/scott",
            content=body,
            headers={**headers, "Content-Type": "application/json"},
        )
@@ -124,21 +138,29 @@ _GCHAT_PAYLOAD = {
    }
 }

+_GCHAT_CHANNELS_NO_AUDIENCE = {
+    # cfg must be non-empty (truthy) to pass the 404 guard; no audience → JWT skipped
+    "google_chat": {"persona": "inara"}
+}
+
+_GCHAT_CHANNELS_WITH_AUDIENCE = {
+    "google_chat": {"audience": "123456789"}
+}
+

@pytest.mark.anyio
 async def test_gchat_no_audience_configured(client, mock_llm):
    """When audience is not set, JWT check is skipped (current behaviour — documented bypass)."""
-    with patch("config.settings.google_chat_audience", ""):
-        r = await client.post("/channels/google-chat", json=_GCHAT_PAYLOAD)
-    # Should process the message (no auth enforcement when audience is empty)
+    with patch("routers.google_chat.get_user_channels", return_value=_GCHAT_CHANNELS_NO_AUDIENCE):
+        r = await client.post("/channels/google-chat/scott", json=_GCHAT_PAYLOAD)
    assert r.status_code == 200


@pytest.mark.anyio
 async def test_gchat_missing_token_with_audience(client):
    """When audience IS configured, requests without a token must be rejected."""
-    with patch("config.settings.google_chat_audience", "123456789"):
-        r = await client.post("/channels/google-chat", json=_GCHAT_PAYLOAD)
+    with patch("routers.google_chat.get_user_channels", return_value=_GCHAT_CHANNELS_WITH_AUDIENCE):
+        r = await client.post("/channels/google-chat/scott", json=_GCHAT_PAYLOAD)
    assert r.status_code == 401


@@ -149,8 +171,8 @@ async def test_gchat_invalid_token_with_audience(client):
        **_GCHAT_PAYLOAD,
        "authorizationEventObject": {"systemIdToken": "not.a.valid.jwt"},
    }
-    with patch("config.settings.google_chat_audience", "123456789"):
-        r = await client.post("/channels/google-chat", json=payload_with_token)
+    with patch("routers.google_chat.get_user_channels", return_value=_GCHAT_CHANNELS_WITH_AUDIENCE):
+        r = await client.post("/channels/google-chat/scott", json=payload_with_token)
    assert r.status_code == 401


@@ -158,7 +180,7 @@ async def test_gchat_invalid_token_with_audience(client):
 async def test_gchat_added_to_space(client, mock_llm):
    """Bot added to a space — should return a greeting, no auth when audience empty."""
    payload = {"chat": {"addedToSpacePayload": {"space": {"type": "ROOM"}}}}
-    with patch("config.settings.google_chat_audience", ""):
-        r = await client.post("/channels/google-chat", json=payload)
+    with patch("routers.google_chat.get_user_channels", return_value=_GCHAT_CHANNELS_NO_AUDIENCE):
+        r = await client.post("/channels/google-chat/scott", json=payload)
    assert r.status_code == 200
    assert "hostAppDataAction" in r.json()
--- a/cortex/tool_audit.py
+++ b/cortex/tool_audit.py
@@ -0,0 +1,156 @@
+"""
+Tool call audit log.
+
+One JSONL file per user per day:
+  home/{user}/tool_audit/YYYY-MM-DD.jsonl
+
+Each line is a JSON object:
+  ts            ISO timestamp (seconds)
+  user          username
+  tool          tool name
+  args          call arguments (string values truncated at ARG_MAX chars)
+  status        "ok" | "error" | "denied"
+  result_chars  length of full result string
+  result_snippet first SNIPPET_MAX chars of result
+"""
+import asyncio
+import json
+import logging
+from contextvars import ContextVar
+from datetime import datetime, date
+from pathlib import Path
+
+from config import settings
+
+logger = logging.getLogger(__name__)
+
+_ARG_MAX     = 500   # truncate individual arg string values longer than this
+_SNIPPET_MAX = 300   # chars of result to keep as snippet
+
+# Per-file write locks — prevents interleaved lines under concurrent tool calls
+_locks: dict[str, asyncio.Lock] = {}
+
+# ContextVars set by orchestrators before their tool loop runs
+_audit_engine: ContextVar[str] = ContextVar("_audit_engine", default="")
+_audit_model:  ContextVar[str] = ContextVar("_audit_model",  default="")
+
+
+def set_context(engine: str, model: str) -> None:
+    """Call at the start of each orchestrator run to tag subsequent tool calls."""
+    _audit_engine.set(engine)
+    _audit_model.set(model)
+
+
+def _truncate_args(args: dict) -> dict:
+    out = {}
+    for k, v in args.items():
+        if isinstance(v, str) and len(v) > _ARG_MAX:
+            out[k] = v[:_ARG_MAX] + f" …[{len(v)} chars total]"
+        else:
+            out[k] = v
+    return out
+
+
+def _audit_path(user: str, day: date | None = None) -> Path:
+    d = day or date.today()
+    audit_dir = settings.home_root() / user / "tool_audit"
+    audit_dir.mkdir(parents=True, exist_ok=True)
+    return audit_dir / f"{d.isoformat()}.jsonl"
+
+
+async def record(
+    user: str,
+    tool: str,
+    args: dict,
+    status: str,    # "ok" | "error" | "denied"
+    result: str = "",
+) -> None:
+    """Append one audit entry. Fire with asyncio.create_task — never awaited directly."""
+    path = _audit_path(user)
+    key = str(path)
+    if key not in _locks:
+        _locks[key] = asyncio.Lock()
+
+    entry = {
+        "ts":             datetime.now().isoformat(timespec="seconds"),
+        "user":           user,
+        "engine":         _audit_engine.get(),
+        "model":          _audit_model.get(),
+        "tool":           tool,
+        "args":           _truncate_args(args),
+        "status":         status,
+        "result_chars":   len(result),
+        "result_snippet": result[:_SNIPPET_MAX],
+    }
+
+    async with _locks[key]:
+        try:
+            with path.open("a", encoding="utf-8") as f:
+                f.write(json.dumps(entry) + "\n")
+        except Exception as e:
+            logger.warning("audit log write failed for %s: %s", user, e)
+
+
+def read_recent(user: str, days: int = 7, limit: int = 200) -> list[dict]:
+    """Read the most recent `limit` entries across the last `days` days.
+
+    Returns entries sorted newest-first (by ts field, file order within a day).
+    """
+    from datetime import timedelta
+    today = date.today()
+    entries: list[dict] = []
+
+    for offset in range(days):
+        day = today - timedelta(days=offset)
+        path = settings.home_root() / user / "tool_audit" / f"{day.isoformat()}.jsonl"
+        if not path.exists():
+            continue
+        try:
+            lines = path.read_text(encoding="utf-8").splitlines()
+        except Exception:
+            continue
+        day_entries = []
+        for line in lines:
+            line = line.strip()
+            if not line:
+                continue
+            try:
+                day_entries.append(json.loads(line))
+            except json.JSONDecodeError:
+                pass
+        # Newest within the day first
+        entries.extend(reversed(day_entries))
+        if len(entries) >= limit:
+            break
+
+    return entries[:limit]
+
+
+def read_day(user: str, day_str: str) -> list[dict]:
+    """Read all entries for a specific date string (YYYY-MM-DD), chronological order."""
+    path = settings.home_root() / user / "tool_audit" / f"{day_str}.jsonl"
+    if not path.exists():
+        return []
+    entries = []
+    try:
+        for line in path.read_text(encoding="utf-8").splitlines():
+            line = line.strip()
+            if not line:
+                continue
+            try:
+                entries.append(json.loads(line))
+            except json.JSONDecodeError:
+                pass
+    except Exception:
+        pass
+    return entries
+
+
+def read_recent_all_users(days: int = 7, limit: int = 500) -> list[dict]:
+    """Read recent entries across all users, sorted newest-first."""
+    from persona import list_users
+    all_entries: list[dict] = []
+    for user in list_users():
+        all_entries.extend(read_recent(user, days=days, limit=limit))
+    all_entries.sort(key=lambda e: e.get("ts", ""), reverse=True)
+    return all_entries[:limit]
--- a/cortex/tools/init.py
+++ b/cortex/tools/init.py
--- a/cortex/tools/ae_database.py
+++ b/cortex/tools/ae_database.py
@@ -0,0 +1,253 @@
+"""
+Aether MariaDB tools — SELECT-only access to the Aether Platform database.
+
+Credentials are read from the current user's channels.json:
+  "aether_db": {
+      "host":     "192.168.64.5",
+      "port":     3306,
+      "name":     "aether_dev",
+      "user":     "aether_dev",
+      "password": "..."
+  }
+
+Configure per-user in Settings → Notifications (or edit channels.json directly).
+Only SELECT, SHOW, DESCRIBE, and EXPLAIN statements are permitted — no writes possible.
+"""
+
+import asyncio
+import logging
+import re
+
+from google.genai import types
+
+from auth_utils import get_user_channels
+from persona import get_user
+
+logger = logging.getLogger(__name__)
+
+_MAX_ROWS = 200
+_MAX_CELL = 120
+_ALLOWED  = {"select", "show", "describe", "desc", "explain"}
+_SAFE_ID  = re.compile(r'^[a-zA-Z0-9_]+$')
+
+
+def _get_db_cfg() -> tuple[dict, str | None]:
+    """Return (cfg_dict, error_string). cfg is empty dict on error."""
+    channels = get_user_channels(get_user())
+    cfg = channels.get("aether_db") or {}
+    if not cfg.get("host") or not cfg.get("user"):
+        return {}, (
+            "Aether DB not configured for this user. "
+            "Add an 'aether_db' block to channels.json: "
+            '{"host": "...", "port": 3306, "name": "aether_dev", "user": "...", "password": "..."}'
+        )
+    return cfg, None
+
+
+def _is_read_only(sql: str) -> bool:
+    stripped = sql.strip()
+    if not stripped:
+        return False
+    first = stripped.split()[0].lower().rstrip(";")
+    return first in _ALLOWED
+
+
+def _fmt(columns: list[str], rows: list[tuple]) -> str:
+    if not rows:
+        return f"({len(columns)} column{'s' if len(columns) != 1 else ''}, 0 rows)"
+
+    str_rows = [
+        [("NULL" if v is None else str(v))[:_MAX_CELL] for v in row]
+        for row in rows
+    ]
+
+    widths = [
+        max([len(col)] + [len(r[i]) for r in str_rows])
+        for i, col in enumerate(columns)
+    ]
+
+    sep    = "+" + "+".join("-" * (w + 2) for w in widths) + "+"
+    header = "|" + "|".join(f" {c:<{w}} " for c, w in zip(columns, widths)) + "|"
+    lines  = [sep, header, sep]
+    for row in str_rows:
+        lines.append("|" + "|".join(f" {v:<{w}} " for v, w in zip(row, widths)) + "|")
+    lines.append(sep)
+
+    note = " — results truncated at limit" if len(rows) == _MAX_ROWS else ""
+    lines.append(f"({len(rows)} row{'s' if len(rows) != 1 else ''}{note})")
+    return "\n".join(lines)
+
+
+def _connect(cfg: dict):
+    import pymysql
+    import pymysql.cursors
+    return pymysql.connect(
+        host=cfg["host"],
+        port=int(cfg.get("port", 3306)),
+        user=cfg["user"],
+        password=cfg.get("password", ""),
+        database=cfg.get("name", "aether_dev"),
+        cursorclass=pymysql.cursors.Cursor,
+        connect_timeout=10,
+    )
+
+
+async def ae_db_query(sql: str) -> str:
+    """Run a read-only SQL query against the Aether MariaDB and return formatted results."""
+    cfg, err = _get_db_cfg()
+    if err:
+        return err
+
+    if not _is_read_only(sql):
+        first = sql.strip().split()[0] if sql.strip() else "(empty)"
+        return f"Only SELECT, SHOW, DESCRIBE, and EXPLAIN are permitted. Got: {first!r}"
+
+    def _run() -> tuple[list[str], list[tuple]]:
+        conn = _connect(cfg)
+        try:
+            with conn.cursor() as cur:
+                cur.execute(sql)
+                columns = [d[0] for d in cur.description] if cur.description else []
+                rows    = list(cur.fetchmany(_MAX_ROWS))
+                return columns, rows
+        finally:
+            conn.close()
+
+    try:
+        columns, rows = await asyncio.to_thread(_run)
+        return _fmt(columns, rows)
+    except Exception as e:
+        logger.warning("ae_db_query error: %s", e)
+        return f"Query error: {e}"
+
+
+async def ae_db_describe(table: str, detailed: bool = False) -> str:
+    """Describe the columns of an Aether DB table or view."""
+    cfg, err = _get_db_cfg()
+    if err:
+        return err
+
+    if not _SAFE_ID.match(table):
+        return f"Invalid table name: {table!r}. Only letters, digits, and underscores allowed."
+
+    def _run():
+        conn = _connect(cfg)
+        try:
+            with conn.cursor() as cur:
+                cur.execute(f"DESCRIBE `{table}`")
+                columns = [d[0] for d in cur.description] if cur.description else []
+                rows    = list(cur.fetchall())
+                return columns, rows
+        finally:
+            conn.close()
+
+    try:
+        columns, rows = await asyncio.to_thread(_run)
+        if not detailed:
+            fields = [row[0] for row in rows]
+            return f"{table}: " + ", ".join(fields)
+        return _fmt(columns, rows)
+    except Exception as e:
+        logger.warning("ae_db_describe error: %s", e)
+        return f"Describe error: {e}"
+
+
+async def ae_db_show_view(view_name: str) -> str:
+    """Return the CREATE VIEW SQL for an Aether DB view."""
+    cfg, err = _get_db_cfg()
+    if err:
+        return err
+
+    if not _SAFE_ID.match(view_name):
+        return f"Invalid view name: {view_name!r}. Only letters, digits, and underscores allowed."
+
+    def _run():
+        conn = _connect(cfg)
+        try:
+            with conn.cursor() as cur:
+                cur.execute(f"SHOW CREATE VIEW `{view_name}`")
+                return cur.fetchone()
+        finally:
+            conn.close()
+
+    try:
+        row = await asyncio.to_thread(_run)
+        if not row:
+            return f"View not found: {view_name}"
+        return str(row[1]) if len(row) > 1 else str(row[0])
+    except Exception as e:
+        logger.warning("ae_db_show_view error: %s", e)
+        return f"Show view error: {e}"
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="ae_db_describe",
+        description=(
+            "Describe the columns of an Aether Platform table or view. "
+            "Returns a compact field list by default; pass detailed=true for full schema "
+            "(type, nullability, default, key). Use to understand data structure before "
+            "writing a SELECT query, or to answer 'what fields does X have?'. "
+            "Examples: table='ae_journals'; table='clients'; table='time_entries'."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "table": types.Schema(
+                    type=types.Type.STRING,
+                    description="Table or view name (letters, digits, underscores only)",
+                ),
+                "detailed": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description="Return full schema (type, nullability, key, default) instead of just field names",
+                ),
+            },
+            required=["table"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_db_show_view",
+        description=(
+            "Return the CREATE VIEW SQL for an Aether Platform database view. "
+            "Use to understand how a view is constructed before querying it, "
+            "or to debug unexpected results from a view. "
+            "Example: view_name='v_active_journals'."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "view_name": types.Schema(
+                    type=types.Type.STRING,
+                    description="View name (letters, digits, underscores only)",
+                ),
+            },
+            required=["view_name"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_db_query",
+        description=(
+            "Run a read-only SQL query against the Aether Platform MariaDB. "
+            "Permitted statements: SELECT, SHOW, DESCRIBE, EXPLAIN. No writes are possible. "
+            "Use for debugging: bad data, missing records, broken foreign keys, schema questions. "
+            "Results capped at 200 rows; cells truncated at 120 chars. "
+            "Examples: SELECT * FROM clients WHERE email = 'x@y.com'; "
+            "SELECT COUNT(*) FROM time_entries WHERE billed = 0 AND deleted_at IS NULL; "
+            "SHOW TABLES; DESCRIBE ae_journals; "
+            "SELECT id_random, enable, deleted_at FROM ae_journals WHERE id_random = 'abc123'."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "sql": types.Schema(
+                    type=types.Type.STRING,
+                    description=(
+                        "SQL query to run — SELECT, SHOW, DESCRIBE, or EXPLAIN only. "
+                        "No semicolons required but harmless if present."
+                    ),
+                ),
+            },
+            required=["sql"],
+        ),
+    ),
+]
--- a/cortex/tools/ae_knowledge.py
+++ b/cortex/tools/ae_knowledge.py
@@ -1,15 +1,17 @@
 """
-Aether Platform knowledge tools — journal search and entry creation.
+Aether Platform knowledge tools — journal search, listing, and entry management.

 These tools give the orchestrator read/write access to the AE Journals module,
 which serves as the primary long-term knowledge base.

 Auth: x-aether-api-key + x-account-id headers (same pattern as agents_sync scripts).
 API:  V3 CRUD — POST /v3/crud/journal_entry/search, POST /v3/crud/journal/{id}/journal_entry/
+              PATCH /v3/crud/journal_entry/{entry_id}, GET /v3/crud/journal_entry/{entry_id}
 """

 import asyncio
 import logging
+from google.genai import types
 from config import settings

 logger = logging.getLogger(__name__)
@@ -40,36 +42,98 @@ def _check_config() -> str | None:
 # Tool: ae_journal_search
 # ---------------------------------------------------------------------------

-async def journal_search(query: str, journal_id: str | None = None, max_results: int = 10) -> str:
-    """Search AE Journal entries by keyword.
+async def journal_search(
+    query: str = "",
+    journal_id: str = "",
+    tags: str = "",
+    type_code: str = "",
+    topic_code: str = "",
+    date_from: str = "",
+    date_to: str = "",
+    sort_by: str = "updated",
+    sort_order: str = "desc",
+    status: int | None = None,
+    priority: int | None = None,
+    max_results: int = 10,
+    page: int = 1,
+) -> str:
+    """Search AE Journal entries.

-    Searches across the default_qry_str field (title + content excerpt).
-    Optionally scoped to a specific journal by journal_id (id_random).
-    Returns a markdown-formatted list of matching entries.
+    At least one of query, tags, type_code, topic_code, date_from, or journal_id
+    should be provided. All filters combine with AND.
    """
    err = _check_config()
    if err:
        return err
-
-    return await asyncio.to_thread(_sync_journal_search, query, journal_id, max_results)
+    return await asyncio.to_thread(
+        _sync_journal_search,
+        query, journal_id, tags, type_code, topic_code,
+        date_from, date_to, sort_by, sort_order,
+        status, priority, max_results, page,
+    )


-def _sync_journal_search(query: str, journal_id: str | None, max_results: int) -> str:
+def _sync_journal_search(
+    query: str,
+    journal_id: str,
+    tags: str,
+    type_code: str,
+    topic_code: str,
+    date_from: str,
+    date_to: str,
+    sort_by: str,
+    sort_order: str,
+    status: int | None,
+    priority: int | None,
+    max_results: int,
+    page: int,
+) -> str:
    import requests

-    url = f"{settings.ae_api_url}/v3/crud/journal_entry/search"
-    search_body = {
-        "and_filters": [
-            {"field": "default_qry_str", "op": "icontains", "value": query}
-        ],
-        "page_size": max_results,
+    # Build sort field
+    sort_field_map = {
+        "updated": "updated_on",
+        "created": "created_on",
+        "name":    "name",
+        "priority": "priority",
    }
+    sort_field = sort_field_map.get(sort_by, "updated_on")
+    order_by = f"{'-' if sort_order == 'desc' else ''}{sort_field}"

-    params = {}
+    search_body: dict = {"page_size": max_results, "page": page, "order_by": order_by}
+
+    # Fulltext keyword — uses MATCH/AGAINST index
+    if query:
+        search_body["query_string"] = query
+
+    # Additional AND filters
+    and_filters: list[dict] = []
+    if tags:
+        and_filters.append({"field": "tags", "op": "icontains", "value": tags})
+    if type_code:
+        and_filters.append({"field": "type_code", "op": "eq", "value": type_code})
+    if topic_code:
+        and_filters.append({"field": "topic_code", "op": "eq", "value": topic_code})
+    if date_from:
+        and_filters.append({"field": "created_on", "op": "gte", "value": date_from})
+    if date_to:
+        and_filters.append({"field": "created_on", "op": "lte", "value": date_to})
+    if status is not None:
+        and_filters.append({"field": "status", "op": "eq", "value": status})
+    if priority is not None:
+        and_filters.append({"field": "priority", "op": "eq", "value": priority})
+    if and_filters:
+        search_body["and"] = and_filters
+        # query_string must be present for `and` filters to apply
+        if "query_string" not in search_body:
+            search_body["query_string"] = "%"
+
+    params: dict = {}
    if journal_id:
        params["for_obj_type"] = "journal"
        params["for_obj_id"] = journal_id

+    url = f"{settings.ae_api_url}/v3/crud/journal_entry/search"
    try:
        resp = requests.post(
            url,
@@ -85,33 +149,92 @@ def _sync_journal_search(query: str, journal_id: str | None, max_results: int) -
        return f"Journal search error: {e}"

    entries = data.get("data", [])
-    if not entries:
-        return f"No journal entries found matching: {query}"
+    total   = (data.get("meta") or {}).get("data_list_count") or len(entries)
+
+    if not entries:
+        desc = query or tags or type_code or topic_code or f"journal {journal_id}"
+        return f"No journal entries found for: {desc}"
+
+    label = query or tags or f"{len(entries)} entries"
+    lines = [f"Journal entries — **{label}** ({total} total, page {page}):\n"]

-    lines = [f"Journal entries matching **{query}** ({len(entries)} result(s)):\n"]
    for entry in entries:
-        title = entry.get("name") or "(untitled)"
-        entry_id = entry.get("id_random", "")
+        title    = entry.get("name") or "(untitled)"
+        entry_id = entry.get("journal_entry_id") or entry.get("id") or ""
        journal_name = entry.get("journal_name") or entry.get("parent_name") or ""
-        summary = entry.get("summary") or ""
-        content_preview = (entry.get("content") or "")[:200].replace("\n", " ")
+        summary  = entry.get("summary") or ""
+        entry_tags = entry.get("tags") or []
+        updated  = (entry.get("updated_on") or entry.get("created_on") or "")[:10]
+        content_preview = (entry.get("content") or "")[:400].replace("\n", " ")

        header = f"**{title}**"
        if journal_name:
            header += f" ({journal_name})"
-        if entry_id:
-            header += f" — id: `{entry_id}`"
-
+        header += f" — id: `{entry_id}`"
+        if updated:
+            header += f"  [{updated}]"
        lines.append(header)
+        if entry_tags:
+            tag_list = entry_tags if isinstance(entry_tags, list) else [t.strip() for t in str(entry_tags).split(",")]
+            lines.append(f"  Tags: {', '.join(tag_list)}")
        if summary:
-            lines.append(f"  Summary: {summary}")
-        if content_preview:
-            lines.append(f"  {content_preview}…")
+            lines.append(f"  {summary}")
+        elif content_preview:
+            lines.append(f"  {content_preview}{'…' if len(entry.get('content', '')) > 400 else ''}")
        lines.append("")

+    if total > page * max_results:
+        lines.append(f"(More results — call again with page={page + 1})")
+
    return "\n".join(lines).strip()


+# ---------------------------------------------------------------------------
+# Tool: ae_journal_list
+# ---------------------------------------------------------------------------
+
+async def journal_list() -> str:
+    """List all journals accessible to the configured AE account."""
+    err = _check_config()
+    if err:
+        return err
+    return await asyncio.to_thread(_sync_journal_list)
+
+
+def _sync_journal_list() -> str:
+    import requests
+
+    url = f"{settings.ae_api_url}/v3/crud/journal/search"
+    try:
+        resp = requests.post(
+            url,
+            headers=_headers(),
+            json={"page_size": 100},
+            timeout=settings.ae_api_timeout,
+        )
+        resp.raise_for_status()
+        data = resp.json()
+    except Exception as e:
+        logger.warning("ae_journal_list failed: %s", e)
+        return f"Journal list error: {e}"
+
+    journals = data.get("data", [])
+    if not journals:
+        return "No journals found for this account."
+
+    lines = [f"Journals ({len(journals)}):\n"]
+    for j in journals:
+        jid  = j.get("journal_id") or j.get("id_random") or j.get("id") or "?"
+        name = j.get("name") or "(untitled)"
+        desc = j.get("description") or ""
+        line = f"- **{name}** — id: `{jid}`"
+        if desc:
+            line += f"\n  {desc}"
+        lines.append(line)
+
+    return "\n".join(lines)
+
+
 # ---------------------------------------------------------------------------
 # Tool: ae_journal_entry_create
 # ---------------------------------------------------------------------------
@@ -170,8 +293,455 @@ def _sync_journal_entry_create(
        return f"Journal entry creation error: {e}"

    entry_id = (
-        result.get("data", {}).get("id_random")
+        result.get("data", {}).get("journal_entry_id")
+        or result.get("data", {}).get("id_random")
        or result.get("id_random")
        or "unknown"
    )
    return f"Journal entry created. id: `{entry_id}`, title: \"{title}\", journal: `{journal_id}`"
+
+
+# ---------------------------------------------------------------------------
+# Shared helper: fetch a single journal entry by id
+# ---------------------------------------------------------------------------
+
+def _get_entry(entry_id: str) -> dict | str:
+    """Return the entry dict, or an error string on failure."""
+    import requests
+    url = f"{settings.ae_api_url}/v3/crud/journal_entry/{entry_id}"
+    try:
+        resp = requests.get(url, headers=_headers(), timeout=settings.ae_api_timeout)
+        resp.raise_for_status()
+        data = resp.json()
+        entry = data.get("data") or data
+        if not isinstance(entry, dict):
+            return f"Unexpected response shape for entry {entry_id}"
+        return entry
+    except Exception as e:
+        logger.warning("_get_entry %s failed: %s", entry_id, e)
+        return f"Error fetching entry {entry_id}: {e}"
+
+
+def _patch_entry(entry_id: str, payload: dict) -> str:
+    """PATCH a journal entry. Returns a success/error string."""
+    import requests
+    url = f"{settings.ae_api_url}/v3/crud/journal_entry/{entry_id}"
+    try:
+        resp = requests.patch(
+            url,
+            headers=_headers(),
+            json=payload,
+            timeout=settings.ae_api_timeout,
+        )
+        resp.raise_for_status()
+        return "ok"
+    except Exception as e:
+        logger.warning("_patch_entry %s failed: %s", entry_id, e)
+        return f"Error updating entry {entry_id}: {e}"
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_entry_read
+# ---------------------------------------------------------------------------
+
+async def journal_entry_read(entry_id: str, max_content_chars: int = 4000) -> str:
+    """Return the full content of a single journal entry by its id_random."""
+    err = _check_config()
+    if err:
+        return err
+    return await asyncio.to_thread(_sync_journal_entry_read, entry_id, max_content_chars)
+
+
+def _sync_journal_entry_read(entry_id: str, max_content_chars: int) -> str:
+    entry = _get_entry(entry_id)
+    if isinstance(entry, str):
+        return entry
+
+    title   = entry.get("name") or "(untitled)"
+    journal = entry.get("journal_name") or entry.get("parent_name") or ""
+    summary = entry.get("summary") or ""
+    raw_tags = entry.get("tags") or []
+    tags = raw_tags if isinstance(raw_tags, list) else [t.strip() for t in str(raw_tags).split(",") if t.strip()]
+    content = entry.get("content") or ""
+    updated = (entry.get("updated_on") or entry.get("created_on") or "")[:19].replace("T", " ")
+    enabled = entry.get("enable", True)
+
+    lines = [f"# {title}"]
+    meta: list[str] = [f"id: `{entry_id}`"]
+    if journal:
+        meta.append(f"journal: {journal}")
+    if updated:
+        meta.append(f"updated: {updated}")
+    if not enabled:
+        meta.append("**DISABLED**")
+    lines.append("  ".join(meta))
+    if tags:
+        lines.append(f"Tags: {', '.join(tags)}")
+    if summary:
+        lines.append(f"\nSummary: {summary}")
+    lines.append("\n---\n")
+
+    truncated = len(content) > max_content_chars
+    lines.append(content[:max_content_chars])
+    if truncated:
+        lines.append(
+            f"\n\n[Content truncated at {max_content_chars} chars — "
+            f"{len(content)} total. Call again with a higher max_content_chars to read more.]"
+        )
+
+    return "\n".join(lines)
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_entries_list
+# ---------------------------------------------------------------------------
+
+async def journal_entries_list(journal_id: str, max_results: int = 20, page: int = 1) -> str:
+    """List entries in a specific journal, newest first."""
+    err = _check_config()
+    if err:
+        return err
+    return await asyncio.to_thread(_sync_journal_entries_list, journal_id, max_results, page)
+
+
+def _sync_journal_entries_list(journal_id: str, max_results: int, page: int) -> str:
+    import requests
+
+    url = f"{settings.ae_api_url}/v3/crud/journal_entry/search"
+    search_body: dict = {
+        "page_size": max_results,
+        "page": page,
+        "order_by": "-updated_on",
+    }
+    params = {"for_obj_type": "journal", "for_obj_id": journal_id}
+
+    try:
+        resp = requests.post(
+            url,
+            headers=_headers(),
+            params=params,
+            json=search_body,
+            timeout=settings.ae_api_timeout,
+        )
+        resp.raise_for_status()
+        data = resp.json()
+    except Exception as e:
+        logger.warning("ae_journal_entries_list failed: %s", e)
+        return f"Journal entries list error: {e}"
+
+    entries = data.get("data", [])
+    total   = (data.get("meta") or {}).get("data_list_count") or len(entries)
+
+    if not entries:
+        return f"No entries found in journal `{journal_id}`."
+
+    offset = (page - 1) * max_results + 1
+    lines = [f"Entries in journal `{journal_id}` — showing {offset}–{offset + len(entries) - 1} of {total}:\n"]
+    for i, entry in enumerate(entries, offset):
+        title    = entry.get("name") or "(untitled)"
+        entry_id = entry.get("journal_entry_id") or entry.get("id") or ""
+        raw_tags = entry.get("tags") or []
+        tags = raw_tags if isinstance(raw_tags, list) else [t.strip() for t in str(raw_tags).split(",") if t.strip()]
+        summary  = entry.get("summary") or ""
+        updated  = (entry.get("updated_on") or entry.get("created_on") or "")[:10]
+        enabled  = entry.get("enable", True)
+
+        status = "" if enabled else " [disabled]"
+        date_str = f"  [{updated}]" if updated else ""
+        lines.append(f"{i}. **{title}**{status} — id: `{entry_id}`{date_str}")
+        if tags:
+            lines.append(f"   Tags: {', '.join(tags)}")
+        if summary:
+            lines.append(f"   {summary[:150]}{'…' if len(summary) > 150 else ''}")
+        lines.append("")
+
+    if total > offset + len(entries) - 1:
+        lines.append(f"(More entries available — call again with page={page + 1})")
+
+    return "\n".join(lines).rstrip()
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_entry_update
+# ---------------------------------------------------------------------------
+
+async def journal_entry_update(
+    entry_id: str,
+    title: str = "",
+    content: str = "",
+    summary: str = "",
+    tags: str = "",
+    enable: bool | None = None,
+) -> str:
+    """Update fields on an existing journal entry. Only provided fields are changed."""
+    err = _check_config()
+    if err:
+        return err
+    return await asyncio.to_thread(_sync_journal_entry_update, entry_id, title, content, summary, tags, enable)
+
+
+def _sync_journal_entry_update(
+    entry_id: str,
+    title: str,
+    content: str,
+    summary: str,
+    tags: str,
+    enable: bool | None,
+) -> str:
+    payload: dict = {}
+    if title:
+        payload["name"] = title
+    if content:
+        payload["content"] = content
+    if summary:
+        payload["summary"] = summary
+    if tags:
+        payload["tags"] = [t.strip() for t in tags.split(",") if t.strip()]
+    if enable is not None:
+        payload["enable"] = enable
+
+    if not payload:
+        return "Nothing to update — no fields provided."
+
+    result = _patch_entry(entry_id, payload)
+    if result != "ok":
+        return result
+
+    updated = ", ".join(payload.keys())
+    return f"Journal entry `{entry_id}` updated. Fields changed: {updated}"
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_entry_disable
+# ---------------------------------------------------------------------------
+
+async def journal_entry_disable(entry_id: str) -> str:
+    """Soft-delete a journal entry by setting enable=false."""
+    err = _check_config()
+    if err:
+        return err
+    return await asyncio.to_thread(_patch_entry, entry_id, {"enable": False})
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_entry_append
+# ---------------------------------------------------------------------------
+
+async def journal_entry_append(entry_id: str, content: str, heading: str = "") -> str:
+    """Append a timestamped section to the bottom of a journal entry's content."""
+    err = _check_config()
+    if err:
+        return err
+    return await asyncio.to_thread(_sync_journal_entry_append, entry_id, content, heading)
+
+
+def _sync_journal_entry_append(entry_id: str, content: str, heading: str) -> str:
+    from datetime import datetime, timezone
+
+    entry = _get_entry(entry_id)
+    if isinstance(entry, str):
+        return entry
+
+    existing = (entry.get("content") or "").rstrip()
+    ts = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
+    section_heading = heading or ts
+    new_content = f"{existing}\n\n### {section_heading}\n{content.strip()}"
+
+    result = _patch_entry(entry_id, {"content": new_content})
+    if result != "ok":
+        return result
+    return f"Appended to journal entry `{entry_id}` under heading \"{section_heading}\"."
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_entry_prepend
+# ---------------------------------------------------------------------------
+
+async def journal_entry_prepend(entry_id: str, content: str, heading: str = "") -> str:
+    """Prepend a timestamped section to the top of a journal entry's content."""
+    err = _check_config()
+    if err:
+        return err
+    return await asyncio.to_thread(_sync_journal_entry_prepend, entry_id, content, heading)
+
+
+def _sync_journal_entry_prepend(entry_id: str, content: str, heading: str) -> str:
+    from datetime import datetime, timezone
+
+    entry = _get_entry(entry_id)
+    if isinstance(entry, str):
+        return entry
+
+    existing = (entry.get("content") or "").lstrip()
+    ts = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
+    section_heading = heading or ts
+    new_content = f"### {section_heading}\n{content.strip()}\n\n{existing}"
+
+    result = _patch_entry(entry_id, {"content": new_content})
+    if result != "ok":
+        return result
+    return f"Prepended to journal entry `{entry_id}` under heading \"{section_heading}\"."
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="ae_journal_list",
+        description=(
+            "List all Aether Journals available for this account. "
+            "Returns each journal's name and id_random. "
+            "Call this first when you need to write a new entry or scope a search to a specific journal "
+            "and don't already know the journal's id."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_search",
+        description=(
+            "Search Aether Journal entries. All parameters are optional — combine freely. "
+            "Use 'query' for fulltext keyword search (supports boolean: +required -excluded \"phrase\"). "
+            "Use 'tags' to filter by tag substring. Use 'date_from'/'date_to' for date ranges (YYYY-MM-DD). "
+            "Always search before creating a new entry to avoid duplicates."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "query": types.Schema(type=types.Type.STRING, description="Fulltext keyword search. Supports boolean mode: +required -excluded \"exact phrase\"."),
+                "journal_id": types.Schema(type=types.Type.STRING, description="Scope results to a specific journal by its id_random. Omit to search all journals."),
+                "tags": types.Schema(type=types.Type.STRING, description="Filter by tag substring (e.g. 'networking' matches entries tagged 'networking' or 'home-networking')."),
+                "type_code": types.Schema(type=types.Type.STRING, description="Filter by exact type_code (e.g. 'note', 'meeting', 'log')."),
+                "topic_code": types.Schema(type=types.Type.STRING, description="Filter by exact topic_code."),
+                "date_from": types.Schema(type=types.Type.STRING, description="Return entries created on or after this date (YYYY-MM-DD)."),
+                "date_to": types.Schema(type=types.Type.STRING, description="Return entries created on or before this date (YYYY-MM-DD)."),
+                "sort_by": types.Schema(type=types.Type.STRING, description="Sort field: 'updated' (default), 'created', 'name', or 'priority'."),
+                "sort_order": types.Schema(type=types.Type.STRING, description="Sort direction: 'desc' (default, newest first) or 'asc'."),
+                "status": types.Schema(type=types.Type.INTEGER, description="Filter by exact status code."),
+                "priority": types.Schema(type=types.Type.INTEGER, description="Filter by exact priority (1=low, 5=high)."),
+                "max_results": types.Schema(type=types.Type.INTEGER, description="Number of results per page (default 10)."),
+                "page": types.Schema(type=types.Type.INTEGER, description="Page number for pagination (default 1)."),
+            },
+            required=[],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_entry_read",
+        description=(
+            "Fetch the full content of a single journal entry by its id_random. "
+            "Use this when you need to read an entry before editing it, or when search results "
+            "don't show enough content. Returns title, journal, tags, summary, and full content."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "entry_id": types.Schema(type=types.Type.STRING, description="The id_random of the journal entry to read."),
+                "max_content_chars": types.Schema(type=types.Type.INTEGER, description="Maximum characters of content to return (default 4000). Increase for long entries."),
+            },
+            required=["entry_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_entries_list",
+        description=(
+            "List entries in a specific journal, newest first. "
+            "Use this to browse what's in a journal when you don't have a search keyword, "
+            "or to find entries by browsing rather than searching. "
+            "Returns numbered entries with id, title, tags, summary, and date."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "journal_id": types.Schema(type=types.Type.STRING, description="The id_random of the journal to list entries from."),
+                "max_results": types.Schema(type=types.Type.INTEGER, description="Number of entries to return (default 20, max 50)."),
+                "page": types.Schema(type=types.Type.INTEGER, description="Page number for pagination (default 1)."),
+            },
+            required=["journal_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_entry_create",
+        description=(
+            "Create a new entry in an Aether Journal. "
+            "Use this to save notes, summaries, or any content the user wants to store. "
+            "Always call ae_journal_search first to check for existing entries on the same topic."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "journal_id": types.Schema(type=types.Type.STRING, description="The id_random of the target journal. Ask the user which journal to write to if not specified."),
+                "title": types.Schema(type=types.Type.STRING, description="Entry title"),
+                "content": types.Schema(type=types.Type.STRING, description="Full entry content (markdown supported)"),
+                "summary": types.Schema(type=types.Type.STRING, description="Optional short summary (1-2 sentences)"),
+                "tags": types.Schema(type=types.Type.STRING, description="Optional comma-separated tags (e.g. 'wireguard, networking, homelab')"),
+            },
+            required=["journal_id", "title", "content"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_entry_update",
+        description=(
+            "Update fields on an existing journal entry. Only the fields you provide are changed — "
+            "omitted fields are left as-is. Use ae_journal_search to find the entry_id first. "
+            "To soft-delete, use ae_journal_entry_disable instead."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "entry_id": types.Schema(type=types.Type.STRING, description="Journal entry id_random"),
+                "title":    types.Schema(type=types.Type.STRING, description="New title"),
+                "content":  types.Schema(type=types.Type.STRING, description="Replacement content (full, markdown supported)"),
+                "summary":  types.Schema(type=types.Type.STRING, description="New summary"),
+                "tags":     types.Schema(type=types.Type.STRING, description="Replacement comma-separated tags"),
+                "enable":   types.Schema(type=types.Type.BOOLEAN, description="Set false to hide/disable the entry"),
+            },
+            required=["entry_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_entry_disable",
+        description=(
+            "Soft-delete a journal entry by setting enable=false. "
+            "The entry is hidden but not permanently removed. "
+            "Use ae_journal_search to find the entry_id first."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "entry_id": types.Schema(type=types.Type.STRING, description="Journal entry id_random"),
+            },
+            required=["entry_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_entry_append",
+        description=(
+            "Append a new section to the bottom of a journal entry's content. "
+            "Each section gets a UTC timestamp heading unless you provide one. "
+            "Ideal for timestamped logs, running notes, or data logs."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "entry_id": types.Schema(type=types.Type.STRING, description="Journal entry id_random"),
+                "content":  types.Schema(type=types.Type.STRING, description="The text to append (markdown supported)"),
+                "heading":  types.Schema(type=types.Type.STRING, description="Optional section heading (defaults to current UTC timestamp)"),
+            },
+            required=["entry_id", "content"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ae_journal_entry_prepend",
+        description=(
+            "Prepend a new section to the top of a journal entry's content. "
+            "Each section gets a UTC timestamp heading unless you provide one. "
+            "Useful for most-recent-first logs."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "entry_id": types.Schema(type=types.Type.STRING, description="Journal entry id_random"),
+                "content":  types.Schema(type=types.Type.STRING, description="The text to prepend (markdown supported)"),
+                "heading":  types.Schema(type=types.Type.STRING, description="Optional section heading (defaults to current UTC timestamp)"),
+            },
+            required=["entry_id", "content"],
+        ),
+    ),
+]
--- a/cortex/tools/ae_tasks.py
+++ b/cortex/tools/ae_tasks.py
@@ -16,6 +16,8 @@ import json
 import logging
 from pathlib import Path

+from google.genai import types
+
 logger = logging.getLogger(__name__)

 # Resolved at import time — agents_sync is always at ~/agents_sync on this machine.
@@ -98,3 +100,20 @@ def _read_bucket(bucket_dir: Path) -> list[dict]:
        except Exception as e:
            logger.warning("Failed to read task file %s: %s", path, e)
    return tasks
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="ae_task_list",
+        description=(
+            "List tasks from the agents_sync Kanban board (todo and in-progress). "
+            "Use this when asked about current work, pending tasks, or project status."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "include_done": types.Schema(type=types.Type.BOOLEAN, description="If true, also include completed tasks (default false)"),
+            },
+        ),
+    ),
+]
--- a/cortex/tools/agent_notes.py
+++ b/cortex/tools/agent_notes.py
@@ -0,0 +1,155 @@
+"""
+Agent private notes — AGENT_NOTES.md.
+
+A persistent notepad only the orchestrator can write to. The file itself is
+never exposed in the Files panel or loaded into user-facing context tiers.
+Up to 3 rolling backups are kept automatically before each write so past
+versions can be reviewed.
+
+Use for: observations about the user's patterns, working hypotheses,
+long-running goals, things to remember across sessions that shouldn't
+be part of the distilled memory visible to the user.
+"""
+
+import asyncio
+from datetime import datetime, timezone
+from pathlib import Path
+
+from google.genai import types
+from persona import persona_path
+
+
+_FILENAME = "AGENT_NOTES.md"
+_N_BACKUPS = 3
+
+
+def _notes_path() -> Path:
+    return persona_path() / _FILENAME
+
+
+def _now_label() -> str:
+    return datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
+
+
+def _rotate(path: Path) -> None:
+    """Rotate up to _N_BACKUPS rolling backups before a write."""
+    if not path.exists():
+        return
+    for i in range(_N_BACKUPS, 1, -1):
+        older = path.parent / f"{path.stem}.bak{i}.md"
+        newer = path.parent / f"{path.stem}.bak{i - 1}.md"
+        if newer.exists():
+            older.write_text(newer.read_text())
+    bak1 = path.parent / f"{path.stem}.bak1.md"
+    bak1.write_text(path.read_text())
+
+
+# ── Sync implementations ────────────────────────────────────────────────────
+
+def _agent_notes_read() -> str:
+    p = _notes_path()
+    if not p.exists() or not p.read_text().strip():
+        return "Agent notes are empty."
+    return p.read_text()
+
+
+def _agent_notes_write(content: str) -> str:
+    p = _notes_path()
+    _rotate(p)
+    p.write_text(content.rstrip() + "\n")
+    return "Agent notes updated."
+
+
+def _agent_notes_append(content: str, heading: str | None = None) -> str:
+    p = _notes_path()
+    _rotate(p)
+    existing = p.read_text() if p.exists() else ""
+    label = heading or _now_label()
+    section = f"\n## {label}\n\n{content.strip()}\n"
+    p.write_text(existing.rstrip() + "\n" + section)
+    return f"Appended to agent notes: {label}"
+
+
+def _agent_notes_clear() -> str:
+    p = _notes_path()
+    _rotate(p)
+    p.write_text("")
+    return "Agent notes cleared."
+
+
+# ── Async wrappers ───────────────────────────────────────────────────────────
+
+async def agent_notes_read() -> str:
+    return await asyncio.to_thread(_agent_notes_read)
+
+async def agent_notes_write(content: str) -> str:
+    return await asyncio.to_thread(_agent_notes_write, content)
+
+async def agent_notes_append(content: str, heading: str | None = None) -> str:
+    return await asyncio.to_thread(_agent_notes_append, content, heading)
+
+async def agent_notes_clear() -> str:
+    return await asyncio.to_thread(_agent_notes_clear)
+
+
+# ── Gemini FunctionDeclarations ──────────────────────────────────────────────
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="agent_notes_read",
+        description=(
+            "Read your private agent notes — a persistent notepad only you can write to. "
+            "Use this to recall observations, working hypotheses, long-running goals, or "
+            "anything you want to remember across sessions without surfacing it to the user. "
+            "This file is never shown in the user's Files panel."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+    types.FunctionDeclaration(
+        name="agent_notes_write",
+        description=(
+            "Replace your private agent notes with new content. "
+            "A backup is saved automatically before writing. "
+            "Use agent_notes_append to add without replacing."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "content": types.Schema(
+                    type=types.Type.STRING,
+                    description="The new notes content (markdown supported).",
+                ),
+            },
+            required=["content"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="agent_notes_append",
+        description=(
+            "Add a new section to your private agent notes without replacing existing content. "
+            "A backup is saved automatically before writing. "
+            "Each section gets a UTC timestamp heading unless you supply one."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "content": types.Schema(
+                    type=types.Type.STRING,
+                    description="The content to append (markdown supported).",
+                ),
+                "heading": types.Schema(
+                    type=types.Type.STRING,
+                    description="Optional section heading. Defaults to current UTC timestamp.",
+                ),
+            },
+            required=["content"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="agent_notes_clear",
+        description=(
+            "Erase all private agent notes. A backup is saved automatically before clearing."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+]
--- a/cortex/tools/agents.py
+++ b/cortex/tools/agents.py
@@ -0,0 +1,446 @@
+"""
+Agent spawning and lifecycle tools.
+
+spawn_agent — synchronous or background sub-agent via any configured role model.
+agent_status / agent_list / agent_cancel — lifecycle management for background agents.
+
+Sub-agents run using the model and tools assigned to the given role. The three-level
+hierarchy (Persona → Specialized → Support) is enforced by denying spawn_agent and
+aider_run at the L2→L3 boundary — Level 3 agents cannot delegate further.
+
+Supported model types for sub-agents: local_openai, gemini_api.
+claude_cli / gemini_cli are chat-only and do not support tool-enabled sub-agents.
+"""
+
+import asyncio
+import logging
+from datetime import datetime
+
+from google.genai import types
+
+import agent_manager
+
+logger = logging.getLogger(__name__)
+
+# Per-host semaphores — keyed by "host:<host_id>" or "type:<model_type>"
+# Created lazily on first use; never deleted (module-level singletons)
+_semaphores: dict[str, asyncio.Semaphore] = {}
+_sem_lock = asyncio.Lock()
+
+# Tools denied at the L2→L3 boundary so Level 3 agents cannot delegate further.
+_L3_DENY_TOOLS = ["spawn_agent", "aider_run"]
+
+
+async def _get_semaphore(key: str, max_concurrent: int) -> asyncio.Semaphore:
+    """Return (or create) the semaphore for a given host/type key."""
+    async with _sem_lock:
+        if key not in _semaphores:
+            _semaphores[key] = asyncio.Semaphore(max_concurrent)
+        return _semaphores[key]
+
+
+async def spawn_agent(
+    task: str,
+    role: str = "chat",
+    tier: int = 1,
+    timeout: int = 120,
+    max_rounds: int | None = None,
+    allow_tools: list[str] | None = None,
+    deny_tools: list[str] | None = None,
+    background: bool = False,
+    notify: bool = False,
+    _agent_level: int = 2,
+) -> str:
+    """
+    Spawn a sub-agent to complete a task.
+
+    In synchronous mode (background=False, the default): blocks until done and returns
+    the result string.
+
+    In background mode (background=True): registers the agent, fires it as an asyncio
+    background task, and returns an agent_id string immediately. Use agent_status() to
+    poll, or set notify=True to receive a push notification on completion.
+
+    Level enforcement: this agent (level _agent_level) spawns children at level+1.
+    Children at level 3 automatically have spawn_agent and aider_run denied so they
+    cannot delegate further.
+    """
+    import model_registry
+    from context_loader import load_context
+    from auth_utils import get_user_role, get_tool_policy
+    from persona import get_user
+
+    user = get_user() or "scott"
+
+    role_cfg = model_registry.get_role_config(user, role)
+    model_cfg = model_registry.get_model_for_role(user, role)
+
+    if not model_cfg:
+        return f"spawn_agent: no model configured for role '{role}'"
+
+    model_type = model_cfg.get("type", "unknown")
+
+    if model_type not in ("local_openai", "gemini_api"):
+        return (
+            f"spawn_agent: model type '{model_type}' does not support tool-enabled sub-agents. "
+            f"Assign a local_openai or gemini_api model to role '{role}'."
+        )
+
+    # Determine concurrency key and semaphore limit
+    host_id = model_cfg.get("host_id")
+    if host_id:
+        registry = model_registry.get_registry(user)
+        host = next((h for h in registry.get("hosts", []) if h["id"] == host_id), None)
+        max_concurrent = (host or {}).get("max_concurrent", 3)
+        sem_key = f"host:{host_id}"
+    else:
+        max_concurrent = 5 if model_type == "gemini_api" else 3
+        sem_key = f"type:{model_type}"
+
+    sem = await _get_semaphore(sem_key, max_concurrent)
+
+    system_prompt = load_context(
+        tier=tier,
+        include_long=(tier >= 2),
+        include_mid=(tier >= 2),
+        include_short=(tier >= 2),
+        role_append=role_cfg.get("system_append", ""),
+        inject_datetime=role_cfg.get("inject_datetime", True),
+    )
+
+    user_role = get_user_role(user)
+    tool_list = role_cfg.get("tools")
+    policy = get_tool_policy(user)
+    confirm_allow = set(policy.get("allow", []))
+    confirm_deny = set(policy.get("deny", []))
+
+    # Per-call tool restrictions — role config remains the authoritative ceiling
+    if allow_tools is not None:
+        if tool_list is not None:
+            tool_list = [t for t in tool_list if t in allow_tools]
+        else:
+            tool_list = list(allow_tools)
+
+    if deny_tools is not None:
+        deny_set = set(deny_tools)
+        if tool_list is not None:
+            tool_list = [t for t in tool_list if t not in deny_set]
+        else:
+            confirm_deny = confirm_deny | deny_set
+
+    # Level enforcement: children of this agent are at level _agent_level + 1.
+    # Level 3 children cannot delegate — auto-deny the spawning tools.
+    child_level = _agent_level + 1
+    if child_level >= 3:
+        l3_deny = set(_L3_DENY_TOOLS)
+        if tool_list is not None:
+            tool_list = [t for t in tool_list if t not in l3_deny]
+        else:
+            confirm_deny = confirm_deny | l3_deny
+
+    if max_rounds is not None:
+        model_cfg = dict(model_cfg)
+        model_cfg["max_rounds"] = max_rounds
+
+    async def _run() -> str:
+        if model_type == "local_openai":
+            import openai_orchestrator
+            result = await openai_orchestrator.run(
+                task=task,
+                system_prompt=system_prompt,
+                model_cfg=model_cfg,
+                respond_with_final=True,
+                user_role=user_role,
+                tool_list=tool_list,
+                confirm_allow=confirm_allow,
+                confirm_deny=confirm_deny,
+            )
+            if result.checkpoint:
+                return (
+                    "Sub-agent requires user confirmation — "
+                    "confirmation gates are not supported inside spawn_agent. "
+                    "Pre-allow the tool in the user's tool policy or use a different role."
+                )
+            return result.response or "(sub-agent returned no output)"
+
+        # gemini_api
+        import orchestrator_engine
+        from auth_utils import get_user_gemini_key
+        gemini_key = model_cfg.get("api_key") or get_user_gemini_key(user)
+        result = await orchestrator_engine.run(
+            task=task,
+            system_prompt=system_prompt,
+            session_messages=None,
+            respond_with_claude=True,
+            gemini_api_key=gemini_key,
+            model_name=model_cfg.get("model_name"),
+            response_role=role,
+            user_role=user_role,
+            tool_list=tool_list,
+            confirm_allow=confirm_allow,
+            confirm_deny=confirm_deny,
+            max_rounds=model_cfg.get("max_rounds"),
+        )
+        if result.checkpoint:
+            return (
+                "Sub-agent requires user confirmation — "
+                "confirmation gates are not supported inside spawn_agent."
+            )
+        return result.response or "(sub-agent returned no output)"
+
+    if background:
+        rec = await agent_manager.register(
+            user=user,
+            role=role,
+            task=task,
+            level=_agent_level,
+            notify=notify,
+        )
+
+        async def _bg_task() -> None:
+            async with sem:
+                try:
+                    logger.info(
+                        "spawn_agent [bg]: %s role=%s level=%d timeout=%ds",
+                        rec.agent_id[:8], role, _agent_level, timeout,
+                    )
+                    result = await asyncio.wait_for(_run(), timeout=float(timeout))
+                    await agent_manager.finish(rec.agent_id, result, "done")
+                    logger.info("spawn_agent [bg]: done %s", rec.agent_id[:8])
+                except asyncio.CancelledError:
+                    await agent_manager.finish(rec.agent_id, "Cancelled.", "cancelled")
+                    raise
+                except asyncio.TimeoutError:
+                    msg = f"Sub-agent timed out after {timeout}s (role={role})"
+                    logger.warning("spawn_agent [bg]: timeout %s", rec.agent_id[:8])
+                    await agent_manager.finish(rec.agent_id, msg, "timeout")
+                except Exception as e:
+                    logger.exception("spawn_agent [bg]: failed %s", rec.agent_id[:8])
+                    await agent_manager.finish(rec.agent_id, str(e), "failed")
+
+        bg = asyncio.create_task(_bg_task())
+        agent_manager.set_task_ref(rec.agent_id, bg)
+        return f"Agent started in background. ID: {rec.agent_id}\nUse agent_status('{rec.agent_id}') to check progress."
+
+    # Synchronous path — unchanged behaviour
+    async with sem:
+        try:
+            logger.info(
+                "spawn_agent: role=%s tier=%d timeout=%ds task=%.80s",
+                role, tier, timeout, task,
+            )
+            response = await asyncio.wait_for(_run(), timeout=float(timeout))
+            logger.info("spawn_agent: done role=%s response=%d chars", role, len(response))
+            return response
+        except asyncio.TimeoutError:
+            logger.warning("spawn_agent: timed out after %ds role=%s", timeout, role)
+            return f"Sub-agent timed out after {timeout}s (role={role})"
+        except Exception as e:
+            logger.exception("spawn_agent: failed role=%s", role)
+            return f"Sub-agent error ({role}): {e}"
+
+
+# ── Agent lifecycle tools ─────────────────────────────────────────────────────
+
+async def agent_status(agent_id: str) -> str:
+    """Return the status and result preview of a background agent."""
+    from persona import get_user
+    user = get_user() or "unknown"
+    rec = agent_manager.get(agent_id)
+    if not rec:
+        return f"No agent found with ID: {agent_id}"
+    if rec.user != user:
+        return "Access denied."
+
+    now = datetime.now()
+    end = rec.finished or now
+    elapsed = int((end - rec.started).total_seconds())
+
+    lines = [
+        f"Agent {rec.agent_id[:8]}…",
+        f"  Status:  {rec.status}",
+        f"  Role:    {rec.role}  (Level {rec.level})",
+        f"  Elapsed: {elapsed}s",
+        f"  Started: {rec.started.strftime('%Y-%m-%d %H:%M:%S')}",
+        f"  Task:    {rec.task}",
+    ]
+    if rec.parent_id:
+        lines.append(f"  Parent:  {rec.parent_id[:8]}…")
+    if rec.result is not None:
+        lines.append(f"  Result:  {rec.result[:300]}")
+    return "\n".join(lines)
+
+
+async def agent_list(status: str | None = None, limit: int = 10) -> str:
+    """List background agents for the current user."""
+    from persona import get_user
+    user = get_user() or "unknown"
+    limit = min(max(int(limit), 1), 50)
+    records = agent_manager.list_agents(user, status=status, limit=limit)
+
+    if not records:
+        suffix = f" (filter: status={status})" if status else ""
+        return f"No agents found.{suffix}"
+
+    now = datetime.now()
+    lines = []
+    for rec in records:
+        end = rec.finished or now
+        elapsed = int((end - rec.started).total_seconds())
+        preview = rec.task[:60].replace("\n", " ")
+        result_hint = f" → {rec.result[:50]}" if rec.result else ""
+        lines.append(
+            f"[{rec.agent_id[:8]}] {rec.status:<10s} L{rec.level} "
+            f"{rec.role:<12s} {elapsed:>5}s  {preview}{result_hint}"
+        )
+
+    header = f"{len(records)} agent(s)" + (f" (status={status})" if status else "") + ":"
+    return header + "\n" + "\n".join(lines)
+
+
+async def agent_cancel(agent_id: str) -> str:
+    """Cancel a running background agent."""
+    from persona import get_user
+    user = get_user() or "unknown"
+    return await agent_manager.cancel_agent(agent_id, user)
+
+
+# ── Declarations ──────────────────────────────────────────────────────────────
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="spawn_agent",
+        description=(
+            "Spawn a sub-agent to complete a task. "
+            "In synchronous mode (default): blocks until the sub-agent finishes and returns its response. "
+            "In background mode (background=True): fires the agent asynchronously and returns an agent_id "
+            "immediately — use agent_status() to check progress or set notify=True for a completion alert. "
+            "The sub-agent uses the model and tool set assigned to the given role. "
+            "Use for processing pipelines, parallel analysis, or delegating specialized work "
+            "(research, coding, data migration, etc.)."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "task": types.Schema(
+                    type=types.Type.STRING,
+                    description="The complete task description for the sub-agent.",
+                ),
+                "role": types.Schema(
+                    type=types.Type.STRING,
+                    description=(
+                        "Role determining the model and tools. "
+                        "E.g. 'research' for web lookups, 'coder' for code tasks, "
+                        "'distill' for summarization. Defaults to 'chat'."
+                    ),
+                ),
+                "tier": types.Schema(
+                    type=types.Type.INTEGER,
+                    description=(
+                        "Context tier: 1 = minimal (fast, identity only), "
+                        "2 = standard (+ memory), 3 = + last 2 session logs. "
+                        "Use 1 for pure processing tasks."
+                    ),
+                ),
+                "timeout": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Max seconds to wait (default 120). Applies in both sync and background mode.",
+                ),
+                "max_rounds": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Override max tool-loop iterations for this call.",
+                ),
+                "allow_tools": types.Schema(
+                    type=types.Type.ARRAY,
+                    items=types.Schema(type=types.Type.STRING),
+                    description=(
+                        "Restrict the sub-agent to only these tools. "
+                        "Intersected with the role's tool set — cannot grant more than the role allows. "
+                        "Example: ['web_search', 'web_read'] for a pure research agent."
+                    ),
+                ),
+                "deny_tools": types.Schema(
+                    type=types.Type.ARRAY,
+                    items=types.Schema(type=types.Type.STRING),
+                    description=(
+                        "Block these tools from the sub-agent regardless of role config. "
+                        "Example: ['shell_exec', 'file_write', 'cortex_restart']."
+                    ),
+                ),
+                "background": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description=(
+                        "Run asynchronously in the background (default: false). "
+                        "When true, returns an agent_id immediately instead of blocking for the result. "
+                        "Use agent_status(agent_id) to check progress. "
+                        "Best for tasks that take more than ~30 seconds."
+                    ),
+                ),
+                "notify": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description=(
+                        "Send a push/Talk notification when the background agent completes (default: false). "
+                        "Only meaningful when background=true."
+                    ),
+                ),
+            },
+            required=["task"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="agent_status",
+        description=(
+            "Get the current status of a background agent by ID. "
+            "Returns status (running/done/failed/cancelled/timeout), role, elapsed time, "
+            "task description, and result preview."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "agent_id": types.Schema(
+                    type=types.Type.STRING,
+                    description="The agent ID returned by spawn_agent(background=True) or aider_run(background=True).",
+                ),
+            },
+            required=["agent_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="agent_list",
+        description=(
+            "List background agents for the current user. "
+            "Returns recent agents with ID, status, role, level, elapsed time, and task preview. "
+            "Use to survey what's running or recently completed."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "status": types.Schema(
+                    type=types.Type.STRING,
+                    description="Filter by status: 'running', 'done', 'failed', 'cancelled', 'timeout'. Omit for all.",
+                ),
+                "limit": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Max agents to return (default 10, max 50).",
+                ),
+            },
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="agent_cancel",
+        description=(
+            "Cancel a running background agent. ADMIN ONLY. Requires confirmation. "
+            "Use agent_list() to find the agent ID first."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "agent_id": types.Schema(
+                    type=types.Type.STRING,
+                    description="The agent ID to cancel.",
+                ),
+            },
+            required=["agent_id"],
+        ),
+    ),
+]
--- a/cortex/tools/aider.py
+++ b/cortex/tools/aider.py
@@ -0,0 +1,406 @@
+"""
+Aider coding agent tool — invokes Aider AI pair programming as a subprocess.
+
+Aider handles repo-map generation, file editing, git commits, and linting automatically.
+It works with any OpenAI-compatible model — point it at DeepSeek, Ollama, OpenRouter, etc.
+via AIDER_MODEL / AIDER_OPENAI_API_BASE env vars or the project's .aider.conf.yml.
+
+Credentials are pulled automatically from the Cortex model registry:
+  - Named cloud providers (OpenRouter, OpenAI, Groq, Anthropic, …) → --api-key slug=key
+  - Generic OpenAI-compatible hosts (Open WebUI, Ollama, local) → --openai-api-base + key
+  - Anthropic from providers.anthropic.credentials → --api-key anthropic=key
+
+background=True runs the subprocess asynchronously and returns an agent_id immediately.
+"""
+
+import asyncio
+import logging
+import os
+from pathlib import Path
+
+from google.genai import types
+
+import agent_manager
+
+logger = logging.getLogger(__name__)
+
+_CORTEX_DIR = Path(__file__).parent      # .../Cortex_and_Inara_dev/cortex/
+_PROJECT_ROOT = _CORTEX_DIR.parent      # .../Cortex_and_Inara_dev/
+
+# Known project aliases — expand before passing to subprocess
+_PROJECT_ALIASES: dict[str, str] = {
+    "cortex":           str(_PROJECT_ROOT),
+    "aether_api":       "~/OSIT_dev/aether_api_fastapi",
+    "aether_frontend":  "~/OSIT_dev/aether_app_sveltekit",
+    "aether_container": "~/OSIT_dev/aether_container_env",
+}
+
+_MAX_OUTPUT_CHARS = 12_000
+
+# Maps URL fragments → Aider --api-key provider slug.
+# Order matters: more specific patterns first.
+_CLOUD_PROVIDER_URL_MAP: list[tuple[str, str]] = [
+    ("openrouter.ai",    "openrouter"),
+    ("api.openai.com",   "openai"),
+    ("groq.com",         "groq"),
+    ("api.together.xyz", "togetherai"),
+    ("fireworks.ai",     "fireworks"),
+    ("api.x.ai",         "xai"),
+    ("api.deepseek.com", "deepseek"),
+    ("api.mistral.ai",   "mistral"),
+]
+
+
+def _provider_slug(api_url: str) -> str | None:
+    """Return the Aider --api-key provider slug for a known cloud URL, None for generic."""
+    url_lower = api_url.lower()
+    for fragment, slug in _CLOUD_PROVIDER_URL_MAP:
+        if fragment in url_lower:
+            return slug
+    return None
+
+
+def _host_flags(host: dict, model: str | None) -> tuple[list[str], str | None]:
+    """Build Aider credential flags for a specific host entry.
+
+    Returns (extra_args, adjusted_model). For generic (local) endpoints the model
+    name may be prefixed with 'openai/' so Aider routes through the OpenAI client.
+    """
+    api_url  = (host.get("api_url") or "").rstrip("/")
+    api_key  = host.get("api_key") or "none"
+    host_type = host.get("host_type", "openai")
+    slug     = _provider_slug(api_url)
+
+    if slug:
+        # Named cloud provider — Aider maps --api-key slug=key → SLUG_API_KEY env var
+        flags = ["--api-key", f"{slug}={api_key}"] if api_key and api_key != "none" else []
+        return flags, model
+
+    # Generic OpenAI-compatible (local Open WebUI, Ollama, custom)
+    base_url = api_url
+    if host_type == "openwebui":
+        # Open WebUI serves the chat endpoint at /api/chat/completions
+        base_url = base_url + "/api"
+
+    flags = ["--openai-api-base", base_url, "--openai-api-key", api_key]
+
+    # Prefix model with 'openai/' for generic endpoints when no provider prefix is set
+    adj_model = model
+    if model and "/" not in model:
+        adj_model = f"openai/{model}"
+
+    return flags, adj_model
+
+
+def _resolve_credentials(
+    registry: dict,
+    model: str | None,
+    host_label: str | None,
+) -> tuple[list[str], str | None]:
+    """Determine Aider credential flags and (possibly adjusted) model name.
+
+    Resolution order:
+    1. Anthropic model hint (claude-* / anthropic/*) → Anthropic API key
+    2. Explicit host_label → that host's credentials
+    3. Model prefix hint (openrouter/*, groq/*, …) → matching host
+    4. Default priority: OpenRouter → Anthropic → any keyed cloud host → local host
+
+    Returns (extra_args, adjusted_model).
+    """
+    hosts = registry.get("hosts", [])
+
+    # Extract Anthropic key from providers.anthropic.credentials (not a host entry)
+    anthropic_key = None
+    for cred in registry.get("providers", {}).get("anthropic", {}).get("credentials", []):
+        if cred.get("api_key"):
+            anthropic_key = cred["api_key"]
+            break
+
+    # ── 1. Anthropic model hint ────────────────────────────────────────────────
+    if model and any(h in model.lower() for h in ("claude-", "anthropic/")):
+        if anthropic_key:
+            logger.debug("aider: Anthropic model detected — using Anthropic API key")
+            return ["--api-key", f"anthropic={anthropic_key}"], model
+
+    # ── 2. Explicit host_label override ───────────────────────────────────────
+    if host_label:
+        ll = host_label.lower()
+        host = next((h for h in hosts if ll in h.get("label", "").lower()), None)
+        if host:
+            logger.debug("aider: using explicitly requested host '%s'", host.get("label"))
+            return _host_flags(host, model)
+
+    # ── 3. Model prefix hints ─────────────────────────────────────────────────
+    if model:
+        ml = model.lower()
+        for fragment, slug in _CLOUD_PROVIDER_URL_MAP:
+            if ml.startswith(slug + "/") or ml.startswith(fragment):
+                host = next(
+                    (h for h in hosts if fragment in h.get("api_url", "").lower()), None
+                )
+                if host:
+                    logger.debug("aider: model prefix '%s' → host '%s'", slug, host.get("label"))
+                    return _host_flags(host, model)
+
+    # ── 4. Default priority ───────────────────────────────────────────────────
+    # OpenRouter first (most model coverage)
+    or_host = next((h for h in hosts if "openrouter.ai" in h.get("api_url", "")), None)
+    if or_host and or_host.get("api_key"):
+        logger.debug("aider: defaulting to OpenRouter")
+        return _host_flags(or_host, model)
+
+    # Anthropic API key (no model hint but it's configured)
+    if anthropic_key:
+        logger.debug("aider: defaulting to Anthropic API key")
+        return ["--api-key", f"anthropic={anthropic_key}"], model
+
+    # Any other keyed cloud host
+    for host in hosts:
+        slug = _provider_slug(host.get("api_url", ""))
+        if slug and host.get("api_key"):
+            logger.debug("aider: using keyed cloud host '%s'", host.get("label"))
+            return _host_flags(host, model)
+
+    # Generic / local host (no key or unknown provider)
+    for host in hosts:
+        flags, adj_model = _host_flags(host, model)
+        if flags:
+            logger.debug("aider: using local host '%s'", host.get("label"))
+            return flags, adj_model
+
+    logger.debug("aider: no credentials found in registry — relying on env vars / .aider.conf.yml")
+    return [], model
+
+
+async def aider_run(
+    project: str,
+    task: str,
+    files: list[str] | None = None,
+    model: str | None = None,
+    host_label: str | None = None,
+    auto_commit: bool = True,
+    timeout: int = 300,
+    background: bool = False,
+    notify: bool = False,
+) -> str:
+    """Run Aider with a single task in a project directory, then exit.
+
+    Credentials are resolved automatically from the Cortex model registry. Use
+    host_label to pick a specific configured host (e.g. 'OpenRouter', 'Local').
+
+    When background=True, fires the subprocess asynchronously and returns an agent_id
+    immediately. Use agent_status(agent_id) to check progress; set notify=True to
+    receive a push/Talk notification on completion.
+    """
+    resolved = _PROJECT_ALIASES.get(project, project)
+    cwd = Path(os.path.expanduser(resolved))
+
+    if not cwd.is_dir():
+        return f"Error: project directory '{resolved}' does not exist."
+
+    timeout = min(max(int(timeout), 10), 600)
+
+    # Resolve credentials before building the command (model name may be adjusted)
+    user = "scott"
+    extra_cred_flags: list[str] = []
+    try:
+        import model_registry
+        from persona import get_user
+        user = get_user() or "scott"
+        registry = model_registry.get_registry(user)
+        extra_cred_flags, model = _resolve_credentials(registry, model, host_label)
+    except Exception as e:
+        logger.debug("aider: credential resolution failed (%s) — relying on env", e)
+
+    cmd: list[str] = [
+        "aider",
+        "--message", task,
+        "--yes-always",
+        "--no-pretty",
+        "--no-stream",
+        "--no-check-update",
+        "--no-detect-urls",
+        "--auto-commits" if auto_commit else "--no-auto-commits",
+    ]
+
+    cmd += extra_cred_flags
+
+    if model:
+        cmd += ["--model", model]
+
+    for f in (files or []):
+        cmd += ["--file", f]
+
+    logger.info(
+        "aider_run: project=%s model=%s host_label=%s auto_commit=%s background=%s task=%.120s",
+        project, model, host_label, auto_commit, background, task,
+    )
+
+    async def _run() -> str:
+        proc = await asyncio.create_subprocess_exec(
+            *cmd,
+            cwd=str(cwd),
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE,
+        )
+        stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=float(timeout))
+
+        out = stdout.decode(errors="replace").strip()
+        err = stderr.decode(errors="replace").strip()
+
+        parts = []
+        if out:
+            parts.append(out)
+        if err:
+            parts.append(f"[stderr]\n{err}")
+        combined = "\n".join(parts) if parts else "(no output)"
+
+        if len(combined) > _MAX_OUTPUT_CHARS:
+            half = _MAX_OUTPUT_CHARS // 2
+            combined = (
+                combined[:half]
+                + f"\n\n[... {len(combined) - _MAX_OUTPUT_CHARS} chars trimmed ...]\n\n"
+                + combined[-half:]
+            )
+
+        if proc.returncode not in (0, 1):
+            return f"[exit {proc.returncode}]\n{combined}"
+        return combined
+
+    if background:
+        rec = await agent_manager.register(
+            user=user,
+            role="aider",
+            task=task,
+            level=2,
+            notify=notify,
+        )
+
+        async def _bg_task() -> None:
+            try:
+                result = await _run()
+                await agent_manager.finish(rec.agent_id, result, "done")
+                logger.info("aider_run [bg]: done %s", rec.agent_id[:8])
+            except asyncio.CancelledError:
+                await agent_manager.finish(rec.agent_id, "Cancelled.", "cancelled")
+                raise
+            except asyncio.TimeoutError:
+                msg = f"Aider timed out after {timeout}s"
+                logger.warning("aider_run [bg]: timeout %s", rec.agent_id[:8])
+                await agent_manager.finish(rec.agent_id, msg, "timeout")
+            except FileNotFoundError:
+                msg = "Error: 'aider' not found in PATH — run: pip install aider-chat"
+                await agent_manager.finish(rec.agent_id, msg, "failed")
+            except Exception as e:
+                logger.error("aider_run [bg]: failed %s: %s", rec.agent_id[:8], e)
+                await agent_manager.finish(rec.agent_id, str(e), "failed")
+
+        bg = asyncio.create_task(_bg_task())
+        agent_manager.set_task_ref(rec.agent_id, bg)
+        return (
+            f"Aider task started in background. ID: {rec.agent_id}\n"
+            f"Use agent_status('{rec.agent_id}') to monitor progress."
+        )
+
+    # Synchronous path
+    try:
+        return await _run()
+    except asyncio.TimeoutError:
+        return f"Error: aider timed out after {timeout}s"
+    except FileNotFoundError:
+        return "Error: 'aider' not found in PATH — run: pip install aider-chat"
+    except Exception as e:
+        logger.error("aider_run error: %s", e)
+        return f"Error: {e}"
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="aider_run",
+        description=(
+            "Run the Aider AI coding agent on a project with a single task, then exit. "
+            "Aider maps the repo, edits files, runs lint checks, and optionally commits. "
+            "Credentials are resolved automatically from the Cortex model registry — "
+            "OpenRouter, local Open WebUI/Ollama, Anthropic API, and other configured hosts "
+            "are all supported. Use host_label to pick a specific host. "
+            "Set background=True for long tasks — returns an agent_id immediately and sends "
+            "a notification when done. ADMIN ONLY. Requires confirmation."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "project": types.Schema(
+                    type=types.Type.STRING,
+                    description=(
+                        "Project alias or absolute path. Known aliases: "
+                        "'cortex' (this project), 'aether_api', 'aether_frontend', "
+                        "'aether_container'. Or provide an absolute path."
+                    ),
+                ),
+                "task": types.Schema(
+                    type=types.Type.STRING,
+                    description=(
+                        "Full task description sent to Aider as --message. "
+                        "Be specific — include file names, what to change, and why."
+                    ),
+                ),
+                "files": types.Schema(
+                    type=types.Type.ARRAY,
+                    items=types.Schema(type=types.Type.STRING),
+                    description=(
+                        "Optional files to add explicitly to the editing context "
+                        "(paths relative to project root). Aider builds a repo map "
+                        "automatically — these get priority."
+                    ),
+                ),
+                "model": types.Schema(
+                    type=types.Type.STRING,
+                    description=(
+                        "Optional model override. Format depends on the provider: "
+                        "'openrouter/anthropic/claude-3-5-haiku-20241022' (OpenRouter), "
+                        "'claude-3-5-sonnet-20241022' (Anthropic direct), "
+                        "'gemma-4-27b-it' or 'openai/gemma-4-27b-it' (local Open WebUI), "
+                        "'deepseek/deepseek-chat' (DeepSeek via OpenRouter). "
+                        "Defaults to the project's .aider.conf.yml model or AIDER_MODEL env var."
+                    ),
+                ),
+                "host_label": types.Schema(
+                    type=types.Type.STRING,
+                    description=(
+                        "Pick a specific configured host by label (partial match, case-insensitive). "
+                        "Examples: 'OpenRouter', 'Local', 'scott-lt-i7-rtx'. "
+                        "Overrides automatic credential resolution. "
+                        "Omit to let credentials be chosen automatically."
+                    ),
+                ),
+                "auto_commit": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description=(
+                        "Auto-commit changes after edits (default: true). "
+                        "Set to false to review diffs before committing manually."
+                    ),
+                ),
+                "timeout": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Max seconds to wait for Aider to finish (default 300, max 600).",
+                ),
+                "background": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description=(
+                        "Run asynchronously in the background (default: false). "
+                        "Returns an agent_id immediately; use agent_status(agent_id) to monitor. "
+                        "Recommended for tasks expected to take more than ~60 seconds."
+                    ),
+                ),
+                "notify": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description=(
+                        "Send a push/Talk notification when the background task completes "
+                        "(default: false). Only applies when background=true."
+                    ),
+                ),
+            },
+            required=["project", "task"],
+        ),
+    )
+]
--- a/cortex/tools/cron.py
+++ b/cortex/tools/cron.py
@@ -17,6 +17,7 @@ import secrets
 from datetime import datetime, timezone
 from pathlib import Path

+from google.genai import types
 from persona import persona_path, get_user, get_persona
 from cron_runner import load_crons, save_crons, parse_schedule

@@ -57,8 +58,9 @@ def _cron_add(label: str, schedule: str, job_type: str, payload: str) -> str:
    except ValueError as e:
        return f"Bad schedule: {e}"

-    if job_type not in ("remind", "note"):
-        return "Bad type: must be 'remind' or 'note'."
+    _VALID_TYPES = ("remind", "note", "message", "brief", "task")
+    if job_type not in _VALID_TYPES:
+        return f"Bad type: must be one of {', '.join(_VALID_TYPES)}."

    current_user = get_user()
    current_persona = get_persona()
@@ -194,3 +196,73 @@ async def cron_toggle(cron_id: str) -> str:

 async def reminders_clear() -> str:
    return await asyncio.to_thread(_reminders_clear)
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="cron_list",
+        description=(
+            "List all scheduled cron jobs — their ID, label, schedule, type, and last run time. "
+            "Use this to see what's scheduled before adding or removing jobs."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+    types.FunctionDeclaration(
+        name="cron_add",
+        description=(
+            "Create a new scheduled cron job and register it immediately (no restart needed). "
+            "Job types: "
+            "'remind' — appends to REMINDERS.md, auto-surfaced in chat context at tier 2+; "
+            "'note' — appends to SCRATCH.md, read on demand; "
+            "'message' — sends payload text directly to the user's notification channel; "
+            "'brief' — calls the LLM (no tools) with payload as the prompt, sends the response; "
+            "'task' — runs the full orchestrator tool loop with payload as the request, sends "
+            "Claude's response to the notification channel (use for agentic scheduled work: "
+            "research, checks, file updates, summaries that need tool access). "
+            "Schedule formats: 'hourly' | 'daily' | 'daily:HH:MM' | 'weekly:DOW' | 'weekly:DOW:HH:MM' | "
+            "'monthly' | 'monthly:DD' | 'monthly:DD:HH:MM' | 'yearly:MM:DD' | 'yearly:MM:DD:HH:MM'. "
+            "Examples: schedule='weekly:mon:08:00' for Monday briefings; "
+            "schedule='monthly:1:09:00' for a first-of-month review; "
+            "schedule='yearly:03:15' for a March 15 birthday reminder."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "label": types.Schema(type=types.Type.STRING, description="Short human-readable name for this job (e.g. 'Monday task summary')"),
+                "schedule": types.Schema(type=types.Type.STRING, description="When to run: hourly | daily | daily:HH:MM | weekly:DOW | weekly:DOW:HH:MM | monthly | monthly:DD | monthly:DD:HH:MM | yearly:MM:DD | yearly:MM:DD:HH:MM"),
+                "job_type": types.Schema(type=types.Type.STRING, description="remind | note | message | brief | task"),
+                "payload": types.Schema(type=types.Type.STRING, description="The text/prompt to use when the job fires"),
+            },
+            required=["label", "schedule", "job_type", "payload"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="cron_remove",
+        description=(
+            "Permanently delete a scheduled cron job. Use cron_list first to get the ID. "
+            "To temporarily disable without deleting, use cron_toggle instead."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "cron_id": types.Schema(type=types.Type.STRING, description="Job ID (e.g. c_abc123) — get from cron_list"),
+            },
+            required=["cron_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="cron_toggle",
+        description=(
+            "Pause a running cron job, or resume a paused one. "
+            "The job stays in the list and can be re-enabled later. "
+            "Use cron_list to see current enabled/paused state."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "cron_id": types.Schema(type=types.Type.STRING, description="Job ID (e.g. c_abc123) — get from cron_list"),
+            },
+            required=["cron_id"],
+        ),
+    ),
+]
--- a/cortex/tools/files.py
+++ b/cortex/tools/files.py
@@ -1,108 +1,78 @@
 """
-File read tool — restricted to known-safe directory roots.
+File read/write/search tools — two access scopes.

-Lets the orchestrator read local files (documentation, notes, config references)
-without exposing arbitrary filesystem access. All paths are resolved and checked
-against an allowlist of roots before any read is performed.
+  Project scope (no admin required):
+    project_file_read   — read a file with optional line-range (offset)
+    project_file_list   — list a directory with sizes + timestamps
+    file_stat           — size, modified time, line count for a path
+    file_grep           — regex search with context lines; up to 50 matches
+    file_syntax_check   — py_compile (.py) or json.loads (.json) check
+
+  System scope (admin-only):
+    file_read           — read a file from ~/agents_sync/, ~/OSIT_dev/, etc.
+    file_list           — list a directory (same roots)
+    file_write          — write/append (~/agents_sync/ + Cortex home/)
+
+  Session tools (user-level, persona-isolated):
+    session_read        — read a session log by date
+    session_search      — keyword search across session logs
+
+All project-scope tools are restricted to the Cortex project root:
+  ~/agents_sync/projects/Cortex_and_Inara_dev/
 """

 import asyncio
+import json
 import logging
+import re
+import subprocess
+from datetime import datetime
 from pathlib import Path

+from google.genai import types
+
 logger = logging.getLogger(__name__)

-# Directories the orchestrator is allowed to read from.
-# Paths are resolved (symlinks followed, ~ expanded) at import time.
-_ALLOWED_ROOTS: list[Path] = [
-    Path.home() / "agents_sync",
-    Path.home() / "OSIT_dev",
-    Path.home() / "DgrZone_Nextcloud",
-    Path.home() / "OSIT_Nextcloud",
-]
+# ── Access roots ──────────────────────────────────────────────────────────────

-# Hard cap on file size to prevent accidental context blowout
-_MAX_BYTES = 50_000   # ~50 KB
-_MAX_LINES = 500
+# Project root: two levels up from cortex/tools/files.py → Cortex_and_Inara_dev/
+_PROJECT_ROOT: Path = Path(__file__).parent.parent.parent.resolve()

-
-async def file_read(path: str, max_lines: int | None = None) -> str:
-    """Read a local file and return its contents as a string.
-
-    Only files within allowed directories can be read:
-      ~/agents_sync/, ~/OSIT_dev/, ~/DgrZone_Nextcloud/, ~/OSIT_Nextcloud/
-
-    Args:
-        path:      Absolute or home-relative path to the file (e.g. ~/agents_sync/CLAUDE.md).
-        max_lines: Optional line limit (default 500, hard cap). Use for large files.
-
-    Returns the file contents (truncated if over the size limit), or an error message.
-    """
-    return await asyncio.to_thread(_sync_file_read, path, max_lines)
-
-
-def _sync_file_read(path: str, max_lines: int | None) -> str:
-    # Expand ~ and resolve to absolute path
+# System-wide read roots
+def _build_allowed_roots() -> list[Path]:
+    roots = [
+        Path.home() / "agents_sync",
+        Path.home() / "OSIT_dev",
+        Path.home() / "DgrZone_Nextcloud",
+        Path.home() / "OSIT_Nextcloud",
+    ]
    try:
-        resolved = Path(path).expanduser().resolve()
-    except Exception as e:
-        return f"Invalid path: {e}"
+        from config import settings
+        roots.append(settings.home_root())
+    except Exception:
+        pass
+    return roots

-    # Security check — must be under an allowed root
-    if not _is_allowed(resolved):
-        allowed_str = ", ".join(str(r) for r in _ALLOWED_ROOTS)
-        return (
-            f"Access denied: {resolved}\n"
-            f"Allowed directories: {allowed_str}"
-        )
+_ALLOWED_ROOTS: list[Path] = _build_allowed_roots()

-    if not resolved.exists():
-        return f"File not found: {resolved}"
+# Write is tighter
+_WRITE_ROOTS: list[Path] = [Path.home() / "agents_sync"]

-    if not resolved.is_file():
-        # If it's a directory, list its contents instead
-        try:
-            entries = sorted(resolved.iterdir())
-            names = [e.name + ("/" if e.is_dir() else "") for e in entries[:100]]
-            return f"Directory listing for {resolved}:\n" + "\n".join(names)
-        except Exception as e:
-            return f"Cannot list directory: {e}"
+# Size limits
+_MAX_BYTES  = 50_000
+_MAX_LINES  = 500
+_MAX_GREP_MATCHES = 50

-    # Read the file
+
+def _is_project_allowed(resolved: Path) -> bool:
    try:
-        raw = resolved.read_bytes()
-    except Exception as e:
-        return f"Read error: {e}"
-
-    # Binary files
-    try:
-        text = raw.decode("utf-8")
-    except UnicodeDecodeError:
-        return f"Binary file (not readable as text): {resolved}  [{len(raw)} bytes]"
-
-    # Apply line limit
-    limit = min(max_lines or _MAX_LINES, _MAX_LINES)
-    lines = text.splitlines()
-    truncated = False
-
-    if len(lines) > limit:
-        lines = lines[:limit]
-        truncated = True
-
-    # Apply byte cap as a final safety net
-    result = "\n".join(lines)
-    if len(result) > _MAX_BYTES:
-        result = result[:_MAX_BYTES]
-        truncated = True
-
-    if truncated:
-        result += f"\n\n… [truncated — file has {len(text.splitlines())} lines total]"
-
-    return result
+        resolved.relative_to(_PROJECT_ROOT)
+        return True
+    except ValueError:
+        return False


 def _is_allowed(resolved: Path) -> bool:
-    """Check that resolved path is under one of the allowed roots."""
    for root in _ALLOWED_ROOTS:
        try:
            resolved.relative_to(root)
@@ -110,3 +80,725 @@ def _is_allowed(resolved: Path) -> bool:
        except ValueError:
            continue
    return False
+
+
+def _is_write_allowed(resolved: Path) -> bool:
+    for root in _WRITE_ROOTS:
+        try:
+            resolved.relative_to(root)
+            return True
+        except ValueError:
+            continue
+    try:
+        from config import settings
+        resolved.relative_to(settings.home_root())
+        return True
+    except (ValueError, Exception):
+        pass
+    return False
+
+
+# ── Shared implementations ────────────────────────────────────────────────────
+
+def _read_impl(path_str: str, offset: int | None, max_lines: int | None, is_allowed_fn) -> str:
+    try:
+        resolved = Path(path_str).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    if not is_allowed_fn(resolved):
+        return f"Access denied: {resolved}"
+
+    if not resolved.exists():
+        return f"File not found: {resolved}"
+
+    if not resolved.is_file():
+        try:
+            entries = sorted(resolved.iterdir())
+            names = [e.name + ("/" if e.is_dir() else "") for e in entries[:100]]
+            return f"Directory listing for {resolved}:\n" + "\n".join(names)
+        except Exception as e:
+            return f"Cannot list directory: {e}"
+
+    try:
+        raw = resolved.read_bytes()
+    except Exception as e:
+        return f"Read error: {e}"
+
+    try:
+        text = raw.decode("utf-8")
+    except UnicodeDecodeError:
+        return f"Binary file (not readable as text): {resolved}  [{len(raw)} bytes]"
+
+    all_lines = text.splitlines()
+    total = len(all_lines)
+
+    # offset is 1-based; default = start of file
+    start = max(0, (offset or 1) - 1)
+    working = all_lines[start:]
+
+    limit = min(max_lines or _MAX_LINES, _MAX_LINES)
+    truncated = False
+    if len(working) > limit:
+        working = working[:limit]
+        truncated = True
+
+    result = "\n".join(working)
+    if len(result) > _MAX_BYTES:
+        result = result[:_MAX_BYTES]
+        truncated = True
+
+    end_line = start + len(working)
+    header = f"[Lines {start + 1}–{end_line} of {total}]\n" if (start > 0 or truncated) else ""
+    trailer = f"\n\n… [truncated — file has {total} lines; use offset={end_line + 1} to read more]" if truncated else ""
+
+    return header + result + trailer
+
+
+def _list_impl(path_str: str, is_allowed_fn) -> str:
+    try:
+        resolved = Path(path_str).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    if not is_allowed_fn(resolved):
+        return f"Access denied: {resolved}"
+
+    if not resolved.exists():
+        return f"Path not found: {resolved}"
+
+    if resolved.is_file():
+        return f"{resolved} is a file. Use file_read / project_file_read to read it."
+
+    try:
+        entries = sorted(resolved.iterdir(), key=lambda e: (e.is_file(), e.name.lower()))
+        lines = []
+        for e in entries[:200]:
+            if e.is_dir():
+                suffix = "/"
+            else:
+                try:
+                    st = e.stat()
+                    mtime = datetime.fromtimestamp(st.st_mtime).strftime("%Y-%m-%d %H:%M")
+                    suffix = f"  ({st.st_size:,} B, {mtime})"
+                except Exception:
+                    suffix = ""
+            lines.append(f"{e.name}{suffix}")
+        result = "\n".join(lines)
+        if len(entries) > 200:
+            result += f"\n… ({len(entries) - 200} more not shown)"
+        return f"Contents of {resolved}:\n\n{result}"
+    except Exception as e:
+        return f"Cannot list directory: {e}"
+
+
+# ── Project-scoped tools ──────────────────────────────────────────────────────
+
+async def project_file_read(path: str, offset: int | None = None, max_lines: int | None = None) -> str:
+    """Read a file within the Cortex project directory, with optional line range."""
+    return await asyncio.to_thread(_read_impl, path, offset, max_lines, _is_project_allowed)
+
+
+async def project_file_list(path: str) -> str:
+    """List directory contents within the Cortex project directory, with sizes and timestamps."""
+    return await asyncio.to_thread(_list_impl, path, _is_project_allowed)
+
+
+async def file_stat(path: str) -> str:
+    """Return metadata for a file or directory: type, size, modified time, line count."""
+    return await asyncio.to_thread(_sync_file_stat, path)
+
+
+def _sync_file_stat(path_str: str) -> str:
+    try:
+        resolved = Path(path_str).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    if not _is_project_allowed(resolved):
+        return f"Access denied: {resolved}\nProject root: {_PROJECT_ROOT}"
+
+    if not resolved.exists():
+        return f"Path not found: {resolved}"
+
+    try:
+        st = resolved.stat()
+    except Exception as e:
+        return f"Cannot stat: {e}"
+
+    modified = datetime.fromtimestamp(st.st_mtime).strftime("%Y-%m-%d %H:%M:%S")
+    lines = [
+        f"Path:     {resolved}",
+        f"Type:     {'directory' if resolved.is_dir() else 'file'}",
+        f"Size:     {st.st_size:,} bytes",
+        f"Modified: {modified}",
+    ]
+
+    if resolved.is_file():
+        try:
+            raw = resolved.read_bytes()
+            if b'\x00' not in raw[:1024]:
+                lines.append(f"Lines:    {len(raw.decode('utf-8', errors='replace').splitlines())}")
+        except Exception:
+            pass
+    elif resolved.is_dir():
+        try:
+            entries = list(resolved.iterdir())
+            n_files = sum(1 for e in entries if e.is_file())
+            n_dirs  = sum(1 for e in entries if e.is_dir())
+            lines.append(f"Contents: {n_files} file(s), {n_dirs} subdirector{'y' if n_dirs == 1 else 'ies'}")
+        except Exception:
+            pass
+
+    return "\n".join(lines)
+
+
+async def file_grep(path: str, pattern: str, context_lines: int = 2, recursive: bool = True) -> str:
+    """Search for a regex pattern in a file or directory, returning matching lines with context."""
+    return await asyncio.to_thread(_sync_file_grep, path, pattern, context_lines, recursive)
+
+
+def _sync_file_grep(path_str: str, pattern: str, context_lines: int, recursive: bool) -> str:
+    try:
+        resolved = Path(path_str).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    if not _is_project_allowed(resolved):
+        return f"Access denied: {resolved}\nProject root: {_PROJECT_ROOT}"
+
+    if not resolved.exists():
+        return f"Path not found: {resolved}"
+
+    try:
+        regex = re.compile(pattern, re.IGNORECASE)
+    except re.error as e:
+        return f"Invalid regex pattern: {e}"
+
+    ctx = max(0, min(context_lines, 5))
+
+    if resolved.is_file():
+        files_to_search = [resolved]
+    elif recursive:
+        files_to_search = sorted(f for f in resolved.rglob("*") if f.is_file())
+    else:
+        files_to_search = sorted(f for f in resolved.iterdir() if f.is_file())
+
+    total_matches = 0
+    sections: list[str] = []
+    capped = False
+
+    for fp in files_to_search:
+        if total_matches >= _MAX_GREP_MATCHES:
+            capped = True
+            break
+        try:
+            raw = fp.read_bytes()
+        except OSError:
+            continue
+        if b'\x00' in raw[:1024]:
+            continue  # skip binary
+        try:
+            text = raw.decode("utf-8", errors="replace")
+        except Exception:
+            continue
+
+        file_lines = text.splitlines()
+        match_indices = [i for i, line in enumerate(file_lines) if regex.search(line)]
+        if not match_indices:
+            continue
+
+        total_matches += len(match_indices)
+
+        try:
+            label = str(fp.relative_to(_PROJECT_ROOT))
+        except ValueError:
+            label = str(fp)
+
+        file_output = [f"── {label} ──"]
+        printed: set[int] = set()
+
+        for mi in match_indices:
+            start = max(0, mi - ctx)
+            end   = min(len(file_lines), mi + ctx + 1)
+            if printed and start > max(printed) + 1:
+                file_output.append("  ···")
+            for j in range(start, end):
+                if j not in printed:
+                    marker = "►" if j == mi else " "
+                    file_output.append(f"  {j + 1:4d}{marker} {file_lines[j]}")
+                    printed.add(j)
+
+        sections.append("\n".join(file_output))
+
+    if not sections:
+        return f"No matches for '{pattern}' in {resolved}"
+
+    cap_note = f" (capped at {_MAX_GREP_MATCHES})" if capped else ""
+    header   = f"grep '{pattern}' — {total_matches} match(es){cap_note}:"
+    return header + "\n\n" + "\n\n".join(sections)
+
+
+async def file_diff(path_a: str, path_b: str) -> str:
+    """Compare two files and return a unified diff."""
+    return await asyncio.to_thread(_sync_file_diff, path_a, path_b)
+
+
+def _sync_file_diff(path_a: str, path_b: str) -> str:
+    try:
+        resolved_a = Path(path_a).expanduser().resolve()
+        resolved_b = Path(path_b).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    for resolved in (resolved_a, resolved_b):
+        if not _is_project_allowed(resolved):
+            return f"Access denied: {resolved}"
+        if not resolved.exists():
+            return f"File not found: {resolved}"
+        if not resolved.is_file():
+            return f"Not a file: {resolved}"
+
+    try:
+        result = subprocess.run(
+            ["diff", "-u", str(resolved_a), str(resolved_b)],
+            capture_output=True, text=True, timeout=15,
+        )
+        if result.returncode == 0:
+            return f"Files are identical: {resolved_a.name} vs {resolved_b.name}"
+        output = result.stdout
+        if not output:
+            return f"diff returned no output (exit {result.returncode}): {result.stderr}"
+        if len(output) > _MAX_BYTES:
+            output = output[:_MAX_BYTES] + "\n… [truncated]"
+        return output
+    except subprocess.TimeoutExpired:
+        return "Timeout running diff"
+    except Exception as e:
+        return f"Error: {e}"
+
+
+async def file_syntax_check(path: str) -> str:
+    """Check syntax of a Python (.py) or JSON (.json) file."""
+    return await asyncio.to_thread(_sync_file_syntax_check, path)
+
+
+def _sync_file_syntax_check(path_str: str) -> str:
+    try:
+        resolved = Path(path_str).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    if not _is_project_allowed(resolved):
+        return f"Access denied: {resolved}\nProject root: {_PROJECT_ROOT}"
+
+    if not resolved.exists():
+        return f"File not found: {resolved}"
+
+    if not resolved.is_file():
+        return f"Not a file: {resolved}"
+
+    suffix = resolved.suffix.lower()
+
+    if suffix == ".py":
+        try:
+            result = subprocess.run(
+                ["python3", "-m", "py_compile", str(resolved)],
+                capture_output=True, text=True, timeout=15,
+            )
+            if result.returncode == 0:
+                return f"OK — {resolved.name}: syntax valid"
+            err = (result.stderr or result.stdout).strip()
+            return f"Syntax error in {resolved.name}:\n{err}"
+        except subprocess.TimeoutExpired:
+            return f"Timeout running py_compile on {resolved.name}"
+        except Exception as e:
+            return f"Error: {e}"
+
+    elif suffix == ".json":
+        try:
+            text = resolved.read_text(encoding="utf-8")
+            json.loads(text)
+            return f"OK — {resolved.name}: valid JSON"
+        except json.JSONDecodeError as e:
+            return f"JSON error in {resolved.name}: {e}"
+        except Exception as e:
+            return f"Error reading {resolved.name}: {e}"
+
+    else:
+        return f"Syntax check not supported for '{suffix}' files. Supported: .py, .json"
+
+
+# ── System-scoped tools ───────────────────────────────────────────────────────
+
+async def file_read(path: str, offset: int | None = None, max_lines: int | None = None) -> str:
+    """Read a local file from the broader system. Allowed: ~/agents_sync/, ~/OSIT_dev/, etc. ADMIN ONLY."""
+    return await asyncio.to_thread(_read_impl, path, offset, max_lines, _is_allowed)
+
+
+async def file_list(path: str) -> str:
+    """List directory contents from the broader system. ADMIN ONLY."""
+    return await asyncio.to_thread(_list_impl, path, _is_allowed)
+
+
+async def file_write(path: str, content: str, mode: str = "overwrite") -> str:
+    """Write or append content to a file. Write roots: ~/agents_sync/ and Cortex home/. ADMIN ONLY."""
+    return await asyncio.to_thread(_sync_file_write, path, content, mode)
+
+
+def _sync_file_write(path: str, content: str, mode: str) -> str:
+    try:
+        resolved = Path(path).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    if not _is_write_allowed(resolved):
+        return (
+            f"Write access denied: {resolved}\n"
+            f"Allowed write roots: ~/agents_sync/ and the Cortex home/ directory."
+        )
+
+    if mode not in ("overwrite", "append"):
+        return f"Invalid mode '{mode}' — use 'overwrite' or 'append'."
+
+    try:
+        resolved.parent.mkdir(parents=True, exist_ok=True)
+        if mode == "append":
+            with resolved.open("a", encoding="utf-8") as f:
+                f.write(content)
+            return f"Appended {len(content)} chars to {resolved}"
+        else:
+            resolved.write_text(content, encoding="utf-8")
+            return f"Wrote {len(content)} chars to {resolved}"
+    except Exception as e:
+        logger.error("file_write error for %s: %s", resolved, e)
+        return f"Write error: {e}"
+
+
+# ── Session tools ─────────────────────────────────────────────────────────────
+
+_SEARCH_EXCERPT_CHARS = 150
+
+
+async def session_read(date: str) -> str:
+    """Read a full session log by date (YYYY-MM-DD)."""
+    return await asyncio.to_thread(_sync_session_read, date.strip())
+
+
+def _sync_session_read(date: str) -> str:
+    from persona import persona_path
+    sessions_dir = persona_path() / "sessions"
+    if not sessions_dir.exists():
+        return "No session logs found."
+
+    target = sessions_dir / f"{date}.md"
+    if target.exists():
+        content = target.read_text()
+        return f"Session log for {date} ({len(content)} chars):\n\n{content}"
+
+    available = sorted([f.stem for f in sessions_dir.glob("*.md")], reverse=True)
+    if not available:
+        return "No session logs found."
+    recent = "\n".join(f"  {d}" for d in available[:15])
+    return f"No session log found for '{date}'. Available dates (most recent first):\n{recent}"
+
+
+async def session_search(query: str, limit: int = 5) -> str:
+    """Search past session logs for a keyword or phrase."""
+    return await asyncio.to_thread(_sync_session_search, query, limit)
+
+
+def _sync_session_search(query: str, limit: int) -> str:
+    from persona import persona_path
+    sessions_dir = persona_path() / "sessions"
+    if not sessions_dir.exists():
+        return "No session logs found."
+
+    limit   = max(1, min(limit, 20))
+    pattern = re.compile(re.escape(query), re.IGNORECASE)
+    session_files = sorted(sessions_dir.glob("*.md"), reverse=True)
+
+    matches = []
+    for sf in session_files:
+        if len(matches) >= limit:
+            break
+        try:
+            text = sf.read_text()
+        except OSError:
+            continue
+        for m in pattern.finditer(text):
+            if len(matches) >= limit:
+                break
+            start   = max(0, m.start() - _SEARCH_EXCERPT_CHARS)
+            end     = min(len(text), m.end() + _SEARCH_EXCERPT_CHARS)
+            excerpt = text[start:end].strip()
+            if start > 0:
+                excerpt = "…" + excerpt
+            if end < len(text):
+                excerpt = excerpt + "…"
+            matches.append(f"[{sf.stem}] {excerpt}")
+
+    if not matches:
+        return f"No matches for '{query}' across {len(session_files)} session logs."
+    header = f"Session search: '{query}' — {len(matches)} match(es) across {len(session_files)} logs\n"
+    return header + "\n\n".join(matches)
+
+
+# ── Declarations ──────────────────────────────────────────────────────────────
+
+DECLARATIONS = [
+    # Project-scoped
+    types.FunctionDeclaration(
+        name="project_file_read",
+        description=(
+            "Read a file within the Cortex project directory (source code, docs, config, persona files). "
+            "Supports reading a specific line range via offset — use to page through large files "
+            "without re-reading from the top. If given a directory path, returns a listing instead. "
+            "Project root: ~/agents_sync/projects/Cortex_and_Inara_dev/"
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Absolute or ~/... path to the file",
+                ),
+                "offset": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Start reading from this line number (1-based). Omit to read from the top.",
+                ),
+                "max_lines": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Maximum lines to return (default 500)",
+                ),
+            },
+            required=["path"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="project_file_list",
+        description=(
+            "List files and subdirectories within the Cortex project directory. "
+            "Shows file sizes and modified timestamps. "
+            "Project root: ~/agents_sync/projects/Cortex_and_Inara_dev/"
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Absolute or ~/... path to the directory",
+                ),
+            },
+            required=["path"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="file_stat",
+        description=(
+            "Get metadata for a file or directory: type, size, modified timestamp, line count (for text files) "
+            "or entry counts (for directories). Use before reading to check recency or size. "
+            "Restricted to the Cortex project directory."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Absolute or ~/... path to the file or directory",
+                ),
+            },
+            required=["path"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="file_grep",
+        description=(
+            "Search for a regex pattern in a file or directory, returning matching lines with surrounding "
+            "context. Much more efficient than reading an entire source file — use this to find function "
+            "definitions, variable names, TODO comments, imports, error strings, etc. "
+            "Searches recursively by default. Capped at 50 matches. Skips binary files. "
+            "Case-insensitive. Restricted to the Cortex project directory."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="File or directory to search (e.g. ~/agents_sync/projects/Cortex_and_Inara_dev/cortex/)",
+                ),
+                "pattern": types.Schema(
+                    type=types.Type.STRING,
+                    description="Regex pattern to search for (case-insensitive). Examples: 'def ha_', 'import httpx', 'TODO'",
+                ),
+                "context_lines": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Lines of context before/after each match (default 2, max 5)",
+                ),
+                "recursive": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description="Search subdirectories recursively (default true)",
+                ),
+            },
+            required=["path", "pattern"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="file_diff",
+        description=(
+            "Compare two files and return a unified diff (diff -u). "
+            "Use for code review, verifying what changed between two versions of a file, "
+            "or comparing config files side-by-side. "
+            "Returns 'Files are identical' if there are no differences. "
+            "Restricted to the Cortex project directory."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path_a": types.Schema(
+                    type=types.Type.STRING,
+                    description="Path to the first file (the 'before' or reference file)",
+                ),
+                "path_b": types.Schema(
+                    type=types.Type.STRING,
+                    description="Path to the second file (the 'after' or comparison file)",
+                ),
+            },
+            required=["path_a", "path_b"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="file_syntax_check",
+        description=(
+            "Check the syntax of a Python (.py) or JSON (.json) file without executing it. "
+            "Returns OK or the error with line number. "
+            "Use after editing a file before restarting Cortex. "
+            "Restricted to the Cortex project directory."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Path to the .py or .json file to check",
+                ),
+            },
+            required=["path"],
+        ),
+    ),
+    # System-scoped
+    types.FunctionDeclaration(
+        name="file_read",
+        description=(
+            "Read a local file from the broader system (~/agents_sync/, ~/OSIT_dev/, ~/DgrZone_Nextcloud/, "
+            "~/OSIT_Nextcloud/, Cortex home/). Supports offset for reading specific line ranges. "
+            "For files within the Cortex project, prefer project_file_read instead. "
+            "ADMIN ONLY."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Absolute or ~/... path to the file",
+                ),
+                "offset": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Start reading from this line number (1-based)",
+                ),
+                "max_lines": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Maximum lines to return (default 500)",
+                ),
+            },
+            required=["path"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="file_list",
+        description=(
+            "List files and subdirectories from the broader system. "
+            "Shows sizes and modified timestamps. "
+            "Allowed: ~/agents_sync/, ~/OSIT_dev/, ~/DgrZone_Nextcloud/, ~/OSIT_Nextcloud/. "
+            "ADMIN ONLY."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Absolute or ~/... path to the directory",
+                ),
+            },
+            required=["path"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="file_write",
+        description=(
+            "Write or append content to a file. "
+            "Write-allowed paths: ~/agents_sync/ and the Cortex home/ directory. "
+            "Creates parent directories if needed. "
+            "ADMIN ONLY. Requires user confirmation before executing."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Absolute or ~/... path to write to",
+                ),
+                "content": types.Schema(
+                    type=types.Type.STRING,
+                    description="Content to write",
+                ),
+                "mode": types.Schema(
+                    type=types.Type.STRING,
+                    description="'overwrite' (default, replaces file) or 'append' (adds to end)",
+                ),
+            },
+            required=["path", "content"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="session_read",
+        description=(
+            "Read a full conversation session log by date (YYYY-MM-DD). "
+            "Useful for continuity and recalling past decisions. "
+            "If the date is not found, lists available dates. "
+            "Only reads this user's own sessions."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "date": types.Schema(
+                    type=types.Type.STRING,
+                    description="Date in YYYY-MM-DD format (e.g. '2026-05-08')",
+                ),
+            },
+            required=["date"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="session_search",
+        description=(
+            "Search past conversation session logs for a keyword or phrase. "
+            "Returns matching excerpts with session dates, newest first. "
+            "Only searches this user's own sessions."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "query": types.Schema(
+                    type=types.Type.STRING,
+                    description="Keyword or phrase to search for",
+                ),
+                "limit": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Max results to return (default 5, max 20)",
+                ),
+            },
+            required=["query"],
+        ),
+    ),
+]
--- a/cortex/tools/git.py
+++ b/cortex/tools/git.py
@@ -0,0 +1,158 @@
+"""
+Git inspection tools — project-scoped, read-only.
+
+  git_status  — working tree status (staged, unstaged, untracked changes)
+  git_log     — recent commit history with optional path filter
+  git_diff    — diff between commits, branches, or working tree vs HEAD
+"""
+
+import asyncio
+import logging
+from pathlib import Path
+
+from google.genai import types
+
+logger = logging.getLogger(__name__)
+
+_PROJECT_ROOT: Path = Path(__file__).parent.parent.parent.resolve()
+_MAX_OUTPUT = 50_000
+
+
+async def _git(*args: str, timeout: int = 15) -> tuple[int, str]:
+    """Run a git command in the project root. Returns (returncode, output)."""
+    proc = await asyncio.create_subprocess_exec(
+        "git", "-C", str(_PROJECT_ROOT), *args,
+        stdout=asyncio.subprocess.PIPE,
+        stderr=asyncio.subprocess.PIPE,
+    )
+    try:
+        stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=timeout)
+    except asyncio.TimeoutError:
+        proc.kill()
+        return 1, "git command timed out"
+    out = (stdout or b"").decode(errors="replace").strip()
+    err = (stderr or b"").decode(errors="replace").strip()
+    combined = out if out else err
+    return proc.returncode, combined
+
+
+def _cap(text: str) -> str:
+    if len(text) > _MAX_OUTPUT:
+        return text[:_MAX_OUTPUT] + "\n… [truncated]"
+    return text
+
+
+async def git_status() -> str:
+    """Return the current git working tree status."""
+    rc, out = await _git("status")
+    if rc != 0:
+        return f"git status failed: {out}"
+    return out or "Working tree clean — nothing to report."
+
+
+async def git_log(n: int = 20, path: str = "", oneline: bool = True) -> str:
+    """Return recent git commit history."""
+    args = ["log"]
+    if oneline:
+        args += ["--oneline"]
+    else:
+        args += ["--format=%H %as %an%n  %s", "--date=short"]
+    args += [f"-{max(1, min(n, 200))}"]
+    if path:
+        args += ["--", path]
+    rc, out = await _git(*args)
+    if rc != 0:
+        return f"git log failed: {out}"
+    return _cap(out) or "No commits found."
+
+
+async def git_diff(ref_a: str = "", ref_b: str = "", path: str = "", stat_only: bool = False) -> str:
+    """Show a git diff. Defaults to working tree vs HEAD (unstaged changes)."""
+    args = ["diff"]
+    if stat_only:
+        args += ["--stat"]
+    if ref_a and ref_b:
+        args += [f"{ref_a}..{ref_b}"]
+    elif ref_a:
+        args += [ref_a]
+    if path:
+        args += ["--", path]
+    rc, out = await _git(*args)
+    # diff exits 1 when there are differences — that's normal
+    if rc not in (0, 1):
+        return f"git diff failed: {out}"
+    return _cap(out) or "No differences found."
+
+
+# ── Declarations ──────────────────────────────────────────────────────────────
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="git_status",
+        description=(
+            "Show the current git working tree status for the Cortex project: "
+            "staged changes, unstaged modifications, and untracked files. "
+            "Use to check whether there are uncommitted changes before restarting or deploying."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={},
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="git_log",
+        description=(
+            "Show recent git commit history for the Cortex project. "
+            "Returns commit hashes, dates, and messages. "
+            "Optionally filter to a specific file or directory path."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "n": types.Schema(
+                    type=types.Type.INTEGER,
+                    description="Number of commits to return (default 20, max 200)",
+                ),
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Optional file or directory path to filter commits by",
+                ),
+                "oneline": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description="Use compact one-line format (default true). Set false for more detail.",
+                ),
+            },
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="git_diff",
+        description=(
+            "Show a git diff for the Cortex project. "
+            "With no arguments: shows unstaged working tree changes vs HEAD. "
+            "With ref_a only: shows changes between that ref and HEAD. "
+            "With ref_a and ref_b: shows changes between the two refs (commits, branches, or tags). "
+            "Use stat_only to get a summary of changed files instead of full patch output."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "ref_a": types.Schema(
+                    type=types.Type.STRING,
+                    description="First ref (commit hash, branch name, or tag). Omit for working tree diff.",
+                ),
+                "ref_b": types.Schema(
+                    type=types.Type.STRING,
+                    description="Second ref. When provided with ref_a, shows diff between the two.",
+                ),
+                "path": types.Schema(
+                    type=types.Type.STRING,
+                    description="Optional file or directory path to restrict the diff to",
+                ),
+                "stat_only": types.Schema(
+                    type=types.Type.BOOLEAN,
+                    description="Return only a file-change summary (--stat) instead of the full diff",
+                ),
+            },
+        ),
+    ),
+]
--- a/cortex/tools/homeassistant.py
+++ b/cortex/tools/homeassistant.py
@@ -0,0 +1,277 @@
+"""
+Home Assistant tools — read device states and call services.
+
+Credentials are read automatically from the current user's channels.json:
+  "homeassistant": {"url": "https://ha.example.com", "token": "<long-lived-token>"}
+
+Configure in Settings → Notifications → Home Assistant.
+"""
+
+import json
+import logging
+
+import httpx
+from google.genai import types
+
+from auth_utils import get_user_channels
+from persona import get_user
+
+logger = logging.getLogger(__name__)
+
+_TIMEOUT = 10
+
+# Attributes that are internal/noisy and not useful to show
+_SKIP_ATTRS = {
+    "friendly_name", "icon", "entity_picture", "supported_features",
+    "supported_color_modes", "color_mode", "min_color_temp_kelvin",
+    "max_color_temp_kelvin", "min_mireds", "max_mireds",
+    "assumed_state", "attribution",
+}
+
+
+def _get_ha_cfg() -> tuple[str, str]:
+    """Return (base_url, token) from the current user's channels.json."""
+    channels = get_user_channels(get_user())
+    ha    = channels.get("homeassistant") or {}
+    url   = (ha.get("url") or "").rstrip("/")
+    token = ha.get("token") or ""
+    if not url or not token:
+        raise ValueError(
+            "Home Assistant not configured — add URL and token in Settings → Notifications."
+        )
+    return url, token
+
+
+def _auth(token: str) -> dict:
+    return {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
+
+
+def _fmt_state(s: dict) -> str:
+    """Format a single HA state dict as a compact readable line."""
+    entity_id = s.get("entity_id", "")
+    state     = s.get("state", "unknown")
+    attrs     = s.get("attributes", {})
+    name      = attrs.get("friendly_name", entity_id)
+
+    label = f"{name} ({entity_id})" if name != entity_id else entity_id
+    useful = {k: v for k, v in attrs.items() if k not in _SKIP_ATTRS}
+
+    extra = ""
+    if useful:
+        parts = []
+        for k, v in list(useful.items())[:6]:   # cap at 6 attrs per entity
+            parts.append(f"{k}: {v}")
+        extra = "  [" + ", ".join(parts) + "]"
+
+    return f"{label}: {state}{extra}"
+
+
+async def ha_get_state(entity_id: str) -> str:
+    """Return the current state and attributes of a single Home Assistant entity."""
+    try:
+        url, token = _get_ha_cfg()
+    except ValueError as e:
+        return str(e)
+
+    try:
+        async with httpx.AsyncClient(timeout=_TIMEOUT) as client:
+            resp = await client.get(f"{url}/api/states/{entity_id}", headers=_auth(token))
+
+        if resp.status_code == 404:
+            return f"Entity not found: {entity_id}"
+        if resp.status_code != 200:
+            return f"HA API error {resp.status_code}: {resp.text[:400]}"
+
+        s     = resp.json()
+        attrs = s.get("attributes", {})
+        lines = [
+            f"**{attrs.get('friendly_name', entity_id)}** (`{entity_id}`)",
+            f"State: **{s.get('state', 'unknown')}**",
+        ]
+        changed = (s.get("last_changed") or "")[:19].replace("T", " ")
+        if changed:
+            lines.append(f"Last changed: {changed} UTC")
+
+        useful = {k: v for k, v in attrs.items() if k not in _SKIP_ATTRS}
+        if useful:
+            lines.append("Attributes:")
+            for k, v in useful.items():
+                lines.append(f"  {k}: {v}")
+
+        return "\n".join(lines)
+
+    except httpx.HTTPError as e:
+        return f"Connection error: {e}"
+    except Exception as e:
+        logger.warning("ha_get_state error: %s", e)
+        return f"Error: {e}"
+
+
+async def ha_get_states(domain: str = "", area: str = "") -> str:
+    """List HA entity states, optionally filtered by domain (e.g. 'light') or area name."""
+    try:
+        url, token = _get_ha_cfg()
+    except ValueError as e:
+        return str(e)
+
+    try:
+        async with httpx.AsyncClient(timeout=_TIMEOUT) as client:
+            resp = await client.get(f"{url}/api/states", headers=_auth(token))
+
+        if resp.status_code != 200:
+            return f"HA API error {resp.status_code}: {resp.text[:400]}"
+
+        states = resp.json()
+
+        if domain:
+            states = [s for s in states if s.get("entity_id", "").startswith(f"{domain}.")]
+        if area:
+            al = area.lower()
+            states = [s for s in states
+                      if al in (s.get("attributes", {}).get("friendly_name") or "").lower()]
+
+        if not states:
+            filters = [f"domain={domain}"] * bool(domain) + [f"area={area}"] * bool(area)
+            return "No entities found" + (f" ({', '.join(filters)})" if filters else "")
+
+        lines = [f"{len(states)} entit{'y' if len(states) == 1 else 'ies'}:"]
+        for s in sorted(states, key=lambda x: x.get("entity_id", "")):
+            lines.append(_fmt_state(s))
+        return "\n".join(lines)
+
+    except httpx.HTTPError as e:
+        return f"Connection error: {e}"
+    except Exception as e:
+        logger.warning("ha_get_states error: %s", e)
+        return f"Error: {e}"
+
+
+async def ha_call_service(
+    domain:    str,
+    service:   str,
+    entity_id: str = "",
+    data:      str = "",
+) -> str:
+    """Call a Home Assistant service (turn on/off lights, set thermostat, lock doors, etc.)."""
+    try:
+        url, token = _get_ha_cfg()
+    except ValueError as e:
+        return str(e)
+
+    payload: dict = {}
+    if entity_id:
+        payload["entity_id"] = entity_id
+    if data:
+        try:
+            extra = json.loads(data)
+            if isinstance(extra, dict):
+                payload.update(extra)
+        except json.JSONDecodeError:
+            return f"Invalid JSON in data: {data}"
+
+    try:
+        async with httpx.AsyncClient(timeout=_TIMEOUT) as client:
+            resp = await client.post(
+                f"{url}/api/services/{domain}/{service}",
+                headers=_auth(token),
+                json=payload,
+            )
+
+        if resp.status_code not in (200, 201):
+            return f"HA API error {resp.status_code}: {resp.text[:400]}"
+
+        changed = resp.json()
+        if not changed:
+            return f"✓ {domain}.{service} called (no state changes reported)."
+
+        lines = [f"✓ {domain}.{service} — {len(changed)} entity state(s) updated:"]
+        for s in changed:
+            lines.append(f"  {s.get('entity_id', '')}: {s.get('state', '')}")
+        return "\n".join(lines)
+
+    except httpx.HTTPError as e:
+        return f"Connection error: {e}"
+    except Exception as e:
+        logger.warning("ha_call_service error: %s", e)
+        return f"Error: {e}"
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="ha_get_state",
+        description=(
+            "Get the current state and attributes of a single Home Assistant entity. "
+            "Use to check if a light is on, read a thermostat temperature, check a "
+            "door/window sensor, battery level, HVAC mode, etc. "
+            "entity_id format: domain.name — e.g. light.living_room, switch.garage, "
+            "climate.ecobee, binary_sensor.front_door, sensor.outdoor_temp."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "entity_id": types.Schema(
+                    type=types.Type.STRING,
+                    description="Full entity ID, e.g. light.living_room or climate.ecobee_main",
+                ),
+            },
+            required=["entity_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ha_get_states",
+        description=(
+            "List Home Assistant entity states, optionally filtered by domain or area. "
+            "Use to survey what devices exist or check multiple entities at once. "
+            "Domain examples: light, switch, sensor, climate, binary_sensor, lock, cover, "
+            "media_player, input_boolean. Leave both blank to list everything (can be large)."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "domain": types.Schema(
+                    type=types.Type.STRING,
+                    description="Filter to this domain, e.g. 'light' or 'switch' (optional)",
+                ),
+                "area": types.Schema(
+                    type=types.Type.STRING,
+                    description="Filter by area name substring match on friendly name (optional)",
+                ),
+            },
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="ha_call_service",
+        description=(
+            "Call a Home Assistant service to control a device or trigger an automation. "
+            "Requires user confirmation before executing. Common examples: "
+            "domain=light service=turn_on entity_id=light.living_room; "
+            "domain=light service=turn_off entity_id=light.all; "
+            "domain=switch service=toggle entity_id=switch.garage; "
+            "domain=climate service=set_temperature data={\"temperature\":72}; "
+            "domain=lock service=lock entity_id=lock.front_door; "
+            "domain=script service=turn_on entity_id=script.goodnight."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "domain": types.Schema(
+                    type=types.Type.STRING,
+                    description="Service domain: light, switch, climate, lock, cover, script, automation, etc.",
+                ),
+                "service": types.Schema(
+                    type=types.Type.STRING,
+                    description="Service name: turn_on, turn_off, toggle, set_temperature, lock, unlock, open, close, etc.",
+                ),
+                "entity_id": types.Schema(
+                    type=types.Type.STRING,
+                    description="Target entity ID — omit for services that don't target a specific entity",
+                ),
+                "data": types.Schema(
+                    type=types.Type.STRING,
+                    description='Extra service data as JSON string, e.g. {"temperature": 72, "hvac_mode": "heat"}',
+                ),
+            },
+            required=["domain", "service"],
+        ),
+    ),
+]
--- a/cortex/tools/notify.py
+++ b/cortex/tools/notify.py
@@ -0,0 +1,234 @@
+"""
+Notification tools — proactively send messages to user channels.
+
+nc_talk_send routes through notification.py → channels.json.
+email_send uses the server SMTP config from .env (smtp_server, smtp_from_*).
+"""
+
+import asyncio
+import json
+import logging
+import re
+
+import httpx
+from google.genai import types
+from config import settings
+from persona import get_user
+
+logger = logging.getLogger(__name__)
+
+
+def _load_allowlist(username: str) -> list[str]:
+    """Load the per-user email allowlist. Returns empty list if not configured."""
+    path = settings.home_root() / username / "email_allowlist.json"
+    try:
+        return [str(p).strip() for p in json.loads(path.read_text()) if str(p).strip()]
+    except FileNotFoundError:
+        return []
+    except Exception as e:
+        logger.warning("failed to read email_allowlist.json for %s: %s", username, e)
+        return []
+
+
+def _email_allowed(address: str, patterns: list[str]) -> bool:
+    """Return True if address matches any pattern (regex, case-insensitive full match)."""
+    addr = address.strip()
+    for pattern in patterns:
+        try:
+            if re.fullmatch(pattern, addr, re.IGNORECASE):
+                return True
+        except re.error:
+            logger.warning("invalid regex in email allowlist: %r", pattern)
+    return False
+
+
+async def email_send(to: str, subject: str, body: str) -> str:
+    """Send an email via the server's configured SMTP account."""
+    username = get_user()
+    allowlist = _load_allowlist(username)
+
+    if not allowlist:
+        return (
+            "Email blocked — no allowlist configured. "
+            f"Add allowed patterns to home/{username}/email_allowlist.json as a JSON array."
+        )
+    if not _email_allowed(to, allowlist):
+        return f"Email blocked — {to} does not match any allowed pattern for {username}."
+
+    from email_utils import send_email
+    ok = await asyncio.to_thread(
+        send_email,
+        to_email=to,
+        subject=subject,
+        body_text=body,
+        body_html=body.replace("\n", "<br>"),
+    )
+    if ok:
+        return f"Email sent to {to}."
+    return "Failed to send email — check SMTP configuration in .env."
+
+
+async def web_push(title: str, body: str, url: str = "") -> str:
+    """Send a browser push notification to the current user's registered devices."""
+    import push_utils
+    username = get_user()
+    result = await push_utils.send_push(username, title, body, url)
+    if "error" in result:
+        return f"Push failed: {result['error']}"
+    return f"Push sent to {result['sent']} device(s) for {username} (pruned {result['pruned']} stale)."
+
+
+async def nc_talk_history(conversation_token: str = "", limit: int = 20) -> str:
+    """Read recent messages from a Nextcloud Talk conversation.
+
+    Requires nc_username and nc_app_password in channels.json under 'nextcloud'.
+    conversation_token defaults to notification_room if not specified.
+    """
+    from auth_utils import get_user_channels
+    username = get_user()
+    channels = get_user_channels(username)
+    nct = channels.get("nextcloud", {})
+
+    url = nct.get("url", "").rstrip("/")
+    nc_username = nct.get("nc_username", "").strip()
+    nc_app_password = nct.get("nc_app_password", "").strip()
+    token = conversation_token.strip() or nct.get("notification_room", "").strip()
+
+    if not url or not nc_username or not nc_app_password:
+        return (
+            "nc_talk_history requires nc_username and nc_app_password in channels.json "
+            f"(under 'nextcloud'). Add these to home/{username}/channels.json to enable message reading."
+        )
+    if not token:
+        return "No conversation token provided and no notification_room set in channels.json."
+
+    limit = min(max(int(limit), 1), 200)
+    return await asyncio.to_thread(_sync_nc_talk_history, url, nc_username, nc_app_password, token, limit)
+
+
+def _sync_nc_talk_history(url: str, nc_user: str, nc_pass: str, token: str, limit: int) -> str:
+    from datetime import datetime, timezone
+    endpoint = f"{url}/ocs/v2.php/apps/spreed/api/v4/chat/{token}"
+    try:
+        resp = httpx.get(
+            endpoint,
+            params={"limit": limit, "lookIntoFuture": 0, "setReadMarker": 0, "noStatusUpdate": 1},
+            auth=(nc_user, nc_pass),
+            headers={"OCS-APIRequest": "true", "Accept": "application/json"},
+            timeout=15,
+        )
+    except Exception as e:
+        return f"NC Talk API error: {e}"
+
+    if resp.status_code != 200:
+        return f"NC Talk API returned HTTP {resp.status_code}: {resp.text[:200]}"
+
+    try:
+        messages = resp.json().get("ocs", {}).get("data", [])
+    except Exception as e:
+        return f"Failed to parse NC Talk response: {e}"
+
+    if not messages:
+        return "No messages found in this conversation."
+
+    # NC Talk returns newest-first; reverse to chronological order
+    lines = [f"Last {len(messages)} messages from {token}:\n"]
+    for msg in reversed(messages):
+        sender = msg.get("actorDisplayName") or msg.get("actorId") or "Unknown"
+        ts = msg.get("timestamp", 0)
+        time_str = datetime.fromtimestamp(ts, tz=timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
+        text = msg.get("message", "")
+        if msg.get("messageType") == "system":
+            lines.append(f"[system {time_str}] {text}")
+        else:
+            lines.append(f"{sender} ({time_str}): {text}")
+
+    return "\n".join(lines)
+
+
+async def nc_talk_send(message: str) -> str:
+    """Send a message to the user via their configured notification channel.
+
+    Channel is resolved from the user's channels.json (notification_channel key).
+    Falls back to Nextcloud Talk if configured. No-op if no channel is set.
+    """
+    from notification import notify
+    username = get_user()
+    try:
+        await notify(username, message)
+        return f"Message sent to {username}'s notification channel."
+    except Exception as e:
+        logger.warning("nc_talk_send error for %s: %s", username, e)
+        return f"Failed to send notification: {e}"
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="web_push",
+        description=(
+            "Send a browser push notification to the current user. Works even when the "
+            "Cortex tab is not open. Use for completing long tasks, reminders that fire "
+            "in the background, or anything the user should see immediately. "
+            "url is optional — if set, clicking the notification opens that URL."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "title": types.Schema(type=types.Type.STRING, description="Notification title (short)"),
+                "body":  types.Schema(type=types.Type.STRING, description="Notification body text"),
+                "url":   types.Schema(type=types.Type.STRING, description="Optional URL to open on click"),
+            },
+            required=["title", "body"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="email_send",
+        description=(
+            "Send an email from the server's configured SMTP account. Use for delivering "
+            "summaries, reports, reminders, or any content the user wants emailed. "
+            "body is plain text; newlines are preserved."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "to":      types.Schema(type=types.Type.STRING, description="Recipient email address"),
+                "subject": types.Schema(type=types.Type.STRING, description="Email subject line"),
+                "body":    types.Schema(type=types.Type.STRING, description="Plain-text email body"),
+            },
+            required=["to", "subject", "body"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="nc_talk_send",
+        description=(
+            "Send a proactive message to the user via their configured notification channel "
+            "(Nextcloud Talk by default). Use this to notify the user of completed background "
+            "tasks, important events, or anything they should know between sessions. "
+            "Requires notification_channel and notification_room set in channels.json."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "message": types.Schema(type=types.Type.STRING, description="The message to send to the user"),
+            },
+            required=["message"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="nc_talk_history",
+        description=(
+            "Read recent messages from a Nextcloud Talk conversation. Useful for checking "
+            "what was said in a room before composing a reply, or reviewing recent context. "
+            "Requires nc_username and nc_app_password in channels.json under 'nextcloud'. "
+            "conversation_token defaults to notification_room if not provided."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "conversation_token": types.Schema(type=types.Type.STRING, description="NC Talk room token (defaults to notification_room from channels.json)"),
+                "limit":              types.Schema(type=types.Type.INTEGER, description="Number of messages to return (default 20, max 200)"),
+            },
+            required=[],
+        ),
+    ),
+]
--- a/cortex/tools/reminders.py
+++ b/cortex/tools/reminders.py
@@ -2,20 +2,23 @@
 Reminders tools.

 Reminders are stored in persona/REMINDERS.md and automatically surfaced
-in the system prompt at Tier 2+. Use these tools to add, list, and clear
-pending reminders.
+in the system prompt at Tier 2+. Each reminder can have an optional due date —
+only due or undated reminders surface in context; future-dated ones are stored
+but invisible until their date arrives.

 Operations:
-  reminders_add   — append a new reminder entry
-  reminders_list  — return all current reminders (or a message if empty)
-  reminders_clear — erase all reminders (moved here from cron.py for consistency;
-                    cron.py still calls the same underlying file)
+  reminders_add    — append a new reminder, optional due date (YYYY-MM-DD)
+  reminders_list   — return all reminders with due status (including future)
+  reminders_remove — remove a single reminder by number
+  reminders_clear  — erase all reminders
 """

 import asyncio
-from datetime import datetime, timezone
+import re
+from datetime import datetime, timezone, date as _date
 from pathlib import Path

+from google.genai import types
 from persona import persona_path


@@ -27,6 +30,68 @@ def _now_label() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")


+def _parse_sections(text: str) -> list[tuple[str, str]]:
+    """Split REMINDERS.md into (heading, body) tuples, one per ## section."""
+    sections: list[tuple[str, str]] = []
+    heading: str | None = None
+    body_lines: list[str] = []
+    for line in text.splitlines():
+        if line.startswith("## "):
+            if heading is not None:
+                sections.append((heading, "\n".join(body_lines).strip()))
+            heading = line[3:].strip()
+            body_lines = []
+        elif heading is not None:
+            body_lines.append(line)
+    if heading is not None:
+        sections.append((heading, "\n".join(body_lines).strip()))
+    return sections
+
+
+def _sections_to_text(sections: list[tuple[str, str]]) -> str:
+    return "".join(f"\n## {h}\n\n{b}\n" for h, b in sections)
+
+
+def _parse_due(body: str) -> _date | None:
+    """Extract due date from a 'due: YYYY-MM-DD' line in the body, if present."""
+    m = re.search(r'^due:\s*(\d{4}-\d{2}-\d{2})', body, re.MULTILINE | re.IGNORECASE)
+    if not m:
+        return None
+    try:
+        return _date.fromisoformat(m.group(1))
+    except ValueError:
+        return None
+
+
+def _today() -> _date:
+    return datetime.now().astimezone().date()
+
+
+def _is_due_or_undated(body: str) -> bool:
+    """Return True if this reminder has no due date or its due date is today or past."""
+    due = _parse_due(body)
+    return due is None or due <= _today()
+
+
+def _due_label(body: str) -> str:
+    """Return a human-readable due status string for reminders_list output."""
+    due = _parse_due(body)
+    if due is None:
+        return ""
+    today = _today()
+    if due < today:
+        days = (today - due).days
+        return f"  [OVERDUE by {days} day{'s' if days != 1 else ''} — due {due}]"
+    if due == today:
+        return "  [due TODAY]"
+    return f"  [due: {due}]"
+
+
+def _body_without_due(body: str) -> str:
+    """Strip the due: line from body for display (due status shown in heading line)."""
+    return re.sub(r'^due:\s*\S+\s*\n?', '', body, count=1, flags=re.MULTILINE | re.IGNORECASE).strip()
+
+
 # ---------------------------------------------------------------------------
 # Sync implementations
 # ---------------------------------------------------------------------------
@@ -35,16 +100,54 @@ def _reminders_list() -> str:
    p = _reminders_path()
    if not p.exists() or not p.read_text().strip():
        return "No pending reminders."
-    return p.read_text()
+    sections = _parse_sections(p.read_text())
+    if not sections:
+        return "No pending reminders."
+    lines = []
+    for i, (heading, body) in enumerate(sections, 1):
+        status = _due_label(body)
+        lines.append(f"{i}. {heading}{status}")
+        display_body = _body_without_due(body)
+        if display_body:
+            for bline in display_body.splitlines()[:4]:
+                lines.append(f"   {bline}")
+        lines.append("")
+    return "\n".join(lines).rstrip()


-def _reminders_add(text: str, label: str | None = None) -> str:
+def _reminders_add(text: str, label: str | None = None, due: str | None = None) -> str:
    p = _reminders_path()
    existing = p.read_text() if p.exists() else ""
    heading = label or _now_label()
-    section = f"\n## {heading}\n\n{text.strip()}\n"
+    body = text.strip()
+    if due:
+        body = f"due: {due}\n{body}"
+    section = f"\n## {heading}\n\n{body}\n"
    p.write_text(existing.rstrip() + "\n" + section)
-    return f"Reminder added: {heading}"
+    msg = f"Reminder added: {heading}"
+    if due:
+        msg += f" (due: {due})"
+    return msg
+
+
+def _reminders_remove(index: int) -> str:
+    p = _reminders_path()
+    if not p.exists() or not p.read_text().strip():
+        return "No reminders to remove."
+    sections = _parse_sections(p.read_text())
+    if not sections:
+        return "No reminders to remove."
+    if index < 1 or index > len(sections):
+        return (
+            f"Index {index} is out of range. "
+            f"There {'is' if len(sections) == 1 else 'are'} {len(sections)} "
+            f"reminder{'s' if len(sections) != 1 else ''} (1–{len(sections)}). "
+            "Call reminders_list to see them."
+        )
+    removed_heading = sections[index - 1][0]
+    sections.pop(index - 1)
+    p.write_text(_sections_to_text(sections))
+    return f"Removed reminder {index}: {removed_heading}"


 def _reminders_clear() -> str:
@@ -53,6 +156,31 @@ def _reminders_clear() -> str:
    return "All reminders cleared."


+# ---------------------------------------------------------------------------
+# Public helper for context_loader
+# ---------------------------------------------------------------------------
+
+def load_due_reminders() -> str:
+    """Return REMINDERS.md content filtered to only due and undated sections.
+
+    Called by context_loader at Tier 2+. Future-dated reminders are excluded
+    from the system prompt until their due date arrives.
+    """
+    p = _reminders_path()
+    if not p.exists():
+        return ""
+    text = p.read_text()
+    if not text.strip():
+        return ""
+    sections = _parse_sections(text)
+    due_sections = [(h, b) for h, b in sections if _is_due_or_undated(b)]
+    if not due_sections:
+        return ""
+    # Strip the raw due: line from body — the date is already part of the heading context
+    cleaned = [(h, _body_without_due(b)) for h, b in due_sections]
+    return _sections_to_text(cleaned).strip()
+
+
 # ---------------------------------------------------------------------------
 # Async wrappers
 # ---------------------------------------------------------------------------
@@ -61,9 +189,67 @@ async def reminders_list() -> str:
    return await asyncio.to_thread(_reminders_list)


-async def reminders_add(text: str, label: str | None = None) -> str:
-    return await asyncio.to_thread(_reminders_add, text, label)
+async def reminders_add(text: str, label: str | None = None, due: str | None = None) -> str:
+    return await asyncio.to_thread(_reminders_add, text, label, due)
+
+
+async def reminders_remove(index: int) -> str:
+    return await asyncio.to_thread(_reminders_remove, index)


 async def reminders_clear() -> str:
    return await asyncio.to_thread(_reminders_clear)
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="reminders_add",
+        description=(
+            "Add a new reminder to REMINDERS.md. Reminders are automatically surfaced "
+            "in context at the start of each session (Tier 2+). "
+            "Use this when the user asks you to remember something or follow up on something. "
+            "Set a due date to suppress the reminder until that date — useful for future tasks "
+            "that would be noise today."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "text":  types.Schema(type=types.Type.STRING, description="The reminder text"),
+                "label": types.Schema(type=types.Type.STRING, description="Optional heading (e.g. 'Follow up on NC Talk'). Defaults to current timestamp."),
+                "due":   types.Schema(type=types.Type.STRING, description="Optional due date in YYYY-MM-DD format. Reminder is hidden from context until this date arrives. Omit for an always-visible reminder."),
+            },
+            required=["text"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="reminders_list",
+        description=(
+            "Read all pending reminders, including future-dated ones not yet in context. "
+            "Shows due status for each (due today, overdue, or future date). "
+            "Use this before adding to avoid duplicates, or to show the user what's queued."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+    types.FunctionDeclaration(
+        name="reminders_remove",
+        description=(
+            "Remove a single reminder by its number. "
+            "Call reminders_list first to get the numbered list, then pass the number to remove."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "index": types.Schema(type=types.Type.INTEGER, description="The number of the reminder to remove (1 = first in reminders_list output)."),
+            },
+            required=["index"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="reminders_clear",
+        description=(
+            "Erase all pending reminders from REMINDERS.md. "
+            "Use this after you have acknowledged and acted on the reminders shown in your context."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+]
--- a/cortex/tools/scratch.py
+++ b/cortex/tools/scratch.py
@@ -17,6 +17,7 @@ import asyncio
 from datetime import datetime, timezone
 from pathlib import Path

+from google.genai import types
 from persona import persona_path


@@ -77,3 +78,51 @@ async def scratch_append(content: str, heading: str | None = None) -> str:

 async def scratch_clear() -> str:
    return await asyncio.to_thread(_scratch_clear)
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="scratch_read",
+        description=(
+            "Read the full contents of the scratchpad. "
+            "Use this to recall working notes, mid-task context, or anything previously jotted down. "
+            "The scratchpad is transient — nothing here is distilled or archived."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+    types.FunctionDeclaration(
+        name="scratch_write",
+        description=(
+            "Replace the entire scratchpad with new content. "
+            "Use this to set a clean working note, replacing whatever was there before. "
+            "For adding without replacing, use scratch_append instead."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "content": types.Schema(type=types.Type.STRING, description="The new scratchpad content (markdown supported)"),
+            },
+            required=["content"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="scratch_append",
+        description=(
+            "Add a new section to the bottom of the scratchpad without replacing existing content. "
+            "Each section gets a timestamp heading unless you supply one."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "content": types.Schema(type=types.Type.STRING, description="The content to append (markdown supported)"),
+                "heading": types.Schema(type=types.Type.STRING, description="Optional section heading. Defaults to current UTC timestamp."),
+            },
+            required=["content"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="scratch_clear",
+        description="Erase everything in the scratchpad. Use when the working notes are no longer needed.",
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+]
--- a/cortex/tools/system.py
+++ b/cortex/tools/system.py
@@ -2,13 +2,23 @@
 System tools — local machine operations.

 These tools affect the host system directly. Use with care.
+All tools in this module require the admin role.
 """

 import asyncio
 import logging
+import os
+import subprocess
+from pathlib import Path
+
+from google.genai import types

 logger = logging.getLogger(__name__)

+# Absolute paths — resolved relative to this file so they work regardless of cwd
+_CORTEX_DIR = Path(__file__).parent          # .../Cortex_and_Inara_dev/cortex/
+_PROJECT_ROOT = _CORTEX_DIR.parent           # .../Cortex_and_Inara_dev/
+
 ALLOW_SCRIPT = "/home/scott/.local/bin/claude-allow-dir"


@@ -42,3 +52,283 @@ async def claude_allow_dir(path: str, mode: str = "rw") -> str:
    except Exception as e:
        logger.error("claude_allow_dir error: %s", e)
        return f"Error: {e}"
+
+
+async def shell_exec(command: str, working_dir: str | None = None, timeout: int = 30) -> str:
+    """Execute a shell command on the Cortex host and return combined stdout/stderr."""
+    timeout = min(max(timeout, 1), 120)
+
+    cwd = None
+    if working_dir:
+        cwd = os.path.expanduser(working_dir)
+        if not os.path.isdir(cwd):
+            return f"Error: working_dir '{working_dir}' does not exist or is not a directory"
+
+    try:
+        proc = await asyncio.create_subprocess_shell(
+            command,
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE,
+            cwd=cwd,
+        )
+        stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=timeout)
+
+        out = stdout.decode(errors="replace").strip()
+        err = stderr.decode(errors="replace").strip()
+
+        parts = []
+        if out:
+            parts.append(out)
+        if err:
+            parts.append(f"[stderr]\n{err}")
+        combined = "\n".join(parts) if parts else "(no output)"
+
+        if proc.returncode != 0:
+            return f"Exit {proc.returncode}:\n{combined}"
+        return combined
+
+    except asyncio.TimeoutError:
+        return f"Error: command timed out after {timeout}s"
+    except Exception as e:
+        logger.error("shell_exec error: %s", e)
+        return f"Error: {e}"
+
+
+async def cortex_restart() -> str:
+    """Schedule a Cortex service restart 5 seconds from now.
+
+    Uses a detached subprocess so the restart survives the current process being
+    terminated by systemd. The calling session will drop — user should refresh.
+    """
+    subprocess.Popen(
+        ["bash", "-c", "sleep 5 && systemctl --user restart cortex"],
+        start_new_session=True,
+        stdout=subprocess.DEVNULL,
+        stderr=subprocess.DEVNULL,
+        close_fds=True,
+    )
+    logger.info("cortex_restart: restart scheduled in 5 seconds")
+    return (
+        "Cortex restart scheduled in 5 seconds. "
+        "The current connection will drop — please refresh the page after a moment."
+    )
+
+
+async def cortex_logs(lines: int = 50) -> str:
+    """Return recent lines from the Cortex systemd journal."""
+    n = min(max(int(lines), 1), 200)
+    try:
+        proc = await asyncio.create_subprocess_exec(
+            "journalctl", "--user", "-u", "cortex", f"-n{n}", "--no-pager",
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE,
+        )
+        stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=15)
+        out = stdout.decode(errors="replace").strip()
+        return out or stderr.decode(errors="replace").strip() or "No log output."
+    except asyncio.TimeoutError:
+        return "Error: journalctl timed out"
+    except Exception as e:
+        logger.error("cortex_logs error: %s", e)
+        return f"Error: {e}"
+
+
+async def cortex_status() -> str:
+    """Return Cortex service status: git branch/commit, ahead/behind remote, and systemctl state."""
+    lines = []
+
+    async def _git(*args: str) -> str:
+        proc = await asyncio.create_subprocess_exec(
+            "git", "-C", str(_PROJECT_ROOT), *args,
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.DEVNULL,
+        )
+        stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=10)
+        return stdout.decode(errors="replace").strip()
+
+    try:
+        branch  = await _git("rev-parse", "--abbrev-ref", "HEAD")
+        commit  = await _git("log", "--oneline", "-1")
+        # fetch quietly so ahead/behind is current
+        await asyncio.create_subprocess_exec(
+            "git", "-C", str(_PROJECT_ROOT), "fetch", "--quiet",
+            stdout=asyncio.subprocess.DEVNULL, stderr=asyncio.subprocess.DEVNULL,
+        )
+        ahead_behind = await _git("rev-list", "--left-right", "--count", f"HEAD...origin/{branch}")
+        ahead, behind = (ahead_behind.split() + ["?", "?"])[:2]
+
+        lines.append(f"**Branch:** {branch}  |  ahead {ahead} / behind {behind}")
+        lines.append(f"**Commit:** {commit}")
+    except Exception as e:
+        lines.append(f"Git info unavailable: {e}")
+
+    lines.append("")
+
+    try:
+        proc = await asyncio.create_subprocess_exec(
+            "systemctl", "--user", "status", "cortex", "--no-pager", "-l",
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.STDOUT,
+        )
+        stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=10)
+        # First 15 lines of systemctl output is enough — avoids log flood
+        status_lines = stdout.decode(errors="replace").splitlines()[:15]
+        lines.extend(status_lines)
+    except Exception as e:
+        lines.append(f"systemctl status unavailable: {e}")
+
+    return "\n".join(lines)
+
+
+async def cortex_update() -> str:
+    """Pull the latest code from git, syntax-check all Python files, and report.
+
+    Does NOT restart automatically — call cortex_restart separately after reviewing
+    the output if you want to apply changes.
+    """
+    lines = []
+
+    async def _run(*cmd: str, cwd: Path = _PROJECT_ROOT, timeout: int = 30) -> tuple[int, str]:
+        proc = await asyncio.create_subprocess_exec(
+            *cmd, cwd=str(cwd),
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.STDOUT,
+        )
+        stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout)
+        return proc.returncode, stdout.decode(errors="replace").strip()
+
+    # 1. Check for incoming commits before pulling
+    try:
+        await _run("git", "fetch", "--quiet")
+        rc, incoming = await _run("git", "log", "--oneline", "HEAD..origin/HEAD")
+        if rc == 0 and not incoming:
+            # Double-check with branch name in case origin/HEAD isn't set
+            branch_rc, branch = await _run("git", "rev-parse", "--abbrev-ref", "HEAD")
+            _, incoming = await _run("git", "log", "--oneline", f"HEAD..origin/{branch.strip()}")
+    except asyncio.TimeoutError:
+        return "Error: git fetch timed out — check network connectivity."
+    except Exception as e:
+        return f"Error during git fetch: {e}"
+
+    if not incoming:
+        rc2, current = await _run("git", "log", "--oneline", "-1")
+        return f"Already up to date.\n\nCurrent commit: {current}"
+
+    lines.append(f"**Incoming commits:**\n{incoming}\n")
+
+    # 2. Pull
+    try:
+        rc, pull_out = await _run("git", "pull", "--ff-only")
+    except asyncio.TimeoutError:
+        return "Error: git pull timed out."
+    except Exception as e:
+        return f"Error during git pull: {e}"
+
+    if rc != 0:
+        return f"git pull failed (exit {rc}):\n{pull_out}"
+
+    lines.append(f"**git pull:**\n{pull_out}\n")
+
+    # 3. Syntax check all Python files under cortex/
+    py_files = sorted(_CORTEX_DIR.rglob("*.py"))
+    errors = []
+    for f in py_files:
+        result = subprocess.run(
+            ["python3", "-m", "py_compile", str(f)],
+            capture_output=True, text=True,
+        )
+        if result.returncode != 0:
+            errors.append(f"  {f.relative_to(_PROJECT_ROOT)}: {result.stderr.strip()}")
+
+    if errors:
+        lines.append(f"**Syntax errors — do NOT restart until fixed:**")
+        lines.extend(errors)
+    else:
+        lines.append(f"**Syntax check:** {len(py_files)} files — all OK.")
+        lines.append("Call `cortex_restart` to apply the update.")
+
+    return "\n".join(lines)
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="shell_exec",
+        description=(
+            "Execute a shell command on the Cortex host machine and return its output. "
+            "Use for system diagnostics: disk usage (df -h), process status (ps aux), "
+            "directory listings (ls), memory (free -h), uptime, network info, log tails, etc. "
+            "Commands run as the Cortex service user. Timeout enforced (default 30s, max 120s). "
+            "Avoid destructive commands — prefer read-only system queries."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "command": types.Schema(type=types.Type.STRING, description="Shell command to run (e.g. 'df -h', 'ls ~/agents_sync/', 'journalctl --user -u cortex -n 50')"),
+                "working_dir": types.Schema(type=types.Type.STRING, description="Optional working directory (e.g. '~/agents_sync/projects'). Defaults to home directory."),
+                "timeout": types.Schema(type=types.Type.INTEGER, description="Timeout in seconds (default 30, max 120)"),
+            },
+            required=["command"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="claude_allow_dir",
+        description=(
+            "Add a directory to Claude Code's auto-allow list so Claude can read or write "
+            "files there without prompting. Edits ~/.claude/settings.json on the local machine. "
+            "Use this when Claude is silently hanging or being blocked from accessing a directory. "
+            "Changes take effect in the next Claude Code session."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "path": types.Schema(type=types.Type.STRING, description="Absolute or home-relative path to the directory (e.g. ~/OSIT_dev/aether_api_fastapi or /home/scott/agents_sync)"),
+                "mode": types.Schema(type=types.Type.STRING, description="Permission mode: 'r' (read-only), 'w' (write-only), or 'rw' (both). Default: rw"),
+            },
+            required=["path"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="cortex_restart",
+        description=(
+            "Restart the Cortex service via systemd. Schedules a restart 5 seconds from now. "
+            "The current connection will drop — inform the user to refresh the page. "
+            "Use after config changes, memory edits, or when the service needs a fresh start. "
+            "ADMIN ONLY."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+    types.FunctionDeclaration(
+        name="cortex_logs",
+        description=(
+            "Fetch recent lines from the Cortex systemd service journal. "
+            "Use for debugging errors, checking startup status, or reviewing recent activity. "
+            "ADMIN ONLY."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "lines": types.Schema(type=types.Type.INTEGER, description="Number of log lines to return (default 50, max 200)"),
+            },
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="cortex_status",
+        description=(
+            "Return Cortex service status: current git branch and commit, how many commits "
+            "ahead/behind the remote, and the systemctl service state. "
+            "Use to check what version is running or whether the service is healthy. "
+            "ADMIN ONLY."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+    types.FunctionDeclaration(
+        name="cortex_update",
+        description=(
+            "Pull the latest code from git, run a syntax check on all Python files, and report "
+            "what changed. Does NOT restart automatically — call cortex_restart separately after "
+            "reviewing the output. Will report syntax errors if the pull introduces broken code. "
+            "ADMIN ONLY. Requires confirmation."
+        ),
+        parameters=types.Schema(type=types.Type.OBJECT, properties={}),
+    ),
+]
--- a/cortex/tools/tasks.py
+++ b/cortex/tools/tasks.py
@@ -20,6 +20,7 @@ import asyncio
 from datetime import datetime, timezone
 from pathlib import Path

+from google.genai import types
 from persona import persona_path


@@ -59,13 +60,15 @@ def _format_task(t: dict) -> str:
 # Sync implementations — called via asyncio.to_thread
 # ---------------------------------------------------------------------------

-def _task_list(status: str | None) -> str:
+def _task_list(status: str | None, priority: str | None) -> str:
    tasks = _load()
    if status:
        tasks = [t for t in tasks if t["status"] == status]
+    if priority:
+        tasks = [t for t in tasks if t.get("priority") == priority]
    if not tasks:
-        label = f"No {status} tasks." if status else "No tasks yet."
-        return label
+        filters = " ".join(f for f in [status, priority] if f)
+        return f"No {filters} tasks." if filters else "No tasks yet."
    lines = [f"Tasks ({len(tasks)}):\n"]
    for t in tasks:
        lines.append(_format_task(t))
@@ -117,8 +120,8 @@ def _task_complete(task_id: str) -> str:
 # Async wrappers
 # ---------------------------------------------------------------------------

-async def task_list(status: str | None = None) -> str:
-    return await asyncio.to_thread(_task_list, status)
+async def task_list(status: str | None = None, priority: str | None = None) -> str:
+    return await asyncio.to_thread(_task_list, status, priority)


 async def task_create(title: str, description: str | None = None,
@@ -133,3 +136,71 @@ async def task_update(task_id: str, status: str | None = None, title: str | None

 async def task_complete(task_id: str) -> str:
    return await asyncio.to_thread(_task_complete, task_id)
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="task_list",
+        description=(
+            "List personal tasks from Inara's task list. "
+            "Use this to check what's on the list, review pending work, or find a task ID. "
+            "Optionally filter by status: 'todo', 'in_progress', or 'done'."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "status": types.Schema(type=types.Type.STRING, description="Filter by status: 'todo', 'in_progress', or 'done'. Omit to list all."),
+                "priority": types.Schema(type=types.Type.STRING, description="Filter by priority: 'low', 'normal', or 'high'. Omit to list all priorities."),
+            },
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="task_create",
+        description=(
+            "Add a new task to Inara's personal task list. "
+            "Use this when the user asks to remember something, add a to-do, or track a follow-up."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "title": types.Schema(type=types.Type.STRING, description="Short task title"),
+                "description": types.Schema(type=types.Type.STRING, description="Optional longer description or context"),
+                "priority": types.Schema(type=types.Type.STRING, description="Priority: 'low', 'normal', or 'high'. Default: normal."),
+            },
+            required=["title"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="task_update",
+        description=(
+            "Update an existing task. Use task_list first to get the task ID. "
+            "Can update status, title, description, or priority. "
+            "To just mark complete, use task_complete instead."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "task_id": types.Schema(type=types.Type.STRING, description="Task ID (e.g. t_abc123) — get from task_list"),
+                "status": types.Schema(type=types.Type.STRING, description="New status: 'todo', 'in_progress', or 'done'"),
+                "title": types.Schema(type=types.Type.STRING, description="Updated title"),
+                "description": types.Schema(type=types.Type.STRING, description="Updated description"),
+                "priority": types.Schema(type=types.Type.STRING, description="Updated priority: 'low', 'normal', or 'high'"),
+            },
+            required=["task_id"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="task_complete",
+        description=(
+            "Mark a task as done. Use task_list first to get the task ID. "
+            "Shorthand for task_update with status='done'."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "task_id": types.Schema(type=types.Type.STRING, description="Task ID (e.g. t_abc123) — get from task_list"),
+            },
+            required=["task_id"],
+        ),
+    ),
+]
--- a/cortex/tools/web.py
+++ b/cortex/tools/web.py
@@ -1,13 +1,17 @@
 """
-Web search tool — DuckDuckGo backend.
-
-Uses the duckduckgo-search library. Set DDG_API_KEY in .env for a paid account
-(higher rate limits). The free unauthenticated tier works for moderate usage.
+Web tools — search (DuckDuckGo), direct HTTP fetch, clean content extraction, and HTTP POST.
 """

 import asyncio
+import json
 import logging
+from urllib.parse import urlparse
+
+import httpx
+from google.genai import types
+
 from config import settings
+from persona import get_user

 logger = logging.getLogger(__name__)

@@ -48,3 +52,216 @@ def _sync_search(query: str, max_results: int) -> list[dict]:
    except Exception as e:
        logger.warning("DuckDuckGo search error: %s", e)
        return []
+
+
+async def http_fetch(
+    url: str,
+    method: str = "GET",
+    body: str | None = None,
+    timeout: int = 15,
+    max_chars: int = 8192,
+) -> str:
+    """Fetch a URL directly and return the raw response body.
+
+    Unlike web_search, this hits a specific URL — useful for health checks,
+    API probing, JSON endpoints, webhook testing, or reading raw page source.
+    For readable article content, use web_read instead.
+    Response body is capped at max_chars (default 8192, max 32768).
+    """
+    method = method.upper()
+    timeout = min(max(int(timeout), 1), 60)
+    max_chars = min(max(int(max_chars), 100), 131072)
+    try:
+        async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
+            resp = await client.request(method, url, content=body)
+            body_text = resp.text[:max_chars]
+            truncated = len(resp.text) > max_chars
+            suffix = f"\n\n[… truncated at {max_chars} chars]" if truncated else ""
+            return f"HTTP {resp.status_code} {resp.url}\n\n{body_text}{suffix}"
+    except httpx.HTTPError as e:
+        return f"HTTP error: {e}"
+    except Exception as e:
+        logger.warning("http_fetch error for %s: %s", url, e)
+        return f"Error: {e}"
+
+
+async def web_read(url: str, max_chars: int = 16000) -> str:
+    """Fetch a URL and extract clean readable text, stripping ads, navigation, and boilerplate.
+
+    Uses trafilatura to extract the main article content — ideal for blog posts,
+    documentation, news articles, and any page where you want the text without
+    surrounding noise. Returns markdown-formatted output.
+    For raw responses (JSON APIs, health checks), use http_fetch instead.
+    """
+    max_chars = min(max(int(max_chars), 1000), 131072)
+    return await asyncio.to_thread(_sync_web_read, url, max_chars)
+
+
+def _sync_web_read(url: str, max_chars: int) -> str:
+    try:
+        import trafilatura
+    except ImportError:
+        return "web_read requires trafilatura — run: pip install trafilatura"
+
+    downloaded = trafilatura.fetch_url(url)
+    if downloaded is None:
+        return f"Failed to download content from: {url}"
+
+    text = trafilatura.extract(downloaded, output_format="markdown", include_links=True, url=url)
+    if not text:
+        text = trafilatura.extract(downloaded, url=url)
+    if not text:
+        return f"Could not extract readable content from: {url}"
+
+    if len(text) > max_chars:
+        text = text[:max_chars] + f"\n\n[… truncated at {max_chars} chars — pass a larger max_chars (up to 131072) to see more]"
+    return f"Content from {url}:\n\n{text}"
+
+
+def _load_http_allowlist(username: str) -> list[str]:
+    """Load per-user HTTP POST allowlist (URL prefixes). Empty list = all blocked."""
+    path = settings.home_root() / username / "http_allowlist.json"
+    try:
+        return [str(p).strip() for p in json.loads(path.read_text()) if str(p).strip()]
+    except FileNotFoundError:
+        return []
+    except Exception as e:
+        logger.warning("failed to read http_allowlist.json for %s: %s", username, e)
+        return []
+
+
+def _http_post_allowed(url: str, allowlist: list[str]) -> bool:
+    """Return True if url starts with any allowlist entry (prefix match)."""
+    for prefix in allowlist:
+        if url.startswith(prefix):
+            return True
+    return False
+
+
+async def http_post(
+    url: str,
+    body: str = "",
+    headers: dict | None = None,
+    max_chars: int = 4096,
+) -> str:
+    """POST to an external URL. Requires the URL to match home/{user}/http_allowlist.json.
+
+    body may be a JSON string or plain text. If body is valid JSON, Content-Type is set
+    to application/json; otherwise text/plain. Override via the headers param.
+    Response is capped at max_chars (default 4096, max 131072).
+    """
+    username = get_user()
+    allowlist = _load_http_allowlist(username)
+    if not allowlist:
+        return (
+            f"http_post blocked — no allowlist configured. "
+            f"Add allowed URL prefixes to home/{username}/http_allowlist.json as a JSON array. "
+            f"Example: [\"https://api.example.com\"]"
+        )
+    if not _http_post_allowed(url, allowlist):
+        return (
+            f"http_post blocked — {url} does not match any allowlist entry for {username}. "
+            f"Add the URL prefix to home/{username}/http_allowlist.json."
+        )
+
+    max_chars = min(max(int(max_chars), 100), 131072)
+
+    # Auto-detect content type from body
+    body_str = body if isinstance(body, str) else json.dumps(body)
+    try:
+        json.loads(body_str)
+        content_type = "application/json"
+    except (json.JSONDecodeError, ValueError):
+        content_type = "text/plain"
+
+    req_headers = {"Content-Type": content_type}
+    if headers:
+        req_headers.update(headers)
+
+    try:
+        async with httpx.AsyncClient(timeout=30, follow_redirects=True) as client:
+            resp = await client.post(url, content=body_str.encode(), headers=req_headers)
+            body_text = resp.text[:max_chars]
+            truncated = len(resp.text) > max_chars
+            suffix = f"\n\n[… truncated at {max_chars} chars]" if truncated else ""
+            return f"HTTP {resp.status_code} {resp.url}\n\n{body_text}{suffix}"
+    except httpx.HTTPError as e:
+        return f"HTTP error: {e}"
+    except Exception as e:
+        logger.warning("http_post error for %s: %s", url, e)
+        return f"Error: {e}"
+
+
+DECLARATIONS = [
+    types.FunctionDeclaration(
+        name="web_search",
+        description=(
+            "Search the web for current information. Use this when you need up-to-date "
+            "facts, news, documentation, or anything not in your training data."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "query": types.Schema(type=types.Type.STRING, description="The search query string"),
+                "max_results": types.Schema(type=types.Type.INTEGER, description="Number of results to return (default 5, max 10)"),
+            },
+            required=["query"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="http_fetch",
+        description=(
+            "Fetch a specific URL and return the raw response body. Unlike web_search, this hits "
+            "a direct URL — useful for health checks, JSON API endpoints, webhook testing, "
+            "or inspecting raw page source. For readable article/doc content, use web_read instead. "
+            "Response body is capped at max_chars (default 8192, max 32768)."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "url": types.Schema(type=types.Type.STRING, description="Full URL to fetch"),
+                "method": types.Schema(type=types.Type.STRING, description="HTTP method: GET (default), POST, HEAD"),
+                "body": types.Schema(type=types.Type.STRING, description="Optional request body (for POST requests)"),
+                "timeout": types.Schema(type=types.Type.INTEGER, description="Request timeout in seconds (default 15, max 60)"),
+                "max_chars": types.Schema(type=types.Type.INTEGER, description="Max characters to return (default 8192, max 131072)"),
+            },
+            required=["url"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="web_read",
+        description=(
+            "Fetch a URL and extract clean readable text, stripping ads, navigation, sidebars, "
+            "and other boilerplate. Returns the main article/document content as markdown. "
+            "Use this for blog posts, documentation, news articles, GitHub READMEs, or any page "
+            "where you want the content without surrounding noise. "
+            "For raw HTTP responses (JSON APIs, health checks, source inspection), use http_fetch."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "url": types.Schema(type=types.Type.STRING, description="Full URL to fetch and extract"),
+                "max_chars": types.Schema(type=types.Type.INTEGER, description="Max characters to return (default 16000, max 131072)"),
+            },
+            required=["url"],
+        ),
+    ),
+    types.FunctionDeclaration(
+        name="http_post",
+        description=(
+            "POST to an external URL. Requires the URL to match the user's http_allowlist.json. "
+            "Use for calling webhooks, triggering automations, posting to APIs, or any HTTP action. "
+            "body is a string — JSON or plain text are both accepted (Content-Type auto-detected). "
+            "Override headers as needed. Response capped at max_chars (default 4096, max 131072)."
+        ),
+        parameters=types.Schema(
+            type=types.Type.OBJECT,
+            properties={
+                "url":       types.Schema(type=types.Type.STRING, description="Full URL to POST to"),
+                "body":      types.Schema(type=types.Type.STRING, description="Request body — JSON string or plain text"),
+                "max_chars": types.Schema(type=types.Type.INTEGER, description="Max response chars (default 4096, max 131072)"),
+            },
+            required=["url"],
+        ),
+    ),
+]
--- a/cortex/usage_tracker.py
+++ b/cortex/usage_tracker.py
@@ -0,0 +1,75 @@
+"""
+API usage and token tracking.
+
+Writes daily buckets to home/{username}/usage.json:
+
+  {
+    "2026-05-01": {
+      "gemini_api/gemini-2.0-flash": {"calls": 3, "prompt_tokens": 8400, "completion_tokens": 520},
+      "local/llama3.2:latest":       {"calls": 2, "prompt_tokens": 1200, "completion_tokens": 310}
+    }
+  }
+
+Claude CLI and Gemini CLI backends produce no structured token data and are not tracked.
+"""
+
+import asyncio
+import json
+import logging
+from datetime import date
+from pathlib import Path
+
+from config import settings
+
+logger = logging.getLogger(__name__)
+
+_LOCK = asyncio.Lock()
+
+
+def _usage_path(username: str) -> Path:
+    return settings.home_root() / username / "usage.json"
+
+
+async def record(
+    username: str,
+    backend: str,
+    model_name: str,
+    prompt_tokens: int,
+    completion_tokens: int,
+) -> None:
+    """Append one call's token counts to the daily usage log for this user.
+
+    backend    — "gemini_api" | "local"
+    model_name — the exact model string (e.g. "gemini-2.0-flash", "llama3.2:latest")
+    """
+    path = _usage_path(username)
+    today = date.today().isoformat()
+    key = f"{backend}/{model_name}"
+
+    async with _LOCK:
+        try:
+            data: dict = json.loads(path.read_text()) if path.exists() else {}
+        except Exception:
+            data = {}
+
+        entry = data.setdefault(today, {}).setdefault(
+            key, {"calls": 0, "prompt_tokens": 0, "completion_tokens": 0}
+        )
+        entry["calls"] += 1
+        entry["prompt_tokens"] += prompt_tokens
+        entry["completion_tokens"] += completion_tokens
+
+        try:
+            path.parent.mkdir(parents=True, exist_ok=True)
+            path.write_text(json.dumps(data, indent=2))
+        except Exception as e:
+            logger.warning("Failed to write usage data to %s: %s", path, e)
+
+
+def read_usage(username: str) -> dict:
+    """Return the full usage dict for this user. Empty dict if no file yet."""
+    path = _usage_path(username)
+    try:
+        return json.loads(path.read_text()) if path.exists() else {}
+    except Exception:
+        return {}
--- a/documentation/ARCH__AE_INTEGRATION.md
+++ b/documentation/ARCH__AE_INTEGRATION.md
@@ -0,0 +1,226 @@
+# Aether Platform Integration — Cortex Tool Layer
+
+> Last updated: 2026-04-30
+> Status: Journal toolset complete — broader AE integration planned
+
+This doc covers how Cortex/Inara integrates with the Aether Platform API, what's
+implemented, what the data model looks like, and what's planned next.
+
+---
+
+## Overview
+
+Cortex connects to the Aether Platform V3 API to give the orchestrator read/write
+access to the user's knowledge base (Journals) and task data. Auth uses the same
+`x-aether-api-key` + `x-account-id` headers as every other Aether client.
+
+Config lives in `.env`:
+```
+AE_API_URL=https://dev-api.oneskyit.com
+AE_API_KEY=...
+AE_ACCOUNT_ID=...
+AE_API_TIMEOUT=15
+```
+
+Tool implementation: `cortex/tools/ae_knowledge.py`
+Tool registrations: `cortex/tools/__init__.py`
+
+---
+
+## V3 Search Engine
+
+### Endpoint
+```
+POST /v3/crud/{obj_type}/search
+```
+For nested objects (journal_entry scoped to a journal):
+```
+POST /v3/crud/journal_entry/search
+  ?for_obj_type=journal&for_obj_id={journal_id}
+```
+
+### Search body
+```json
+{
+  "query_string": "fulltext search term",
+  "and": [
+    { "field": "tags",       "op": "icontains", "value": "networking" },
+    { "field": "created_on", "op": "gte",        "value": "2026-01-01" }
+  ],
+  "or": [...],
+  "page_size": 20,
+  "page": 1,
+  "order_by": "-updated_on"
+}
+```
+
+**`query_string` vs `and` filters on `default_qry_str`:**
+- `query_string` → triggers `MATCH(default_qry_str) AGAINST(... IN BOOLEAN MODE)` — uses the
+  FULLTEXT index. Faster and supports boolean operators (`+word`, `-word`, `"phrase"`).
+- `and` with `icontains` on `default_qry_str` → plain `LIKE '%term%'`. Slower, no index.
+
+**Important:** `query_string` must be present for `and`/`or` filters to apply. When using
+filters without a keyword query, pass `query_string: "%"` as a wildcard to activate the
+filter path without restricting by keyword.
+
+### Supported operators
+| Operator | SQL | Notes |
+|---|---|---|
+| `eq` | `=` | exact match |
+| `ne` | `!=` | not equal |
+| `gt` / `gte` | `>` / `>=` | numeric, dates |
+| `lt` / `lte` | `<` / `<=` | numeric, dates |
+| `contains` / `icontains` | `LIKE '%v%'` | substring; both case-insensitive on MariaDB |
+| `startswith` / `istartswith` | `LIKE 'v%'` | |
+| `endswith` / `iendswith` | `LIKE '%v'` | |
+| `like` | `LIKE` | raw LIKE pattern |
+| `in` | `IN (...)` | value is a list |
+| `is_null` / `is_not_null` | `IS NULL` / `IS NOT NULL` | no value needed |
+
+### Sorting
+`order_by` accepts any indexed field name. Prefix with `-` for descending:
+- `-updated_on` (default for listing)
+- `-created_on`
+- `name`
+- `-priority`
+
+### Pagination
+`page_size` (default 10, max ~100) + `page` (1-based).
+Total count is in `response["meta"]["data_list_count"]` — not a top-level key.
+
+---
+
+## journal_entry Schema
+
+Full table schema from `ae_describe journal_entry --detailed`:
+
+| Field | Type | Indexed | Notes |
+|---|---|---|---|
+| `id_random` | varchar(22) | UNI | DB public ID field — but API responses return this as `journal_entry_id` (the Vision ID convention: `{obj_type}_id`). `id_random` key is `None` in responses. |
+| `journal_id` | int | MUL | FK — use `for_obj_id` param in search |
+| `name` | varchar(250) | MUL | Entry title |
+| `short_name` | varchar(25) | | |
+| `summary` | text | | Short summary (1–2 sentences) |
+| `content` | text | | Full markdown content |
+| `content_html` | text | | HTML version |
+| `content_json` | longtext | | Structured content (editor format) |
+| `content_encrypted` | longtext | | Optional encrypted content |
+| `tags` | varchar(255) | MUL | Comma-separated string — filter with `icontains` |
+| `type` / `type_code` | varchar | | Classification: type |
+| `topic` / `topic_code` | varchar | | Classification: topic |
+| `activity` / `activity_code` | varchar | | Classification: activity |
+| `category_code` | varchar(25) | | Classification: category |
+| `code` | varchar(20) | | Short entry code |
+| `start_datetime` | datetime | MUL | Optional event start |
+| `end_datetime` | datetime | | Optional event end |
+| `seconds` / `hours` | int/decimal | | Duration |
+| `priority` | tinyint | MUL | 1=low → 5=high |
+| `status` | int | MUL | Status code (domain-specific) |
+| `private` / `public` / `personal` / `professional` | tinyint | MUL | Visibility flags |
+| `billable` | tinyint | | Billing flag |
+| `enable` | tinyint NOT NULL | MUL | Soft-delete flag (default 1) |
+| `hide` | tinyint | MUL | UI hide flag |
+| `archive` | tinyint | MUL | Archived flag |
+| `default_qry_str` | text | FULLTEXT | Auto-generated search target (name + content) |
+| `data_json` | longtext | | Arbitrary structured data |
+| `notes` | text | | Internal notes |
+| `created_on` | timestamp NOT NULL | MUL | Auto-set on create |
+| `updated_on` | timestamp | MUL | Auto-updated on change |
+
+### journal Schema (top-level)
+
+| Field | Type | Notes |
+|---|---|---|
+| `id_random` | varchar(22) | DB field — returned in API as `journal_id` |
+| `name` | varchar(250) | Journal name |
+| `summary` / `description` | text | |
+| `type_code` | varchar(25) | Journal type |
+| `enable` | tinyint | |
+| `created_on` / `updated_on` | timestamp | |
+
+---
+
+## Current Tool Inventory
+
+| Tool | Status | Notes |
+|---|---|---|
+| `ae_journal_list` | ✅ | Lists journals with id + name |
+| `ae_journal_search` | ✅ | Fulltext + tag/date/type/status/priority filters; paginated |
+| `ae_journal_entry_read` | ✅ | Full content by entry_id; configurable truncation |
+| `ae_journal_entries_list` | ✅ | Browse a journal newest-first; paginated |
+| `ae_journal_entry_create` | ✅ | Create with title, content, tags, summary |
+| `ae_journal_entry_update` | ✅ | Patch any fields (title, content, tags, summary, enable) |
+| `ae_journal_entry_disable` | ✅ | Soft-delete (enable=false) |
+| `ae_journal_entry_append` | ✅ | Timestamped append to bottom |
+| `ae_journal_entry_prepend` | ✅ | Timestamped prepend to top |
+| `ae_task_list` | ✅ | agents_sync Kanban (admin only) |
+
+---
+
+## ae_journal_search — Current Signature
+
+All filters are optional and combine with AND. At least one should be provided.
+
+```python
+ae_journal_search(
+    query: str = "",          # fulltext via query_string (MATCH/AGAINST)
+    journal_id: str = "",     # scope to a specific journal
+    tags: str = "",           # icontains on tags field
+    type_code: str = "",      # eq on type_code
+    topic_code: str = "",     # eq on topic_code
+    date_from: str = "",      # created_on gte  (YYYY-MM-DD)
+    date_to: str = "",        # created_on lte  (YYYY-MM-DD, exclusive of time — use next day to include full day)
+    sort_by: str = "updated", # updated | created | name | priority
+    sort_order: str = "desc",
+    status: int | None = None,
+    priority: int | None = None,
+    max_results: int = 10,
+    page: int = 1,
+)
+```
+
+**date_to boundary note:** `date_to='2026-01-17'` means `<= 2026-01-17 00:00:00`, which
+excludes entries created later that day. Use `date_to='2026-01-18'` to include all of Jan 17.
+
+---
+
+## Planned: Broader AE Platform Integration
+
+### Phase 1 — Journal Toolset (current)
+Complete read/write/search for Journals and Journal Entries.
+
+### Phase 2 — Tasks & Projects
+- `ae_task_create` / `ae_task_update` / `ae_task_complete` on Aether tasks (not just agents_sync Kanban)
+- Read project/task hierarchy
+
+### Phase 3 — Knowledge Import Pipeline
+- Script to walk markdown dirs, chunk by H2, create Journal entries
+- Dedup via search-before-create pattern
+- Tag and classify entries automatically via orchestrator
+
+### Phase 4 — People & Contacts
+- Read contact records (person, organization)
+- Link journal entries to contacts
+
+### Phase 5 — Calendar / Events
+- `start_datetime` / `end_datetime` already on journal_entry
+- Could expose time-scoped journal queries as a calendar view
+
+---
+
+## Notes on `tags` field
+
+`tags` is stored as a raw comma-separated varchar(255), not a JSON array.
+The API accepts a Python list on write (the `tags` PATCH key takes a list and the backend joins it).
+On read, it comes back as a **string** (e.g. `"shelterluv, api"`), not a list — normalize before
+displaying: `[t.strip() for t in tags_str.split(",") if t.strip()]`.
+For filtering: use `icontains` on `tags` inside the `"and"` list, e.g.:
+`{"field": "tags", "op": "icontains", "value": "networking"}`.
+A tag search for "net" matches "networking" AND "subnet" — acceptable for now.
+True per-tag filtering would require a tags junction table.
+
+## Notes on `default_qry_str`
+
+Auto-populated by the backend from `name` + content fields. Do not write to it directly.
+FULLTEXT index supports boolean mode: `+required -excluded "exact phrase"`.
+The `query_string` key in the search body triggers this path automatically.
--- a/documentation/ARCH__BACKENDS.md
+++ b/documentation/ARCH__BACKENDS.md
@@ -1,18 +1,20 @@
 # Architecture: LLM Backends

 > How Cortex selects and talks to AI models.
-> Last updated: 2026-04-06
+> Last updated: 2026-05-06

 ---

-## Backends
+## Providers

-| Backend | Type | Auth | Notes |
-|---|---|---|---|
-| **Claude CLI** | `claude_cli` | OAuth token from `~/.claude/.credentials.json` | Primary chat; model set via `DEFAULT_MODEL` in `.env` |
-| **Gemini CLI** | `gemini_cli` | Gemini CLI credentials | Fallback / explicit selection |
-| **Gemini API** | `gemini_api` | `GEMINI_API_KEY` in `.env` | Orchestrator tool loop only — not general chat |
-| **Local (OpenAI-compat)** | `local_openai` | API key per host in model registry | Open WebUI, Ollama, OpenRouter, LiteLLM, etc. |
+Cortex supports four model types, each dispatched differently:
+
+| Type | Auth | Use |
+|---|---|---|
+| `claude_cli` | OAuth token from `~/.claude/.credentials.json` | Chat, persona responses |
+| `gemini_cli` | Gemini CLI credentials | Chat fallback / explicit selection |
+| `gemini_api` | API key from registry account or `.env` | Orchestrator tool loop |
+| `local_openai` | API key per host in model registry | Open WebUI, Ollama, OpenRouter, LiteLLM, etc. |

 ---

@@ -26,93 +28,139 @@ request's **role** in the user's model registry. Roles: `chat`, `orchestrator`,

 Resolution order for a role:
 1. User registry: `roles[role].primary → backup_1 → backup_2 → backup_3 → backup_4`
-2. `.env` role default: `ROLE_CHAT=claude_cli`, `ROLE_DISTILL=gemini_api`, etc.
+2. `.env` role default: `ROLE_CHAT=claude_cli`, `ROLE_DISTILL=claude_cli`, etc.
 3. Hardcoded last-resort: `chat/distill/coder → claude_cli`, `orchestrator/research → gemini_api`

 ### Explicit Override

-The UI backend toggle cycles: **auto → claude → gemini → local → auto**
+The **Role** toggle in the Context & Memory panel cycles through configured role slots for the `chat` role: **Primary → Backup 1 → Backup 2 → auto**.

- **auto** (default): role-based routing as above; sends `model: null` to `/chat`
- **claude / gemini / local**: bypasses role routing; forces that specific backend
- When "local" is active, the configured model name appears below the toggle button
+- Each slot shows the configured model label
+- `auto` uses the Primary without forcing a specific backend type
+- The ⚡ Tools toggle is independent — it routes to the `orchestrator` role regardless of the chat role selection

-**Fallback chain** (automatic, on any error):
+**Fallback chain** (automatic, only when no explicit registry entry exists):
 ```
 claude  → gemini
 gemini  → claude
 local   → claude
 ```
+When a model is explicitly configured in the registry, errors surface immediately — no silent fallback.

-Each response includes a model label (bottom-right of the message bubble) showing what
-actually responded. Amber label with `⚡` = fallback was used.
-
-Auth expiry on Claude triggers a UI banner + `claude_auth_expired` SSE event.
+Each response shows a model tag (bottom-right of the message bubble) with the model label and host.

 ---

-## Model Registry
+## Model Registry — V2 Schema

 Per-user configuration stored in `home/{user}/model_registry.json`.

-Hosts and models are managed at **Settings → Model Registry** (`/settings/local`).
-
-### Schema
+Managed at **Settings → Models** (`/settings/models`). Full provider UI coming in Phase 2.

 ```json
 {
-  "version": 1,
+  "version": 2,
+
+  "providers": {
+    "anthropic": {
+      "credentials": [
+        {"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}
+      ]
+    },
+    "google": {
+      "accounts": [
+        {"id": "a1b2", "label": "One Sky IT", "api_key": "AIza..."}
+      ]
+    }
+  },
+
  "hosts": [
    {
      "id": "abc123",
-      "label": "Home ML Laptop",
+      "label": "Gaming Laptop",
      "api_url": "http://192.168.x.x:3000",
-      "api_key": "sk-...",
+      "api_key": "",
      "host_type": "openwebui"
    }
  ],
+
  "models": [
    {
-      "id": "def456",
+      "id": "m1",
+      "type": "claude_cli",
+      "label": "Sonnet 4.6 (CLI)",
+      "model_name": "claude-sonnet-4-6",
+      "provider": "anthropic",
+      "credential_id": "cli",
+      "context_k": 200,
+      "tags": ["chat", "persona"]
+    },
+    {
+      "id": "m2",
+      "type": "gemini_api",
+      "label": "Gemini 2.5 Flash (OSIT)",
+      "model_name": "gemini-2.5-flash",
+      "provider": "google",
+      "account_id": "a1b2",
+      "context_k": 1000,
+      "tags": ["orchestrator", "research"]
+    },
+    {
+      "id": "m3",
      "type": "local_openai",
-      "label": "Gemma Medium",
-      "model_name": "agent-support-gemma-medium",
+      "label": "Gemma 4 E4B",
+      "model_name": "gemma4:e4b",
+      "provider": "local",
      "host_id": "abc123",
-      "context_k": 50,
-      "tags": ["chat", "fast"]
+      "context_k": 72,
+      "max_rounds": 5,
+      "tools": true,
+      "tags": ["fast", "local"]
    }
  ],
+
  "roles": {
-    "chat": {
-      "primary": "def456",
-      "backup_1": "claude_cli"
-    }
+    "chat":         {"primary": "m1", "backup_1": "m2", "backup_2": "m3"},
+    "orchestrator": {"primary": "m2", "backup_1": "m3"},
+    "distill":      {"primary": "m1"}
  }
 }
 ```

-### host_type
+### Optional model fields

-Controls which API path layout is used:
+| Field | Type | Default | Meaning |
+|---|---|---|---|
+| `context_k` | int | 32 | Context window in thousands of tokens. Used for compaction budget (75% of window). |
+| `max_rounds` | int \| null | null | Per-model tool loop cap. `null` = use global `orchestrator_max_rounds`. Effective limit = `min(per_model, global)`. |
+| `tools` | bool | true | Whether this model supports tool calling. `false` = skip tool loop entirely; model gets a plain chat request. |
+
+### host_type (local hosts)

 | `host_type` | Chat endpoint | Models endpoint | Use for |
 |---|---|---|---|
 | `openwebui` (default) | `POST {url}/api/chat/completions` | `GET {url}/api/models` | Open WebUI, Ollama |
 | `openai` | `POST {url}/chat/completions` | `GET {url}/models` | OpenRouter, LiteLLM, Anthropic-compat |

-Set `api_url` to the base path ending just before `/chat/completions`:
+Set `api_url` to the base path before `/chat/completions`:
 - OpenRouter: `https://openrouter.ai/api/v1`
- LiteLLM proxy: `http://host:port`

 ### Built-in model IDs

-Always resolvable without a registry entry:
+Always resolvable without a user-created registry entry. Used as role defaults.

-| ID | Backend |
-|---|---|
-| `claude_cli` | Claude CLI subprocess |
-| `gemini_cli` | Gemini CLI subprocess |
-| `gemini_api` | Gemini API (SDK) — orchestrator only |
+| ID | Type | Notes |
+|---|---|---|
+| `claude_cli` | `claude_cli` | Model from `DEFAULT_MODEL` in `.env` |
+| `gemini_cli` | `gemini_cli` | Gemini CLI subprocess |
+| `gemini_api` | `gemini_api` | Model from `ORCHESTRATOR_MODEL` in `.env`; key from `GEMINI_API_KEY` |
+
+### V1 → V2 migration
+
+Automatic on first load. Changes:
+- Adds `providers` section (Anthropic CLI credential + empty Google accounts)
+- Migrates `gemini_api_key` from `auth.json` → `providers.google.accounts[0]`
+- All existing hosts, models, and role assignments are preserved

 ---

@@ -122,9 +170,9 @@ Runs `claude --print --no-session-persistence --output-format text` as a subproc

 - System prompt passed via `--system-prompt`
 - Conversation history formatted as `<conversation>` block
- Token read live from `~/.claude/.credentials.json` on every call — never relies on the
+- Token read live from `~/.claude/.credentials.json` on every call — never uses the
  env var, which goes stale after `claude auth login`
- Model override via `--model` flag when a specific `model_name` is configured in the registry
+- Model override via `--model` flag when `model_name` is set in the registry entry

 Timeout: `TIMEOUT_CLAUDE=60` seconds (`.env`)

@@ -136,7 +184,7 @@ Runs `gemini --output-format text --extensions "" -p <prompt>` as a subprocess.

 - `--extensions ""` disables all MCP extensions — prevents child processes keeping pipes open
 - `start_new_session=True` puts the process in its own group for clean `os.killpg` on timeout
- Output is cleaned to strip CLI noise lines (loading messages, retry notices, quota warnings)
+- Output is cleaned to strip CLI noise (loading messages, retry notices, quota warnings)

 Timeout: `TIMEOUT_GEMINI=120` seconds (`.env`)

@@ -155,13 +203,23 @@ Timeout: `TIMEOUT_LOCAL=300` seconds (`.env`) — local models may need to load

 ---

+## Gemini API (Orchestrator)
+
+Used by `orchestrator_engine.py` for the ReAct tool loop. Not used for general chat.
+
+API key resolution order:
+1. `api_key` embedded in the resolved orchestrator model config (V2 registry with `account_id`)
+2. `get_user_gemini_key(user)` — reads from `auth.json` (legacy, kept for compat)
+3. `GEMINI_API_KEY` in `.env` (server default)
+
+---
+
 ## Distillation

-Memory distillation uses `role="distill"` for mid and long passes. Configure the distill
-model via the Model Registry → Role Assignments → Distill role.
+Memory distillation uses `role="distill"`. Configure via Model Registry → Role Assignments.
+
+`.env` override: `ROLE_DISTILL=claude_cli` (default).

-`.env` override: `ROLE_DISTILL=claude_cli` (default). Set to any built-in ID or leave blank
-to fall through to the hardcoded default (`claude_cli`).

 ---

@@ -170,7 +228,8 @@ to fall through to the hardcoded default (`claude_cli`).
 | File | Responsibility |
 |---|---|
 | `cortex/llm_client.py` | `complete()` — routing, dispatch, fallback |
-| `cortex/model_registry.py` | Per-user registry CRUD and resolution |
+| `cortex/model_registry.py` | Per-user registry CRUD and resolution (V2) |
 | `cortex/routers/local_llm.py` | Settings UI routes + `/api/models/role` AJAX |
 | `cortex/routers/chat.py` | `_backend_label()`, `fallback_used` flag |
+| `cortex/routers/orchestrator.py` | Engine selection, Gemini API key resolution |
 | `cortex/config.py` | `ROLE_*` env defaults, `DEFINED_ROLES`, `PRIMARY_BACKEND` |
--- a/documentation/ARCH__CHANNELS.md
+++ b/documentation/ARCH__CHANNELS.md
@@ -33,7 +33,7 @@ Single-page app served from `cortex/static/`. All chat happens via `POST /chat`

 **Files panel:** Browse and edit persona markdown files in-browser. Session search at the bottom.

-**Settings:** `/settings` — Gemini API key, Google account, connected status. `/settings/local` — local model hosts and models.
+**Settings:** `/settings` — Gemini API key, Google account, connected status. `/settings/models` — model registry (providers, hosts, models, roles).

 ---

@@ -129,16 +129,24 @@ User-defined scheduled jobs stored in `home/{user}/persona/{name}/CRONS.json`. R

 ## Notification Channel Config

-`notification_channel` in `channels.json` sets the default outbound channel for all proactive messages (distill alerts, cron message/brief jobs):
+`notification_channel` in `channels.json` sets the default outbound channel for all proactive messages (distill alerts, cron jobs, reminder checks):

 ```json
 {
-  "notification_channel": "nextcloud",
-  ...
+  "notification_channel": "web_push",
+  "notification_email": "user@example.com",
+  "nextcloud": { "notification_room": "<token>" },
+  "google_chat": { "outbound_webhook": "https://..." }
 }
 ```

-If absent, defaults to `nextcloud` if configured. Currently only NC Talk is supported for outbound; Google Chat outbound is a future item.
+Supported channels: `web_push` (browser push via VAPID), `email`, `nextcloud` (NC Talk), `google_chat`. Configured via **Settings → Notifications** (`/settings/notifications`).
+
+**Proactive notification triggers:**
+- **Daily 09:00** — `_run_reminder_check()` in `scheduler.py`: reads due/overdue reminders per persona, fires `notify()` with a formatted summary
+- **Memory distillation** — `_run_mid()` / `_run_long()` call `notify()` on completion
+- **Cron jobs** — `message` / `brief` job types call `notify()` directly
+- **On-demand** — `POST /api/push/test` (test notification) and `POST /api/push/reminders/check` (immediate reminder check)

 ---

--- a/documentation/ARCH__FUTURE.md
+++ b/documentation/ARCH__FUTURE.md
@@ -1,7 +1,7 @@
 # Architecture: Planned Features

 > What's next and how it's designed to work.
-> Last updated: 2026-04-04
+> Last updated: 2026-05-11

 For the current task list see `TODO__Agents.md`. For phases and priorities see `ROADMAP.md`.

@@ -9,17 +9,17 @@ For the current task list see `TODO__Agents.md`. For phases and priorities see `

 ## 1. Local Orchestrator

-**Status:** High priority — design complete, not yet built.
+**Status:** Partially built — `openai_orchestrator.py` exists and is wired into `POST /orchestrate`. When the `orchestrator` role in the model registry resolves to a `local_openai` model, it routes there automatically. Remaining work is quality/reliability parity with the Gemini orchestrator, not ground-up design.

-Same ReAct tool loop as the Gemini API orchestrator, but driven by a local model via Open WebUI's OpenAI-compatible API. Enables offline/private agent tasks with no API cost.
+Same ReAct tool loop as the Gemini API orchestrator, driven by a local model via Open WebUI's OpenAI-compatible API. Enables offline/private agent tasks with no API cost.

 **Why local models work for this now:** Gemma 4 E4B and 26B A4B both support OpenAI `tools` / `tool_choice` function calling. The tool schema is nearly identical to Gemini's `FunctionDeclaration` — minor field renaming only.

 **Design:**
 ```
-POST /orchestrate  (mode: "local")
+POST /orchestrate  (role resolves to local_openai model)
    ↓
-local_orchestrator_engine.py
+openai_orchestrator.py
    • converts tools/ to OpenAI tools format
    • POST /api/chat/completions with tools array
    • parse tool_calls response
@@ -34,16 +34,57 @@ Model selection:
 - **Gemma 4 26B A4B** (9 t/s, 50k ctx) — heavier reasoning, background tasks

 Context budget per iteration (system prompt + memory + tool results + history):
- Small model: budget ~40-50k tokens per round
- Medium model: budget ~35-40k tokens per round
+- Small model: budget ~40–50k tokens per round
+- Medium model: budget ~35–40k tokens per round
+
+Context compaction (to implement): automatically trim stale tool results mid-run when
+approaching the budget ceiling, preserving only the most recent N tool exchanges.

 Full API reference: [`docs/OPEN_WEBUI_API.md`](../docs/OPEN_WEBUI_API.md)

 ---

-## 2. Dev Agent Pipeline
+## 2. Orchestrator Tool Expansions

-**Status:** Design complete, not yet built.
+**Status:** Ongoing. Current tool count: 45. Previously planned tools are all complete.
+
+### Completed
+All originally planned tools are live: `cortex_restart`, `cortex_logs`, `http_fetch`,
+`file_list`, `file_write`, `nc_talk_send`, `email_send`, `web_push`, `agent_notes_*`.
+
+### Next additions
+
+**Datetime note:** The current date and time is already injected into every system prompt
+via `context_loader.py` (`--- System --- Current date and time: ...`). A dedicated
+`datetime_now` tool is not needed — the timestamp is always in context.
+
+### Completed Round 2
+| Tool | Notes |
+|---|---|
+| `session_search` | `tools/files.py` — full-text grep across session logs; params: `query`, `limit` (max 20); own sessions only via ContextVars. 2026-05-08 |
+| `reminders due dates` | `tools/reminders.py` — optional `due: YYYY-MM-DD` on `reminders_add`; `load_due_reminders()` suppresses future-dated entries from context. 2026-05-08 |
+| `spawn_agent` | `tools/agents.py` — sync sub-agent via role model; semaphore per host (`max_concurrent` in host schema); `asyncio.wait_for` timeout; admin-only. 2026-05-08 |
+
+### Remaining Round 2
+
+| Tool | Module | Priority | Description |
+|---|---|---|---|
+| `http_post` | `web.py` | Medium | POST to an external URL — for webhooks, REST APIs, form submissions. Requires a per-user host allowlist (same pattern as `email_send`) to prevent misuse. |
+| `nc_talk_history` | `notify.py` | Medium | Read recent messages from a Nextcloud Talk conversation. The bot can send but cannot read — adding read capability gives it full context before replying. |
+| `task_list` priority filter | `tasks.py` | Low | `task_list` accepts `status` but not `priority`. Add `priority` param so the agent can ask "what are my high-priority tasks?" without returning everything. |
+| `http_fetch` max_chars | `web.py` | Low | Currently hardcapped at 8,192 chars. Accept optional `max_chars` param so callers can request more or less content. |
+
+### Not needed / deferred
+- **`datetime_now`** — already in system prompt (see note above)
+- **`memory_read`** — memory files are already loaded into system prompt at Tier 2+; a tool adds no value except at Tier 1, which is a rare edge case
+- **Calculator** — modern models handle arithmetic well; `shell_exec` covers edge cases for admins
+- **Google Calendar** — useful but requires Google API OAuth scope expansion; defer until auth layer supports it
+
+---
+
+## 3. Dev Agent Pipeline
+
+**Status:** Design complete, not yet built. Review §8 (Agent Architecture Patterns) before starting.

 Accept a plain-English task, implement code changes, verify them, and present for human approval before committing.

@@ -64,7 +105,7 @@ Supervisor Agent
 Human approval gate
    • summary in Cortex UI or NC Talk
    • approve → commit (+ optional push)
-    • reject <EFBFBD><EFBFBD> feedback back to specialist
+    • reject → feedback back to specialist
 ```

 **Specialists** (both Claude CLI):
@@ -84,7 +125,7 @@ Human approval gate

 ---

-## 3. Gitea Integration
+## 4. Gitea Integration

 **Status:** Not started. pfSense port forward for SSH already confirmed working.

@@ -97,7 +138,7 @@ SSH clone/push: `git clone ssh://git@git.dgrzone.com:2222/<user>/<repo>.git`

 ---

-## 4. Knowledge Layer (AE Journals)
+## 5. Knowledge Layer (AE Journals)

 **Status:** Tools exist, import script not yet built.

@@ -122,16 +163,19 @@ AE Journals becomes the searchable long-term knowledge base. Complements memory

 ---

-## 5. Intelligent Model Routing
+## 6. Intelligent Model Routing

-**Status:** Deferred. Currently user-toggled.
+**Status:** Partially addressed. Model Registry V2 (2026-04-27) introduced role-based routing —
+`chat`, `orchestrator`, `distill`, `coder`, `research` roles each have their own primary/backup
+model chain, and the UI role toggle lets users manually select which role handles a message.
+Automatic task-characteristic routing (below) is still deferred.

-Route automatically based on task characteristics rather than requiring manual backend selection:
+Route automatically based on task characteristics rather than requiring manual selection:

 | Task type | Backend | Reason |
 |---|---|---|
 | User-facing conversation | Claude | Quality prose, persona fidelity |
-| Tool use / orchestration | Gemini API | Native function calling, free tier |
+| Tool use / orchestration | Gemini API or local | Native function calling |
 | Private / sensitive / offline | Local (Ollama) | No data leaves the network |
 | Long context (>50k tokens) | Gemini 2.0 | 1M token context window |
 | Fast/cheap simple queries | Local (E4B) | 25 t/s, no API cost |
@@ -140,7 +184,7 @@ Routing logic would live in `llm_client.py` or a new `router.py` — map task me

 ---

-## 6. RAG via Open WebUI
+## 7. RAG via Open WebUI

 **Status:** Future — Open WebUI already supports it.

@@ -152,9 +196,9 @@ API reference: [`docs/OPEN_WEBUI_API.md`](../docs/OPEN_WEBUI_API.md) — RAG sec

 ---

-## 8. Agent Architecture Ideas (from Claude Code leak)
+## 8. Agent Architecture Patterns — Research

-**Status:** Research — review before building dev agent pipeline and orchestrator.
+**Status:** Research — review before building dev agent pipeline and local orchestrator.

 The Claude Code system prompt was leaked in early April 2026. Two reimplementation repos are worth reading for design ideas before building out the dev agent pipeline and local orchestrator:

@@ -175,18 +219,326 @@ The Claude Code system prompt was leaked in early April 2026. Two reimplementati

 **File history journaling** — beyond session logs, a journal of what files changed and why, with replay summaries. Different from memory distillation — more like a git log for agent actions. Could complement the supervisor agent's diff review.

-**Plugin/manifest-based tool extensions** — tools declared via manifest rather than hardcoded in `__init__.py`. Would make adding new orchestrator tools less invasive. Worth considering before the tool suite grows much larger.
+**Plugin/manifest-based tool extensions** — tools declared via manifest rather than hardcoded in `__init__.py`. Would make adding new orchestrator tools less invasive. Worth considering before the tool suite grows much larger (currently 27 tools).

 ---

-## 7. Permanent Fleet Hosting
+## 9. Permanent Fleet Hosting

-**Status:** Deferred.
+**Status:** Deferred. Currently running on `scott-lt-i7-rtx` (gaming/agents laptop).

-Currently running on `scott_lpt` (main laptop). Long-term target: home server (always-on, Docker).
+Long-term target: home server (always-on, Docker). `docker-compose.yml` already exists in the project root.

-`docker-compose.yml` already exists in the project root. Deployment path:
+Deployment path:
 1. Copy to home server
 2. Configure reverse proxy (Nginx, already Docker-hosted)
-3. Set subdomain `cortex.dgrzone.com` → home server internal IP
+3. Update `cortex.dgrzone.com` → home server internal IP in pfSense
 4. WireGuard required for all access — not internet-exposed
+5. Update `FLEET_MANIFEST.md` and CLAUDE.md fleet table
+
+---
+
+## 10. Cortex Mesh — Multi-Instance Fleet
+
+**Status:** Concept — no design yet.
+
+Rather than a single Cortex instance, each device in the fleet runs its own instance with its own persona(s), local models, and capabilities. Instances can delegate tasks to each other based on available resources and roles.
+
+**Use cases:**
+- `scott_lpt` (edit/dev node) delegates code tasks to `scott-lt-i7-rtx` (GPU/Ollama host)
+- A background cron on one instance triggers an orchestrated task on another
+- Each instance has its own "best available" model — mesh routing picks the right node automatically
+
+**Design questions to resolve:**
+- Auth between instances (shared JWT secret vs. per-instance API keys)
+- How instances advertise capabilities (model registry over HTTP? shared Syncthing file?)
+- Whether `ae_send_message` / the existing inbox system is the right coordination layer or if a dedicated Cortex-to-Cortex protocol is needed
+- Session continuity — does a conversation that starts on one node stay there, or can it migrate?
+
+The Syncthing-synced `home/` directory and shared `model_registry.json` already provide a natural foundation — instances share persona memory and context without a central DB.
+---
+
+## 11. LLM Wiki — Persistent Knowledge Compilation (Karpathy Pattern)
+
+**Status:** Concept — no design yet. Inspired by [Karpathy's llm-wiki](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) gist.
+
+**Core idea:** Instead of treating AE Journals as an archive you retrieve from, evolve them into a **living wiki** that the LLM incrementally builds and maintains. When a new source is added, the LLM doesn't just index it — it reads it, extracts key information, and integrates it into the existing wiki: updating entity pages, revising topic summaries, flagging contradictions, strengthening or challenging the evolving synthesis. Knowledge is compiled once and kept current, not re-derived on every query.
+
+This is a philosophical shift from our current approach (RAG/retrieval) toward **compounding knowledge** — the wiki gets richer with every source added and every question asked.
+
+### Three-Layer Architecture
+
+```
+Raw Sources (immutable)          ↓
+    → LLM reads, extracts, cross-references
+Wiki (LLM-maintained markdown)  ← the persistent artifact
+    → Human reads, LLM writes
+Schema (CLAUDE.md / AGENTS.md)  ← configuration + conventions
+```
+
+1. **Raw sources** — curated, immutable originals (articles, papers, session logs, transcripts). LLM reads from them, never modifies them.
+2. **The wiki** — directory of LLM-generated markdown files: summaries, entity pages, concept pages, comparisons, synthesis. The LLM owns this layer entirely. Creates pages, updates them when new sources arrive, maintains cross-references.
+3. **Schema** — a configuration document (analogous to our `PROTOCOLS.md`) that tells the LLM how the wiki is structured, what conventions to follow, and what workflows to use when ingesting sources or answering questions. Co-evolved with the human over time.
+
+### Operations
+
+**Ingest.** Drop a new source into the raw collection and tell the LLM to process it. Flow: LLM reads source → discusses key takeaways with human → writes summary page → updates index → updates relevant entity/concept pages (a single source might touch 10-15 pages) → appends to log. Human stays involved, guiding emphasis.
+
+**Query.** Ask questions against the wiki. LLM reads the index to find relevant pages, drills in, synthesizes an answer with citations. **Key insight: good answers get filed back into the wiki as new pages.** A comparison table, an analysis, a connection discovered — these are valuable and shouldn't disappear into chat history.
+
+**Lint.** Periodic health check: contradictions between pages, stale claims superseded by newer sources, orphan pages with no inbound links, missing cross-references, data gaps that could be filled with a web search.
+
+### Index and Log (Two Navigation Files)
+
+**`index.md`** — content-oriented catalog. Every wiki page listed with link, one-line summary, and optional metadata (date, source count). Organized by category. LLM updates on every ingest. At moderate scale (~100 sources, ~hundreds of pages), this replaces the need for embedding-based RAG.
+
+**`log.md`** — chronological, append-only record of what happened and when (ingests, queries, lint passes). Each entry starts with a consistent prefix (e.g. `## [2026-04-02] ingest | Article Title`) making it parseable with simple tools like `grep "^## \[" log.md | tail -5`.
+
+### Applicability to Cortex / Inara
+
+This pattern maps naturally to several existing concepts:
+
+| Karpathy Concept | Cortex Equivalent | Gap |
+|---|---|---|
+| Raw sources | Session logs, imported docs | No curated raw-source collection yet |
+| Wiki pages | AE Journals | Journals are entry-based, not interlinked-wiki-based |
+| Index + Log | No equivalent | Would need `wiki_index.md` and `wiki_log.md` |
+| Schema/Protocols | PROTOCOLS.md, OPERATIONS.md | Not configured for wiki maintenance workflows |
+| Lint operation | No equivalent | No periodic wiki health-check exists |
+| Answers filed back | Session chat history | Answers are lost after session (unless distilled) |
+| Obsidian as IDE | Cortex UI / Files panel | Files panel could serve as the browsing surface |
+
+**Next steps (if pursued):**
+1. Design the wiki directory structure within `agents_sync/` — separate from session logs and memory files
+2. Define the schema document — what goes in a wiki page, cross-reference format, category taxonomy
+3. Build an ingest tool/script that reads a source and updates wiki pages (LLM-driven)
+4. Build a lint cron job that health-checks the wiki periodically
+5. Consider Obsidian compatibility for human browsing of the wiki graph
+
+---
+
+## 13. Multi-Level Agent Management
+
+**Status:** Design complete — implementation not yet started. See `TODO__Agents.md` for the task breakdown.
+
+Cortex personas can spawn specialized sub-agents to handle parallel or long-running work.
+Sub-agents can in turn spawn lightweight support agents for simple subtasks. The hierarchy
+is capped at three levels to prevent runaway delegation.
+
+### Level Definitions
+
+| Level | Name | Created by | Can spawn | Tool scope |
+|---|---|---|---|---|
+| **1** | Cortex Persona (Inara) | HTTP request / cron | Level 2 | Full orchestrator tool set |
+| **2** | Specialized Sub-Agent | Level 1 `spawn_agent` | Level 3 only | Role-scoped; `spawn_agent` auto-restricted so children are Level 3 |
+| **3** | Basic Support Agent | Level 2 `spawn_agent` | Nothing | Narrow tool set; `spawn_agent` and `aider_run` denied |
+
+**Examples:**
+- Level 1 spawns a Level 2 **Coder** agent (has file + git + shell tools; can spawn a Level 3 syntax-checker)
+- Level 1 spawns a Level 2 **Research** agent (web tools only; can spawn a Level 3 web reader for parallel page fetches)
+- Level 2 spawns a Level 3 **Support** agent for a focused subtask (web_search only, no writes, no further delegation)
+
+### Core Problem: Everything is Currently Synchronous
+
+Both `spawn_agent` and `aider_run` block the calling coroutine for their full duration
+(default 120s / 300s respectively). Level 1 (Inara) cannot respond to the user, send
+notifications, or inspect other agents while waiting. For 5-minute Aider runs or multi-step
+research agents this is unusable — the user sees nothing until completion or timeout.
+
+### Design
+
+#### 1. Agent Manager (`cortex/agent_manager.py`)
+
+A lightweight in-process registry of running and recently completed agents. Module-level
+dict protected by `asyncio.Lock()`:
+
+```python
+@dataclass
+class AgentRecord:
+    agent_id: str           # UUID
+    level: int              # 1 / 2 / 3
+    role: str               # e.g. "coder", "research"
+    task: str               # first 200 chars of the task
+    status: str             # running / done / failed / cancelled / timeout
+    started: datetime
+    finished: datetime | None
+    parent_id: str | None   # lineage — which agent spawned this one
+    result: str | None      # populated on completion (first 500 chars)
+    notify: bool            # fire web_push/NC Talk notification on completion
+    user: str
+
+_agents: dict[str, AgentRecord] = {}
+_lock = asyncio.Lock()
+```
+
+On completion, the manager calls `notification.py notify()` if `notify=True` — the same
+function used by reminder checks and cron completions. Completed agents stay in the
+registry for 24 hours then are pruned on next access.
+
+#### 2. Background Mode for `spawn_agent`
+
+Add `background: bool = False` and `notify: bool = False` to `spawn_agent`. When
+`background=False` (default): existing synchronous blocking behaviour — unchanged, no
+regression. When `background=True`: wraps the run in `asyncio.create_task()`, registers
+in the agent manager, returns an `agent_id` string immediately.
+
+```python
+# Level 1 — non-blocking delegation:
+agent_id = await spawn_agent(
+    task="Research Zigbee mesh repeaters; summarize findings to my journal",
+    role="research",
+    background=True,
+    notify=True,        # web_push + NC Talk when done
+)
+# Returns "550e8400-..." immediately. Inara continues responding to the user.
+```
+
+#### 3. Agent Lifecycle Tools
+
+Three new tools, wired into `cortex/tools/__init__.py` under the "Agents" category:
+
+| Tool | Params | Description |
+|---|---|---|
+| `agent_status(agent_id)` | `agent_id: str` | Status, role, task, elapsed, result preview |
+| `agent_list(status=None, limit=10)` | `status: str \| None` | All agents for current user; filter by status |
+| `agent_cancel(agent_id)` | `agent_id: str` | Cancel a running background agent (admin, confirm-required) |
+
+Level 1 can call these between tool rounds to check on delegated work without blocking.
+
+#### 4. Level Enforcement
+
+`agent_level` is passed through `spawn_agent` calls as a ContextVar so each agent knows
+where it sits in the hierarchy. Enforcement is automatic and simple:
+
+- **L1 → spawns L2:** `spawn_agent` called normally. Child agent inherits role tools.
+- **L2 → spawns L3:** `spawn_agent` automatically adds `deny_tools=["spawn_agent", "aider_run"]`
+  to the child's effective tool set. Level 3 agents cannot further delegate.
+- **Level 3:** `spawn_agent` and `aider_run` are never in the tool list.
+
+Level is stored in `AgentRecord.level` — the lineage (`parent_id`) provides a full call tree.
+
+#### 5. `aider_run` Background Mode
+
+Add `background: bool = False` and `notify: bool = False` to `aider_run`. When `True`,
+runs the Aider subprocess via `asyncio.create_task()`, registers in the agent manager,
+returns `agent_id` immediately. When called in background mode, `aider_run` is removed
+from `CONFIRM_REQUIRED` — the user is not blocking on a confirmation gate since the call
+returns instantly.
+
+```python
+# Level 1 or 2 — fire and forget a code change:
+agent_id = await aider_run(
+    project="cortex",
+    task="Add max_chars param to http_fetch in tools/web.py, cap at 32768",
+    background=True,
+    notify=True,
+)
+```
+
+### Implementation Order
+
+1. **`agent_manager.py`** — AgentRecord + registry CRUD + completion notification hook.
+   Foundation for everything else; ~100 lines.
+2. **`spawn_agent` background mode** — `background` + `notify` + `agent_level` params;
+   `asyncio.create_task()`; registers in manager. Existing sync path unchanged.
+3. **`agent_status` / `agent_list` / `agent_cancel`** — wire into `__init__.py`; add to
+   `TOOL_CATEGORIES["Agents"]`, `TOOL_ROLES` (cancel = admin), `CONFIRM_REQUIRED` (cancel).
+4. **Level enforcement** — `agent_level` ContextVar; auto `deny_tools` at L2→L3 boundary.
+5. **`aider_run` background mode** — same pattern as step 2.
+
+### Files to Create/Modify
+
+| File | Change |
+|---|---|
+| `cortex/agent_manager.py` | **New** — AgentRecord, registry dict, start/finish/cancel/list functions |
+| `cortex/tools/agents.py` | Add `background`, `notify`, `agent_level` to `spawn_agent`; add `agent_status`, `agent_list`, `agent_cancel` functions + declarations |
+| `cortex/tools/aider.py` | Add `background`, `notify` params; register with agent_manager when background |
+| `cortex/tools/__init__.py` | Register new agent tools; update TOOL_CATEGORIES, TOOL_ROLES, CONFIRM_REQUIRED |
+
+See §12 for the existing `allow_tools` / `deny_tools` per-call restrictions that level
+enforcement builds on.
+
+---
+
+## 12. Spawner-Level Tool Restrictions — `spawn_agent` Permission Control
+
+**Status:** Design complete, not yet built.
+
+### Problem
+
+`spawn_agent` currently grants sub-agents the full tool set of whatever role they're assigned. The spawning agent (Inara) cannot restrict a sub-agent to a subset of tools — the role config is the only gate. This means every spawned agent implicitly has access to everything the role allows, including potentially destructive operations (`shell_exec`, `file_write`, `cortex_restart`).
+
+### Design
+
+Add two optional parameters to `spawn_agent`: **`allow_tools`** and **`deny_tools`**.
+
+- **`allow_tools`** — explicit allow list. If set, the sub-agent can *only* use tools in this list (intersected with what the role allows). If omitted, the role's full tool set is available.
+- **`deny_tools`** — explicit deny list. If set, these tools are removed from whatever the sub-agent would otherwise have access to. If omitted, nothing is denied beyond what the role already excludes.
+
+**Effective tool set formula:**
+
+```
+effective = (role_base_tools ∩ allow_tools) ∩ (role_base_tools \ deny_tools)
+```
+
+Where `role_base_tools` is the full tool set the role config grants, `allow_tools` is the spawner's allow list (default: full set), and `deny_tools` is the spawner's deny list (default: empty set).
+
+### Usage Examples
+
+```python
+# Research agent — web only, no file access, no shell
+spawn_agent(
+    "Research the latest on Zigbee mesh repeaters",
+    role="chat",
+    allow_tools=["web_search", "web_read", "http_fetch"]
+)
+
+# Code review — read-only, no shell
+spawn_agent(
+    "Review this file for security issues",
+    role="coder",
+    deny_tools=["shell_exec", "file_write", "cortex_restart", "cortex_update"]
+)
+
+# Full access (same as today — omit both params)
+spawn_agent("Refactor the auth module", role="coder")
+
+# Narrow data migration — just file ops, no web
+spawn_agent(
+    "Migrate the task files to the new format",
+    role="coder",
+    allow_tools=["file_read", "file_write", "file_list"]
+)
+```
+
+### Implementation Plan
+
+**1. Model registry / role config — no changes needed.**
+
+The role config (`role_cfg.get("tools")`) remains the authoritative ceiling. No schema changes at this level.
+
+**2. `spawn_agent` function — new parameters + filtering logic.**
+
+File: `cortex/tools/agents.py`. Add `allow_tools` and `deny_tools` as optional `list[str] | None` parameters. After resolving `tool_list` from `role_cfg.get("tools")`, apply the filter:
+
+```python
+if allow_tools is not None:
+    tool_list = [t for t in tool_list if t in allow_tools]
+if deny_tools is not None:
+    tool_list = [t for t in tool_list if t not in deny_tools]
+```
+
+**3. Declaration — update the Gemini `FunctionDeclaration`.**
+
+Add `allow_tools` and `deny_tools` as optional parameters in the declaration so the orchestrator knows they exist.
+
+**4. Confirmation gate behavior — explicit.**
+
+If a sub-agent with restricted tools hits a confirmation gate (e.g., trying `shell_exec` with it denied), the gate blocks as normal — it does not silently fail. The sub-agent returns the "requires user confirmation" message as it already does.
+
+### What Doesn't Change
+
+- Existing `spawn_agent` calls with no `allow_tools`/`deny_tools` continue to work exactly as before
+- Role config remains the authoritative max — no security regression
+- No schema changes to `model_registry.json`
+- No UI changes needed
--- a/documentation/ARCH__PERSONA.md
+++ b/documentation/ARCH__PERSONA.md
@@ -1,7 +1,7 @@
 # Architecture: Persona System & Memory

 > How Inara (and other personas) know who they are and what they remember.
-> Last updated: 2026-04-03
+> Last updated: 2026-05-09

 ---

@@ -44,6 +44,19 @@ Each chat request specifies a tier (default: 2). Higher tiers load more context

 `context_loader.py` assembles the system prompt from these files in order. The resulting prompt is passed to whichever LLM backend handles the request.

+### System Block
+
+Before any persona files, `context_loader.py` prepends a `--- System ---` block with per-request metadata:
+
+```
+--- System ---
+Current date and time: Friday, 2026-05-09 at 02:34 PM EDT
+Current mode: Off The Record — this conversation is private and will not be logged or included in memory distillation
+```
+
+The **date/time line** is always present (unless the role has `inject_datetime: false`).
+The **mode line** is only added when the session is Off The Record — normal Chat mode adds nothing, so the block stays minimal. This mirrors the same principle as the mode indicator in the UI: only signal when something non-default is in effect.
+
 ---

 ## Memory Distillation
--- a/documentation/ARCH__SYSTEM.md
+++ b/documentation/ARCH__SYSTEM.md
@@ -1,7 +1,7 @@
 # Architecture: System Overview

 > How the pieces fit together.
-> Last updated: 2026-04-03
+> Last updated: 2026-05-06

 ---

@@ -56,7 +56,9 @@ Details: [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) | [`ARCH__PERSONA.md`](ARCH__P
 | `context_loader.py` | Builds system prompt from persona files (tiers 1–4) |
 | `llm_client.py` | All LLM backends — Claude, Gemini CLI, Local |
 | `orchestrator_engine.py` | Gemini API ReAct tool loop → Claude handoff |
-| `session_store.py` | In-memory + file session persistence |
+| `openai_orchestrator.py` | OpenAI-compatible ReAct tool loop (local models via Open WebUI/OpenRouter) |
+| `model_registry.py` | Per-user model registry V2 — providers, hosts, models, role assignments |
+| `session_store.py` | In-memory + file session persistence (`session_data/{id}.json`) |
 | `session_logger.py` | Writes session turns to `sessions/YYYY-MM-DD.md` |
 | `memory_distiller.py` | Short/mid/long distill jobs |
 | `scheduler.py` | APScheduler — distill jobs + user crons |
@@ -64,20 +66,23 @@ Details: [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) | [`ARCH__PERSONA.md`](ARCH__P
 | `notification.py` | Outbound channel messages (distill alerts, cron proactive) |
 | `auth_utils.py` | bcrypt passwords, JWT, invite tokens, channel config |
 | `auth_middleware.py` | JWT cookie validation on all routes |
-| `user_settings.py` | Per-user local LLM config (hosts, models, active model) |
+| `tool_audit.py` | JSONL audit log for every orchestrator tool invocation |
+| `usage_tracker.py` | Per-user token usage tracking (daily buckets → `usage.json`) |
 | `event_bus.py` | Internal SSE pub/sub (NC Talk → browser mirror) |
 | `email_utils.py` | SMTP invite emails |
 | `persona_template.py` | Bootstrap a new persona directory from templates |
-| `routers/` | One file per endpoint group (chat, orchestrator, auth, files, channels, ui, settings…) |
-| `tools/` | Orchestrator tool implementations (web, ae_knowledge, tasks, scratch, reminders, cron, system) |
-| `static/` | Web UI — `index.html`, `app.js`, `style.css`, `login.html`, `setup.html`, `HELP.md` |
-| `tests/` | pytest suite (80 tests) |
+| `routers/` | One file per endpoint group — `chat`, `orchestrator`, `auth`, `files`, `ui`, `settings`, `tools_settings`, `local_llm`, `distill`, `audit`, `usage`, `push`, `help`, `onboarding`, `auth_google`, `nextcloud_talk`, `google_chat`, `homeassistant` |
+| `tools/` | 58 orchestrator tools in 15 domain modules — `web`, `files` (project + system scope), `tasks`, `scratch`, `reminders`, `cron`, `system`, `notify`, `ae_knowledge`, `ae_tasks`, `agent_notes`, `agents`, `homeassistant`. Registry and access control in `tools/__init__.py`. |
+| `static/` | Web UI — `index.html`, `app.js`, `style.css`, `login.html`, `setup.html`, `HELP.md`, `local_llm.html`, `settings.html`, `notifications.html`, `tools_settings.html` |
+| `tests/` | pytest suite |

 ---

 ## Key Design Decisions

-**Two-brain pattern** — Gemini API handles tool use (function calling, planning, web search). Claude CLI handles all user-facing responses. Direct chat bypasses the orchestrator entirely.
+**Two-brain pattern (Gemini orchestrator)** — Gemini API handles tool use (function calling, planning, web search). Claude CLI handles all user-facing responses. Direct chat bypasses the orchestrator entirely.
+
+**Single-model pattern (local orchestrator)** — When the `orchestrator` role resolves to a `local_openai` model, `openai_orchestrator.py` runs the full ReAct loop and produces the final response itself. No Claude handoff — the local model does both reasoning and response.

 **Subprocess backends** — Claude and Gemini run as CLI subprocesses (`claude --print`, `gemini -p`). This keeps auth transparent (Claude Code manages tokens) and avoids API costs on the Pro subscription path.

@@ -88,3 +93,40 @@ Details: [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) | [`ARCH__PERSONA.md`](ARCH__P
 **Per-user filesystem layout** — `home/{user}/persona/{name}/` mirrors Linux home directories. Each persona is a directory of markdown files and JSON. No database. Easy to inspect, edit, and back up.

 **No single point of coupling** — tools live in `cortex/tools/`, separate from `ae_*` MCP tools. Channels live in `cortex/routers/`, each self-contained. Adding a channel or tool doesn't touch other subsystems.
+
+**Tool access control (three layers):**
+1. **Role gate** (`TOOL_ROLES` in `tools/__init__.py`) — admin-only tools require `admin` role in `auth.json`.
+2. **Risk policy** (`home/{user}/tool_policy.json`) — `max_risk` auto-includes all tools at or below a level (low/medium/high); `whitelist`/`blacklist` override individual tools. Configurable at `/settings/tools`.
+3. **Model-level tool list** — per-role `tools` field in `local_llm.json`; can only restrict further, never elevate.
+
+All 58 tools carry a `TOOL_RISK` rating (36 low / 12 medium / 10 high) used for auto-filtering. `CONFIRM_REQUIRED` is a separate static set of tools that trigger a user confirmation prompt before executing, independent of risk level.
+
+**Agent private notes** — `AGENT_NOTES.md` per persona, writable only by the orchestrator via `agent_notes_*` tools. Never loaded into user-facing context. Three rolling backups (`bak1`–`bak3`) are visible read-only in the Files panel. Declared in `tools/agent_notes.py`; usage guidance in `PROTOCOLS.md`.
+
+**No black boxes** — Every component, flow, and design decision is documented. Documentation is updated before implementation of significant changes and verified after. HELP.md is the user-facing contract; ARCH__*.md files are the developer contract; PROTOCOLS.md is the agent contract. If any of these drift from reality, that is a bug.
+
+---
+
+## Onboarding Flow
+
+New users are invited via a one-time token and complete a three-step setup before reaching the chat:
+
+```
+1. /setup/{token}         → Set password (POST creates session cookie, consumes token)
+2. /setup/persona         → Create persona (slug, display name, emoji, description)
+3. /setup/model           → Connect a model — OpenRouter recommended
+                            (skip link goes straight to /{user}/{persona})
+```
+
+Step 3 is the planned addition (see `TODO__Agents.md § Guided onboarding`). Before it exists,
+users land in the chat with no model configured and must navigate Settings → Model Registry
+manually — which is confusing for non-technical users.
+
+**After Step 3:**
+- `save_host()` adds OpenRouter (`https://openrouter.ai/api/v1`, type `openai`)
+- `save_model()` creates a model entry for the chosen model
+- `set_role(chat, primary, model_id)` assigns it as the chat role primary
+- Redirect to `/{user}/{persona}`
+
+**Existing users with no model configured** — a dismissable banner is shown in the chat on
+load, linking to `/setup/model` (the Step 3 form works standalone, without step labels).
--- a/documentation/DESIGN__Model_Registry_V2.md
+++ b/documentation/DESIGN__Model_Registry_V2.md
@@ -0,0 +1,209 @@
+# Model Registry V2 — Design Document
+
+> Status: Phase 3 in progress
+> Goal: Unified, provider-agnostic model management with clean role-based routing
+
+---
+
+## Problem Statement
+
+The original system had two classes of models with different treatment:
+
+| Type | How configured | How selected |
+|---|---|---|
+| Claude, Gemini | Hardcoded built-ins (`claude_cli`, `gemini_api`) | Backend toggle string ("claude"/"gemini") |
+| Local (Ollama, Open WebUI) | Configured via `/settings/local` | Backend toggle string "local" |
+
+This breaks down when you want multiple Gemini API keys, OpenRouter alongside local models,
+role assignments spanning all provider types, or a toggle that shows which model is active
+instead of which service.
+
+---
+
+## Architecture
+
+### Core concept: Providers + Credentials + Models + Roles
+
+```
+Providers (built-in, fixed set)
+  └─ Anthropic       ← catalog of Claude model IDs (code constants)
+  └─ Google          ← catalog of Gemini model IDs (code constants)
+  └─ Local Host      ← OpenAI-compatible endpoint (user adds these)
+
+Credentials (user-configured, stored in model_registry.json)
+  └─ Anthropic       ← Claude CLI (OAuth, default) — API key support in Phase 4
+  └─ Google          ← one or more API keys (one per Google account)
+  └─ Local Host      ← api_key stored on the host record
+
+Model Entries (user-registered)
+  └─ Provider + model ID + credential = one usable model entry
+
+Role Assignments (unified — any model entry can fill any role)
+  └─ chat:         primary → backup_1 → backup_2
+  └─ orchestrator: primary → backup_1
+  └─ distill:      primary
+  └─ (etc.)
+```
+
+### Catalog design decision
+
+Catalogs (`ANTHROPIC_CATALOG`, `GOOGLE_CATALOG`) are **Python constants** in
+`model_registry.py`, not stored in the per-user JSON. Updated with each code deploy.
+Per-user catalog customisation is deferred to Phase 4.
+
+### Backend toggle redesign (Phase 3)
+
+**Before:** cycles service type strings — `auto → claude → gemini → local`
+
+**After:** cycles through the chat role's configured models by label:
+```
+Sonnet 4.6 (CLI) → Gemini 2.5 Flash → Gemma 4 E4B → (wraps)
+```
+- Shows the resolved model label on the toggle button
+- If no chat role models are configured: shows "auto", uses existing role routing
+- Click skips empty slots automatically
+- Color: `claude_cli` = default, `gemini_*` = blue, `local_openai` = amber
+
+UI sends `slot: "primary" | "backup_1" | "backup_2"` (not backend type string).
+`llm_client.complete()` resolves that slot from the chat role and dispatches by `type`.
+
+---
+
+## Data Model (V2 Schema)
+
+Stored in `home/{user}/model_registry.json`.
+
+```json
+{
+  "version": 2,
+  "providers": {
+    "anthropic": {
+      "credentials": [{"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}]
+    },
+    "google": {
+      "accounts": [{"id": "a1b2", "label": "One Sky IT", "api_key": "AIza..."}]
+    }
+  },
+  "hosts": [
+    {"id": "h1", "label": "Gaming Laptop", "api_url": "http://...", "api_key": "", "host_type": "openwebui"}
+  ],
+  "models": [
+    {"id": "m1", "type": "claude_cli",   "label": "Sonnet 4.6 (CLI)",     "model_name": "claude-sonnet-4-6",  "provider": "anthropic", "credential_id": "cli",  "context_k": 1000, "tags": []},
+    {"id": "m2", "type": "gemini_api",   "label": "Gemini 2.5 Flash",     "model_name": "gemini-2.5-flash",   "provider": "google",    "account_id": "a1b2",    "context_k": 1000, "tags": []},
+    {"id": "m3", "type": "local_openai", "label": "Gemma 4 E4B",          "model_name": "gemma4:e4b",         "provider": "local",     "host_id": "h1",         "context_k": 72,   "tags": []},
+    {"id": "m4", "type": "local_openai", "label": "DeepSeek: V4 Flash",   "model_name": "deepseek/deepseek-v4-flash", "provider": "local", "host_id": "h1", "context_k": 750, "reasoning_budget_tokens": 4096, "tags": ["frontier"]}
+  ],
+  "roles": {
+    "chat":        {"primary": "m1", "backup_1": "m2", "backup_2": "m3"},
+    "orchestrator":{"primary": "m2", "backup_1": "m3"},
+    "distill":     {"primary": "m1"}
+  }
+}
+```
+
+### Model types and dispatch
+
+| `type` | Dispatches via | Notes |
+|---|---|---|
+| `claude_cli` | Claude CLI subprocess | `~/.claude/.credentials.json` OAuth |
+| `gemini_cli` | Gemini CLI subprocess | |
+| `gemini_api` | Currently: Gemini CLI (gap — see Phase 4) | Should use google-genai SDK |
+| `local_openai` | HTTP to OpenAI-compatible endpoint | host_type controls path |
+
+### Optional model fields
+
+| Field | Type | Default | Meaning |
+|---|---|---|---|
+| `context_k` | int | 32 | Context window in thousands of tokens. Used for compaction budget (75% of window). |
+| `max_rounds` | int \| null | null | Per-model tool loop cap. `null` = use global `orchestrator_max_rounds`. Effective limit = `min(per_model, global)`. |
+| `tools` | bool | true | Whether this model supports tool calling. `false` = skip tool loop entirely; model gets a plain chat request. |
+| `reasoning_budget_tokens` | int \| null | null | Per-model reasoning/thinking budget for models that support it (e.g., DeepSeek V4 via OpenRouter). `null` = no reasoning override. When set, injected as `{"reasoning": {"budget_tokens": <value>}}` in the API call to OpenRouter-compatible endpoints. |
+
+### Built-in model IDs
+
+Always resolvable without a registry entry (used as `.env` role defaults):
+`claude_cli`, `gemini_cli`, `gemini_api`
+
+---
+
+## Resolution Logic
+
+`get_model_for_role(username, role)` — walks `primary → backup_1 → backup_2 → backup_3 → backup_4`, returns first resolved model config with credentials merged in. Falls back to `.env` defaults, then hardcoded last-resort.
+
+`get_model_for_slot(username, role, slot)` — resolves *only* the named slot, no fallback chain. Used by Phase 3 explicit slot selection.
+
+---
+
+## Routing Code
+
+### `llm_client.complete()` (Phase 3 update)
+
+```
+slot: str | None  → resolve specific slot, no fallback (explicit selection)
+model: str | None → legacy backend strings, kept for backward compat
+(neither)         → auto: role-based routing with full fallback chain
+```
+
+Dispatch table (`type` → backend function):
+- `claude_cli`   → `_claude()`
+- `gemini_cli`   → `_gemini()`
+- `gemini_api`   → `_gemini()` ← **gap: should be `_gemini_api()` (Phase 4)**
+- `local_openai` → `_local()`
+
+### `routers/chat.py` (Phase 3 update)
+
+- `ChatRequest` gets `slot: str | None = None`
+- `GET /backend` returns `chat_models: [{slot, label, type}]` for the UI toggle
+- `_stream_chat` resolves model label from slot when `req.slot` is set
+
+### `app.js` (Phase 3 update)
+
+- Loads `chat_models` from `GET /backend` on page init
+- Toggle cycles through `chat_models` by label, sends `slot` in chat payload
+- Agent mode placeholder: remove "Gemini tool loop" hardcode → "orchestrator"
+
+---
+
+## Known Gaps (not yet implemented)
+
+### Gap A — `gemini_api` dispatch in `llm_client` (Phase 4)
+`_TYPE_TO_BACKEND` maps `gemini_api → "gemini"` (CLI subprocess). If a user assigns a
+`gemini_api` type model to the `chat` role, it silently routes to the Gemini CLI instead
+of the Google genai SDK. Fix: add `_gemini_api()` in `llm_client.py` that calls the SDK
+directly, matching how `orchestrator_engine.py` does it. Needs API key from resolved config.
+
+### Gap B — Agent mode placeholder (Phase 3, quick fix)
+`app.js` lines 257–258 hard-code `"Gemini tool loop"`. Should say `"orchestrator"` since
+the orchestrator role can now be a local model.
+
+---
+
+## Phases
+
+### Phase 1 — Data model + routing ✅ 2026-04-27
+- V2 schema with `providers` section
+- Auto migration V1→V2 (pulls gemini_api_key from auth.json → Google accounts)
+- `_resolve_model()` merges account API key for `gemini_api` type
+- `get_google_api_key()`, `save_cloud_model()`, `save/remove_google_account()`
+- Orchestrator router uses model-resolved API key
+
+### Phase 2 — Cloud provider UI ✅ 2026-04-27
+- `/settings/models` (canonical, `/settings/local` redirects)
+- Cloud Providers section: Anthropic info + Google account add/remove
+- Add Model form with provider tabs (Local / Google / Anthropic)
+- Provider badges on model rows (Anthropic / Google / Local)
+- Settings page updated: Gemini Key section replaced by Model Registry card
+
+### Phase 3 — Toggle redesign + routing cleanup 🔄 in progress
+- `model_registry.get_model_for_slot()` — resolve a specific slot without fallback chain
+- `llm_client.complete()` — add `slot` parameter
+- `routers/chat.py` — `ChatRequest.slot`, extend `GET /backend`, slot label in response tag
+- `app.js` — data-driven toggle cycling model labels; send `slot` not backend string
+- Fix Gap B: agent mode placeholder
+
+### Phase 4 — Polish + future providers
+- Fix Gap A: `gemini_api` dispatch in `llm_client` → direct Google genai SDK for chat
+- Claude direct API key support (alternative to CLI OAuth)
+- OpenRouter as a named provider (already works as local host; could be promoted)
+- Per-role "test" button in role assignments UI
+- Per-user catalog additions (extend ANTHROPIC_CATALOG / GOOGLE_CATALOG from UI)
--- a/documentation/MASTER.md
+++ b/documentation/MASTER.md
@@ -1,13 +1,16 @@
-# Cortex / Inara — Master Index
+# Cortex — Master Index

 > Start here. This document is a map, not a manual.
-> Last updated: 2026-04-03
+> Last updated: 2026-06-03
+>
+> **Documentation philosophy:** Cortex is a no-black-box system. Docs must match reality.
+> Update docs before implementing significant changes. Verify they still match after.

 ---

 ## What It Is

-Cortex is a self-hosted personal AI platform. It routes messages from any input channel to AI backends, manages a resident agent (Inara) with persistent memory, and coordinates across a fleet of machines. It is infrastructure, not a product.
+Cortex is a self-hosted personal AI platform. It routes messages from any input channel to AI backends, manages per-user AI personas with persistent memory, and coordinates across a fleet of machines. It is infrastructure, not a product.

 **Running at:** `https://cortex.dgrzone.com` | `systemctl --user restart cortex`

@@ -17,19 +20,33 @@ Cortex is a self-hosted personal AI platform. It routes messages from any input

 | Component | Status | Notes |
 |---|---|---|
-| Web UI | ✅ Live | SPA, dark theme, mobile-responsive, session auth |
+| Web UI | ✅ Live | SPA, dark theme, mobile-responsive, PWA-installable |
 | Nextcloud Talk bot | ✅ Live | HMAC-signed, per-user routing |
 | Google Chat Add-on | ✅ Live | JWT-verified, per-user routing |
 | Claude backend | ✅ Live | Primary — via Claude Code CLI |
 | Gemini backend | ✅ Live | Fallback — via Gemini CLI |
-| Local backend | ✅ Live | Third option — Open WebUI/Ollama on scott_gaming |
-| Gemini orchestrator | ✅ Live | Tool loop → Claude response, Agent mode in UI |
+| Local backend | ✅ Live | Open WebUI/Ollama on scott_gaming; per-user multi-model config |
+| Gemini orchestrator | ✅ Live | Tool loop → Claude response, ⚡ toggle in UI (66 tools) |
+| Local orchestrator | ✅ Live | OpenAI-compatible ReAct loop; used when orchestrator role → local model |
+| Model registry V2 | ✅ Live | Providers (Anthropic/Google/Local), multi-account Gemini, role assignments |
 | Memory distillation | ✅ Live | Short (daily) / Mid (weekly) / Long (monthly) |
 | Multi-user | ✅ Live | Scott, Holly, Brian — each with own personas |
 | Session search | ✅ Live | Full-text search across past session logs |
-| Proactive cron | ✅ Live | `message` and `brief` job types → NC Talk |
+| Proactive cron | ✅ Live | 5 job types: `remind`, `note`, `message`, `brief`, `task` (full orchestrator loop) → NC Talk / web push |
+| Schedules web UI | ✅ Live | `/settings/crons` — view, add, edit, pause/resume, delete jobs without going through the AI |
+| Tool audit log | ✅ Live | Every orchestrator tool call logged to `home/{user}/tool_audit/` |
+| Token usage tracking | ✅ Live | Per-user daily buckets in `home/{user}/usage.json`; visible in Settings |
+| Web push notifications | ✅ Live | VAPID push; `web_push` orchestrator tool; subscribe via ☰ menu |
+| Proactive notifications | ✅ Live | Daily reminder check (09:00); distill/cron completion alerts; dedicated `/settings/notifications` page |
+| Sub-agent spawning | ✅ Live | `spawn_agent` tool — sync or background; `agent_status`/`agent_list`/`agent_cancel`; 3-level hierarchy (L2→L3 enforcement built in) |
+| Aider coding agent | ✅ Live | `aider_run` tool — Aider subprocess; model-agnostic (DeepSeek, Ollama, OpenRouter, etc.) |
+| Agent private notes | ✅ Live | `AGENT_NOTES.md` — orchestrator-only notepad; 3 rolling backups; user-visible as read-only |
+| Distill safety | ✅ Live | Per-persona asyncio lock, per-endpoint cooldowns, Rebuild option |
+| Guided onboarding | ✅ Live | Setup Step 3 for OpenRouter; existing-user banner; settings quick-link |

-**Active users / personas:** scott/inara, scott/developer, holly/tina, brian/wintermute
+**69 orchestrator tools** across 17 domain modules — added 2026-06-03: `agent_status`/`agent_list` (user-level)/`agent_cancel` (admin, confirm-required); background mode for `spawn_agent` (`background=True` returns agent_id immediately; `notify=True` sends push on completion); `agent_manager.py` registry with lineage tracking and 24h pruning; L2→L3 level enforcement auto-denies `spawn_agent`/`aider_run` in Level 3 children. Added 2026-05-23: `aider_run` (Aider coding agent subprocess; project aliases for cortex/aether_api/aether_frontend/aether_container; model-agnostic via `.aider.conf.yml` or env vars; admin-only, confirm-required). `.aider.conf.yml` added to project root (read-only context, Python lint-cmd, auto-commits). Added 2026-05-12: `file_diff`, `git_status` / `git_log` / `git_diff` (read-only git inspection), `ae_db_query` / `ae_db_describe` / `ae_db_show_view` (SELECT-only Aether MariaDB access, admin, per-user credentials). `/settings/integrations` page added (admin-only). File attachments in chat (images for vision-capable local models; text/code files for all backends). Settings pages unified under `pg.css`. Added 2026-05-13: `task` cron type (full orchestrator loop on a schedule); monthly/yearly schedule formats (`monthly`, `monthly:DD:HH:MM`, `yearly:MM:DD:HH:MM`); Schedules web UI at `/settings/crons` (list, add, edit, pause, delete); HA inbound webhook tools toggle (orchestrator vs. direct LLM); Anthropic API key backend (`anthropic_api` model type via Anthropic SDK — alternative to CLI OAuth); Cloud APIs catalog in Model Registry — named provider picker (OpenRouter, OpenAI, Groq, X.ai/Grok, Together.ai, Fireworks.ai, Custom) with auto-filled URLs; hosts split into Cloud APIs / Local Hosts sections. Added 2026-05-15: Per-user custom roles — three required roles (`chat`, `orchestrator`, `distill`) are always present; users can add/remove custom roles (e.g. `coder`, `research`) via the Model Registry UI; existing `.env`-defined roles auto-migrated. Settings pages (`local_llm.html` + all settings pages) migrated to Tailwind CSS CDN (no build step); `preflight: false` preserves `pg.css` base styles; `input[type=checkbox/radio]` global width fix in `pg.css`; `btn-submit` now responsive (`w-full md:w-96`).
+
+**Active users / personas:** scott/inara, holly/tina, brian/wintermute

 ---

@@ -65,6 +82,7 @@ Cortex is a self-hosted personal AI platform. It routes messages from any input
 | [`CLAUDE.md`](../CLAUDE.md) | Project instructions for Claude Code — directory map, run commands, design decisions |
 | [`README.md`](../README.md) | Project root orientation, quick-start, user management |
 | [`cortex/static/HELP.md`](../cortex/static/HELP.md) | In-app help (rendered in UI for all users) |
+| [`SELF_UPDATE.md`](SELF_UPDATE.md) | Bootstrap for agents doing self-maintenance — git, Syncthing, scripts, doc checklist |

 ---

--- a/documentation/PLAN__Tool_Schema_Optimization.md
+++ b/documentation/PLAN__Tool_Schema_Optimization.md
@@ -0,0 +1,362 @@
+# PLAN — Reduce Tool Schema Overhead in Cortex
+
+**Goal:** Eliminate the per-round, per-message transmission of all 45 tool definitions.
+Drop overhead from ~8K-10K tokens per round to near zero for casual chat, and to a
+relevant subset for orchestrated work.
+
+**Status:** Draft — ready for Claude Code implementation.
+
+---
+
+## Background
+
+Every orchestrated (⚡ toggled on) message triggers a ReAct tool loop. The full 45-tool
+schema is rebuilt and transmitted **on every round of every call** — including rounds
+where no tool is invoked and messages where no tool is needed at all. This wastes
+thousands of tokens per interaction.
+
+The architecture already has the building blocks for a fix: role configs support a
+`tools` allow-list, and `get_openai_tools_for_role()` already accepts filtering
+parameters. They're just not being wired together effectively.
+
+---
+
+## Phase 1 — Role-Based Tool Filtering (Foundation)
+
+**Effort:** Small. **Impact:** High.
+
+### What
+
+Define which tools each role actually needs, then enforce the filtering so roles
+only receive their relevant tool subset.
+
+### Implementation
+
+**1. Audit every role and define tool lists.**
+
+| Role | Tools needed | Approx count |
+|------|-------------|-------------|
+| `chat` | None (zero tools — should never be in the orchestration loop) | 0 |
+| `orchestrator` | web, file (admin), shell (admin), tasks, cron, reminders, scratchpad, Aether journals, agent notes, system (admin), spawn_agent, HA, ae_db, git, file_diff, file_syntax_check, notifications (admin) | 25-30 |
+| `distill` | None (pure text processing) | 0 |
+| `coder` | file (admin), shell (admin), git, file_diff, file_syntax_check | 8-10 |
+| `research` | web_search, web_read, http_fetch | 3 |
+| `admin` (role) | All 45 (admin-level access) | 45 |
+
+**2. Store tool lists per role in `config.yaml` or the model registry defaults.**
+The role config already has a `tools` field — populate it with the lists above.
+
+**3. Enforce in `get_openai_tools_for_role()`.**
+The function is called from `openai_orchestrator.py` around line 451. Currently if
+`tools` is empty/missing it returns all tools. Change so that:
+
+- If role config has a `tools` list → return only those tools
+- If role config has `tools: false` → return empty list
+- If role config has no `tools` field → return all (backward compat)
+
+At the call site (`_run_from_messages`), pass the role's tool allow-list into
+`get_openai_tools_for_role()` via the `tool_list` parameter that already exists.
+
+### Files to change
+
+- `cortex/openai_orchestrator.py` — wire role config `tools` into the call to
+  `get_openai_tools_for_role()`
+- `cortex/model_registry.py` — ensure `get_role_config()` returns the `tools` field
+  (it does already, line 487)
+- `cortex/config.py` or `home/{user}/model_registry.json` — define the tool lists
+  per default role
+
+---
+
+## Phase 2 — Dynamic Keyword-Based Tool Routing (High Impact)
+
+**Effort:** Small. **Impact:** Very High.
+
+### What
+
+Before entering the ReAct tool loop, scan the user's message with a lightweight
+keyword classifier to determine which tool categories are relevant. Only include
+tools from matched categories — typically 3-8 tools instead of 45.
+
+This is the **core optimization.** For the 80%+ of messages that only need a narrow
+set of tools (or none at all), this eliminates the bulk of schema overhead on every
+round.
+
+### The Hybrid Stack
+
+```
+User message
+    ↓
+[1] Role filter (Phase 1) — narrows 45 tools → ~25 for orchestrator role
+    ↓
+[2] Keyword classifier (Phase 2) — narrows ~25 → 3-8 relevant tools
+    ↓
+[3] ReAct loop — only transmitting the relevant subset each round
+```
+
+If the keyword classifier matches nothing (e.g. "good morning", "test", "what do you
+think?"), it returns an empty tool set — effectively routing the message as a pure
+chat interaction with zero tool overhead.
+
+### Keyword Category Map
+
+Each category maps keywords → tool names. Simple regex/contains matching.
+
+| Category | Trigger keywords | Tools included |
+|----------|-----------------|---------------|
+| `web` | search, google, look up, what is, who is, weather, forecast, temperature, news, article, website, find, research | web_search, web_read, http_fetch |
+| `web_post` | post to, send to, webhook, trigger, notify | http_post |
+| `file` | read file, show file, open file, list files, directory, grep, find in, search in, diff, compare, syntax check | file_read, file_list, file_write, file_diff, file_grep, file_syntax_check, file_stat |
+| `git` | git, commit, branch, pushed, pulled, merge, repo, repository | git_status, git_log, git_diff |
+| `system` | restart, update, status, logs, deploy, shell, command, run, health, is it running | cortex_status, cortex_logs, cortex_restart, cortex_update, shell_exec |
+| `tasks` | task, todo, to-do, to do, add task, create task, what's on my list, pending | task_list, task_create, task_update, task_complete |
+| `cron` | schedule, cron, every day, every week, recurring, automate, job | cron_list, cron_add, cron_remove, cron_toggle |
+| `reminders` | remind, reminder, remember, don't forget | reminders_add, reminders_list, reminders_remove, reminders_clear |
+| `scratchpad` | scratch, scratchpad, working notes, jot down, notepad | scratch_read, scratch_write, scratch_append, scratch_clear |
+| `ha` | home assistant, light, thermostat, turn on, turn off, kitchen, bedroom, switch, sensor, temperature | ha_get_state, ha_get_states, ha_call_service |
+| `aether` | journal, aether, note entry, log entry, search journals, ae_ | ae_journal_list, ae_journal_search, ae_journal_entry_read, ae_journal_entries_list, ae_journal_entry_create, ae_journal_entry_update, ae_journal_entry_disable, ae_journal_entry_append, ae_journal_entry_prepend |
+| `aether_db` | database, query, sql, select, db, table, schema, maria | ae_db_query, ae_db_describe, ae_db_show_view |
+| `notifications` | notify, push, send email, email, message, talk, nextcloud | web_push, email_send, nc_talk_send, nc_talk_history |
+| `agents` | spawn, sub-agent, delegate, agent | spawn_agent |
+| `notes` | agent notes, private notes, my notes | agent_notes_read, agent_notes_write, agent_notes_append, agent_notes_clear |
+| `session` | remember, session, history, last time, what did we, earlier, yesterday, last week | session_read, session_search |
+| `ae_tasks` | ae task, kanban, board | ae_task_list |
+| `claude` | claude, allow directory, permissions | claude_allow_dir |
+
+### Implementation
+
+In `openai_orchestrator.py`, before the ReAct loop starts:
+
+```python
+def _classify_tool_categories(user_message: str) -> list[str]:
+    """Classify a user message into tool categories based on keywords.
+    
+    Returns a list of category names whose tools should be included.
+    Returns empty list if no categories match (pure chat).
+    """
+    message_lower = user_message.lower()
+    
+    category_keywords = {
+        "web":          ["search", "look up", "what is", "who is", "weather",
+                         "forecast", "news", "find on", "google", "website",
+                         "article", "research", "temperature"],
+        "web_post":     ["post to", "send to", "webhook", "trigger webhook"],
+        "file":         ["read file", "show file", "list file", "directory",
+                         "grep", "search in", "find in", "diff", "compare",
+                         "syntax check", "open file"],
+        "git":          ["git", "commit", "branch", "pulled", "merged",
+                         "repository", "repo"],
+        "system":       ["restart", "update", "status", "logs", "deploy",
+                         "run command", "shell", "is it running", "health"],
+        "tasks":        ["task", "todo", "to-do", "to do", "add task",
+                         "create task", "pending", "what's on my list"],
+        "cron":         ["schedule", "cron", "every day", "every week",
+                         "recurring", "automate", "job"],
+        "reminders":    ["remind", "reminder", "remember", "don't forget"],
+        "scratchpad":   ["scratch", "scratchpad", "working note", "jot down",
+                         "notepad"],
+        "ha":           ["home assistant", "light", "thermostat", "turn on",
+                         "turn off", "switch", "sensor", "temperature in",
+                         "kitchen", "bedroom", "garage"],
+        "aether":       ["journal", "aether journal", "note entry", "log entry",
+                         "search journal", "ae_journal"],
+        "aether_db":    ["database", "query", "sql", "select", "db", "table",
+                         "schema", "maria", "run query"],
+        "notifications":["notify", "push notification", "send email", "email",
+                         "talk message", "nextcloud"],
+        "agents":       ["spawn", "sub-agent", "delegate", "spawn agent"],
+        "notes":        ["agent notes", "private notes", "my notes",
+                         "agent_notes"],
+        "session":      ["remember", "session", "history", "last time",
+                         "what did we", "earlier", "yesterday", "last week",
+                         "previously"],
+        "ae_tasks":     ["ae task", "kanban", "board", "ae_task"],
+        "claude":       ["claude allow", "claude directory"],
+    }
+    
+    matched = []
+    for category, keywords in category_keywords.items():
+        if any(kw in message_lower for kw in keywords):
+            matched.append(category)
+    
+    return matched
+```
+
+Then at the orchestration entry point, after determining the role's base tool list
+(Phase 1), apply the keyword filter:
+
+```python
+# Phase 1: Get role's base tool list
+role_tools = get_role_config(username, role).get("tools")
+
+# Phase 2: Dynamically narrow based on message content
+matched_categories = _classify_tool_categories(user_message)
+if matched_categories:
+    category_tool_map = { ... }  # defined at module level
+    dynamic_tools = []
+    for cat in matched_categories:
+        dynamic_tools.extend(category_tool_map.get(cat, []))
+    # Intersect with role_tools so we never grant more than the role allows
+    if role_tools:
+        dynamic_tools = [t for t in dynamic_tools if t in role_tools]
+    active_tools = get_openai_tools_for_role(
+        role=user_role,
+        tool_list=dynamic_tools or None
+    )
+else:
+    # No keywords matched — likely causal chat route to /chat
+    # or use empty tool list
+    active_tools = []
+```
+
+### Edge Cases to Handle
+
+1. **Multiple categories match:** Union all matched tool sets. The `for cat in matched_categories` loop handles this naturally.
+
+2. **No categories match:** Return empty tool set. The orchestrator loop won't start — this effectively becomes a chat message without incurring the schema tax. If the LLM needs tools anyway, it will respond with a natural language request, and the user can rephrase.
+
+3. **Ambiguous short messages:** "Hey can you check something" — matches nothing, falls through to empty tools. This is correct behavior; the LLM will ask "what do you want me to check?" and the next message will have a clear intent.
+
+4. **Over-broad keywords:** "search" in "search journals" could trigger both `web` and `aether`. The union handles this — both categories' tools are included, which is what you want.
+
+### File to change
+
+- `cortex/openai_orchestrator.py` — add `_classify_tool_categories()` function and
+  wire it into the orchestration entry point before the ReAct loop
+
+---
+
+## Phase 3 — Cache Tool Schema per Session
+
+**Effort:** Medium. **Impact:** Medium.
+
+### What
+
+The tool schema doesn't change between rounds of the same session for a given role.
+After Phase 2 narrows it to, say, 5 tools, those 5 tool definitions are identical
+every round. Cache them.
+
+### Implementation
+
+Add a session-scoped cache in `openai_orchestrator.py`:
+
+```python
+# Module-level cache: key = f"{session_id}:{role}:{sorted_tool_list}"
+_tool_schema_cache: dict[str, list[dict]] = {}
+
+def _get_cached_tool_schema(session_id: str, role: str, tool_list: list[str] | None) -> list[dict]:
+    key = f"{session_id}:{role}:{sorted(tool_list) if tool_list else 'all'}"
+    if key in _tool_schema_cache:
+        return _tool_schema_cache[key]
+    schemas = get_openai_tools_for_role(role=role, tool_list=tool_list)
+    _tool_schema_cache[key] = schemas
+    return schemas
+```
+
+Invalidation: Cache key includes the tool list, so if the dynamic classifier returns
+different categories on the next message, it gets a fresh cache entry. No explicit
+invalidation needed.
+
+### File to change
+
+- `cortex/openai_orchestrator.py` — add cache dict and lookup before calling
+  `get_openai_tools_for_role()`
+
+---
+
+## Phase 4 — Reduce Default Max Rounds
+
+**Effort:** Trivial. **Impact:** Low-to-medium.
+
+### What
+
+Most requests resolve in 1-3 tool calls. A global cap of 10 means up to 7 wasted
+schema transmissions on edge cases.
+
+### Implementation
+
+1. Make `max_rounds` configurable per model in the model registry (it already exists
+   in some model configs — see `home/brian/model_registry.json` line 42).
+2. Read it from the model config during orchestration instead of using the global
+   `.env` value.
+3. Lower the default from 10 to 5.
+
+### Files to change
+
+- `cortex/.env` — change `ORCHESTRATOR_MAX_ROUNDS=10` to `=5`
+- `cortex/openai_orchestrator.py` — read per-model `max_rounds` from `model_cfg`
+  instead of only from settings
+
+---
+
+## Phase 5 — UI Improvements (Independent)
+
+**Effort:** Small. **Impact:** Medium (UX).
+
+### What
+
+Make the tool mode indicator more obvious so the user can quickly tell whether
+they're incurring the tool tax.
+
+### Ideas
+
+- Change ⚡ color: green when tools are on, gray when off
+- Swap icon: ⚡ (tools) vs. 💬 (chat only)
+- Add tooltip: "Tools enabled — all 45 tool schemas sent with each message"
+- Optional: add a "Quick Question" button that sends to `/chat` directly, bypassing
+  the orchestrator entirely
+
+### Files to change
+
+- Svelte UI component — likely `ChatInput.svelte` or the chat mode toggle component
+
+---
+
+## Recommended Execution Order
+
+1. **Phase 1** (role filtering) — foundation. Defines the base tool set per role.
+2. **Phase 2** (keyword routing) — **the big one.** Slashes 45 tools → 3-8 for the
+   vast majority of messages. Builds on Phase 1's role filtering.
+3. **Phase 4** (lower max_rounds) — trivial change, do alongside Phase 2.
+4. **Phase 3** (schema caching) — more involved, compounds savings from Phase 2.
+5. **Phase 5** (UI) — independent UX polish, can be done any time.
+
+### Quick Win Path (Recommended First Session)
+
+Phases 1 + 2 + 4 can be done in a single Claude Code session. They're all in
+`openai_orchestrator.py` and `model_registry.py` — the same few files. Estimated
+effort: 45-60 minutes of coding.
+
+Phase 3 (caching) is a separate, focused session afterward.
+
+---
+
+## Appendix A: Code Locations (from grep audit 2026-05-15)
+
+| What | File | Line |
+|------|------|------|
+| `get_openai_tools_for_role` definition | `cortex/tools.py` | ~540 |
+| Call site (decides active_tools) | `cortex/openai_orchestrator.py` | ~449 |
+| `_run_from_messages()` tool loop | `cortex/openai_orchestrator.py` | ~260 |
+| Role config tools field | `cortex/model_registry.py` | ~487 |
+| `get_role_config()` | `cortex/model_registry.py` | ~473 |
+| `save_role_config()` (tools allow-list) | `cortex/model_registry.py` | ~455 |
+| Global `ORCHESTRATOR_MAX_ROUNDS` | `cortex/.env` | 35 |
+| `REQUIRED_ROLES` | `cortex/model_registry.py` | 163 |
+| `DEFINED_ROLES` config | `cortex/config.py` | 80 |
+| Per-model `max_rounds` example | `home/brian/model_registry.json` | 42 |
+
+## Appendix B: Token Savings Estimate
+
+| Scenario | Before (per round) | After Phase 1 | After Phase 1+2 | After All Phases |
+|----------|-------------------|--------------|-----------------|-----------------|
+| "What's the weather?" | ~9K tokens | ~5K (25 tools) | ~600 (3 web tools) | ~600 (cached) |
+| "Good morning" | ~9K tokens | ~5K (25 tools) | 0 (routed to chat) | 0 |
+| "Turn off kitchen lights" | ~9K tokens | ~5K (25 tools) | ~600 (3 HA tools) | ~600 (cached) |
+| "Search journals for X" | ~9K tokens | ~5K (25 tools) | ~2K (10 aether tools) | ~2K (cached) |
+| "Create a task" | ~9K tokens | ~5K (25 tools) | ~800 (4 task tools) | ~800 (cached) |
+| "Run a SQL query" | ~9K tokens | ~5K (25 tools) | ~600 (3 db tools) | ~600 (cached) |
+
+At 3 rounds per request and 50 requests/day, that's roughly **1.3M tokens/day saved**
+vs. **~13K/day after all optimizations** — a 99% reduction for casual chat, ~90% for
+most tool-using queries.
--- a/documentation/ROADMAP.md
+++ b/documentation/ROADMAP.md
@@ -1,7 +1,7 @@
 # Cortex — Roadmap

 > Phases and priorities. For active tasks see `TODO__Agents.md`.
-> Last updated: 2026-04-03
+> Last updated: 2026-05-09

 ---

@@ -32,14 +32,24 @@

 ## Phase 3 — Intelligence Layer (In Progress)
 - ✅ Gemini API orchestrator (tool loop → Claude responder)
- ✅ Tool suite: web search, AE Journal read/write, tasks, scratch, reminders, cron, system
- ✅ Agent mode in UI (async job, poll for result)
- ✅ Local LLM backend (Open WebUI/Ollama, per-user multi-model config)
+- ✅ Tool suite: web search, AE Journal read/write, tasks, scratch, reminders, cron, system, email_send (+ per-user allowlist), nc_talk_send
+- ✅ Agent mode in UI (async job, poll for result); role-based tool access + confirmation gate
+- ✅ Local LLM backend (Open WebUI/Ollama, per-user multi-model config); inline model edit in registry UI
 - ✅ Proactive cron (`message` / `brief` job types → NC Talk)
 - ✅ Session search (full-text across past session logs)
 - ✅ Distill notifications (NC Talk after mid/long runs)
 - ✅ Local backend for distillation (DISTILL_BACKEND_MID/LONG in .env)
- [ ] **Local orchestrator** — ReAct tool loop using local model (High priority — see `TODO__Agents.md`)
+- ✅ Local orchestrator — OpenAI-compatible ReAct loop; fires when orchestrator role → local model
+- ✅ Web push notifications — VAPID; `web_push` tool; PWA-installable; subscribe via ☰ menu
+- ✅ Proactive notifications — daily reminder check (09:00); `notify()` routes to any configured channel; dedicated settings page
+- ✅ Sub-agent spawning — `spawn_agent` tool; per-host concurrency limit; Gemini API + local OpenAI backends
+- ✅ Web content extraction — `web_read` via trafilatura; strips ads/nav/boilerplate; 128K cap
+- ✅ Session log reader — `session_read(date)` tool; complements `session_search`
+- ✅ `http_post` — POST to external URLs with per-user URL prefix allowlist; admin-only, confirm-required
+- ✅ `nc_talk_history` — read recent NC Talk messages; requires nc_username + nc_app_password in channels.json
+- ✅ Local orchestrator retry — exponential backoff on 429/5xx/connection errors (3 attempts)
+- ✅ Multi-level agent management — `agent_manager.py` (registry + lifecycle), background `spawn_agent`, `agent_status`/`agent_list`/`agent_cancel` tools, 3-level hierarchy enforcement (see `ARCH__FUTURE.md` §13)
+- ✅ `aider_run` background mode — background task + push notification on completion; sync path unchanged
 - [ ] Knowledge import — markdown → AE Journals (import script)
 - [ ] Dev agent pipeline — specialist agents + supervisor + approval gate
 - [ ] Gitea webhook integration + Actions CI
@@ -54,7 +64,6 @@
 ## Phase 5 — Routing Intelligence & Scale
 - [ ] Intelligent model routing (by task type, privacy, context length)
 - [ ] Agent-to-agent task delegation across fleet
- [ ] Permanent hosting on home server (currently on `scott_lpt`)

 ## Phase 6 — Infrastructure
 - [ ] Server DMZ finalized
@@ -64,7 +73,6 @@
 ---

 ## Deferred / Watching
- **Unsloth Gemma 4 GGUFs** — blocked on Ollama v0.20.1 (llama.cpp GGUF metadata issue); switch `agent-support-gemma-*` aliases to Unsloth Q4_K_M when ready
 - **Speculative decoding** — llama.cpp supports it (E4B + E2B draft ≈ 2x speed); Ollama does not yet
 - **RAG via Open WebUI** — feed Nextcloud docs into local knowledge collections; possible complement to AE Journals search
 - **Multi-host local models** — per-user config already supports multiple hosts; routing logic TBD
--- a/documentation/SELF_UPDATE.md
+++ b/documentation/SELF_UPDATE.md
@@ -0,0 +1,144 @@
+# Cortex — Self-Update & Maintenance Bootstrap
+
+> A short reference for Inara (or any agent) performing maintenance, feature additions,
+> or configuration changes on the Cortex codebase.
+> Last updated: 2026-05-09
+
+---
+
+## Where the Code Lives
+
+**Git repository:** `~/agents_sync/projects/Cortex_and_Inara_dev/`
+This is the canonical source. All Python, HTML, config templates, and documentation live here.
+
+**Remote:** `ssh://git@git.dgrzone.com:2222/Scott.Idem/cortex-inara.git`
+
+```bash
+git status          # see uncommitted changes
+git log --oneline -8
+git push            # push to Gitea after committing
+```
+
+---
+
+## Syncthing — How Code Gets to the Fleet
+
+The `~/agents_sync/` directory syncs across all fleet machines in real time via Syncthing.
+Code is edited on `scott_lpt` (main laptop). Changes sync automatically to `scott-lt-i7-rtx`
+(the Agents Laptop, which runs the live Cortex service).
+
+**You do not need to manually copy files.** Edit → Syncthing syncs → restart service.
+
+**Sync is not instantaneous** — allow a few seconds after saving before restarting the service.
+
+---
+
+## Ignore Files
+
+Two layers of ignores apply to this project:
+
+| File | Scope | Purpose |
+|---|---|---|
+| `.gitignore` | Git | Keeps secrets, runtime data, and persona data out of the repo |
+| `.stignore` | Syncthing | Keeps machine-local artifacts from syncing (overlaps `.gitignore`) |
+| `~/agents_sync/.stignore` | Syncthing (root) | Fleet-wide Syncthing ignores (venvs, pyc, system files) |
+
+**Key ignores to be aware of:**
+- `home/` — all persona data (memory, tasks, sessions, credentials). **Never in git.** Backed up via restic.
+- `cortex/.env` — secrets (API keys, JWT secret, VAPID keys). Never committed; `cortex/.env.default` is the template.
+- `cortex/.venv/` — Python virtualenv. Machine-local; recreated by `install.py`.
+- `cortex/data/` — runtime session JSON files. Machine-local.
+
+---
+
+## Helper Scripts
+
+All scripts live in the project root. Run them from `scott_lpt`; they SSH to the service host as needed.
+
+### `install.py` — Set up or update the service
+```bash
+python3 install.py           # install / update (idempotent — safe to re-run)
+python3 install.py --check   # status check only, no changes
+```
+What it does: creates `.venv`, installs `requirements.txt`, writes the systemd user service,
+enables linger, starts/restarts Cortex, checks LLM CLI auth, sets up the daily backup timer.
+
+Run after: cloning the repo on a new machine, adding a new pip dependency, or changing the systemd service definition.
+
+### `dev-restart.sh` — Restart the service and view logs
+```bash
+./dev-restart.sh          # restart on scott-lt-i7-rtx, show last 30 log lines
+./dev-restart.sh logs     # tail live logs (Ctrl-C to stop)
+./dev-restart.sh status   # show service status only
+```
+This SSHes to `scott-lt-i7-rtx` — it does not restart anything locally.
+Run after: any Python file change.
+
+### `backup.sh` — Back up persona data
+```bash
+./backup.sh               # run a restic backup of home/ immediately
+```
+Normally runs automatically via systemd timer (daily 03:00). Run manually to verify backups
+or before a risky change to persona files. Backup location: `~/backups/cortex-home-restic`.
+
+---
+
+## Making a Change — Standard Workflow
+
+1. **Read before writing.** Check `documentation/TODO__Agents.md` for active tasks.
+   Check the relevant `ARCH__*.md` for the component you're changing.
+2. **Edit files** on `scott_lpt`. Syncthing handles distribution.
+3. **Syntax check** before restarting:
+   ```bash
+   python3 -m py_compile cortex/<file>.py
+   # or for all routers/tools at once:
+   for f in cortex/routers/*.py cortex/tools/*.py; do python3 -m py_compile "$f" && echo "OK: $f"; done
+   ```
+4. **Restart:** `./dev-restart.sh` — confirm clean startup in the log output.
+5. **Update docs** — see checklist below.
+6. **Commit and push.**
+
+---
+
+## Documentation Update Checklist
+
+Run through this after any feature or functionality change:
+
+| Doc | Update when |
+|---|---|
+| `CLAUDE.md` | New tool, channel, router, tool count, major design change |
+| `cortex/static/HELP.md` | Any user-visible feature — tools, settings, UI, endpoints |
+| `documentation/TODO__Agents.md` | Mark completed items; add new planned work |
+| `documentation/MASTER.md` | New capability goes live; tool count changes |
+| `documentation/ROADMAP.md` | Phase items completed or added |
+| `documentation/ARCH__CHANNELS.md` | New channel, notification trigger, or scheduler job |
+| `documentation/ARCH__SYSTEM.md` | New module, router, or tools/ file |
+| `README.md` | Architecture diagram, channels table, or setup steps change |
+
+The principle: **stale docs are bugs.** If a feature exists that docs don't mention, or docs
+describe something that doesn't exist, fix it before moving on.
+
+---
+
+## Adding a Python Dependency
+
+1. Add the package to `cortex/requirements.txt`
+2. Install it on the service host:
+   ```bash
+   ssh scott@scott-lt-i7-rtx "~/agents_sync/projects/Cortex_and_Inara_dev/cortex/.venv/bin/pip install <package>"
+   ```
+3. Verify it works, then commit `requirements.txt`
+4. On any new machine setup, `install.py` will install it automatically from `requirements.txt`
+
+---
+
+## Key Paths on the Service Host (`scott-lt-i7-rtx`)
+
+| Path | What it is |
+|---|---|
+| `~/agents_sync/projects/Cortex_and_Inara_dev/` | Project root (synced from `scott_lpt`) |
+| `~/agents_sync/projects/Cortex_and_Inara_dev/cortex/.env` | Live secrets (not in git) |
+| `~/agents_sync/projects/Cortex_and_Inara_dev/home/` | All user persona data (not in git) |
+| `~/.config/systemd/user/cortex.service` | systemd service file (written by `install.py`) |
+| `~/backups/cortex-home-restic/` | Restic backup repository |
+| `~/.config/cortex/restic-password` | Restic encryption key — back this up separately |
--- a/documentation/TODO__Agents.md
+++ b/documentation/TODO__Agents.md
@@ -1,4 +1,4 @@
-# Cortex / Inara — Agent Task List
+# Cortex — Agent Task List

 > Read this file before starting any work on this project.
 > **Status:** Active development — ongoing.
@@ -7,61 +7,301 @@

 ## 🔴 High Priority

-### [Local] Tool-capable local orchestrator
-Design and implement `local_orchestrator_engine.py` — a ReAct tool loop driven by
-a local model via Open WebUI's OpenAI-compatible API, as an alternative to the
-Gemini API orchestrator for private/offline tasks.
+### [UX] User onboarding — guided model setup
+New users complete password + persona setup and land directly in the chat with no working
+AI model configured. This closes that gap with a guided Step 3 and a fallback for existing
+users who skipped it or were onboarded before this existed.

- [ ] Convert existing Cortex tool definitions (`cortex/tools/`) from Gemini
-      `FunctionDeclaration` format to OpenAI `tools` format (minor schema diff)
- [ ] Implement tool loop: send tools → parse `tool_calls` response → execute →
-      append result → loop until `finish_reason: stop`
- [ ] Wire into `routers/orchestrator.py` — new `mode` param: `"local"` vs `"gemini"`
- [ ] UI: Agent mode button routes to local orchestrator when local backend active
- [ ] Recommended models (scott_gaming, 8 GB VRAM):
-      Gemma 4 E4B — 25 t/s, 72k practical ctx — interactive/fast tasks
-      Gemma 4 26B A4B — 9 t/s, 50k practical ctx — heavier reasoning, background tasks
- Reference: `docs/OPEN_WEBUI_API.md` for full tool call request/response format
+Design spec: `documentation/ARCH__SYSTEM.md` § Onboarding Flow
+
+- [x] **Setup Step 3 page** — new `/setup/model` GET/POST in `onboarding.py` — 2026-05-06
+  - Recommends OpenRouter: "one API key, access to Claude, Gemini, and dozens of other models"
+  - API key field + curated model dropdown (claude-3-5-haiku, claude-3-7-sonnet, gemini-2.0-flash, llama-3.3-70b)
+  - On submit: `save_host()` (OpenRouter) + `save_model()` + `set_role(chat, primary, model_id)` in `model_registry.py`
+  - Skip: `POST /setup/model/skip` reads `cx_setup_persona` cookie, redirects to chat; JS fetch on skip-link click
+  - Step labels updated: setup.html "1 of 3" / "2 of 3" / "3 of 3" (was "1 of 2" / "2 of 2")
+  - Standalone: `/setup/model` works without step labels (no `cx_setup_persona` cookie → no label)
+  - Persona creation now redirects to `/setup/model` instead of directly to chat
+- [x] **Existing user banner** — displayed in chat if no role has a model assigned — 2026-05-06
+  - Checks `GET /backend` on load (uses `available_roles` — already does role-resolution)
+  - Dismissable amber callout strip above chat: "No AI model configured — Set up OpenRouter →"
+  - Dismissed via `localStorage` key `cx_no_model_banner_dismissed`; auto-removed when a model is added
+- [x] **Settings quick-link** — amber card in settings Model Registry section — 2026-05-06
+  - Checks `GET /backend` on page load; shown if `available_roles` is empty
+  - Links to `/setup/model`
+- [x] Update `cortex/static/HELP.md` — Getting Started section + model registry quick-connect note — 2026-05-06
+- [x] Update `CLAUDE.md` — documented `/setup/model` endpoint, setup flow description, docs philosophy — 2026-05-06
+
+### [Local] Local orchestrator — reach full parity with Gemini orchestrator
+`openai_orchestrator.py` is partially built and wired into `POST /orchestrate`.
+When the `orchestrator` role resolves to a `local_openai` model it routes there
+automatically. Remaining work is quality/reliability parity, not ground-up design.
+
+- [x] Tool schema conversion — Gemini FunctionDeclaration → OpenAI tools format
+- [x] Context budget: `_context_budget()` uses `context_k * 1000 * 0.75`, min 16k — 2026-05-06
+- [x] Context compaction: `_compact_messages()` trims old tool results before each round and before the confirmation-gate call — 2026-05-06
+- [x] Error handling: malformed tool args caught + logged; tool execution errors returned as strings
+- [x] Retry logic on transient API errors (connection timeout, 429, 503) — 2026-05-09
+  - `_chat_with_retry()` helper in `openai_orchestrator.py`; 3 attempts, exponential backoff (1s, 2s)
+  - Retries on `APIConnectionError` and `APIStatusError` with status 429/500/502/503/504
+- [ ] Test end-to-end with Gemma 4 E4B and 26B A4B on scott_gaming
+- [ ] Review `ARCH__FUTURE.md` agent architecture ideas before finalising design
+- Reference: `docs/OPEN_WEBUI_API.md`, `documentation/ARCH__FUTURE.md` §1

 ---

 ## 🟡 Medium Priority

+### [UI] Progressive Web App (PWA) ✅ — 2026-04-29
+- manifest.json, sw.js, icon-192/512.png, SW registration in app.js
+- `/manifest.json` and `/sw.js` served at root; added to `_PUBLIC` in auth_middleware
+- Tested: install prompt confirmed working in Chromium
+
+### [Tools] Orchestrator tool expansions — Round 1 ✅
+- [x] **`cortex_restart`** — detached subprocess, 5s delay, admin-only, confirm-required — 2026-04-29
+- [x] **`cortex_logs`** — `journalctl --user -u cortex -n N`, admin-only — 2026-04-29
+- [x] **`http_fetch`** — direct URL fetch via httpx, 8192 char cap — 2026-04-29
+- [x] **`file_list`** — directory listing with size, dirs first, 200 entry cap, admin-only — 2026-04-29
+- [x] **`file_write`** — overwrite/append to home_root paths, admin-only, confirm-required — 2026-04-29
+- [x] **`nc_talk_send`** — outbound NC Talk message via notification.py, admin-only — 2026-04-29
+- [x] **`email_send`** — SMTP via email_utils, per-user regex allowlist in `home/{user}/email_allowlist.json`, managed via Settings UI textarea + Files panel raw editor — 2026-04-29
+- [x] **`web_push`** — VAPID push via pywebpush; subscriptions in `home/{user}/push_subscriptions.json`; "Enable notifications" toggle in ☰ menu; sw.js push+notificationclick handlers — 2026-05-05
+
+### [Agents] Multi-Level Agent Management
+
+Design: `documentation/ARCH__FUTURE.md` §13
+
+Three-level hierarchy: Level 1 = Cortex Persona; Level 2 = Specialized Sub-Agent
+(can spawn Level 3); Level 3 = Basic Support Agent (cannot spawn). All spawning is
+currently synchronous and blocking — this makes long-running agents (Aider, research
+pipelines) unusable without freezing the orchestrator.
+
+**Phase 1 — Foundation (build first):**
+- [x] **`cortex/agent_manager.py`** — `AgentRecord` dataclass (agent_id, level, role,
+      task, status, started, parent_id, result, notify, user); module-level registry dict
+      with `asyncio.Lock()`; `register()`, `finish()`, `cancel_agent()`,
+      `list_agents(user, status)` functions; calls `notification.notify()` on completion
+      when `notify=True`; prune records older than 24 hours on next register — 2026-06-03
+- [x] **Background mode for `spawn_agent`** — added `background: bool = False` and
+      `notify: bool = False` params; when `background=True`, wraps `_run()` in
+      `asyncio.create_task()`, registers in agent_manager, returns agent_id immediately;
+      existing sync path unchanged — 2026-06-03
+- [x] **`agent_status(agent_id)` tool** — returns status, role, task excerpt, elapsed
+      seconds, result preview (first 300 chars); user-level — 2026-06-03
+- [x] **`agent_list(status=None, limit=10)` tool** — returns running + recent agents for
+      current user; filter by `status`; user-level — 2026-06-03
+- [x] **`agent_cancel(agent_id)` tool** — cancels background task via stored
+      `asyncio.Task` reference; admin-only, confirm-required — 2026-06-03
+
+**Phase 2 — Level enforcement:**
+- [x] **L2→L3 boundary enforcement** — `spawn_agent` param `_agent_level` (default 2);
+      when `child_level >= 3`, auto-adds `spawn_agent` + `aider_run` to deny_tools so
+      Level 3 children cannot delegate; level stored in AgentRecord — 2026-06-03
+- [ ] **`_agent_level=1` from main orchestrators** — Gemini and OpenAI orchestrators
+      should pass `_agent_level=1` when calling spawn_agent so the hierarchy is rooted
+      correctly; currently defaults to 2 (children become Level 3, which is safe but
+      means Level 1 cannot spawn Level 2 that itself spawns Level 3)
+
+**Phase 3 — `aider_run` async:**
+- [x] **`aider_run` background mode** — added `background: bool = False` and
+      `notify: bool = False` params; runs subprocess via `asyncio.create_task()`, registers
+      in agent_manager, returns agent_id immediately; confirmation still required (correct
+      — user confirms before the tool runs, not during) — 2026-06-03
+- [x] **Register new tools in `__init__.py`** — `agent_status`, `agent_list`, `agent_cancel`
+      in `TOOL_CATEGORIES["Agents"]`; `agent_cancel` in `TOOL_ROLES` (admin) and
+      `CONFIRM_REQUIRED`; added to `_CALLABLES` and `_ALL_DECLARATIONS` — 2026-06-03
+
+**Tests:**
+- [x] **`cortex/tests/test_agent_manager.py`** — 41 tests covering: agent_manager CRUD,
+      prune, notify hook, spawn_agent background mode (returns immediately, completes async,
+      timeout, failure), level enforcement (L1→L2 permits, L2→L3 auto-denies), agent
+      lifecycle tools output, aider_run background mode — 2026-06-03
+      Run: `cd cortex && .venv/bin/python -m pytest tests/test_agent_manager.py -v`
+
+---
+
+### [Tools] Orchestrator tool expansions — Round 2
+Next additions identified 2026-05-08. See `ARCH__FUTURE.md` §2 for design notes.
+
+**Note:** `datetime_now` is NOT needed — current date/time is already injected into every
+system prompt by `context_loader.py` at all tiers.
+
+- [x] **`session_search`** — expose existing session search to the orchestrator — 2026-05-08
+  - Wraps session log grep as a tool callable in `tools/files.py`
+  - Params: `query: str`, `limit: int = 5` (max 20)
+  - Returns: excerpts with session date, newest first; own sessions only via ContextVars
+  - User-level (no TOOL_ROLES entry needed)
+- [x] **`reminders` due-date support** — make reminders time-aware — 2026-05-08
+  - Optional `due: YYYY-MM-DD` on `reminders_add`; stored as `due: date` first line of body
+  - `context_loader.py` calls `load_due_reminders()` — future-dated sections suppressed until due
+  - `reminders_list` shows `[OVERDUE]`, `[due TODAY]`, or `[due: YYYY-MM-DD]` per entry
+  - Backward compatible — existing undated reminders always surface as before
+- [x] **`spawn_agent`** — spawn a synchronous sub-agent using any role's model + tools — 2026-05-08
+  - `cortex/tools/agents.py` — `spawn_agent(task, role, tier, timeout, max_rounds)`
+  - Per-host asyncio semaphore keyed by `host_id` (or model type for cloud); `max_concurrent` field in host schema
+  - Supports `local_openai` and `gemini_api` model types; returns error string for others
+  - Admin-only tool (powerful — can spawn arbitrarily long sub-tasks)
+  - Host UI: "Max parallel" number input in host edit/add forms
+- [x] **`spawn_agent` per-call tool restrictions** — `allow_tools` and `deny_tools` params — 2026-05-12
+  - `allow_tools: list[str]` — intersected with role ceiling; cannot grant beyond role config
+  - `deny_tools: list[str]` — blocked even when role permits; falls back to `confirm_deny` gate when `tool_list` is None
+  - Both params documented in FunctionDeclaration for orchestrator use
+- [x] **`file_diff`** — unified diff between two project-scoped files — 2026-05-12
+  - `cortex/tools/files.py` — `diff -u`, 50 KB output cap, project-scoped path resolution
+- [x] **`git_status` / `git_log` / `git_diff`** — read-only git inspection — 2026-05-12
+  - `cortex/tools/git.py` — new module; all project-scoped, low risk
+  - `git_log(n, path, oneline)` — last N commits with optional path filter
+  - `git_diff(ref_a, ref_b, path, stat_only)` — any ref range; no args = unstaged vs HEAD
+- [x] **`http_post`** — POST to external URLs — 2026-05-09
+  - Params: `url: str`, `body: str`, `headers: dict | None`, `max_chars: int`
+  - Per-user URL prefix allowlist in `home/{user}/http_allowlist.json` (JSON array of prefixes)
+  - Default: blocked if no allowlist or URL doesn't match any prefix
+  - Admin-only, confirm-required
+- [x] **`nc_talk_history`** — read recent Talk messages — 2026-05-09
+  - Params: `conversation_token: str` (optional, defaults to notification_room), `limit: int = 20`
+  - Returns last N messages with sender + timestamp, chronological order
+  - Admin-only; requires `nc_username` and `nc_app_password` in channels.json under `nextcloud`
+- [x] **`task_list` priority filter** — add `priority` param alongside existing `status` — 2026-05-12
+- [x] **`http_fetch` max_chars** — optional param, default 8192, cap at 32768 — 2026-05-09
+- [x] **`web_read(url, max_chars=16000)`** — clean article extraction via trafilatura; strips ads/nav/boilerplate, returns markdown — 2026-05-09
+- [x] **`session_read(date)`** — read a full session log by YYYY-MM-DD date; lists available dates if not found — 2026-05-09
+
+### [Channel] Proactive notifications ✅ — 2026-05-08
+Inara reaches out on her own initiative via NC Talk, Google Chat, email, or browser push.
+- [x] `notification.py` — `notify(username, message, channel=None)` routes to NCT / email / Google Chat / web_push
+- [x] `web_push` added as a routable channel in `notification.py` (was tool-only before)
+- [x] `scheduler.py` — `_run_reminder_check()` daily at 09:00: reads due reminders per persona, fires `notify()` with a summary
+- [x] `cron_runner.py` — already calls `notify()` on job completion (was already wired)
+- [x] `scheduler.py` — distill_mid and distill_long already call `notify()` on completion
+- [x] Settings UI — "Browser Push Notification" option added to Notification Channel selector
+- [x] `notification_channel` accepts `"web_push"` in `routers/settings.py`
+- [x] `GET /settings/notifications` — dedicated Notifications page (channel form + test buttons); Settings page now shows a link card
+- [x] `POST /api/push/test` + `POST /api/push/reminders/check` — on-demand test endpoints
+- [x] `push_utils.py` — fixed `pywebpush` 2.x key deserialisation (use `Vapid.from_pem()` instead of passing PEM string)
+
+### [Channel] Home Assistant integration — design & tools
+Inara can already receive HA events via `POST /webhook/ha/{username}/{webhook_id}` and
+respond via web push. Next steps are deciding what events to send and giving Inara the
+ability to act on HA via the REST API.
+
+- [ ] **Event design** — decide which HA events are worth routing to Inara (security,
+      climate thresholds, low battery, unexpected device state). Avoid flooding with
+      high-frequency sensor polling. Per-automation `"tools": true/false` to choose
+      notify-only vs. agentic response.
+- [ ] **Richer payload template** — update `rest_command` in HA to include
+      `trigger.to_state.attributes`, `area_name`, and `previous_state` so Inara gets
+      full device context automatically.
+- [x] **HA API tools** — `cortex/tools/homeassistant.py` — 2026-05-12
+      - `ha_get_state(entity_id)` — current state + attributes of any entity
+      - `ha_call_service(domain, service, data)` — turn on lights, set HVAC, lock doors, etc.
+      - `ha_get_states(area=None, domain=None)` — list states with optional filter
+      - Auth via Long-Lived Access Token stored in `channels.json` under `homeassistant.token`
+      - HA URL from `channels.json` under `homeassistant.url`
+- [x] **Store HA config in channels.json** — `url`, `token`, `webhook_id` fields under `homeassistant`; managed via `/settings/notifications` — 2026-05-12
+- [x] **`ha_call_service` confirm-required** — 2026-05-12
+
+### [UX] Session delete confirmation
+- [x] Inline "Delete this session? [Delete] [Cancel]" reveal on `×` click in `app.js` — 2026-05-12
+- [x] Message-level delete: "confirm delete / cancel" inline in the actions bar — 2026-05-12
+
+### [UI] File attachments in chat ✅ — 2026-05-12
+Upload an image or document inline and have it flow into context.
+- [x] Attachment button (paperclip) in input area; hidden file input
+- [x] Images sent as base64 inline_data (Gemini API) or image blocks (Claude/local)
+- [x] Text/code files read as UTF-8, injected as fenced code block in message
+- [x] Thumbnail/filename shown above sent message in UI
+
+### [Auth] Encrypted sessions
+Allow users to opt-in to per-session encryption so session logs on disk cannot be
+read without the user's key.
+- [ ] Design key derivation: password-based (PBKDF2/Argon2) or separate passphrase
+- [ ] Encrypt `session_logger.py` output before writing to `sessions/*.md`
+- [ ] Decrypt on read in `session_store.py` (history reload, file browser)
+- [ ] UI toggle in Settings to enable/disable encrypted sessions per persona
+- [ ] Decide: encrypt at rest only, or also in-memory session store?
+- [ ] Consider: how distillation and session search interact with encrypted files
+
+### [Models] Model Registry V2 — Unified Provider System
+See `DESIGN__Model_Registry_V2.md` for full design.
+- [x] **Phase 1** — V2 schema with providers (Anthropic/Google), multi-account Gemini, auto migration, orchestrator uses account API key — 2026-04-27
+- [x] **Phase 2** — Cloud provider UI: Anthropic + Google sections in `/settings/models`, account management, model entry creation for cloud models — 2026-04-27
+- [x] **Phase 3** — Unified roles + toggle redesign: chat toggle cycles chat-role slot models (Primary/Backup 1/Backup 2) by label; slot sent in chat/orchestrate payload — 2026-05-12
+- [ ] **Phase 4** — Polish: Claude API key, OpenRouter as named provider, catalog sync from API
+
 ### [Intelligence] Knowledge consolidation — Phase 1
 See `ARCH__Intelligence_Layer.md` for full design.
+- [x] Tool: `ae_journal_list` — list all journals for the account — 2026-04-28
 - [x] Tool: `ae_journal_search` — search before creating to avoid duplicates
 - [x] Tool: `ae_journal_entry_create` — write a new entry with source metadata
- [ ] Import script: walk a markdown directory, chunk by H2 section, create entries
- [ ] Target: markdown files from `~/DgrZone_Nextcloud/` and `~/OSIT_Nextcloud/`
- [ ] Tag strategy: source path, date, topic tags from frontmatter or filename
-
-### [Distill] Review first auto_distill_long output — 2026-04-01
- Ran April 1 at 04:00 as scheduled
- Manually review `inara/MEMORY_LONG.md` — confirm quality before fully trusting
- Adjust distill prompts in `cortex/memory_distiller.py` if needed
-
-### [Distill] Distill quality review
- Short/mid/long distill prompts live in `cortex/memory_distiller.py`
- After first few automatic runs, review quality and tune
-
-### [Local] Unsloth Gemma 4 variants
- Unsloth Dynamic 2.0 Q4_K_M GGUFs fail with `500: unable to load model` on Ollama v0.20.0
- Root cause: Ollama's bundled llama.cpp doesn't recognize Gemma 4 GGUF architecture metadata from raw files
- Waiting on Ollama point release (v0.20.1+) — then switch Open WebUI to Unsloth variants
- Expected speedup: ~10–20% smaller context footprint vs baseline, same quality
- `agent-support-gemma-small` → Unsloth E4B Q4_K_M; `agent-support-gemma-medium` → Unsloth 26B A4B Q4_K_M
+- [x] Tool: `ae_journal_entry_update` — PATCH any fields on an existing entry — 2026-04-28
+- [x] Tool: `ae_journal_entry_disable` — soft-delete via enable=false — 2026-04-28
+- [x] Tool: `ae_journal_entry_append` — read→append timestamped section→write (running logs) — 2026-04-28
+- [x] Tool: `ae_journal_entry_prepend` — read→prepend timestamped section→write (newest-first logs) — 2026-04-28
+- [x] Import script: walk a markdown directory, chunk by H2 section, create entries — 2026-05-05
+- [x] Target: markdown files from `~/DgrZone_Nextcloud/` and `~/OSIT_Nextcloud/` — 2026-05-05
+- [x] Tag strategy: source path, topic tags from path components — 2026-05-05

 ---

 ## 🟢 Lower Priority / Future

+### [Research] Agent architecture patterns — review before building dev agent pipeline
+The Claude Code system prompt was leaked April 2026. Two reimplementation repos have
+useful design ideas directly applicable to the local orchestrator and dev agent work.
+Read before finalising either design.
+- [ ] Review https://github.com/HarnessLab/claw-code-agent (Python, targets local models)
+- [ ] Review https://github.com/ultraworkers/claw-code (community port, interesting source)
+- Key ideas to evaluate for Cortex:
+  - Tiered permission model (read-only / write / shell / unsafe) — relevant once dev
+    agent is writing and executing code
+  - Agent lineage tracking — which agent spawned which sub-agent; essential for the
+    orchestrator → specialist → supervisor chain
+  - Hard token/cost budgets per operation — local models have fixed context ceilings
+  - Context compaction mid-session — trim stale tool results before hitting limit
+  - Nested agent delegation with dependency-aware batching
+  - Plugin/manifest-based tool registration — worth considering before tool suite grows
+
+### [Backend] API usage / cost tracking
+Multi-user setup with real Gemini/Claude API costs. Track per-user token consumption
+so Scott can see who's spending what.
+- [x] Count input + output tokens — local backend (OpenAI `usage` field) + Gemini API (`usage_metadata`) — 2026-05-05
+- [x] Append to `home/{user}/usage.json` — daily buckets, per-model breakdown — 2026-05-05
+- [x] Expose via `/api/usage` + `/api/usage/summary` + `/api/usage/all` (admin); usage table in Settings — 2026-05-06
+- [ ] Optional: soft spending limit with a warning toast when exceeded
+
+### [Security] Tool call audit log — 2026-05-05
+Every orchestrator tool invocation logged to `home/{user}/tool_audit/YYYY-MM-DD.jsonl`.
+- [x] `tool_audit.py` — JSONL writer with asyncio locks; ContextVars for engine/model set by each orchestrator at run start
+- [x] Hook in `call_tool()` — fire-and-forget `asyncio.create_task`; captures status ok/error/denied, 300-char result snippet, args (truncated at 500 chars)
+- [x] `GET /api/audit/files` — lists available dates for current user (self-service)
+- [x] `GET /api/audit/day?date=` — returns entries for one date (self-service)
+- [x] `GET /api/audit/recent` + `/stats` — cross-user aggregation (admin only)
+- [x] "Audit Log" group in Files panel sidebar (collapsed by default) — read-only table with time/tool/status/model/args/result columns; colour-coded status
+
 ### [Intelligence] Dev agent pipeline
 See `ARCH__Intelligence_Layer.md`. Full design not yet started.
+
+`aider_run` (2026-05-23) provides the execution layer — Cortex dispatches to Aider as
+the coding worker. Aider is model-agnostic (DeepSeek, Ollama, OpenRouter, etc.) and
+fully scriptable via `--message --yes-always`. This replaces the Claude Code subprocess
+dependency for coding tasks. Per-project `.aider.conf.yml` holds read-only context files
+and lint commands; model/key come from env vars (not committed).
+
+- [x] **`aider_run` tool** — `cortex/tools/aider.py`; project aliases + subprocess with `--message --yes-always`; admin-only, confirm-required, high risk — 2026-05-23
+- [x] **`aider_run` async/notify** — background=True fires subprocess via asyncio.create_task(), registers in agent_manager, returns agent_id immediately; notify=True sends push/Talk on completion — 2026-06-03
+- [x] **`.aider.conf.yml`** — project-level Aider config: `read: [CLAUDE.md]`, Python lint-cmd, auto-commits — 2026-05-23
+- [x] **`aider_run` multi-provider credentials** — `_resolve_credentials()` pulls from
+      all configured hosts: OpenRouter/OpenAI/Groq/etc. → `--api-key slug=key`;
+      local Open WebUI/Ollama → `--openai-api-base + key`; Anthropic from
+      `providers.anthropic.credentials`; `host_label` param for explicit host selection;
+      auto-prefixes model with `openai/` for generic endpoints — 2026-06-03
+- [x] **`.gitignore`** — added `.aider.chat.history.md`, `.aider.input.history`, `.aider.llm.history` — 2026-05-23
 - [ ] Specialist agent: frontend (SvelteKit) code changes
 - [ ] Specialist agent: backend (FastAPI) code changes
 - [ ] Supervisor agent: diff review, syntax check, test runner
 - [ ] Gitea webhook integration: trigger on push/PR, report back
 - [ ] Human approval gate before commit
+- [ ] `.aider.conf.yml` for aether_api, aether_frontend, aether_container projects

 ### [Intelligence] Supervisor agent
 - Runs `py_compile`, `svelte-check`, unit tests after specialist agent work
@@ -80,18 +320,112 @@ base accessible to local models. Endpoints documented in `docs/OPEN_WEBUI_API.md
 - `/api/v1/files/` upload + `/api/v1/retrieval/process/web` for URLs
 - Reference in chat via `"files": [{"type": "collection", "id": "..."}]`

-### [Backend] Intelligent model routing
- Currently hardcoded: Claude default, Gemini fallback, local third
- Design direction (now informed by real local model perf):
-  - **Private/offline tasks** → local (Gemma 4 E4B for speed, 26B A4B for reasoning)
-  - **Complex tool tasks / long context** → Gemini (1M token context, strong function calling)
-  - **Final user-facing responses** → Claude (quality prose, persona fidelity)
- Future: auto-route by task type rather than requiring user to toggle backend manually
+### [Backend] Intelligent model routing — automatic task-type dispatch
+Model Registry V2 (2026-04-27) added role-based routing and manual role toggle — that's
+the foundation. What remains is removing the need to toggle manually.
+- [ ] Classify incoming messages by task type (heuristic or lightweight classifier)
+- [ ] Map task type → role → model automatically:
+  - User conversation → `chat` role → Claude (quality prose, persona fidelity)
+  - Tool/research tasks → `orchestrator` role → Gemini API or local
+  - Private/sensitive → `local` role → Ollama (no data leaves network)
+  - Long context (>50k tokens) → Gemini 2.0 (1M ctx window)
+  - Fast/cheap queries → local E4B (25 t/s, no API cost)
+- [ ] Routing logic in `llm_client.py` or new `router.py`; expose override in UI
+
+### [Future] Cortex Mesh — multi-instance fleet coordination
+Each fleet device runs its own Cortex instance. Instances delegate tasks to each
+other based on resources and specialisation. No central coordinator required.
+- Concept only — no design yet. Resolve these questions before building:
+  - Auth between instances (shared JWT secret vs. per-instance API keys)
+  - Capability advertisement (model registry over HTTP? shared Syncthing file?)
+  - Whether `ae_send_message` / the inbox system is the right coordination layer
+  - Session continuity — does a conversation stay on one node or migrate?
+- Natural foundation already in place: Syncthing-synced `home/` and shared
+  `model_registry.json` mean instances share persona memory without a central DB

 ---

 ## ✅ Completed

+### [Tools] email_send tool + per-user email allowlist — 2026-04-29
+- `email_send(to, subject, body)` in `cortex/tools/notify.py` — SMTP via `email_utils.py`
+- Per-user regex allowlist at `home/{user}/email_allowlist.json` (JSON array of patterns)
+- `re.fullmatch(..., re.IGNORECASE)` — supports wildcards like `.*@oneskyit\\.com`
+- Blocked by default (no allowlist = no sends); non-matching addresses silently blocked
+- Registered as admin-only tool in `TOOL_ROLES`
+- **Settings UI**: `POST /settings/email-allowlist` — textarea in Account Settings, one pattern per line
+- **Files panel**: `email_allowlist.json` added to `USER_FILES` in `files.py`; served from `home/{user}/`; appears in new "Settings" group in sidebar
+
+### [Models] Edit existing model entries — 2026-04-29
+- Inline edit form per model row in `local_llm.html` (`.model-row-header` + hidden `.model-edit-form`)
+- "Edit" toggle shows pre-populated form; "Cancel" collapses it
+- "Fetch models" button in edit form — same live-fetch flow as Add Model
+- `POST /settings/local/models/{model_id}/edit` route in `local_llm.py` dispatches to `save_model` / `save_cloud_model` (upsert via `model_id`)
+- Works for both `local_openai` and cloud model types
+
+### [Sessions] Cross-session search — 2026-04-29
+- `GET /sessions/search?q=&user=&persona=&limit=` in `files.py` — full-text grep across `sessions/*.md`, newest first
+- Returns up to `limit` matches with 120-char excerpt and date; `total_files_searched` count
+- UI: search input + results panel below Files sidebar; `Ctrl+F` / search icon shortcut; `marked.parse` highlights matches
+
+### [Tools] Role-based access control + confirmation gate — 2026-04-29
+- `TOOL_ROLES` dict maps tool names to minimum required role (`admin`/`user`)
+- `CONFIRM_REQUIRED` set blocks destructive tools; orchestrator injects confirmation prompt instead
+- `get_tools_for_role(role)` filters both Gemini declarations and callables
+- `get_user_role(username)` added to `auth_utils.py`; passed through both orchestrators
+- `manage_passwords.py role <username> admin|user` — shell-only admin promotion
+- Admin-only tools: `shell_exec`, `claude_allow_dir`, `cortex_restart`, `cortex_logs`,
+  `file_read`, `file_list`, `file_write`, `ae_task_list`, `nc_talk_send`
+- Confirm-required tools: `cortex_restart`, `file_write`, `shell_exec`, `cron_remove`, `reminders_clear`
+
+### [UI] Admin role badge in Account settings — 2026-04-29
+- `GET /settings` now injects `user_role` from `auth.json` into settings page
+- Role shown as a styled pill badge (purple ADMIN, muted USER) below username field
+
+### [Local] Unsloth Gemma 4 variants — resolved 2026-04-29
+- Ollama update resolved the `500: unable to load model` issue
+- Unsloth Dynamic 2.0 Q4_K_M GGUFs loading correctly
+
+### [Distill] Distill quality review — resolved 2026-04-29
+- Short/mid/long output reviewed and quality confirmed acceptable
+- No prompt tuning needed at this time
+
+### [UI] Progressive Web App (PWA) — 2026-04-29
+- `manifest.json`, `sw.js`, PNG icons (192/512) generated via rsvg-convert
+- `/manifest.json` and `/sw.js` served at root via ui.py; exempted in auth_middleware
+- Theme-color meta tag updated dynamically on light/dark toggle
+- Install prompt confirmed working in Chromium desktop; apple-touch-icon for iOS
+
+### [UI] CodeMirror markdown editor for identity/memory files — 2026-04-28
+- Replaced textarea in Files panel with CodeMirror 5 (markdown mode, CDN)
+- Syntax highlighting, line wrapping, Ctrl+S to save, per-file undo history
+
+### [UI] Input area polish — 2026-04-28
+- Single cycling S/M/L button replaces 3 separate height buttons (same UX as font size)
+- S size collapses mode-select to a row (compact); M/L keep vertical column layout
+- Input height minimum derived from setting so empty textarea reflects selected size
+- Context & Memory panel and Settings dropdown are mutually exclusive (closeAllPanels fix)
+- Both panels now use consistent shadow (var(--shadow)) and z-index (200)
+
+### [Tools] Tools toggle — decoupled from Role/Backend — 2026-04-28
+- Removed "Agent" mode from the mode selector; replaced with independent ⚡ toggle
+- `toolsEnabled` persists in localStorage; routes to orchestrator regardless of active mode
+- Layout: column (M/L) or row (S) driven by `data-size` attribute set by JS
+- chat_role flows from UI → OrchestrateRequest → orchestrator_engine.run(response_role=...)
+
+### [Tools] shell_exec tool — 2026-04-28
+- `shell_exec(command, working_dir, timeout)` in `cortex/tools/system.py`
+- Runs any shell command on the Cortex host; timeout clamped 1–120s
+- Use for system diagnostics: `df -h`, `ps aux`, `journalctl`, `free -h`, etc.
+
+### [Tools] Aether Journals full toolkit — 2026-04-28
+- `ae_journal_list` — list all journals + ids for the account
+- `ae_journal_entry_update` — PATCH any fields (title, content, summary, tags, enable)
+- `ae_journal_entry_disable` — soft-delete via enable=false
+- `ae_journal_entry_append` — read→append timestamped section→write (running/data logs)
+- `ae_journal_entry_prepend` — read→prepend timestamped section→write (newest-first)
+- Shared `_get_entry` / `_patch_entry` helpers; OpenAI JSON Schema auto-derived from Gemini declarations
+
 ### [Local] Per-user multi-model local LLM settings — 2026-04-01
 - `home/{username}/local_llm.json` — `hosts[]` + `models[]` + `active_model_id` structure
 - `cortex/user_settings.py` — CRUD functions: save_host, add_model, remove_model, set_active_model, get_active_local_model
@@ -217,3 +551,14 @@ base accessible to local models. Endpoints documented in `docs/OPEN_WEBUI_API.md
 - FastAPI service with streaming SSE response
 - Claude CLI and Gemini CLI subprocess backends
 - Session context management (rolling window, `MAX_HISTORY_MESSAGES`)
+
+
+### [Tools] Orchestrator tool expansions — Round 3
+
+- [x] **`spawn_agent` tool restrictions** — `allow_tools` and `deny_tools` per-call params — 2026-05-12
+  - Role config remains the authoritative ceiling; spawner can only restrict further
+  - `allow_tools`: intersected with role tool list; if role list is None, used directly (role gate still applies)
+  - `deny_tools`: removed from tool list; falls back to `confirm_deny` gate when tool list is unrestricted
+  - Design spec: `ARCH__FUTURE.md` §12
+- [x] **`file_diff`** — unified diff of two project-scoped files (`diff -u`); low risk, no admin — 2026-05-12
+- [x] **`git_status` / `git_log` / `git_diff`** — read-only git inspection tools, project-scoped; `git.py` module — 2026-05-12
--- a/scripts/import_knowledge.py
+++ b/scripts/import_knowledge.py
@@ -0,0 +1,407 @@
+#!/usr/bin/env python3
+"""
+Knowledge base importer — walks a markdown directory and creates AE journal entries.
+
+Uses a local LLM to generate a 1-2 sentence summary for each chunk.
+Tracks progress in a state file so interrupted runs can be resumed.
+
+Usage:
+  python import_knowledge.py --source ~/DgrZone_Nextcloud --journal <journal_id>
+  python import_knowledge.py --source ~/OSIT_Nextcloud --journal <journal_id> --dry-run
+  python import_knowledge.py --source ~/DgrZone_Nextcloud/Notes --journal <journal_id> --limit 5
+
+Reads credentials from cortex/.env (relative to this script's parent directory)
+or from environment variables:
+  AE_API_URL, AE_API_KEY, AE_ACCOUNT_ID
+  LOCAL_API_URL, LOCAL_API_KEY, LOCAL_MODEL
+"""
+
+import argparse
+import hashlib
+import json
+import os
+import re
+import sys
+import time
+from datetime import datetime, timezone
+from pathlib import Path
+
+# ── Bootstrap: load .env from cortex/.env if not already set ─────────────────
+
+_ENV_PATH = Path(__file__).parent.parent / "cortex" / ".env"
+
+def _load_env(path: Path) -> None:
+    if not path.exists():
+        return
+    for line in path.read_text().splitlines():
+        line = line.strip()
+        if not line or line.startswith("#") or "=" not in line:
+            continue
+        key, _, val = line.partition("=")
+        key = key.strip()
+        if key not in os.environ:
+            os.environ[key] = val.strip().strip('"').strip("'")
+
+_load_env(_ENV_PATH)
+
+# ── Constants ─────────────────────────────────────────────────────────────────
+
+# Dirs to skip regardless of source
+_DEFAULT_EXCLUDE = {
+    "temp", "Temp", "tmp", "Tmp", "test", "Test",
+    "Temporary Share", "Test Share", ".obsidian", "media", "Photos",
+}
+
+# Max characters per journal entry chunk
+_DEFAULT_MAX_CHUNK = 8_000
+
+# Delay between API calls (seconds) to avoid hammering the LLM
+_LLM_DELAY = 1.0
+_AE_DELAY = 0.3
+
+
+# ── Path / tag utilities ──────────────────────────────────────────────────────
+
+def _path_tags(source_root: Path, file_path: Path) -> list[str]:
+    """Derive tags from path components relative to the source root."""
+    rel = file_path.relative_to(source_root)
+    parts = list(rel.parts[:-1])  # exclude filename itself
+    tags = []
+    for part in parts:
+        cleaned = re.sub(r"[^a-zA-Z0-9 ]", " ", part).strip().lower()
+        words = cleaned.split()
+        tags.extend(w for w in words if len(w) > 2)
+    # Add source root name as top-level tag
+    tags.insert(0, source_root.name.lower().replace("_nextcloud", "").replace("_", ""))
+    return list(dict.fromkeys(tags))  # deduplicate preserving order
+
+
+def _file_title(file_path: Path, content: str) -> str:
+    """Extract the first H1 heading or fall back to the filename stem."""
+    m = re.search(r"^# (.+)$", content, re.MULTILINE)
+    if m:
+        return m.group(1).strip()
+    return file_path.stem.replace("_", " ").replace("-", " ")
+
+
+# ── Chunking ──────────────────────────────────────────────────────────────────
+
+def _chunk_content(
+    file_path: Path,
+    content: str,
+    source_root: Path,
+    max_size: int,
+) -> list[dict]:
+    """
+    Returns a list of chunk dicts:
+      {
+        "chunk_key": str,   # unique ID for state tracking
+        "title": str,
+        "content": str,
+        "tags": list[str],
+        "path": str,
+      }
+    """
+    base_title = _file_title(file_path, content)
+    base_tags = _path_tags(source_root, file_path)
+    rel_path = str(file_path.relative_to(source_root))
+
+    if len(content) <= max_size:
+        h = hashlib.sha1(content.encode()).hexdigest()[:12]
+        return [{
+            "chunk_key": f"{rel_path}#{h}",
+            "title": base_title,
+            "content": content,
+            "tags": base_tags,
+            "path": rel_path,
+        }]
+
+    # Split by H2 headings
+    sections = re.split(r"^(## .+)$", content, flags=re.MULTILINE)
+    # sections alternates: [preamble, heading, body, heading, body, ...]
+
+    chunks = []
+    preamble = sections[0].strip()
+    pairs = list(zip(sections[1::2], sections[2::2]))
+
+    if not pairs:
+        # No H2 found — hard split by max_size
+        for i in range(0, len(content), max_size):
+            part = content[i:i + max_size]
+            h = hashlib.sha1(part.encode()).hexdigest()[:12]
+            chunks.append({
+                "chunk_key": f"{rel_path}#part{i}_{h}",
+                "title": f"{base_title} (part {i // max_size + 1})",
+                "content": part,
+                "tags": base_tags,
+                "path": rel_path,
+            })
+        return chunks
+
+    for heading, body in pairs:
+        section_title = heading.lstrip("#").strip()
+        section_content = f"{heading}\n{body}".strip()
+
+        # Prepend preamble to first section if it has meaningful content
+        if preamble and chunks == []:
+            section_content = f"{preamble}\n\n{section_content}"
+
+        # If section itself is too big, hard split it
+        if len(section_content) > max_size:
+            for i in range(0, len(section_content), max_size):
+                part = section_content[i:i + max_size]
+                h = hashlib.sha1(part.encode()).hexdigest()[:12]
+                chunks.append({
+                    "chunk_key": f"{rel_path}#{section_title}#part{i}_{h}",
+                    "title": f"{base_title} — {section_title} (part {i // max_size + 1})",
+                    "content": part,
+                    "tags": base_tags + [section_title.lower()[:40]],
+                    "path": rel_path,
+                })
+        else:
+            h = hashlib.sha1(section_content.encode()).hexdigest()[:12]
+            chunks.append({
+                "chunk_key": f"{rel_path}#{section_title}#{h}",
+                "title": f"{base_title} — {section_title}",
+                "content": section_content,
+                "tags": base_tags + [section_title.lower()[:40]],
+                "path": rel_path,
+            })
+
+    return chunks
+
+
+# ── LLM summarization ─────────────────────────────────────────────────────────
+
+def _summarize(content: str, llm_url: str, llm_key: str, llm_model: str) -> str:
+    """Call the local LLM to generate a 1-2 sentence summary. Returns "" on failure."""
+    import urllib.request
+
+    # Truncate for prompt economy
+    snippet = content[:3000] if len(content) > 3000 else content
+
+    prompt = (
+        "Summarize the following note in 1-2 sentences. "
+        "Be specific and factual. Do not start with 'This note' or 'This document'.\n\n"
+        f"{snippet}"
+    )
+
+    payload = json.dumps({
+        "model": llm_model,
+        "messages": [{"role": "user", "content": prompt}],
+        "max_tokens": 150,
+        "temperature": 0.3,
+    }).encode()
+
+    headers = {"Content-Type": "application/json"}
+    if llm_key:
+        headers["Authorization"] = f"Bearer {llm_key}"
+
+    # Try Open WebUI path first, fall back to standard OpenAI path
+    for path in ("/api/chat/completions", "/chat/completions"):
+        url = llm_url.rstrip("/") + path
+        req = urllib.request.Request(url, data=payload, headers=headers, method="POST")
+        try:
+            with urllib.request.urlopen(req, timeout=60) as resp:
+                data = json.loads(resp.read())
+            return data["choices"][0]["message"]["content"].strip()
+        except Exception:
+            continue
+
+    return ""
+
+
+# ── AE API ────────────────────────────────────────────────────────────────────
+
+def _ae_create_entry(
+    journal_id: str,
+    title: str,
+    content: str,
+    summary: str,
+    tags: list[str],
+    ae_url: str,
+    ae_key: str,
+    ae_account: str,
+) -> str:
+    """POST to AE API. Returns the new entry's id_random or raises on error."""
+    import urllib.request
+
+    url = f"{ae_url.rstrip('/')}/v3/crud/journal/{journal_id}/journal_entry/"
+    payload = json.dumps({
+        "name": title,
+        "content": content,
+        "summary": summary,
+        "tags": tags,
+    }).encode()
+    headers = {
+        "Content-Type": "application/json",
+        "x-aether-api-key": ae_key,
+        "x-account-id": ae_account,
+    }
+    req = urllib.request.Request(url, data=payload, headers=headers, method="POST")
+    with urllib.request.urlopen(req, timeout=30) as resp:
+        data = json.loads(resp.read())
+
+    return (
+        data.get("data", {}).get("journal_entry_id")
+        or data.get("data", {}).get("id_random")
+        or data.get("id_random")
+        or "?"
+    )
+
+
+# ── State file ────────────────────────────────────────────────────────────────
+
+def _load_state(state_file: Path) -> dict:
+    if state_file.exists():
+        try:
+            return json.loads(state_file.read_text())
+        except Exception:
+            pass
+    return {"imported": {}}
+
+
+def _save_state(state_file: Path, state: dict) -> None:
+    state_file.write_text(json.dumps(state, indent=2))
+
+
+# ── File walker ───────────────────────────────────────────────────────────────
+
+def _walk_markdown(source: Path, exclude: set[str]) -> list[Path]:
+    files = []
+    for f in sorted(source.rglob("*.md")):
+        if any(part in exclude for part in f.parts):
+            continue
+        if f.stat().st_size < 50:  # skip near-empty files
+            continue
+        files.append(f)
+    return files
+
+
+# ── Main ──────────────────────────────────────────────────────────────────────
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Import markdown notes into AE Journal")
+    parser.add_argument("--source", required=True, help="Root directory to import from")
+    parser.add_argument("--journal", required=True, help="Target AE journal id_random")
+    parser.add_argument("--dry-run", action="store_true", help="Preview without creating entries")
+    parser.add_argument("--limit", type=int, default=0, help="Stop after N chunks (0 = unlimited)")
+    parser.add_argument("--max-chunk", type=int, default=_DEFAULT_MAX_CHUNK, help="Max chars per chunk")
+    parser.add_argument("--exclude", default="", help="Extra dir names to skip (comma-separated)")
+    parser.add_argument("--state-file", default="import_state.json", help="State tracking file")
+    parser.add_argument("--no-llm", action="store_true", help="Skip LLM summarization (faster)")
+    parser.add_argument("--ae-url", default=os.environ.get("AE_API_URL", ""), help="AE API URL")
+    parser.add_argument("--ae-key", default=os.environ.get("AE_API_KEY", ""), help="AE API key")
+    parser.add_argument("--ae-account", default=os.environ.get("AE_ACCOUNT_ID", ""), help="AE account ID")
+    parser.add_argument("--llm-url", default=os.environ.get("LOCAL_API_URL", ""), help="Local LLM API URL")
+    parser.add_argument("--llm-key", default=os.environ.get("LOCAL_API_KEY", ""), help="Local LLM API key")
+    parser.add_argument("--llm-model", default=os.environ.get("LOCAL_MODEL", ""), help="Local LLM model name")
+    args = parser.parse_args()
+
+    source = Path(args.source).expanduser().resolve()
+    if not source.exists():
+        print(f"ERROR: source directory not found: {source}", file=sys.stderr)
+        sys.exit(1)
+
+    if not args.dry_run:
+        if not args.ae_url or not args.ae_key or not args.ae_account:
+            print("ERROR: AE_API_URL, AE_API_KEY, and AE_ACCOUNT_ID are required (or use --dry-run)", file=sys.stderr)
+            sys.exit(1)
+
+    use_llm = not args.no_llm and bool(args.llm_url) and bool(args.llm_model)
+    if not use_llm and not args.no_llm:
+        print("NOTE: LLM summarization disabled (LOCAL_API_URL or LOCAL_MODEL not set). Use --no-llm to silence this.")
+
+    exclude = _DEFAULT_EXCLUDE | {d.strip() for d in args.exclude.split(",") if d.strip()}
+    state_file = Path(args.state_file)
+    state = _load_state(state_file)
+
+    print(f"Source: {source}")
+    print(f"Journal: {args.journal}")
+    print(f"Dry run: {args.dry_run}")
+    print(f"LLM: {'enabled (' + args.llm_model + ')' if use_llm else 'disabled'}")
+    print(f"State file: {state_file} ({len(state['imported'])} already imported)")
+    print()
+
+    files = _walk_markdown(source, exclude)
+    print(f"Found {len(files)} markdown files")
+
+    created = 0
+    skipped = 0
+    errors = 0
+    total_chunks = 0
+
+    for file_path in files:
+        try:
+            content = file_path.read_text(encoding="utf-8", errors="replace")
+        except Exception as e:
+            print(f"  SKIP (read error): {file_path.name} — {e}")
+            errors += 1
+            continue
+
+        chunks = _chunk_content(file_path, content, source, args.max_chunk)
+        total_chunks += len(chunks)
+
+        for chunk in chunks:
+            key = chunk["chunk_key"]
+
+            if key in state["imported"]:
+                skipped += 1
+                continue
+
+            print(f"  {'[DRY RUN] ' if args.dry_run else ''}IMPORT: {chunk['title'][:70]}")
+
+            summary = ""
+            if use_llm:
+                try:
+                    summary = _summarize(chunk["content"], args.llm_url, args.llm_key, args.llm_model)
+                    time.sleep(_LLM_DELAY)
+                except Exception as e:
+                    print(f"    LLM error (continuing without summary): {e}")
+
+            if not args.dry_run:
+                try:
+                    entry_id = _ae_create_entry(
+                        journal_id=args.journal,
+                        title=chunk["title"],
+                        content=chunk["content"],
+                        summary=summary,
+                        tags=chunk["tags"],
+                        ae_url=args.ae_url,
+                        ae_key=args.ae_key,
+                        ae_account=args.ae_account,
+                    )
+                    state["imported"][key] = {
+                        "entry_id": entry_id,
+                        "imported_at": datetime.now(timezone.utc).isoformat(),
+                        "path": chunk["path"],
+                        "title": chunk["title"],
+                    }
+                    _save_state(state_file, state)
+                    time.sleep(_AE_DELAY)
+                    created += 1
+                except Exception as e:
+                    print(f"    AE API error: {e}")
+                    errors += 1
+            else:
+                created += 1
+
+            if args.limit and created >= args.limit:
+                print(f"\nReached --limit {args.limit}. Stopping.")
+                _print_summary(created, skipped, errors, total_chunks, args.dry_run)
+                return
+
+    _print_summary(created, skipped, errors, total_chunks, args.dry_run)
+
+
+def _print_summary(created: int, skipped: int, errors: int, total: int, dry_run: bool) -> None:
+    label = "Would create" if dry_run else "Created"
+    print(f"\n{'=' * 50}")
+    print(f"{label}: {created} entries")
+    print(f"Skipped (already imported): {skipped}")
+    print(f"Errors: {errors}")
+    print(f"Total chunks processed: {total}")
+
+
+if __name__ == "__main__":
+    main()