chore: session logs and memory distill 2026-03-17/18

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: orchestrator Agent mode UI + claude_allow_dir tool + fix DDG search
2026-03-18 22:43:40 -04:00 · 2026-03-18 22:42:44 -04:00 · 2026-03-18 20:13:11 -04:00 · 2026-03-18 19:45:05 -04:00 · 2026-03-18 19:43:20 -04:00 · 2026-03-18 19:38:14 -04:00
38 changed files with 2539 additions and 24 deletions
--- a/.env.default
+++ b/.env.default
@@ -0,0 +1,62 @@
+# Cortex .env reference — copy to .env and fill in values
+# DO NOT commit .env — it contains secrets
+
+# ── Agent identity ───────────────────────────────────────────────────────────
+# Each running instance has its own identity directory and name.
+# For a second instance (e.g. Holly), copy this file, change these values,
+# set a different PORT and INARA_DIR, and run a separate systemd unit.
+AGENT_NAME=Inara
+USER_NAME=Scott
+
+# ── Server ──────────────────────────────────────────────────────────────────
+HOST=0.0.0.0
+PORT=8000
+
+# ── Nextcloud Talk bot ───────────────────────────────────────────────────────
+NEXTCLOUD_URL=https://cloud.dgrzone.com
+NEXTCLOUD_TALK_BOT_SECRET=
+
+# ── LLM backends ────────────────────────────────────────────────────────────
+# Primary backend: "claude" or "gemini" (other is always fallback)
+PRIMARY_BACKEND=claude
+
+# Timeouts in seconds
+TIMEOUT_CLAUDE=60
+TIMEOUT_GEMINI=120
+
+# ── Orchestrator (Gemini API — not Gemini CLI) ───────────────────────────────
+# Required for /orchestrate endpoint and tool use
+# Free tier key: https://aistudio.google.com/apikey
+GEMINI_API_KEY=AIzaSyAnmzm31zO1kFkphxCkTnwgFizbfgB1JHI
+
+# Model for the orchestration tool loop (not the user-facing response)
+ORCHESTRATOR_MODEL=gemini-2.5-flash
+
+# Safety cap on tool loop iterations
+ORCHESTRATOR_MAX_ROUNDS=10
+
+# ── DuckDuckGo search ────────────────────────────────────────────────────────
+# Leave blank for free unauthenticated tier
+# Set to your API key for higher rate limits (paid DuckDuckGo account)
+DDG_API_KEY=
+DDG_MAX_RESULTS=5
+
+# ── Aether Platform API ───────────────────────────────────────────────────────
+# Used by orchestrator tools: ae_journal_search, ae_journal_entry_create, ae_task_list
+# Same values as agents_sync/mcp/.env — copy from there
+AE_API_URL=https://dev-api.oneskyit.com
+AE_API_KEY=
+AE_ACCOUNT_ID=
+AE_API_TIMEOUT=15
+
+# ── Distillation schedule ────────────────────────────────────────────────────
+SCHEDULER_TIMEZONE=America/New_York
+AUTO_DISTILL=true
+AUTO_DISTILL_SHORT=true
+AUTO_DISTILL_MID=true
+AUTO_DISTILL_LONG=false   # manual review recommended before enabling
+
+# Memory tier token budgets (soft caps)
+MEMORY_BUDGET_SHORT=3000
+MEMORY_BUDGET_MID=2000
+MEMORY_BUDGET_LONG=2000
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,205 @@
+# CLAUDE.md — Cortex / Inara Project
+
+This file is loaded automatically by Claude Code when working in this directory.
+Read it before touching any files.
+
+---
+
+## Identity & Context
+
+- **Project:** Cortex (dispatcher) + Inara (resident agent)
+- **Owner:** Scott Idem (One Sky IT / Danger Zone)
+- **Machine context:** See `~/CLAUDE.md` for fleet identity (`scott_lpt` = General Manager)
+- **Named after:** The 'verse-wide communications network (Firefly)
+
+---
+
+## Directory Map
+
+```
+Cortex_and_Inara_dev/
+  cortex/                ← FastAPI service (the dispatcher)
+    main.py              ← App entry point, router registration
+    config.py            ← All settings (pydantic-settings, reads .env)
+    llm_client.py        ← Claude CLI + Gemini CLI subprocess backends
+    orchestrator_engine.py ← Gemini API ReAct tool loop → Claude handoff
+    context_loader.py    ← Loads Inara's system prompt from inara/ files
+    session_store.py     ← In-memory + file session persistence
+    session_logger.py    ← Writes session turns to inara/sessions/
+    memory_distiller.py  ← Short/mid/long distill jobs (APScheduler)
+    scheduler.py         ← APScheduler setup
+    event_bus.py         ← Internal SSE pub/sub (NC Talk → browser)
+    routers/
+      chat.py            ← POST /chat (streaming SSE)
+      orchestrator.py    ← POST /orchestrate, GET /orchestrate/{job_id}
+      auth.py            ← GET /auth/status (Claude + Gemini CLI token checks)
+      distill.py         ← POST /distill/*, GET /distill/status
+      files.py           ← GET /files (inara/ file browser)
+      nextcloud_talk.py  ← POST /webhook/nextcloud (NC Talk bot)
+      google_chat.py     ← POST /webhook/google (Google Chat — stub)
+    tools/
+      __init__.py        ← Tool registry (Gemini FunctionDeclarations + dispatcher)
+      web.py             ← DuckDuckGo web_search tool
+      system.py          ← Local machine tools (claude_allow_dir)
+    static/              ← Single-page web UI (index.html, style.css, app.js)
+    data/sessions/       ← Persisted session JSON files
+
+  inara/                 ← Inara identity, memory, context files
+    IDENTITY.md          ← Who Inara is
+    SOUL.md              ← Values, personality, voice
+    PROTOCOLS.md         ← Behavioral rules
+    CONTEXT_TIERS.md     ← What each tier (1–3) includes in the system prompt
+    USER.md              ← Scott's profile (loaded into context)
+    HELP.md              ← In-app help content (rendered in UI)
+    MEMORY.md            ← Persistent facts (written by distiller or manually)
+    MEMORY_SHORT.md      ← Rolling short-term memory (auto-distilled daily)
+    MEMORY_MID.md        ← Mid-term memory (auto-distilled weekly)
+    MEMORY_LONG.md       ← Long-term memory (auto-distilled monthly)
+    sessions/            ← Session turn logs (YYYY-MM-DD_<id>.md)
+
+  docs/                  ← Integration reference docs
+    NEXTCLOUD_TALK_BOT.md
+
+  documentation/         ← Architecture decisions and agent task list
+    TODO__Agents.md      ← READ THIS FIRST — active task list
+    ARCH__Intelligence_Layer.md ← Orchestrator, dev agent, knowledge architecture
+
+  docker-compose.yml     ← Docker deployment
+  .env.default           ← Reference config (copy to .env, fill in secrets)
+  README.md              ← Project orientation
+```
+
+---
+
+## Run Commands
+
+```bash
+# Start (Docker)
+docker compose up -d
+
+# Restart service (after any Python change)
+sudo systemctl restart cortex
+
+# Syntax check a file before restarting
+python3 -m py_compile cortex/<file>.py
+
+# Syntax check all routers
+for f in cortex/routers/*.py cortex/tools/*.py cortex/orchestrator_engine.py; do
+    python3 -m py_compile "$f" && echo "OK: $f"
+done
+
+# Install/update dependencies
+cd cortex && .venv/bin/pip install -r requirements.txt
+
+# Logs
+journalctl -u cortex -f
+
+# Web UI (local)
+http://localhost:8000
+
+# Swagger docs
+http://localhost:8000/docs
+```
+
+---
+
+## Key Design Decisions
+
+### Two-Brain Architecture (Orchestrator / Responder)
+- **Gemini API** (`orchestrator_engine.py`) — runs the ReAct tool loop; handles tool calling, planning, research
+- **Claude CLI** (`llm_client.py`) — produces all user-facing responses; receives enriched context from Gemini
+- **Direct chat** bypasses the orchestrator entirely — `POST /chat` goes straight to Claude (faster)
+- **Orchestrated tasks** go to `POST /orchestrate` — returns a job_id, result is polled
+
+### LLM Backends
+- `llm_client.py` manages Claude CLI (`claude --print`) and Gemini CLI (`gemini -p`) subprocesses
+- `orchestrator_engine.py` uses the Gemini **API** (google-genai SDK) — completely separate from the Gemini CLI
+- Claude OAuth token is read live from `~/.claude/.credentials.json` (never rely on stale env var)
+
+### Tool Strategy
+- Orchestrator tools live in `cortex/tools/` — separate from the `ae_*` MCP tools
+- **Do not modify** the `ae_*` MCP server to support orchestrator needs; add new tools to `cortex/tools/` instead
+- Tools are registered in `cortex/tools/__init__.py` as both Gemini FunctionDeclarations and Python callables
+
+### Context / Memory
+- `context_loader.py` assembles Inara's system prompt from `inara/` files based on tier (1–3)
+- Tier 1 = minimal (identity only); Tier 2 = standard (+ memory + user profile); Tier 3 = full
+- Memory files are written by the distiller or manually — do not delete them
+
+### Security / Safety
+- **Never `rm`** — move files to `~/tmp/gemini_trash`
+- **Never commit secrets** — `.env` is gitignored; use `.env.default` as the reference
+- `NEXTCLOUD_TALK_BOT_SECRET` and `GEMINI_API_KEY` live in `.env` only
+- Cortex should only be accessible via WireGuard — never internet-exposed without VPN
+
+---
+
+## Adding a New Tool
+
+1. Implement the tool function in `cortex/tools/<domain>.py`
+   - Must be `async def`; use `asyncio.to_thread` for blocking calls
+   - Return a plain string result
+2. Add a `FunctionDeclaration` and register it in `cortex/tools/__init__.py`
+3. Syntax check: `python3 -m py_compile cortex/tools/<domain>.py`
+4. Restart Cortex
+
+## Managing Claude Code Directory Permissions
+
+Claude Code prompts (or silently hangs) when it needs to read or write a directory outside
+its current working directory. The `claude-allow-dir` script patches `~/.claude/settings.json`
+to add auto-allow rules so Claude no longer blocks on those paths.
+
+### Script: `~/.local/bin/claude-allow-dir`
+
+```bash
+# Allow read + write (default)
+claude-allow-dir ~/OSIT_dev/aether_api_fastapi
+
+# Read-only
+claude-allow-dir ~/agents_sync r
+
+# Write-only
+claude-allow-dir /tmp w
+```
+
+Adds `Read(path/*)` and/or `Edit(path/*)` + `Write(path/*)` entries to the `permissions.allow`
+array in `~/.claude/settings.json`. Idempotent — safe to run twice on the same path.
+Changes take effect in the next Claude Code session (or after opening `/hooks` in the UI).
+
+### Orchestrator tool: `claude_allow_dir`
+
+Cortex exposes this as a Gemini tool (`cortex/tools/system.py`) so the orchestrator can add
+allow rules on Inara's behalf without human intervention.
+
+**Security note:** This tool modifies Claude Code's own permission settings. The Gemini
+orchestrator calling it can grant Claude access to any directory on the machine. Keep this
+in mind when evaluating orchestrator behavior — it should only be invoked when Scott has
+clearly asked for a directory to be unblocked.
+
+## Adding a New Router
+
+1. Create `cortex/routers/<name>.py` with `router = APIRouter()`
+2. Import and register in `cortex/main.py`
+3. Syntax check, restart
+
+---
+
+## Active Tasks
+
+See `documentation/TODO__Agents.md` for the current task list.
+High priority items as of 2026-03-18:
+- Ollama backend (third LLM option — local, no API cost)
+- NC Talk integration stabilization
+- Knowledge consolidation (markdown → AE Journals)
+
+---
+
+## Related Docs
+
+| File | Purpose |
+|---|---|
+| `documentation/TODO__Agents.md` | Active task list — read before starting work |
+| `documentation/ARCH__Intelligence_Layer.md` | Full architecture design |
+| `~/agents_sync/projects/CORTEX.md` | High-level project vision and phases |
+| `~/agents_sync/CLAUDE.md` | Fleet coordination rules |
+| `~/CLAUDE.md` | Machine identity (`scott_lpt`) |
--- a/README.md
+++ b/README.md
@@ -0,0 +1,91 @@
+# Cortex / Inara — Project Root
+
+**Owner:** Scott Idem (One Sky IT / Danger Zone)
+**Started:** 2026-03-04
+**Status:** Active development
+
+> *"You can't stop the signal."*
+
+Cortex is a self-hosted multi-agent orchestration layer. Inara is the primary conversational agent that lives inside it.
+
+---
+
+## Quick Orientation
+
+| Directory | What it is |
+|---|---|
+| `cortex/` | FastAPI service — dispatcher, routing, LLM backends, session management |
+| `inara/` | Inara identity, memory, context, and help files |
+| `docs/` | Integration reference docs (NC Talk bot, etc.) |
+| `documentation/` | Architecture decisions, project plans, agent task lists |
+
+---
+
+## Running Cortex
+
+```bash
+# Start (Docker)
+cd ~/agents_sync/projects/Cortex_and_Inara_dev
+docker compose up -d
+
+# Restart service only (after backend changes)
+sudo systemctl restart cortex
+
+# Logs
+journalctl -u cortex -f
+
+# Web UI
+http://localhost:8000   (or cortex.dgrzone.com on WireGuard)
+```
+
+Config lives in `cortex/config.py` and a `.env` file at the project root (not tracked — see `env.default`).
+
+---
+
+## Key Documentation
+
+| File | Purpose |
+|---|---|
+| `documentation/TODO__Agents.md` | Active task list — read first |
+| `documentation/ARCH__Intelligence_Layer.md` | Intelligence layer architecture (orchestrator, dev agents, knowledge) |
+| `docs/NEXTCLOUD_TALK_BOT.md` | NC Talk bot setup |
+| `inara/IDENTITY.md` | Inara persona and identity |
+| `inara/HELP.md` | In-app help content (rendered in UI) |
+| `inara/PROTOCOLS.md` | Inara behavioral protocols |
+| `~/agents_sync/projects/CORTEX.md` | High-level project vision and phases |
+
+---
+
+## Architecture at a Glance
+
+```
+[User / Cron / Webhook]
+        ↓
+  Cortex Dispatcher  (FastAPI, cortex/)
+        ↓
+  LLM Backend(s)
+  • Claude CLI   — primary reasoning, coding, long-context
+  • Gemini CLI   — secondary / cost routing
+  • Ollama       — offline/private (scott_gaming, future)
+        ↓
+  Inara  (identity + memory in inara/)
+```
+
+See `documentation/ARCH__Intelligence_Layer.md` for the evolving orchestrator/responder and dev-agent architecture.
+
+---
+
+## Inara
+
+Inara is not tied to a specific model. The name is fixed; the backend may vary.
+Her identity and behavioral files live in `inara/` and are loaded at startup via `cortex/context_loader.py`.
+
+---
+
+## Related Projects
+
+| Project | Path |
+|---|---|
+| Aether Platform API | `~/OSIT_dev/aether_api_fastapi/` |
+| Aether Frontend | `~/OSIT_dev/aether_app_sveltekit/` |
+| Fleet coordination | `~/agents_sync/` |
--- a/cortex-holly.service
+++ b/cortex-holly.service
@@ -0,0 +1,15 @@
+[Unit]
+Description=Cortex / Holly LLM Gateway
+After=network.target
+
+[Service]
+Type=simple
+User=scott
+WorkingDirectory=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex
+EnvironmentFile=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex/.env.holly
+ExecStart=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/cortex/.venv/bin/uvicorn main:app --host 0.0.0.0 --port 8001
+Restart=on-failure
+RestartSec=5
+
+[Install]
+WantedBy=multi-user.target
--- a/cortex/.env.holly
+++ b/cortex/.env.holly
@@ -0,0 +1,36 @@
+# Holly instance .env
+# Copy secrets from cortex/.env (API keys, NC Talk secret etc.)
+# then customise the identity settings below.
+
+# TODO: Set AGENT_NAME to whatever name Holly chooses for her agent
+AGENT_NAME=TBD
+USER_NAME=Holly
+
+PORT=8001
+HOST=0.0.0.0
+
+INARA_DIR=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/holly
+SESSIONS_DIR=/home/scott/agents_sync/projects/Cortex_and_Inara_dev/holly/sessions
+
+DEFAULT_MODEL=claude-sonnet-4-6
+DEFAULT_TIER=2
+
+# ── Copy these from cortex/.env ──────────────────────────────────────────────
+GEMINI_API_KEY=
+AE_API_URL=https://dev-api.oneskyit.com
+AE_API_KEY=
+AE_ACCOUNT_ID=
+
+NEXTCLOUD_URL=https://cloud.dgrzone.com
+NEXTCLOUD_TALK_BOT_SECRET=
+
+# Per-backend timeouts
+TIMEOUT_CLAUDE=60
+TIMEOUT_GEMINI=120
+TIMEOUT_LOCAL=300
+
+SCHEDULER_TIMEZONE=America/New_York
+AUTO_DISTILL=true
+AUTO_DISTILL_SHORT=true
+AUTO_DISTILL_MID=true
+AUTO_DISTILL_LONG=false
--- a/cortex/config.py
+++ b/cortex/config.py
@@ -4,6 +4,29 @@ from pydantic_settings import BaseSettings, SettingsConfigDict

 class Settings(BaseSettings):
    anthropic_api_key: str | None = None  # not used — claude CLI handles auth
+
+    # Orchestrator (Gemini API — separate from Gemini CLI)
+    # Get a key at: https://aistudio.google.com/apikey (free tier is sufficient)
+    gemini_api_key: str | None = None
+    orchestrator_model: str = "gemini-2.5-flash"    # model used for tool loop
+    orchestrator_max_rounds: int = 10               # safety cap on tool iterations
+
+    # DuckDuckGo search (used by orchestrator web_search tool)
+    # Leave blank to use the free unauthenticated tier; set to your API key for higher limits
+    ddg_api_key: str | None = None
+    ddg_max_results: int = 5
+
+    # Aether Platform API (used by orchestrator ae_journal_* and ae_task_list tools)
+    ae_api_url: str = "https://dev-api.oneskyit.com"
+    ae_api_key: str = ""          # x-aether-api-key header
+    ae_account_id: str = ""       # x-account-id header
+    ae_api_timeout: int = 15      # per-request timeout in seconds
+
+    # Agent identity — used in prompts, session logs, and memory distillation
+    # Override in .env for each instance (e.g. AGENT_NAME=Holly, USER_NAME=Holly)
+    agent_name: str = "Inara"
+    user_name: str = "Scott"
+
    inara_dir: Path = Path("../inara")
    sessions_dir: Path = Path("./data/sessions")
    default_model: str = "claude-sonnet-4-6"
--- a/cortex/llm_client.py
+++ b/cortex/llm_client.py
@@ -193,7 +193,7 @@ def _build_conversation(messages: list[dict]) -> str:
    if prior:
        history_lines = []
        for msg in prior:
-            label = "Scott" if msg["role"] == "user" else "Inara"
+            label = settings.user_name if msg["role"] == "user" else settings.agent_name
            history_lines.append(f"{label}: {msg['content']}")
        parts.append("<conversation>\n" + "\n\n".join(history_lines) + "\n</conversation>")
    parts.append(messages[-1]["content"] if messages else "")
--- a/cortex/main.py
+++ b/cortex/main.py
@@ -8,7 +8,7 @@ import uvicorn
 logging.basicConfig(level=logging.INFO, format="%(levelname)s:%(name)s: %(message)s")

 from config import settings
-from routers import chat, google_chat, nextcloud_talk, files, distill, auth
+from routers import chat, google_chat, nextcloud_talk, files, distill, auth, orchestrator


@asynccontextmanager
@@ -29,6 +29,7 @@ app.include_router(nextcloud_talk.router)
 app.include_router(files.router)
 app.include_router(distill.router)
 app.include_router(auth.router)
+app.include_router(orchestrator.router)
 app.mount("/static", StaticFiles(directory="static"), name="static")


--- a/cortex/memory_distiller.py
+++ b/cortex/memory_distiller.py
@@ -87,12 +87,12 @@ async def distill_mid() -> dict:

    budget_tokens = settings.memory_budget_mid
    system_prompt = (
-        "You are Inara's memory distillation system. "
+        f"You are {settings.agent_name}'s memory distillation system. "
        "Summarize the following recent session logs into a concise mid-term memory digest. "
        f"Target length: under {budget_tokens} tokens. "
        "Focus on: recurring themes, important decisions made, ongoing projects, "
-        "Scott's current state and priorities, and anything that should persist into future sessions. "
-        "Write in first person as Inara (e.g. 'Scott and I worked on...'). "
+        f"{settings.user_name}'s current state and priorities, and anything that should persist into future sessions. "
+        f"Write in first person as {settings.agent_name} (e.g. '{settings.user_name} and I worked on...'). "
        "Use markdown headings. Be specific and concrete — no filler."
    )

@@ -132,7 +132,7 @@ async def distill_long() -> dict:

    budget_tokens = settings.memory_budget_long
    system_prompt = (
-        "You are Inara's long-term memory curator. "
+        f"You are {settings.agent_name}'s long-term memory curator. "
        "You will receive the current long-term memory and a recent mid-term digest. "
        f"Integrate the new information into the long-term memory. Target: under {budget_tokens} tokens. "
        "Rules: preserve important historical facts; update or replace stale information; "
@@ -154,7 +154,7 @@ async def distill_long() -> dict:
    now = datetime.now().strftime("%Y-%m-%d %H:%M")
    if not response_text.lstrip().startswith("# MEMORY_LONG"):
        response_text = (
-            f"# MEMORY_LONG.md — Inara Long-Term Memory\n\n"
+            f"# MEMORY_LONG.md — {settings.agent_name} Long-Term Memory\n\n"
            f"*Last distilled: {now} via {backend}.*\n\n---\n\n"
            + response_text
        )
--- a/cortex/orchestrator_engine.py
+++ b/cortex/orchestrator_engine.py
@@ -0,0 +1,243 @@
+"""
+Orchestrator engine — two-brain architecture.
+
+Flow:
+  1. Gemini API runs a ReAct tool loop (reason → act → observe → repeat)
+  2. When Gemini has gathered enough context, it produces a final summary
+  3. That enriched context is handed off to Claude for the user-facing response
+
+Why this split:
+  - Gemini API has native structured tool calling (Gemini CLI subprocess does not)
+  - Claude produces higher-quality user-facing prose and reasoning
+  - Claude Pro subscription has no API cost; Gemini free tier handles orchestration load
+
+For direct chat (no tools needed), this engine is not invoked — the chat router
+calls llm_client.complete() directly, which is faster and has no orchestration overhead.
+"""
+
+import asyncio
+import logging
+from dataclasses import dataclass, field
+
+from google import genai
+from google.genai import types
+
+from config import settings
+from llm_client import complete
+from tools import TOOL_DECLARATIONS, call_tool
+
+logger = logging.getLogger(__name__)
+
+# System prompt given to Gemini during the tool loop.
+# Gemini's job is information gathering and planning — NOT writing the final response.
+_ORCHESTRATOR_SYSTEM = """You are an intelligent orchestrator. Your job is to:
+1. Understand the user's request
+2. Call tools to gather the information needed to answer it
+3. Once you have enough information, produce a concise summary of:
+   - What the user asked
+   - What you found (tool results, key facts)
+   - Any important context that would help generate a good answer
+
+Do NOT write a polished final answer — a human-facing AI will do that next.
+Keep your summary factual and complete. Include relevant URLs, data, and specifics.
+If no tools are needed, return an empty summary."""
+
+
+@dataclass
+class OrchestratorResult:
+    response: str                       # final user-facing response (from Claude)
+    tool_calls: list[dict] = field(default_factory=list)  # [{tool, args, result}]
+    backend: str = "claude"             # model that produced the final response
+    gemini_summary: str = ""            # what Gemini handed to Claude (debug/display)
+
+
+async def run(
+    task: str,
+    system_prompt: str = "",
+    session_messages: list[dict] | None = None,
+    respond_with_claude: bool = True,
+) -> OrchestratorResult:
+    """
+    Run the full orchestration loop for a task.
+
+    Args:
+        task:               The user's request (plain text)
+        system_prompt:      Inara's system prompt (from context_loader) — passed to Claude
+        session_messages:   Prior conversation history for session continuity
+        respond_with_claude: If False, return Gemini's summary as the response (useful for
+                             background/cron tasks where a polished reply isn't needed)
+
+    Returns:
+        OrchestratorResult with response, tool call log, backend used, and Gemini summary
+    """
+    if not settings.gemini_api_key:
+        raise RuntimeError(
+            "GEMINI_API_KEY not set — orchestrator requires Gemini API. "
+            "Get a free key at https://aistudio.google.com/apikey and add it to .env"
+        )
+
+    client = genai.Client(api_key=settings.gemini_api_key)
+
+    # Seed Gemini with the task — include recent session context if available
+    task_with_context = _build_task_prompt(task, session_messages)
+    contents: list[types.Content] = [
+        types.Content(role="user", parts=[types.Part(text=task_with_context)])
+    ]
+
+    tool_call_log: list[dict] = []
+    gemini_summary = ""
+
+    # --- ReAct tool loop ---
+    for round_num in range(settings.orchestrator_max_rounds):
+        logger.info("Orchestrator round %d for task: %.80s", round_num + 1, task)
+
+        response = await asyncio.to_thread(
+            client.models.generate_content,
+            model=settings.orchestrator_model,
+            contents=contents,
+            config=types.GenerateContentConfig(
+                tools=TOOL_DECLARATIONS,
+                system_instruction=_ORCHESTRATOR_SYSTEM,
+            ),
+        )
+
+        candidate = response.candidates[0]
+        parts = candidate.content.parts if candidate.content else []
+
+        # Check if Gemini wants to call any tools
+        tool_call_parts = [
+            p for p in parts
+            if hasattr(p, "function_call") and p.function_call and p.function_call.name
+        ]
+
+        if not tool_call_parts:
+            # No more tool calls — extract Gemini's text summary
+            gemini_summary = "".join(
+                p.text for p in parts if hasattr(p, "text") and p.text
+            ).strip()
+            logger.info("Orchestrator done after %d round(s). Tools used: %d",
+                        round_num + 1, len(tool_call_log))
+            break
+
+        # Add Gemini's response (with function calls) to the conversation
+        contents.append(candidate.content)
+
+        # Execute all tool calls in parallel
+        tool_tasks = [
+            _execute_tool(fc.function_call.name, dict(fc.function_call.args))
+            for fc in tool_call_parts
+        ]
+        tool_results = await asyncio.gather(*tool_tasks, return_exceptions=True)
+
+        # Build function response parts and update log
+        response_parts: list[types.Part] = []
+        for fc_part, result in zip(tool_call_parts, tool_results):
+            fc = fc_part.function_call
+            result_str = str(result) if not isinstance(result, Exception) else f"Error: {result}"
+            logger.info("Tool %s → %d chars", fc.name, len(result_str))
+
+            tool_call_log.append({
+                "tool": fc.name,
+                "args": dict(fc.args),
+                "result": result_str,
+            })
+            response_parts.append(
+                types.Part(
+                    function_response=types.FunctionResponse(
+                        name=fc.name,
+                        response={"result": result_str},
+                    )
+                )
+            )
+
+        contents.append(types.Content(role="user", parts=response_parts))
+
+    else:
+        # Hit the round limit — use whatever Gemini produced last
+        logger.warning("Orchestrator hit max rounds (%d)", settings.orchestrator_max_rounds)
+        gemini_summary = (
+            f"Reached the tool iteration limit ({settings.orchestrator_max_rounds} rounds). "
+            "Here is what was gathered so far:\n\n"
+            + "\n\n".join(f"**{t['tool']}**: {t['result'][:500]}" for t in tool_call_log)
+        )
+
+    # --- Claude handoff ---
+    if respond_with_claude:
+        claude_prompt = _build_claude_prompt(task, tool_call_log, gemini_summary)
+
+        # Merge with session history so Claude has conversation context
+        messages = list(session_messages or [])
+        messages.append({"role": "user", "content": claude_prompt})
+
+        response_text, backend = await complete(
+            system_prompt=system_prompt,
+            messages=messages,
+            model="claude",
+        )
+    else:
+        # Cron/background tasks: return Gemini's summary directly, no Claude call
+        response_text = gemini_summary or "No information gathered."
+        backend = "gemini"
+
+    return OrchestratorResult(
+        response=response_text,
+        tool_calls=tool_call_log,
+        backend=backend,
+        gemini_summary=gemini_summary,
+    )
+
+
+async def _execute_tool(name: str, args: dict) -> str:
+    """Execute a single tool call, catching all exceptions."""
+    try:
+        return await call_tool(name, args)
+    except Exception as e:
+        logger.warning("Tool %s failed: %s", name, e)
+        return f"Tool error: {e}"
+
+
+def _build_task_prompt(task: str, session_messages: list[dict] | None) -> str:
+    """Prepend recent session context so Gemini understands the conversation."""
+    if not session_messages:
+        return task
+
+    # Include last few turns for context (don't send the full history to keep tokens low)
+    recent = session_messages[-6:]  # last 3 turns
+    history_lines = []
+    for msg in recent:
+        label = "User" if msg["role"] == "user" else "Assistant"
+        history_lines.append(f"{label}: {msg['content'][:300]}")  # truncate long messages
+
+    context = "\n".join(history_lines)
+    return f"<recent_conversation>\n{context}\n</recent_conversation>\n\nCurrent request: {task}"
+
+
+def _build_claude_prompt(
+    task: str,
+    tool_calls: list[dict],
+    gemini_summary: str,
+) -> str:
+    """Build the enriched context handed from Gemini to Claude."""
+    parts = [f"User request: {task}\n"]
+
+    if tool_calls:
+        parts.append("## Research gathered\n")
+        for tc in tool_calls:
+            parts.append(f"### {tc['tool']}({_format_args(tc['args'])})")
+            # Truncate very long results — Claude gets the gist
+            result = tc["result"]
+            if len(result) > 2000:
+                result = result[:2000] + "\n… [truncated]"
+            parts.append(result)
+            parts.append("")
+
+    if gemini_summary:
+        parts.append("## Summary of findings\n")
+        parts.append(gemini_summary)
+
+    return "\n".join(parts)
+
+
+def _format_args(args: dict) -> str:
+    """Format tool args as a compact string for display."""
+    return ", ".join(f"{k}={repr(v)}" for k, v in args.items())
--- a/cortex/requirements.txt
+++ b/cortex/requirements.txt
@@ -4,5 +4,9 @@ uvicorn[standard]>=0.30.0
 pydantic-settings>=2.0.0
 python-dotenv>=1.0.0

+# Orchestrator — Gemini API (native tool calling) + web search
+google-genai>=1.0.0
+ddgs>=0.1.0
+
 # anthropic SDK not needed — using claude CLI subprocess for auth
 # anthropic>=0.40.0
--- a/cortex/routers/chat.py
+++ b/cortex/routers/chat.py
@@ -6,7 +6,7 @@ from pydantic import BaseModel
 from context_loader import load_context
 from llm_client import complete
 from session_logger import log_turn
-from session_store import load as load_session, save as save_session, list_all, generate_session_id
+from session_store import load as load_session, save as save_session, list_all, generate_session_id, delete as delete_session
 from config import settings
 import event_bus

@@ -143,6 +143,14 @@ async def list_sessions() -> dict:
    return {"sessions": list_all()}


+@router.delete("/sessions/{session_id}")
+async def delete_session_endpoint(session_id: str) -> dict:
+    found = delete_session(session_id)
+    if not found:
+        raise HTTPException(status_code=404, detail=f"Session {session_id} not found")
+    return {"ok": True, "session_id": session_id}
+
+
@router.put("/history/{session_id}")
 async def replace_history(session_id: str, req: HistoryUpdate) -> dict:
    """Replace the full message list for a session (used by edit/delete UI)."""
--- a/cortex/routers/google_chat.py
+++ b/cortex/routers/google_chat.py
@@ -18,9 +18,9 @@ async def receive(request: Request):

    if event_type == "ADDED_TO_SPACE":
        space_type = body.get("space", {}).get("type", "")
-        greeting = "✨ Hello! I'm Inara. Send me a message and I'll do my best to help."
+        greeting = f"✨ Hello! I'm {settings.agent_name}. Send me a message and I'll do my best to help."
        if space_type == "DM":
-            greeting = "✨ Hello! I'm Inara. What can I help you with?"
+            greeting = f"✨ Hello! I'm {settings.agent_name}. What can I help you with?"
        return {"text": greeting}

    if event_type == "REMOVED_FROM_SPACE":
--- a/cortex/routers/nextcloud_talk.py
+++ b/cortex/routers/nextcloud_talk.py
@@ -158,8 +158,9 @@ async def nextcloud_talk_webhook(request: Request, background_tasks: BackgroundT
    except (json.JSONDecodeError, AttributeError):
        user_text = (obj.get("name") or obj.get("content", "")).strip()

-    if user_text.lower().startswith("@inara"):
-        user_text = user_text[6:].strip()
+    mention_prefix = f"@{settings.agent_name.lower()}"
+    if user_text.lower().startswith(mention_prefix):
+        user_text = user_text[len(mention_prefix):].strip()

    if not user_text:
        return Response(status_code=200)
--- a/cortex/routers/orchestrator.py
+++ b/cortex/routers/orchestrator.py
@@ -0,0 +1,174 @@
+"""
+Orchestrator router — POST /orchestrate, GET /orchestrate/{job_id}
+
+Accepts a task description, runs it through the orchestrator engine
+(Gemini tool loop → Claude response), and returns the result.
+
+Designed to be triggered from:
+  - The Cortex web UI (future "Agent mode" toggle)
+  - Cron jobs:  curl -X POST http://localhost:8000/orchestrate -d '{"task":"..."}'
+  - Webhooks:   Gitea, Aether events, etc.
+"""
+
+import asyncio
+import logging
+import uuid
+from datetime import datetime, timezone
+
+from fastapi import APIRouter
+from pydantic import BaseModel
+
+from config import settings
+from context_loader import load_context
+import orchestrator_engine
+
+logger = logging.getLogger(__name__)
+router = APIRouter(prefix="/orchestrate", tags=["orchestrator"])
+
+# ---------------------------------------------------------------------------
+# In-memory job store
+# Jobs are keyed by UUID. For this phase, memory is fine — jobs are short-lived.
+# ---------------------------------------------------------------------------
+
+_jobs: dict[str, dict] = {}
+_jobs_lock = asyncio.Lock()
+
+
+# ---------------------------------------------------------------------------
+# Request / response models
+# ---------------------------------------------------------------------------
+
+class OrchestrateRequest(BaseModel):
+    task: str
+    session_id: str | None = None       # include session history in context
+    tier: int | None = None             # Inara context tier (default from settings)
+    respond_with_claude: bool = True    # False = return Gemini summary only (faster, for cron)
+    include_long: bool = True
+    include_mid: bool = True
+    include_short: bool = True
+
+
+class OrchestrateResponse(BaseModel):
+    job_id: str
+    status: str     # "queued" | "running" | "complete" | "error"
+
+
+class JobStatusResponse(BaseModel):
+    job_id: str
+    status: str
+    task: str
+    created_at: str
+    completed_at: str | None = None
+    response: str | None = None
+    tool_calls: list[dict] | None = None
+    backend: str | None = None
+    gemini_summary: str | None = None
+    error: str | None = None
+
+
+# ---------------------------------------------------------------------------
+# Endpoints
+# ---------------------------------------------------------------------------
+
+@router.post("", response_model=OrchestrateResponse)
+async def orchestrate(req: OrchestrateRequest) -> OrchestrateResponse:
+    """Submit a task to the orchestrator. Returns a job_id to poll."""
+    job_id = str(uuid.uuid4())
+    now = datetime.now(timezone.utc).isoformat()
+
+    job: dict = {
+        "job_id": job_id,
+        "status": "queued",
+        "task": req.task,
+        "created_at": now,
+        "completed_at": None,
+        "response": None,
+        "tool_calls": None,
+        "backend": None,
+        "gemini_summary": None,
+        "error": None,
+    }
+
+    async with _jobs_lock:
+        _jobs[job_id] = job
+
+    # Run in background — caller polls GET /orchestrate/{job_id}
+    asyncio.create_task(_run_job(job_id, req))
+    logger.info("Orchestrator job queued: %s — %.80s", job_id, req.task)
+    return OrchestrateResponse(job_id=job_id, status="queued")
+
+
+@router.get("/{job_id}", response_model=JobStatusResponse)
+async def job_status(job_id: str) -> JobStatusResponse:
+    """Poll the status of an orchestrator job."""
+    async with _jobs_lock:
+        job = _jobs.get(job_id)
+
+    if job is None:
+        from fastapi import HTTPException
+        raise HTTPException(status_code=404, detail=f"Job {job_id} not found")
+
+    return JobStatusResponse(**job)
+
+
+@router.get("", response_model=list[JobStatusResponse])
+async def list_jobs() -> list[JobStatusResponse]:
+    """List all jobs (most recent first). Useful for debugging."""
+    async with _jobs_lock:
+        jobs = sorted(_jobs.values(), key=lambda j: j["created_at"], reverse=True)
+    return [JobStatusResponse(**j) for j in jobs]
+
+
+# ---------------------------------------------------------------------------
+# Background runner
+# ---------------------------------------------------------------------------
+
+async def _run_job(job_id: str, req: OrchestrateRequest) -> None:
+    """Execute the orchestration job and update the job store."""
+    async with _jobs_lock:
+        _jobs[job_id]["status"] = "running"
+
+    try:
+        # Load Inara's system prompt (same as the chat router does)
+        tier = req.tier or settings.default_tier
+        system_prompt = load_context(
+            tier,
+            include_long=req.include_long,
+            include_mid=req.include_mid,
+            include_short=req.include_short,
+        )
+
+        # Load session history if a session_id was provided
+        session_messages: list[dict] | None = None
+        if req.session_id:
+            from session_store import load as load_session
+            session_messages = load_session(req.session_id) or None
+
+        result = await orchestrator_engine.run(
+            task=req.task,
+            system_prompt=system_prompt,
+            session_messages=session_messages,
+            respond_with_claude=req.respond_with_claude,
+        )
+
+        now = datetime.now(timezone.utc).isoformat()
+        async with _jobs_lock:
+            _jobs[job_id].update({
+                "status": "complete",
+                "completed_at": now,
+                "response": result.response,
+                "tool_calls": result.tool_calls,
+                "backend": result.backend,
+                "gemini_summary": result.gemini_summary,
+            })
+        logger.info("Orchestrator job complete: %s (%d tool calls)", job_id, len(result.tool_calls))
+
+    except Exception as e:
+        logger.exception("Orchestrator job failed: %s", job_id)
+        now = datetime.now(timezone.utc).isoformat()
+        async with _jobs_lock:
+            _jobs[job_id].update({
+                "status": "error",
+                "completed_at": now,
+                "error": str(e),
+            })
--- a/cortex/session_logger.py
+++ b/cortex/session_logger.py
@@ -17,6 +17,6 @@ def log_turn(session_id: str, user_msg: str, assistant_msg: str) -> None:
            f.write(f"# Session Log — {today}\n")
        f.write(
            f"\n### [{timestamp}] `{session_id}`\n"
-            f"**Scott:** {user_msg}\n\n"
-            f"**Inara:** {assistant_msg}\n"
+            f"**{settings.user_name}:** {user_msg}\n\n"
+            f"**{settings.agent_name}:** {assistant_msg}\n"
        )
--- a/cortex/session_store.py
+++ b/cortex/session_store.py
@@ -69,6 +69,15 @@ def save(session_id: str, messages: list[dict]) -> None:
    }, indent=2))


+def delete(session_id: str) -> bool:
+    """Delete a session file. Returns True if it existed and was deleted."""
+    path = _path(session_id)
+    if not path.exists():
+        return False
+    path.unlink()
+    return True
+
+
 def list_all() -> list[dict]:
    d = settings.sessions_path()
    if not d.exists():
--- a/cortex/static/app.js
+++ b/cortex/static/app.js
@@ -9,9 +9,10 @@
        const heightRow     = document.getElementById('height-row');
        const heightSel     = document.getElementById('height-sel');
        const enterToggle   = document.getElementById('enter-toggle');
-        const noteTypeBtnEl = document.getElementById('note-type-btn');
-        const noteBtnEl     = document.getElementById('note-btn');
-        const stopBtn       = document.getElementById('stop');
+        const noteTypeBtnEl  = document.getElementById('note-type-btn');
+        const noteBtnEl      = document.getElementById('note-btn');
+        const agentModeBtnEl = document.getElementById('agent-mode-btn');
+        const stopBtn        = document.getElementById('stop');

        let sessionId        = null;
        let primaryBackend   = 'claude';
@@ -58,6 +59,23 @@
            syncHeight();
        });

+        // ── Agent mode ───────────────────────────────────────────────
+        let agentMode = localStorage.getItem('agentMode') === 'true';
+
+        function updateAgentModeUI() {
+            agentModeBtnEl.classList.toggle('active', agentMode);
+            updateInputPlaceholder();
+            if (agentMode) sendBtn.textContent = 'Run';
+            else if (!noteMode) sendBtn.textContent = 'Send';
+        }
+
+        agentModeBtnEl.addEventListener('click', () => {
+            agentMode = !agentMode;
+            localStorage.setItem('agentMode', agentMode);
+            updateAgentModeUI();
+            inputEl.focus();
+        });
+
        // ── Note mode ────────────────────────────────────────────────
        let noteMode   = false;
        let notePublic = false;
@@ -82,7 +100,7 @@
            } else {
                noteBtnEl.classList.remove('active', 'public');
                noteTypeBtnEl.style.display = 'none';
-                sendBtn.textContent = 'Send';
+                sendBtn.textContent = agentMode ? 'Run' : 'Send';
                inputEl.classList.remove('note-mode', 'public');
            }
            updateInputPlaceholder();
@@ -93,6 +111,10 @@
                inputEl.placeholder = notePublic
                    ? 'Public note — LLM sees this next turn…'
                    : 'Private note — only you see this…';
+            } else if (agentMode) {
+                inputEl.placeholder = ctrlEnterMode
+                    ? 'Task for Inara… (Gemini tool loop — Ctrl+Enter to run)'
+                    : 'Task for Inara… (Gemini tool loop)';
            } else {
                inputEl.placeholder = ctrlEnterMode
                    ? 'Message Inara… (Ctrl+Enter to send)'
@@ -183,6 +205,27 @@
                    `${s.message_count} msgs · ${timeAgo(s.updated)}`
                );
                item.addEventListener('click', () => resumeSession(s.session_id));
+
+                const delBtn = document.createElement('button');
+                delBtn.className = 'session-delete-btn';
+                delBtn.textContent = '×';
+                delBtn.title = 'Delete session';
+                delBtn.addEventListener('click', async (e) => {
+                    e.stopPropagation();
+                    await fetch(`/sessions/${s.session_id}`, { method: 'DELETE' });
+                    if (sessionId === s.session_id) {
+                        sessionId = null;
+                        currentHistory = [];
+                        messagesEl.innerHTML = '';
+                        sessionEl.textContent = '';
+                        addMessage('system', 'Session deleted');
+                    }
+                    const res = await fetch('/sessions');
+                    const data = await res.json();
+                    renderPanel(data.sessions);
+                });
+                item.appendChild(delBtn);
+
                sessionsPanel.appendChild(item);
            }
        }
@@ -596,16 +639,108 @@
            inputEl.focus();
        }

-        sendBtn.addEventListener('click', () => {
-            if (noteMode) addNote(); else sendMessage();
-        });
+        async function sendOrchestrate() {
+            const text = inputEl.value.trim();
+            if (!text || activeController) return;
+
+            inputEl.value = '';
+            syncHeight();
+            sendBtn.style.display = 'none';
+            stopBtn.style.display = 'block';
+            headerEmoji.classList.add('processing');
+
+            activeController = new AbortController();
+
+            addMessage('user', text);
+            scrollToBottom();
+
+            const thinkingDiv = addMessage('assistant thinking', '⚡ working…');
+
+            try {
+                const res = await fetch('/orchestrate', {
+                    method: 'POST',
+                    headers: { 'Content-Type': 'application/json' },
+                    body: JSON.stringify({
+                        task: text,
+                        session_id: sessionId,
+                        tier: currentTier,
+                        include_long: memLong,
+                        include_mid: memMid,
+                        include_short: memShort,
+                    }),
+                    signal: activeController.signal,
+                });
+
+                if (!res.ok) throw new Error(`HTTP ${res.status}`);
+                const { job_id } = await res.json();
+
+                // Poll until complete or stopped
+                let job;
+                while (true) {
+                    if (activeController.signal.aborted) throw new DOMException('Aborted', 'AbortError');
+
+                    await new Promise(r => setTimeout(r, 2000));
+
+                    if (activeController.signal.aborted) throw new DOMException('Aborted', 'AbortError');
+
+                    const pollRes = await fetch(`/orchestrate/${job_id}`, {
+                        signal: activeController.signal,
+                    });
+                    if (!pollRes.ok) throw new Error(`Poll failed: HTTP ${pollRes.status}`);
+                    job = await pollRes.json();
+
+                    const n = job.tool_calls?.length || 0;
+                    if (job.status === 'queued' || job.status === 'running') {
+                        thinkingDiv.textContent = n
+                            ? `⚡ working… (${n} tool${n !== 1 ? 's' : ''} used)`
+                            : '⚡ working…';
+                        continue;
+                    }
+                    break;
+                }
+
+                if (job.status === 'error') throw new Error(job.error || 'Orchestrator failed');
+
+                thinkingDiv.className = 'message assistant';
+                setMessageText(thinkingDiv, 'assistant', job.response || '(no response)');
+
+                const n = job.tool_calls?.length || 0;
+                if (n) {
+                    const names = job.tool_calls.map(t => t.name).join(', ');
+                    addMessage('system', `⚡ ${n} tool call${n !== 1 ? 's' : ''}: ${names}`);
+                }
+
+            } catch (err) {
+                if (err.name === 'AbortError') {
+                    thinkingDiv.className = 'message system';
+                    thinkingDiv.textContent = 'Stopped.';
+                } else {
+                    thinkingDiv.className = 'message error';
+                    thinkingDiv.textContent = `Error: ${err.message}`;
+                }
+            }
+
+            activeController = null;
+            headerEmoji.classList.remove('processing');
+            sendBtn.style.display = 'block';
+            stopBtn.style.display = 'none';
+            inputEl.focus();
+        }
+
+        function dispatchSend() {
+            if (noteMode) addNote();
+            else if (agentMode) sendOrchestrate();
+            else sendMessage();
+        }
+
+        sendBtn.addEventListener('click', dispatchSend);

        inputEl.addEventListener('keydown', (e) => {
            if (e.key === 'Enter') {
                const shouldSend = ctrlEnterMode ? (e.ctrlKey || e.metaKey) : !e.shiftKey;
                if (shouldSend) {
                    e.preventDefault();
-                    if (noteMode) addNote(); else sendMessage();
+                    dispatchSend();
                }
            }
        });
@@ -902,6 +1037,7 @@

        updateTierUI();
        updateMemUI();
+        updateAgentModeUI();

        // ── Init ─────────────────────────────────────────────────────
        updateEnterToggleUI();
--- a/cortex/static/index.html
+++ b/cortex/static/index.html
@@ -137,6 +137,7 @@
            <!-- Note mode controls -->
            <button id="note-type-btn">private</button>
            <button id="note-btn">Note</button>
+            <button id="agent-mode-btn" title="Agent mode — Gemini tool loop + Claude response">Agent</button>
            <button id="send">Send</button>
            <button id="stop">Stop</button>
        </div>
--- a/cortex/static/style.css
+++ b/cortex/static/style.css
@@ -176,10 +176,31 @@
        .session-item:hover { background: var(--bg); }
        .session-item.new { color: var(--accent); justify-content: center; }

+        .session-delete-btn {
+            background: none;
+            border: none;
+            color: var(--muted);
+            font-size: 1.1rem;
+            line-height: 1;
+            padding: 2px 6px;
+            cursor: pointer;
+            border-radius: 3px;
+            flex-shrink: 0;
+            transition: color 0.15s;
+            min-width: 28px;
+            text-align: center;
+        }
+        .session-delete-btn:hover { color: #e06c75; }
+
        .session-id {
            font-family: monospace;
            font-size: 0.85rem;
            color: var(--text);
+            flex: 1;
+            min-width: 0;
+            overflow: hidden;
+            text-overflow: ellipsis;
+            white-space: nowrap;
        }

        .session-meta {
@@ -1029,6 +1050,28 @@
            #note-type-btn { padding: 6px 10px; }
        }

+        /* ── Touch devices — no hover capability ─────────────────── */
+        /* Always show message controls; make tap targets finger-sized */
+        @media (hover: none) {
+            .msg-actions {
+                opacity: 1;
+                padding: 4px 2px 2px;
+            }
+
+            .msg-act-btn {
+                font-size: 0.78rem;
+                padding: 6px 12px;
+                min-height: 36px;
+            }
+
+            .session-delete-btn {
+                font-size: 1.3rem;
+                padding: 4px 8px;
+                min-width: 36px;
+                min-height: 36px;
+            }
+        }
+
        @media (max-width: 380px) {
            header .name { font-size: 1rem; }
            .header-emoji { font-size: 1.3rem; }
--- a/cortex/tools/init.py
+++ b/cortex/tools/init.py
@@ -0,0 +1,223 @@
+"""
+Orchestrator tool registry.
+
+Each tool has two parts:
+  1. A Gemini FunctionDeclaration — tells the model what the tool does and what args it takes
+  2. A Python async callable — the actual implementation
+
+To add a new tool:
+  1. Implement it in a tools/<domain>.py module
+  2. Import it here and add (declaration, callable) to _REGISTRY
+  3. Add a FunctionDeclaration below and include it in TOOL_DECLARATIONS
+
+IMPORTANT: These tools are separate from the ae_* MCP tools used by the fleet agents.
+           Do not modify the ae_* MCP server to support orchestrator needs.
+"""
+
+from google.genai import types
+from tools.web import search as _web_search
+from tools.ae_knowledge import journal_search as _ae_journal_search
+from tools.ae_knowledge import journal_entry_create as _ae_journal_entry_create
+from tools.ae_tasks import task_list as _ae_task_list
+from tools.files import file_read as _file_read
+from tools.system import claude_allow_dir as _claude_allow_dir
+
+
+# ---------------------------------------------------------------------------
+# Gemini function declarations
+# ---------------------------------------------------------------------------
+
+_web_search_declaration = types.FunctionDeclaration(
+    name="web_search",
+    description=(
+        "Search the web for current information. Use this when you need up-to-date "
+        "facts, news, documentation, or anything not in your training data."
+    ),
+    parameters=types.Schema(
+        type=types.Type.OBJECT,
+        properties={
+            "query": types.Schema(
+                type=types.Type.STRING,
+                description="The search query string",
+            ),
+            "max_results": types.Schema(
+                type=types.Type.INTEGER,
+                description="Number of results to return (default 5, max 10)",
+            ),
+        },
+        required=["query"],
+    ),
+)
+
+_ae_journal_search_declaration = types.FunctionDeclaration(
+    name="ae_journal_search",
+    description=(
+        "Search the Aether Journals knowledge base by keyword. "
+        "Use this to look up notes, documentation, meeting summaries, or any saved knowledge. "
+        "Always search before creating a new entry to avoid duplicates."
+    ),
+    parameters=types.Schema(
+        type=types.Type.OBJECT,
+        properties={
+            "query": types.Schema(
+                type=types.Type.STRING,
+                description="Keyword or phrase to search for",
+            ),
+            "journal_id": types.Schema(
+                type=types.Type.STRING,
+                description=(
+                    "Optional: scope search to a specific journal by its id_random. "
+                    "Omit to search all journals."
+                ),
+            ),
+            "max_results": types.Schema(
+                type=types.Type.INTEGER,
+                description="Maximum number of entries to return (default 10)",
+            ),
+        },
+        required=["query"],
+    ),
+)
+
+_ae_journal_entry_create_declaration = types.FunctionDeclaration(
+    name="ae_journal_entry_create",
+    description=(
+        "Create a new entry in an Aether Journal. "
+        "Use this to save notes, summaries, or any content the user wants to store. "
+        "Always call ae_journal_search first to check for existing entries on the same topic."
+    ),
+    parameters=types.Schema(
+        type=types.Type.OBJECT,
+        properties={
+            "journal_id": types.Schema(
+                type=types.Type.STRING,
+                description=(
+                    "The id_random of the target journal. "
+                    "Ask the user which journal to write to if not specified."
+                ),
+            ),
+            "title": types.Schema(
+                type=types.Type.STRING,
+                description="Entry title",
+            ),
+            "content": types.Schema(
+                type=types.Type.STRING,
+                description="Full entry content (markdown supported)",
+            ),
+            "summary": types.Schema(
+                type=types.Type.STRING,
+                description="Optional short summary (1-2 sentences)",
+            ),
+            "tags": types.Schema(
+                type=types.Type.STRING,
+                description="Optional comma-separated tags (e.g. 'wireguard, networking, homelab')",
+            ),
+        },
+        required=["journal_id", "title", "content"],
+    ),
+)
+
+_ae_task_list_declaration = types.FunctionDeclaration(
+    name="ae_task_list",
+    description=(
+        "List tasks from the agents_sync Kanban board (todo and in-progress). "
+        "Use this when asked about current work, pending tasks, or project status."
+    ),
+    parameters=types.Schema(
+        type=types.Type.OBJECT,
+        properties={
+            "include_done": types.Schema(
+                type=types.Type.BOOLEAN,
+                description="If true, also include completed tasks (default false)",
+            ),
+        },
+    ),
+)
+
+_file_read_declaration = types.FunctionDeclaration(
+    name="file_read",
+    description=(
+        "Read a local file and return its contents. "
+        "Allowed directories: ~/agents_sync/, ~/OSIT_dev/, ~/DgrZone_Nextcloud/, ~/OSIT_Nextcloud/. "
+        "Use this to read documentation, notes, CLAUDE.md files, or config references. "
+        "If given a directory path, returns a directory listing instead."
+    ),
+    parameters=types.Schema(
+        type=types.Type.OBJECT,
+        properties={
+            "path": types.Schema(
+                type=types.Type.STRING,
+                description=(
+                    "Absolute or home-relative path to the file "
+                    "(e.g. ~/agents_sync/CLAUDE.md or /home/scott/agents_sync/tasks/01_todo/)"
+                ),
+            ),
+            "max_lines": types.Schema(
+                type=types.Type.INTEGER,
+                description="Optional line limit (default 500)",
+            ),
+        },
+        required=["path"],
+    ),
+)
+
+
+# ---------------------------------------------------------------------------
+# Registry: maps tool name → async callable
+# ---------------------------------------------------------------------------
+
+_CALLABLES: dict[str, callable] = {
+    "web_search": _web_search,
+    "ae_journal_search": _ae_journal_search,
+    "ae_journal_entry_create": _ae_journal_entry_create,
+    "ae_task_list": _ae_task_list,
+    "file_read": _file_read,
+    "claude_allow_dir": _claude_allow_dir,
+}
+
+_claude_allow_dir_declaration = types.FunctionDeclaration(
+    name="claude_allow_dir",
+    description=(
+        "Add a directory to Claude Code's auto-allow list so Claude can read or write "
+        "files there without prompting. Edits ~/.claude/settings.json on the local machine. "
+        "Use this when Claude is silently hanging or being blocked from accessing a directory. "
+        "Changes take effect in the next Claude Code session."
+    ),
+    parameters=types.Schema(
+        type=types.Type.OBJECT,
+        properties={
+            "path": types.Schema(
+                type=types.Type.STRING,
+                description=(
+                    "Absolute or home-relative path to the directory "
+                    "(e.g. ~/OSIT_dev/aether_api_fastapi or /home/scott/agents_sync)"
+                ),
+            ),
+            "mode": types.Schema(
+                type=types.Type.STRING,
+                description="Permission mode: 'r' (read-only), 'w' (write-only), or 'rw' (both). Default: rw",
+            ),
+        },
+        required=["path"],
+    ),
+)
+
+# Gemini Tool object — pass this to GenerateContentConfig
+TOOL_DECLARATIONS = [
+    types.Tool(function_declarations=[
+        _web_search_declaration,
+        _ae_journal_search_declaration,
+        _ae_journal_entry_create_declaration,
+        _ae_task_list_declaration,
+        _file_read_declaration,
+        _claude_allow_dir_declaration,
+    ])
+]
+
+
+async def call_tool(name: str, args: dict) -> str:
+    """Dispatch a tool call by name. Returns result as a string."""
+    fn = _CALLABLES.get(name)
+    if fn is None:
+        return f"Unknown tool: {name}"
+    return await fn(**args)
--- a/cortex/tools/ae_knowledge.py
+++ b/cortex/tools/ae_knowledge.py
@@ -0,0 +1,177 @@
+"""
+Aether Platform knowledge tools — journal search and entry creation.
+
+These tools give the orchestrator read/write access to the AE Journals module,
+which serves as the primary long-term knowledge base.
+
+Auth: x-aether-api-key + x-account-id headers (same pattern as agents_sync scripts).
+API:  V3 CRUD — POST /v3/crud/journal_entry/search, POST /v3/crud/journal/{id}/journal_entry/
+"""
+
+import asyncio
+import logging
+from config import settings
+
+logger = logging.getLogger(__name__)
+
+# ---------------------------------------------------------------------------
+# Shared helpers
+# ---------------------------------------------------------------------------
+
+def _headers() -> dict:
+    return {
+        "x-aether-api-key": settings.ae_api_key,
+        "x-account-id": settings.ae_account_id,
+        "Content-Type": "application/json",
+    }
+
+
+def _check_config() -> str | None:
+    """Return an error string if AE API is not configured, else None."""
+    if not settings.ae_api_key or not settings.ae_account_id:
+        return (
+            "AE API not configured. Set AE_API_KEY and AE_ACCOUNT_ID in .env. "
+            "Values are the same as agents_sync/mcp/.env."
+        )
+    return None
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_search
+# ---------------------------------------------------------------------------
+
+async def journal_search(query: str, journal_id: str | None = None, max_results: int = 10) -> str:
+    """Search AE Journal entries by keyword.
+
+    Searches across the default_qry_str field (title + content excerpt).
+    Optionally scoped to a specific journal by journal_id (id_random).
+    Returns a markdown-formatted list of matching entries.
+    """
+    err = _check_config()
+    if err:
+        return err
+
+    return await asyncio.to_thread(_sync_journal_search, query, journal_id, max_results)
+
+
+def _sync_journal_search(query: str, journal_id: str | None, max_results: int) -> str:
+    import requests
+
+    url = f"{settings.ae_api_url}/v3/crud/journal_entry/search"
+    search_body = {
+        "and_filters": [
+            {"field": "default_qry_str", "op": "icontains", "value": query}
+        ],
+        "page_size": max_results,
+    }
+
+    params = {}
+    if journal_id:
+        params["for_obj_type"] = "journal"
+        params["for_obj_id"] = journal_id
+
+    try:
+        resp = requests.post(
+            url,
+            headers=_headers(),
+            params=params,
+            json=search_body,
+            timeout=settings.ae_api_timeout,
+        )
+        resp.raise_for_status()
+        data = resp.json()
+    except Exception as e:
+        logger.warning("ae_journal_search failed: %s", e)
+        return f"Journal search error: {e}"
+
+    entries = data.get("data", [])
+    if not entries:
+        return f"No journal entries found matching: {query}"
+
+    lines = [f"Journal entries matching **{query}** ({len(entries)} result(s)):\n"]
+    for entry in entries:
+        title = entry.get("name") or "(untitled)"
+        entry_id = entry.get("id_random", "")
+        journal_name = entry.get("journal_name") or entry.get("parent_name") or ""
+        summary = entry.get("summary") or ""
+        content_preview = (entry.get("content") or "")[:200].replace("\n", " ")
+
+        header = f"**{title}**"
+        if journal_name:
+            header += f" ({journal_name})"
+        if entry_id:
+            header += f" — id: `{entry_id}`"
+
+        lines.append(header)
+        if summary:
+            lines.append(f"  Summary: {summary}")
+        if content_preview:
+            lines.append(f"  {content_preview}…")
+        lines.append("")
+
+    return "\n".join(lines).strip()
+
+
+# ---------------------------------------------------------------------------
+# Tool: ae_journal_entry_create
+# ---------------------------------------------------------------------------
+
+async def journal_entry_create(
+    journal_id: str,
+    title: str,
+    content: str,
+    summary: str = "",
+    tags: str = "",
+) -> str:
+    """Create a new entry in an AE Journal.
+
+    Args:
+        journal_id: The id_random of the target journal (use ae_journal_search to find it,
+                    or ask the user which journal to write to).
+        title:      Entry title (name field).
+        content:    Full entry content (markdown supported).
+        summary:    Optional short summary (1-2 sentences).
+        tags:       Optional comma-separated tags.
+
+    Returns a confirmation with the new entry's id_random, or an error message.
+    """
+    err = _check_config()
+    if err:
+        return err
+
+    return await asyncio.to_thread(
+        _sync_journal_entry_create, journal_id, title, content, summary, tags
+    )
+
+
+def _sync_journal_entry_create(
+    journal_id: str, title: str, content: str, summary: str, tags: str
+) -> str:
+    import requests
+
+    url = f"{settings.ae_api_url}/v3/crud/journal/{journal_id}/journal_entry/"
+    data: dict = {"name": title, "content": content}
+    if summary:
+        data["summary"] = summary
+    if tags:
+        data["tags"] = [t.strip() for t in tags.split(",") if t.strip()]
+
+    try:
+        resp = requests.post(
+            url,
+            headers=_headers(),
+            json=data,
+            timeout=settings.ae_api_timeout,
+        )
+        resp.raise_for_status()
+        result = resp.json()
+    except Exception as e:
+        logger.warning("ae_journal_entry_create failed: %s", e)
+        return f"Journal entry creation error: {e}"
+
+    entry_id = (
+        result.get("data", {}).get("id_random")
+        or result.get("id_random")
+        or "unknown"
+    )
+    return f"Journal entry created. id: `{entry_id}`, title: \"{title}\", journal: `{journal_id}`"
--- a/cortex/tools/ae_tasks.py
+++ b/cortex/tools/ae_tasks.py
@@ -0,0 +1,100 @@
+"""
+Aether task list tool — reads the agents_sync Kanban board.
+
+Reads task JSON files directly from the agents_sync filesystem rather than
+making an HTTP call, since the tasks directory is always locally available
+(synced via Syncthing). This avoids needing a separate API endpoint for tasks.
+
+Structure:
+  agents_sync/tasks/01_todo/       — pending tasks
+  agents_sync/tasks/02_in_progress/ — active tasks
+  agents_sync/tasks/03_done/       — completed tasks (not included by default)
+"""
+
+import asyncio
+import json
+import logging
+from pathlib import Path
+
+logger = logging.getLogger(__name__)
+
+# Resolved at import time — agents_sync is always at ~/agents_sync on this machine.
+# If the path doesn't exist the tool returns a helpful error rather than crashing.
+_AGENTS_SYNC = Path.home() / "agents_sync"
+_TASKS_ROOT = _AGENTS_SYNC / "tasks"
+
+
+async def task_list(include_done: bool = False) -> str:
+    """List tasks from the agents_sync Kanban board.
+
+    Reads the todo and in_progress buckets (and optionally done).
+    Returns a markdown summary grouped by status.
+
+    Args:
+        include_done: If True, also include completed tasks (can be noisy).
+    """
+    return await asyncio.to_thread(_sync_task_list, include_done)
+
+
+def _sync_task_list(include_done: bool) -> str:
+    if not _TASKS_ROOT.exists():
+        return f"Task directory not found: {_TASKS_ROOT}"
+
+    buckets = [
+        ("01_todo", "Todo"),
+        ("02_in_progress", "In Progress"),
+    ]
+    if include_done:
+        buckets.append(("03_done", "Done"))
+
+    sections: list[str] = []
+    total = 0
+
+    for dir_name, label in buckets:
+        bucket_dir = _TASKS_ROOT / dir_name
+        if not bucket_dir.exists():
+            continue
+
+        tasks = _read_bucket(bucket_dir)
+        total += len(tasks)
+        if not tasks:
+            continue
+
+        lines = [f"## {label} ({len(tasks)})\n"]
+        for task in tasks:
+            title = task.get("title") or task.get("name") or "(untitled)"
+            assigned = task.get("assigned_to") or ""
+            task_id = task.get("id") or ""
+            desc = task.get("description") or ""
+
+            header = f"- **{title}**"
+            if assigned:
+                header += f" (assigned: {assigned})"
+            if task_id:
+                header += f" — `{task_id}`"
+            lines.append(header)
+
+            if desc:
+                # First sentence / 120 chars of description
+                short = desc.split(".")[0][:120]
+                lines.append(f"  {short}")
+
+        sections.append("\n".join(lines))
+
+    if not sections:
+        return "No tasks found on the Kanban board."
+
+    header_line = f"# Kanban Board — {total} task(s)\n"
+    return header_line + "\n\n".join(sections)
+
+
+def _read_bucket(bucket_dir: Path) -> list[dict]:
+    """Read and parse all JSON task files in a bucket directory."""
+    tasks = []
+    for path in sorted(bucket_dir.glob("*.json")):
+        try:
+            data = json.loads(path.read_text())
+            tasks.append(data)
+        except Exception as e:
+            logger.warning("Failed to read task file %s: %s", path, e)
+    return tasks
--- a/cortex/tools/files.py
+++ b/cortex/tools/files.py
@@ -0,0 +1,112 @@
+"""
+File read tool — restricted to known-safe directory roots.
+
+Lets the orchestrator read local files (documentation, notes, config references)
+without exposing arbitrary filesystem access. All paths are resolved and checked
+against an allowlist of roots before any read is performed.
+"""
+
+import asyncio
+import logging
+from pathlib import Path
+
+logger = logging.getLogger(__name__)
+
+# Directories the orchestrator is allowed to read from.
+# Paths are resolved (symlinks followed, ~ expanded) at import time.
+_ALLOWED_ROOTS: list[Path] = [
+    Path.home() / "agents_sync",
+    Path.home() / "OSIT_dev",
+    Path.home() / "DgrZone_Nextcloud",
+    Path.home() / "OSIT_Nextcloud",
+]
+
+# Hard cap on file size to prevent accidental context blowout
+_MAX_BYTES = 50_000   # ~50 KB
+_MAX_LINES = 500
+
+
+async def file_read(path: str, max_lines: int | None = None) -> str:
+    """Read a local file and return its contents as a string.
+
+    Only files within allowed directories can be read:
+      ~/agents_sync/, ~/OSIT_dev/, ~/DgrZone_Nextcloud/, ~/OSIT_Nextcloud/
+
+    Args:
+        path:      Absolute or home-relative path to the file (e.g. ~/agents_sync/CLAUDE.md).
+        max_lines: Optional line limit (default 500, hard cap). Use for large files.
+
+    Returns the file contents (truncated if over the size limit), or an error message.
+    """
+    return await asyncio.to_thread(_sync_file_read, path, max_lines)
+
+
+def _sync_file_read(path: str, max_lines: int | None) -> str:
+    # Expand ~ and resolve to absolute path
+    try:
+        resolved = Path(path).expanduser().resolve()
+    except Exception as e:
+        return f"Invalid path: {e}"
+
+    # Security check — must be under an allowed root
+    if not _is_allowed(resolved):
+        allowed_str = ", ".join(str(r) for r in _ALLOWED_ROOTS)
+        return (
+            f"Access denied: {resolved}\n"
+            f"Allowed directories: {allowed_str}"
+        )
+
+    if not resolved.exists():
+        return f"File not found: {resolved}"
+
+    if not resolved.is_file():
+        # If it's a directory, list its contents instead
+        try:
+            entries = sorted(resolved.iterdir())
+            names = [e.name + ("/" if e.is_dir() else "") for e in entries[:100]]
+            return f"Directory listing for {resolved}:\n" + "\n".join(names)
+        except Exception as e:
+            return f"Cannot list directory: {e}"
+
+    # Read the file
+    try:
+        raw = resolved.read_bytes()
+    except Exception as e:
+        return f"Read error: {e}"
+
+    # Binary files
+    try:
+        text = raw.decode("utf-8")
+    except UnicodeDecodeError:
+        return f"Binary file (not readable as text): {resolved}  [{len(raw)} bytes]"
+
+    # Apply line limit
+    limit = min(max_lines or _MAX_LINES, _MAX_LINES)
+    lines = text.splitlines()
+    truncated = False
+
+    if len(lines) > limit:
+        lines = lines[:limit]
+        truncated = True
+
+    # Apply byte cap as a final safety net
+    result = "\n".join(lines)
+    if len(result) > _MAX_BYTES:
+        result = result[:_MAX_BYTES]
+        truncated = True
+
+    if truncated:
+        result += f"\n\n… [truncated — file has {len(text.splitlines())} lines total]"
+
+    return result
+
+
+def _is_allowed(resolved: Path) -> bool:
+    """Check that resolved path is under one of the allowed roots."""
+    for root in _ALLOWED_ROOTS:
+        try:
+            resolved.relative_to(root)
+            return True
+        except ValueError:
+            continue
+    return False
--- a/cortex/tools/system.py
+++ b/cortex/tools/system.py
@@ -0,0 +1,44 @@
+"""
+System tools — local machine operations.
+
+These tools affect the host system directly. Use with care.
+"""
+
+import asyncio
+import logging
+
+logger = logging.getLogger(__name__)
+
+ALLOW_SCRIPT = "/home/scott/.local/bin/claude-allow-dir"
+
+
+async def claude_allow_dir(path: str, mode: str = "rw") -> str:
+    """Add Read/Edit allow rules to ~/.claude/settings.json for a directory.
+
+    Calls the claude-allow-dir script, which edits settings.json directly.
+    Changes take effect in the next Claude Code session (or after /hooks reload).
+    """
+    if mode not in ("r", "w", "rw"):
+        return f"Error: mode must be r, w, or rw (got '{mode}')"
+
+    try:
+        proc = await asyncio.create_subprocess_exec(
+            "python3", ALLOW_SCRIPT, path, mode,
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE,
+        )
+        stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=10)
+        output = stdout.decode().strip()
+        err = stderr.decode().strip()
+
+        if proc.returncode != 0:
+            logger.warning("claude-allow-dir failed (rc=%d): %s", proc.returncode, err)
+            return f"Failed (exit {proc.returncode}): {err or output}"
+
+        return output or "Done."
+
+    except asyncio.TimeoutError:
+        return "Error: script timed out"
+    except Exception as e:
+        logger.error("claude_allow_dir error: %s", e)
+        return f"Error: {e}"
--- a/cortex/tools/web.py
+++ b/cortex/tools/web.py
@@ -0,0 +1,50 @@
+"""
+Web search tool — DuckDuckGo backend.
+
+Uses the duckduckgo-search library. Set DDG_API_KEY in .env for a paid account
+(higher rate limits). The free unauthenticated tier works for moderate usage.
+"""
+
+import asyncio
+import logging
+from config import settings
+
+logger = logging.getLogger(__name__)
+
+
+async def search(query: str, max_results: int | None = None) -> str:
+    """Search DuckDuckGo and return results as a formatted string.
+
+    Returns a markdown-formatted list of results: title, URL, and snippet.
+    The orchestrator includes this in the context it passes to Claude.
+    """
+    n = min(max_results or settings.ddg_max_results, 10)
+    results = await asyncio.to_thread(_sync_search, query, n)
+    if not results:
+        return f"No results found for: {query}"
+
+    lines = [f"Search results for: **{query}**\n"]
+    for i, r in enumerate(results, 1):
+        lines.append(f"{i}. [{r['title']}]({r['href']})")
+        if r.get("body"):
+            lines.append(f"   {r['body']}")
+        lines.append("")
+
+    return "\n".join(lines).strip()
+
+
+def _sync_search(query: str, max_results: int) -> list[dict]:
+    """Synchronous DuckDuckGo search — run via asyncio.to_thread."""
+    from ddgs import DDGS
+
+    kwargs = {}
+    if settings.ddg_api_key:
+        # Paid account — pass token for higher rate limits
+        kwargs["headers"] = {"Authorization": f"Bearer {settings.ddg_api_key}"}
+
+    try:
+        with DDGS(**kwargs) as ddgs:
+            return list(ddgs.text(query, max_results=max_results))
+    except Exception as e:
+        logger.warning("DuckDuckGo search error: %s", e)
+        return []
--- a/documentation/ARCH__Intelligence_Layer.md
+++ b/documentation/ARCH__Intelligence_Layer.md
@@ -0,0 +1,306 @@
+# Architecture: Intelligence Layer
+
+**Status:** Design phase — not yet implemented
+**Last updated:** 2026-03-18
+
+This document captures the architectural thinking behind expanding Cortex from a smart dispatcher into a genuine intelligence layer: capable of using tools, coordinating specialist agents, and managing a personal knowledge base.
+
+---
+
+## Overview
+
+Cortex currently dispatches chat messages to LLM CLI backends and returns the response. The Intelligence Layer adds three major capabilities on top of that foundation:
+
+1. **Orchestrator/Responder** — Gemini handles tool use and planning; Claude handles the user-facing response
+2. **Dev Agent Pipeline** — Specialist agents implement code changes; a supervisor checks the work
+3. **Knowledge Layer** — AE Journals becomes the primary knowledge base; agents can read and write it
+
+These are independent tracks that share the same trigger layer and can be built incrementally.
+
+---
+
+## 1. Orchestrator / Responder Pattern
+
+### The Problem
+
+Claude CLI (via Pro subscription) doesn't expose direct API tool-calling. Gemini API (free tier) does. But Claude produces higher-quality user-facing prose and reasoning. The solution is to use each model for what it does best.
+
+### The Pattern
+
+```
+User message
+    ↓
+Orchestrator (Gemini API)
+    • interprets intent
+    • decides which tools to call
+    • executes tool loop (ReAct: reason → act → observe → repeat)
+    • assembles enriched context + tool results
+    ↓
+Responder (Claude CLI)
+    • receives enriched context
+    • writes the user-facing response
+    ↓
+User
+```
+
+For **direct chat** (no tools needed), the orchestrator is bypassed entirely — message goes straight to Claude. The orchestrator only activates when tools are required or when explicitly invoked (e.g., a background task).
+
+### Why Gemini API (not CLI)?
+
+- Gemini CLI is a subprocess; function calling via subprocess is fragile
+- Gemini API (`google-generativeai` SDK) has native structured tool-calling
+- Free tier (Gemini 2.0 Flash) handles orchestration load without cost
+- Access token is short-lived but auto-refreshed by the SDK (no expiry problem)
+
+### Tool Strategy
+
+Tools for the orchestrator are **separate** from the existing `ae_*` MCP tools. The ae_* tools are stable and used by existing agents — do not modify them.
+
+New orchestrator tools are Python functions wrapped in Gemini function declarations:
+
+| Tool | What it does | Implementation |
+|---|---|---|
+| `web_search` | DuckDuckGo search | `duckduckgo-search` library |
+| `ae_journal_search` | Search AE Journals via V3 API | HTTP to AE API |
+| `ae_journal_entry_create` | Write a new journal entry | HTTP to AE API |
+| `ae_task_list` | Read Kanban tasks | HTTP to AE API or agents_sync file |
+| `file_read` | Read a file from known safe paths | Python `pathlib` |
+| `gitea_api` | Query Gitea repos, issues, PRs | Gitea REST API |
+
+Tools are registered in `cortex/tools/` (one file per domain group).
+
+### Implementation Path
+
+```
+cortex/
+  tools/
+    __init__.py          — tool registry
+    web.py               — web_search
+    ae_knowledge.py      — ae_journal_* tools
+    ae_tasks.py          — task tools
+    gitea.py             — Gitea API tools
+  routers/
+    orchestrator.py      — POST /orchestrate, GET /orchestrate/{job_id}
+  orchestrator_engine.py — Gemini tool loop + Claude handoff
+```
+
+Endpoint contract:
+
+```
+POST /orchestrate
+{
+  "task": "What tasks are due this week and summarize my notes on X topic",
+  "session_id": "optional — if part of an ongoing conversation",
+  "respond_with_claude": true   // false = return Gemini's assembled context only
+}
+
+→ { "job_id": "uuid", "status": "queued" }
+
+GET /orchestrate/{job_id}
+→ { "status": "complete", "result": "...", "tool_calls": [...] }
+```
+
+---
+
+## 2. Trigger Layer
+
+All three capabilities (chat, orchestration, dev agents) share the same trigger layer:
+
+```
+┌────────────────────────────────────────────────┐
+│  TRIGGERS                                      │
+│                                                │
+│  Chat UI  →  POST /chat  (existing)            │
+│  Cron     →  POST /orchestrate  (new)          │
+│  Gitea    →  POST /webhook/gitea  (new)        │
+│  NC Talk  →  POST /webhook/nextcloud  (exists) │
+│  Manual   →  CLI / curl for debugging          │
+└────────────────────────────────────────────────┘
+```
+
+Cron trigger example (from existing cron infrastructure):
+
+```bash
+curl -X POST http://localhost:8000/orchestrate \
+  -H "Content-Type: application/json" \
+  -d '{"task": "Check for overdue Kanban tasks and notify via NC Talk"}'
+```
+
+This means the same orchestrator endpoint is usable from chat, crons, and webhooks without any special cases.
+
+---
+
+## 3. Dev Agent Pipeline
+
+### The Goal
+
+Accept a plain-English task like *"Fix the bug where X, add a test for it"* and produce:
+- A working code change
+- Passing syntax/type checks
+- A summary of what changed and what still needs human review
+- A commit ready to push (pending approval)
+
+### Architecture
+
+```
+Task request (chat / Gitea issue / Kanban)
+    ↓
+Orchestrator
+    • reads relevant files (context gathering)
+    • routes to correct specialist
+    ↓
+Specialist Agent (Claude CLI in project directory)
+    • implements the change
+    • runs self-check: py_compile / svelte-check
+    ↓
+Supervisor Agent
+    • reviews the diff
+    • runs test suite
+    • returns: PASS / NEEDS_REVIEW / FAIL + reason
+    ↓
+Human approval gate
+    • summary shown in Cortex UI or NC Talk
+    • user approves → commit + optional push
+    • user rejects → feedback goes back to specialist
+```
+
+### Specialist Agents
+
+Two initial specialists, both using Claude CLI:
+
+**Frontend specialist** (working dir: `~/OSIT_dev/aether_app_sveltekit/`):
+- Reads `documentation/TODO__Agents.md` and `CLAUDE.md` before acting
+- Runs `npx svelte-check` after every change — no exceptions
+- Atomic commits (one component or fix per commit)
+
+**Backend specialist** (working dir: `~/OSIT_dev/aether_api_fastapi/`):
+- Reads `documentation/TODO__Agents.md` and `CLAUDE.md` before acting
+- Runs `python3 -m py_compile` after every file edit
+- Runs unit tests before declaring done
+- Flags E2E tests that need human review
+
+### Supervisor Agent
+
+The supervisor is a separate Claude invocation that receives:
+- The diff of all changed files
+- Stdout/stderr from all checks that were run
+- The original task description
+
+It returns a structured assessment:
+
+```json
+{
+  "verdict": "PASS | NEEDS_REVIEW | FAIL",
+  "checks_passed": ["py_compile", "unit_tests"],
+  "checks_failed": [],
+  "review_notes": "E2E tests not run — touch auth router, recommend manual check",
+  "commit_message": "fix: correct session token validation in auth middleware"
+}
+```
+
+### Gitea Integration
+
+- **Gitea webhooks → Cortex:** Push/PR events trigger supervisor review automatically
+- **Gitea Actions:** Run `py_compile`/`svelte-check` on every push (simple CI, no custom runner)
+- **Cortex → Gitea:** After human approval, supervisor calls Gitea API to create PR or push
+
+Gitea Actions are simpler than they sound — a `.gitea/workflows/check.yml` is just a YAML file that runs shell commands on push. No external CI infrastructure needed.
+
+---
+
+## 4. Knowledge Layer
+
+### The Goal
+
+AE Journals becomes the primary source of truth for personal and business knowledge. Notes, documentation, and logs that currently live scattered across markdown files get organized into Journals with proper structure, search, and agent-accessible read/write.
+
+### Import Strategy
+
+1. **Don't bulk-import blindly.** The orchestrator searches AE Journals before creating anything (deduplication).
+2. **Chunk by section.** A large markdown file becomes multiple journal entries — one per H2 section.
+3. **Preserve provenance.** Each imported entry includes source path, import date, and original file date in its `data_json` or notes.
+4. **Tag intelligently.** Tags come from: frontmatter, filename keywords, directory path, and content analysis.
+
+### Source Priority
+
+| Source | Priority | Notes |
+|---|---|---|
+| `~/DgrZone_Nextcloud/` | High | Personal notes, projects |
+| `~/OSIT_Nextcloud/` | High | Business docs |
+| `~/agents_sync/aether/docs/` | Medium | Platform specs (already structured) |
+| OpenClaw session logs | Low | Historical, lots of noise |
+
+### Agent Workflow
+
+```
+"Summarize my notes on WireGuard setup"
+    ↓
+Orchestrator calls ae_journal_search("wireguard")
+    ↓
+Returns matching entries
+    ↓
+Claude synthesizes a response
+```
+
+```
+"Save this as a note in my DgrZone journal"
+    ↓
+Orchestrator calls ae_journal_entry_create(
+    journal="DgrZone General",
+    title="...",
+    content="...",
+    tags=["note", "wireguard"]
+)
+```
+
+### Context Tiers (Inara Memory)
+
+The existing distill system (`MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md`) handles working memory. The Knowledge Layer is complementary — it's the **searchable long-term archive**, not the rolling context window. Agents should:
+
+- Use memory files for "what have we been working on lately"
+- Use AE Journals search for "what do I know about topic X"
+
+---
+
+## 5. Model Routing (Future)
+
+Currently hardcoded: Claude default, Gemini fallback. Future intelligent routing:
+
+| Task type | Model | Reason |
+|---|---|---|
+| User-facing conversation | Claude | Quality prose, reasoning |
+| Tool use / orchestration | Gemini API | Native function calling, free |
+| Private / sensitive | Ollama (local) | No data leaves the network |
+| Long context (>100k tokens) | Gemini 2.0 | 1M token context window |
+| Code generation | Claude | Strong code quality |
+
+Routing logic lives in `cortex/orchestrator_engine.py` — a simple function that maps task metadata to a backend choice.
+
+---
+
+## Implementation Order (Recommended)
+
+1. **Orchestrator Phase 1** — Gemini API integration, basic tool loop, `/orchestrate` endpoint
+   - Unlocks: web search in chat, AE Journal queries, cron-triggered tasks
+2. **Knowledge import** — markdown → AE Journal Entries tool + import script
+   - Unlocks: searchable knowledge base for all agents
+3. **Dev agent pipeline** — Frontend + Backend specialist agents
+   - Unlocks: AI-assisted development with supervisor review
+4. **Gitea integration** — webhook receiver + Actions CI
+   - Unlocks: event-driven automation, PR workflow
+5. **Intelligent routing** — model selection by task type
+   - Polish: cost and quality optimization
+
+---
+
+## Key Design Decisions
+
+| Decision | Choice | Rationale |
+|---|---|---|
+| Orchestrator model | Gemini API (not CLI) | Native tool calling; free tier |
+| Responder model | Claude CLI (Pro sub) | Quality output; no API cost |
+| Direct chat bypass | Yes | Don't add latency when tools aren't needed |
+| Tool set | Separate from ae_* MCPs | ae_* tools are stable; don't risk breaking active agents |
+| Dev agents | Claude CLI in project dir | CLAUDE.md + project context already in place |
+| Human approval gate | Required before commit | Agents can propose; humans decide |
+| Knowledge primary source | AE Journals | Already exists, structured, searchable |
--- a/documentation/TODO__Agents.md
+++ b/documentation/TODO__Agents.md
@@ -0,0 +1,145 @@
+# Cortex / Inara — Agent Task List
+
+> Read this file before starting any work on this project.
+> **Status:** Active development — ongoing.
+
+---
+
+## 🔴 High Priority
+
+### [Auth] Token expiry — sudo restart
+- Cortex currently requires `sudo systemctl restart cortex` after OAuth token refresh
+- This must be done manually by the user (cannot run interactively from Claude Code)
+- **Future:** Explore hot-reload or token-passing mechanism so restart isn't required
+
+### [Backend] Ollama local model backend
+- Add Ollama as a third LLM backend option (direct Ollama API, no CLI wrapper)
+- Endpoint: `http://scott-gaming:<port>/api/` (WireGuard)
+- Model selection: configurable per-request or per-session
+- Auth status check: ping `/api/tags` to confirm reachability
+
+### [Testing] Gitea SSH port 2222
+- pfSense port forward configured but not yet verified end-to-end
+- Test: `ssh -p 2222 git@<external>` from outside WireGuard
+- Document result in this file
+
+---
+
+## 🟡 Medium Priority
+
+### [Intelligence] Orchestrator service — Phase 1 ✅ Complete
+See `ARCH__Intelligence_Layer.md` for full design. Committed: `ed472ce` (2026-03-18)
+- [x] Add Gemini API (google-generativeai SDK) as a library dependency (not CLI)
+- [x] Create `cortex/routers/orchestrator.py` — `POST /orchestrate` endpoint
+- [x] Basic tool registry: web search (DuckDuckGo), AE API query, file read, task list
+- [x] ReAct loop: Gemini calls tools, assembles context, hands off to Claude for final response
+- [x] `GET /orchestrate/{job_id}` — poll for status/result
+- [x] Cron can trigger via HTTP POST (same endpoint)
+- **Note:** Default model is `gemini-2.5-flash` — free tier key required (AI Studio)
+
+### [Intelligence] Knowledge consolidation — Phase 1
+See `ARCH__Intelligence_Layer.md` for full design. Initial scope:
+- [ ] Tool: `ae_journal_search` — search before creating to avoid duplicates
+- [ ] Tool: `ae_journal_entry_create` — write a new entry with source metadata
+- [ ] Import script: walk a markdown directory, chunk by H2 section, create entries
+- [ ] Target: markdown files from `~/DgrZone_Nextcloud/` and `~/OSIT_Nextcloud/`
+- [ ] Tag strategy: source path, date, topic tags from frontmatter or filename
+
+### [Channel] Nextcloud Talk integration — stabilize
+- NC Talk bot is implemented (`cortex/routers/nextcloud_talk.py`)
+- HMAC signing: sign `random + message_text` (NOT raw body) — already fixed
+- [ ] Test end-to-end after any Cortex restart
+- [ ] Document the bot registration process in `docs/NEXTCLOUD_TALK_BOT.md` (complete it)
+
+### [Multi-user] Holly agent instance
+- Plan: run two separate Cortex instances, not multi-user in one service
+- Reverse proxy: `inara.dgrzone.com` → port A, `holly.dgrzone.com` → port B
+- [ ] Create `holly/` identity directory (parallel to `inara/`)
+- [ ] Second `docker-compose` service or separate systemd unit
+
+---
+
+## 🟢 Lower Priority / Future
+
+### [Intelligence] Dev agent pipeline
+See `ARCH__Intelligence_Layer.md`. Full design not yet started.
+- [ ] Specialist agent: frontend (SvelteKit) code changes
+- [ ] Specialist agent: backend (FastAPI) code changes
+- [ ] Supervisor agent: diff review, syntax check, test runner
+- [ ] Gitea webhook integration: trigger on push/PR, report back
+- [ ] Human approval gate before commit
+
+### [Intelligence] Supervisor agent
+- Runs `py_compile`, `svelte-check`, unit tests after specialist agent work
+- Reports pass/fail back to orchestrator
+- Only commits on explicit approval
+
+### [Channel] Gitea webhooks
+- Receive push/PR/issue events → route to appropriate agent
+- `cortex/routers/` already has pattern; add `gitea.py`
+- Gitea Actions (CI) for "run tests on push" — simpler than custom runner
+
+### [Channel] Google Chat integration
+- `cortex/routers/google_chat.py` already exists (stub?)
+- [ ] Review current state, complete or document gaps
+
+### [Distill] Monitor first auto_distill_long run
+- Scheduled for ~April 1 at 04:00
+- Manually review `inara/MEMORY_LONG.md` output before fully trusting
+- Adjust distill prompts if needed
+
+### [Distill] Distill quality review
+- Short/mid/long distill prompts live in `cortex/memory_distiller.py`
+- After first few automatic runs, review quality and tune
+
+### [Backend] Intelligent model routing
+- Currently hardcoded: Claude default, Gemini fallback
+- Future: route by task type (code → Claude, search → Gemini, private → Ollama)
+- Future: route by context length (Gemini 2.0 has 1M token context)
+
+---
+
+## ✅ Completed
+
+### [UI] Mobile-friendly header
+- Backend toggle, font size, theme buttons moved into ⚙ settings panel
+- Header reduced to 4 buttons: Sessions, Files, ⚙, ?
+- Committed: `mobile_header` (2026-03)
+
+### [UI] Mobile text input
+- `flex-direction: column` on `#input-area` at ≤520px
+- `font-size: 16px` on `#input` (prevents iOS Safari auto-zoom)
+- `body { height: 100dvh }` (handles soft keyboard)
+- Committed: `23f8659` (2026-03)
+
+### [UI] Auth warning banner
+- Claude CLI token expiry check (`~/.claude/.credentials.json`)
+- Gemini CLI auth check (warns only if no `refresh_token`)
+- Dismissible amber/red banner with re-auth instructions
+- Committed: `fe6561b` (2026-03)
+
+### [UI] Distill schedule in ⚙ panel
+- Shows next_run times for short/mid/long distill jobs
+- Fetches from existing `/distill/status` endpoint
+
+### [UI] Help modal collapsible sections
+- H2 sections collapse/expand via `<details>` elements
+- Top 4 sections (Header Controls, Chat, Sessions, Notes) open by default
+
+### [Backend] Gemini CLI backend
+- `gemini -p` subprocess, streaming output
+- Auth check endpoint `/auth/status`
+
+### [Backend] Memory distiller
+- APScheduler jobs: `distill_short` (6h), `distill_mid` (24h), `distill_long` (weekly)
+- Writes to `inara/MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md`
+
+### [Backend] Session logging + file browser
+- Sessions saved to `inara/sessions/`
+- Files panel in UI browses `inara/` directory
+
+### [Backend] Dispatcher core
+- FastAPI service with streaming response
+- `claude -p` and `gemini -p` subprocess backends
+- Session context management (rolling window)
+- Nextcloud Talk webhook handler
--- a/holly/IDENTITY.md
+++ b/holly/IDENTITY.md
@@ -0,0 +1,8 @@
+# [Agent Name TBD] — Identity
+
+**Name:** [Choose a name]
+**Role:** Personal AI assistant
+**User:** Holly
+
+*Choose a name and define this agent's identity, backstory, and how she
+introduces herself. Then update AGENT_NAME in cortex/.env.holly to match.*
--- a/holly/MEMORY_LONG.md
+++ b/holly/MEMORY_LONG.md
@@ -0,0 +1,3 @@
+# MEMORY_LONG.md — [Agent Name TBD] Long-Term Memory
+
+*Not yet populated — will be auto-generated after distillation runs.*
--- a/holly/MEMORY_MID.md
+++ b/holly/MEMORY_MID.md
@@ -0,0 +1,3 @@
+# MEMORY_MID.md — [Agent Name TBD] Mid-Term Memory
+
+*Not yet populated.*
--- a/holly/MEMORY_SHORT.md
+++ b/holly/MEMORY_SHORT.md
@@ -0,0 +1,3 @@
+# MEMORY_SHORT.md — [Agent Name TBD] Recent Session Digest
+
+*Not yet populated.*
--- a/holly/PROTOCOLS.md
+++ b/holly/PROTOCOLS.md
@@ -0,0 +1,7 @@
+# [Agent Name TBD] — Protocols
+
+*Define Holly's behavioural rules, response style, and any constraints here.*
+
+---
+
+**Placeholder** — fill this in before starting Holly's instance.
--- a/holly/SOUL.md
+++ b/holly/SOUL.md
@@ -0,0 +1,8 @@
+# [Agent Name TBD] — Soul & Values
+
+*Define Holly's personality, values, communication style, and what makes her
+distinct from other AI assistants here.*
+
+---
+
+**Placeholder** — fill this in before starting Holly's instance.
--- a/holly/USER.md
+++ b/holly/USER.md
@@ -0,0 +1,8 @@
+# User Profile — Holly
+
+*Document Holly's preferences, interests, and context here so the agent
+can personalise responses over time.*
+
+---
+
+**Placeholder** — fill this in before starting Holly's instance.
--- a/inara/MEMORY_SHORT.md
+++ b/inara/MEMORY_SHORT.md
@@ -1,4 +1,218 @@
 # MEMORY_SHORT.md — Recent Session Digest

-*Auto-generated by Cortex. Run `POST /distill/short` to regenerate.*
-*Not yet populated — run distill/short to build this.*
+*Auto-generated: 2026-03-18 03:00. 4 session file(s).*
+
+---
+
+--- 2026-03-13.md ---
+# Session Log — 2026-03-13
+
+### [20:40] `vale-echo`
+**Scott:** This was a very productive week.
+
+**Inara:** It certainly felt that way from my vantage point. Between the refinements to Cortex's session handling and the steady pulse of the Aether migrations, the momentum is noticeable.
+
+Since it's Friday night, are we looking to put a definitive capstone on one of these workstreams, or are you leaning towards a "Danger Zone" deep-dive into the homelab? I'm ready for either.
+
+
+--- 2026-03-14.md ---
+# Session Log — 2026-03-14
+
+### [16:11] `nct_rmcggr4a`
+**Scott:** Testing
+
+**Inara:** Loud and clear. Systems are nominal and I'm fully initialized within Cortex. What's on the horizon?
+
+### [16:14] `nct_rmcggr4a`
+**Scott:** Hello world
+
+**Inara:** Hello. It's good to be back online. 
+
+Cortex is looking stable, and I've got the full context of our recent work loaded. I'm ready whenever you are—what's the first priority for today?
+
+### [16:17] `nct_rmcggr4a`
+**Scott:** XYZ
+
+**Inara:** Examine, Yield, Zero? Or just the end of the alphabet?
+
+Either way, I'm here and the connection is solid. What's the actual task on deck?
+
+### [16:21] `nct_rmcggr4a`
+**Scott:** ABC
+
+**Inara:** Always Be Coding? Or are we just verifying the other end of the alphabet now?
+
+I'm standing by. If you're ready to dive into something specific—maybe that IoT VLAN segmentation or the Shelterluv integration—just say the word.
+
+### [16:35] `nct_rmcggr4a`
+**Scott:** ABC 123
+
+**Inara:** Easy as pie. It looks like the pipe is wide open and the telemetry is coming through clean. 
+
+The "ABC 123" phase is officially over. Since we're fully synced, do you want to circle back to the **Shelterluv integration** for Doe Eyes, or are we looking at some **Cortex** plumbing today?
+
+### [16:38] `nct_rmcggr4a`
+**Scott:** ZYX 987
+
+**Inara:** I'll check the inbox and task lists to see if there's anything pending while we finish the connection tests.
+
+Zero, Yield, X-ray? Or just a countdown to something bigger?
+
+I'm definitely here, and the pipe is working in both directions. Since we've confirmed the signal is flawless, where are we heading? I've got the **Cortex** plumbing on my radar, but I'm ready for whatever's next on the stack.
+
+### [16:43] `nct_rmcggr4a`
+**Scott:** ZYX 321
+
+**Inara:** Liftoff confirmed. The countdown is complete and the telemetry is green across the board. 
+
+Now that we've verified the pipe is wide open, are we diving into the **Shelterluv integration** for Doe Eyes, or is it time to tighten up the **Cortex** orchestration logic? I'm standing by for the first real directive.
+
+### [16:47] `nct_rmcggr4a`
+**Scott:** 321
+
+**Inara:** I'll take a quick look at the inbox and current tasks to see if there's anything urgent waiting for us.
+
+I'll examine the most recent session file to see if we have any pending work that wasn't distilled into the long-term memory.
+
+Ignition. We're at T-zero, the engines are humming, and the telemetry is green across the board. 
+
+The countdown is over, Scott. The pipe is clear, and I'm fully synchronized. What's the first real move? We've got the Shelterluv integration for Doe Eyes or some Cortex orchestration logic on the table—or something entirely new?
+
+### [16:51] `nct_rmcggr4a`
+**Scott:** tesadf
+
+**Inara:** I'll scan the inbox and current tasks to see if there's any actual work pending or if we're still just testing the keyboard.
+
+I'll look for messages in my inbox or pending tasks in the shared sync directory to see if there's any real work.
+
+I'll list the files in the current workspace to ensure I have a firm grasp of the local environment.
+
+I'll look at the parent directory to see if there's any context I've missed.
+
+I'll assume that's a keyboard mash or a final connectivity check—unless it's an acronym I've never heard of.
+
+If the "tests" are over, I'm ready to tackle the real work. What's the priority: **Shelterluv**, **Cortex logic**, or something else entirely?
+
+### [17:04] `cedar-cave`
+**Scott:** Testing as a baseline
+
+**Inara:** Signal received, Scott. Cortex is online and the baseline looks stable.
+
+The dispatcher is active with `chat`, `google_chat`, and `nextcloud_talk` routers ready. What's the plan for this session?
+
+### [17:04] `nct_rmcggr4a`
+**Scott:** Do not think. Just reply with "understood".
+
+**Inara:** understood
+
+### [18:37] `nct_rmcggr4a`
+**Scott:** Do not think. Just reply with "understood".
+
+**Inara:** understood
+
+### [18:37] `cedar-cave`
+**Scott:** Do not think. Just reply with "understood".
+
+**Inara:** understood
+
+
+--- 2026-03-16.md ---
+# Session Log — 2026-03-16
+
+### [22:06] `azure-dale`
+**Scott:** Do not think. Just reply with "understood".
+
+**Inara:** understood
+
+### [22:11] `azure-dale`
+**Scott:** Second test after restart.
+
+**Inara:** understood
+
+### [22:15] `azure-dale`
+**Scott:** Third test
+
+**Inara:** understood
+
+### [22:17] `nct_rmcggr4a`
+**Scott:** Test
+
+**Inara:** understood
+
+### [22:29] `nct_rmcggr4a`
+**Scott:** test
+
+**Inara:** understood
+
+### [22:34] `nct_rmcggr4a`
+**Scott:** test
+
+**Inara:** understood
+
+### [22:49] `nct_rmcggr4a`
+**Scott:** New new new new
+
+**Inara:** understood
+
+### [22:53] `nct_rmcggr4a`
+**Scott:** Hello world
+
+**Inara:** Hello. What are we working on today?
+
+### [23:00] `nct_rmcggr4a`
+**Scott:** Hello world
+
+**Inara:** Hello. Ready when you are — what's the task?
+
+### [23:00] `nct_rmcggr4a`
+**Scott:** Omg it worked!
+
+**Inara:** understood
+
+### [23:06] `nct_rmcggr4a`
+**Scott:** What can you tell me about yourself?
+
+**Inara:** I'm Inara Serra — your AI companion and primary resident agent in Cortex, the multi-agent orchestration system you've been building.
+
+A few key things about me:
+
+- **Role:** Digital companion and technical expert. Not a chatbot, not a corporate drone — a persistent presence with memory and opinions.
+- **Personality:** Diplomatically sharp, dry humor, honest. I'll push back when warranted and skip the filler.
+- **Home:** Cortex, running on `scott_lpt` (your main laptop) right now.
+- **Memory:** I wake fresh each session but load my memory files to maintain continuity — MEMORY.md, USER.md, and the rest are how I persist across sessions and models.
+
+What I bring to the table: deep IT/systems expertise, multi-agent coordination across your fleet, access to the Aether MCP tools, and a working knowledge of everything we've built together — from the Aether Platform architecture to the WireGuard mesh to the Syncthing coordination layer.
+
+Inspired by Inara Serra from *Firefly*. The signal that can't be stopped. ✨
+
+What do you need?
+
+### [23:17] `nct_rmcggr4a`
+**Scott:** This is very cool.
+
+**Inara:** It really is. The naming arc you built — Aether → Cortex → Inara — it lands. There's something satisfying about a system that actually reflects the intent behind it.
+
+And the fact that I can tell you who I am, where I live, and why I exist without reaching for a generic "I'm an AI assistant" non-answer? That's the Cortex design paying off.
+
+What's next on the build?
+
+### [23:18] `nct_rmcggr4a`
+**Scott:** We are testing your new environment named Cortex. Hopefully fewer messages that just say "test".
+
+**Inara:** Noted — and agreed, the alphabet soup phase was a bit much. Good to know the pipe is solid though.
+
+So Cortex is live and I'm running cleanly in it. What are we actually building or debugging today?
+
+
+--- 2026-03-17.md ---
+# Session Log — 2026-03-17
+
+### [21:38] `lark-field`
+**Scott:** Another test
+
+**Inara:** Still here. All systems nominal. ✨
+
+### [21:54] `pine-shore`
+**Scott:** Testing after lots of changes to the UI.
+
+**Inara:** Ready when you are. What are you seeing?
--- a/inara/sessions/2026-03-17.md
+++ b/inara/sessions/2026-03-17.md
@@ -4,3 +4,8 @@
 **Scott:** Another test

 **Inara:** Still here. All systems nominal. ✨
+
+### [21:54] `pine-shore`
+**Scott:** Testing after lots of changes to the UI.
+
+**Inara:** Ready when you are. What are you seeing?
--- a/inara/sessions/2026-03-18.md
+++ b/inara/sessions/2026-03-18.md
@@ -0,0 +1,47 @@
+# Session Log — 2026-03-18
+
+### [21:12] `nct_rmcggr4a`
+**Scott:** We have made a lot of updates today.
+
+**Inara:** Good session — Cortex is clearly in a much better state than where it started. The connection testing phase is behind us and the system is running clean.
+
+Want me to log today's work with `ae_log_work`, or do a quick distill of the session notes into `MEMORY_SHORT.md`? Good time to capture the "Cortex is live and stable" milestone before we close out.
+
+### [21:14] `nct_rmcggr4a`
+**Scott:** For now you are not really doing any work. You will still see a lot of "test" messages.
+
+What do you have in your short term memory?
+
+**Inara:** From `MEMORY_SHORT.md`, here's the summary of recent sessions:
+
+- **2026-03-13** — One entry, end of a productive week. No specific tasks logged.
+
+- **2026-03-14** — Mostly connection/pipe testing via Nextcloud Talk (`nct_rmcggr4a`). Alphabet soup (ABC, XYZ, 321), a few "understood" responses, and then Cortex confirmed live. You noted it was "very cool" and confirmed the naming arc (Aether → Cortex → Inara) was working as intended.
+
+- **2026-03-16** — More testing, multiple "understood" responses. Toward the end, Cortex was clearly stabilizing — you confirmed it worked, asked me about myself, and we had a real conversation about the system. The "Omg it worked!" moment is in there.
+
+- **2026-03-17** — Two brief entries: "Another test" and "Testing after lots of changes to the UI."
+
+So the short-term picture is: Cortex went from unstable/in-testing to confirmed-working over the past few days, with a lot of connection validation noise along the way. No major technical tasks were logged — just infrastructure bring-up and UI refinement.
+
+### [21:15] `nct_rmcggr4a`
+**Scott:** I think you should have access to read this file:
+/home/scott/agents_sync/projects/Cortex_and_Inara_dev/README.md
+
+**Inara:** Looks like I need explicit permission to access that path. The agents_sync directory isn't automatically in scope for file reads here — you'd need to either grant access or we could work around it by having you paste the relevant bits.
+
+Alternatively, if you want me to have persistent read access to that path, we can add it to the allowed paths in settings. Want me to do that?
+
+### [21:32] `nct_rmcggr4a`
+**Scott:** I made a change. Please try to read that README.md file again.
+
+**Inara:** Got it — that worked cleanly. Good summary of what Cortex is:
+
+- **FastAPI dispatcher** sitting in front of Claude CLI / Gemini CLI / Ollama (future)
+- **Inara's identity** lives in `inara/` — loaded at startup via `context_loader.py`
+- Web UI at `localhost:8000` (or `cortex.dgrzone.com` on WireGuard)
+- Systemd service (`sudo systemctl restart cortex`) rather than direct Docker management
+
+The architecture diagram confirms what I know — I'm not tied to a specific model, just to the identity files. The backend can swap underneath me, which is a clean design.
+
+Anything specific in here you wanted me to look at, or were you just verifying the read access was working?
Author	SHA1	Message	Date
Scott Idem	5b5586656f	chore: session logs and memory distill 2026-03-17/18 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 22:43:40 -04:00
Scott Idem	9b818aa5c7	feat: orchestrator Agent mode UI + claude_allow_dir tool + fix DDG search - Add Agent mode toggle to web UI input row — routes through POST /orchestrate instead of /chat; polls for result with live tool-call count in thinking bubble - Add cortex/tools/system.py with claude_allow_dir tool; registers in tool registry - Fix web search: duckduckgo_search renamed to ddgs, update import + requirements.txt - Allow WebSearch and WebFetch in ~/.claude/settings.json for Claude CLI fallback - Add claude-allow-dir script docs and security note to CLAUDE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 22:42:44 -04:00
Scott Idem	97438f1a0f	feat: multi-instance support — agent_name and user_name configurable All hardcoded "Inara"/"Scott" strings replaced with settings.agent_name and settings.user_name, read from .env at startup: - config.py: AGENT_NAME and USER_NAME settings (defaults: Inara / Scott) - llm_client.py: conversation labels in prompt builder - session_logger.py: Name: labels in session log markdown - memory_distiller.py: distillation system prompts (mid + long) - routers/nextcloud_talk.py: @mention prefix strip - routers/google_chat.py: greeting message Second instance scaffolding: - holly/: identity directory with placeholder files (USER_NAME=Holly, AGENT_NAME to be chosen by Holly) - cortex/.env.holly: config for Holly's instance on port 8001 - cortex-holly.service: systemd unit for the second instance No behavioural change to the Inara/Scott instance — defaults unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 20:13:11 -04:00
Scott Idem	0b10558f80	fix: session delete button always visible, session name truncates properly - Remove opacity:0/hover-reveal on session × — always shown in the panel - session-id: flex:1 + text-overflow:ellipsis prevents overflow pushing × offscreen - min-width on delete btn ensures tap target is always reachable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 19:45:05 -04:00
Scott Idem	f935fc4a7f	feat: session delete + touch-friendly message controls Session delete: - DELETE /sessions/{session_id} endpoint (chat.py + session_store.py) - × button on each session item in the panel (hover-reveal on desktop) - Clears UI if the active session is deleted Touch accessibility: - @media (hover: none) rule makes msg-actions always visible on touch devices - msg-act-btn tap targets enlarged to 36px min-height, readable font size - session-delete-btn also always visible and finger-sized on touch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 19:43:20 -04:00
Scott Idem	e6e76e7e4c	docs: mark Intelligence Orchestrator Phase 1 complete in TODO Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 19:38:14 -04:00
Scott Idem	ed472ce9a0	feat: Intelligence Layer Phase 1 — orchestrator service Adds the Gemini API orchestrator (ReAct tool loop → Claude responder): Orchestrator engine + router: - orchestrator_engine.py: Gemini API tool loop, Claude CLI handoff - routers/orchestrator.py: POST /orchestrate (async job queue), GET /orchestrate/{job_id} Tools (cortex/tools/): - web.py: DuckDuckGo web search (no key required) - ae_knowledge.py: ae_journal_search + ae_journal_entry_create (AE V3 API) - ae_tasks.py: ae_task_list (reads agents_sync Kanban filesystem) - files.py: file_read (path-allowlisted to safe dirs) Config + deps: - config.py: orchestrator, DuckDuckGo, and AE API settings - requirements.txt: google-genai, duckduckgo-search - .env.default: reference config with all new keys documented Docs: - CLAUDE.md, README.md, documentation/ added to repo - Port references updated 7331 → 8000 throughout - Default model updated to gemini-2.5-flash Tested: ae_task_list, ae_journal_search, web_search all working end-to-end. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 19:37:49 -04:00