feat: audit log, usage tracking UI, OpenAI orchestrator compaction, onboarding + docs

Tool audit log: - Every orchestrator tool call logged to home/{user}/tool_audit/YYYY-MM-DD.jsonl - Files panel sidebar: audit log group (collapsed), date-linked read-only table - Admin endpoints: /api/audit/files, /api/audit/day, /api/audit/recent, /api/audit/stats - Engine and model name recorded per entry OpenAI orchestrator improvements: - Context budget enforcement: 75% of model context_k (min 16k) - Message compaction: truncates old tool results when approaching budget - max_rounds respected per model config (intersected with server cap) OpenRouter onboarding (setup.html, onboarding.py, app.js, settings.html): - Step 3 of 3: /setup/model with curated model picker - Chat banner for users on server-default model (informational, not alarmist) - Settings quick-link card; /setup/model works standalone for existing users Model registry + session store: - set_role_config / get_role_config for per-role tool lists and system_append - session_store: session rename, session name backfill endpoint UI updates (app.js, index.html, style.css, local_llm.html): - Role toggle in context panel - Off-the-record mode - Agent notes read-only viewer - OPERATIONS.md loaded at T2+ in context Documentation: - HELP.md: full tool table, per-role tool sets, Agent Notes, usage tracking - TOOLS.md: Agent Notes section, count corrected to 44 - ARCH__SYSTEM.md, ARCH__BACKENDS.md, MASTER.md updated to match reality - CLAUDE.md: onboarding flow, documentation philosophy sections - README.md: stack in practice, DeepSeek TUI mention, architecture diagram updated - TODO__Agents.md: onboarding task completed with deviation notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 21:26:43 -04:00
parent c02d2462b0
commit f8f7cd75da
25 changed files with 1088 additions and 151 deletions
--- a/cortex/static/HELP.md
+++ b/cortex/static/HELP.md
@@ -6,7 +6,24 @@
     and are appended automatically by help.html when present.
 -->

-*Last updated: 2026-05-05*
+*Last updated: 2026-05-08*
+
+---
+
+## Getting Started
+
+If this is your first time using Cortex, you need one thing before the chat will work: an AI model connected to your account.
+
+**Fastest path — OpenRouter:**
+OpenRouter gives you access to Claude, Gemini, and dozens of other models with a single API key.
+
+1. Get a free API key at [openrouter.ai/keys](https://openrouter.ai/keys)
+2. Go to **☰ → Account → [Set up OpenRouter →]** (shown automatically if no model is configured)
+3. Paste your key, pick a starting model, click **Connect**
+
+That's it — you're ready to chat.
+
+**Already past setup but seeing errors?** Go to **☰ → Account → Model Registry → Manage models** and confirm a model is assigned to the **Chat** role (Primary slot). If all slots are empty, add a model first.

 ---

@@ -52,19 +69,45 @@ Click the **⚡** button in the input row to enable the Tools toggle. When lit (

 The orchestrator runs a multi-step tool loop:

-1. The **orchestrator model** reasons about the request and calls tools as needed — web search, file reads, task management, shell commands, Aether Journals, and more
+1. The **orchestrator model** reasons about the request and calls tools as needed
 2. It produces an enriched summary of what it found
 3. The **responder model** (set by the active Role) receives that context and writes the final user-facing reply
 4. A `⚡ N tool calls: …` note appears below the response listing what was used

-The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**. By default this is Gemini API.
-
-The full tool reference is in the **Tools** tab. 40 tools across web, files, shell, system, tasks, cron, reminders, scratchpad, notifications, and Aether Journals.
+The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**.

 Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.

 Orchestrated sessions persist to history exactly like regular chat.

+### Available Tools
+
+40 tools across 11 categories. Each tool schema is sent to the model on every orchestrated call — fewer active tools means fewer tokens per call.
+
+| Category | Tools |
+|---|---|
+| **Web** | `web_search`, `http_fetch` |
+| **Files** | `file_read`, `file_list`, `file_write` |
+| **Shell** | `shell_exec`, `claude_allow_dir` |
+| **System** | `cortex_restart`, `cortex_logs`, `cortex_status`, `cortex_update` |
+| **Tasks** | `task_list`, `task_create`, `task_update`, `task_complete` |
+| **Cron** | `cron_list`, `cron_add`, `cron_remove`, `cron_toggle` |
+| **Reminders** | `reminders_add`, `reminders_list`, `reminders_remove`, `reminders_clear` |
+| **Scratchpad** | `scratch_read`, `scratch_write`, `scratch_append`, `scratch_clear` |
+| **Notifications** | `web_push`, `email_send`, `nc_talk_send` |
+| **Aether Journals** | `ae_journal_list/search`, `ae_journal_entries_list`, `ae_journal_entry_read/create/update/disable/append/prepend` |
+| **Agent Notes** | `agent_notes_read`, `agent_notes_write`, `agent_notes_append`, `agent_notes_clear` |
+
+File, Shell, System, and some Notification tools are **admin-only** and not visible to regular users.
+
+### Per-Role Tool Sets
+
+Each role can be configured with a specific subset of tool categories. When a role has a tool subset configured, only those tools are sent to the orchestrator — the rest are invisible to the model for that session.
+
+**Example:** a Coder role might only need Web, Files, Shell, and Agent Notes. A Research role might only need Web. Configuring this avoids sending schemas for 30+ irrelevant tools on every call.
+
+Configure per-role tool sets in **Account → Model Registry → Role Assignments** — expand a role card to see the category checkboxes. The default (no checkboxes selected) sends all tools the user has access to.
+
 ---

 ## Sessions
@@ -123,11 +166,59 @@ Each response shows a **model tag** (bottom-right of message) with the model lab

 ---

+## Account Settings
+
+**Navigate to:** ☰ (top-right menu) → **Account**
+
+| Section | What you can do |
+|---|---|
+| **Account** | View your username, role badge (Admin / User), rename your username |
+| **Connected Accounts** | See which Google account is linked for OAuth sign-in |
+| **Email Allowlist** | Regex patterns controlling which addresses the `email_send` tool can reach |
+| **Notifications** | Set which channel (NC Talk, Google Chat, email) Inara uses for proactive messages |
+| **Tool Permissions** | Allow or block specific orchestrator tools for your account |
+| **Usage** | Token consumption by model — see below |
+| **Browser Cache** | Clear UI preferences stored locally (theme, font size, session ID, etc.) |
+| **Model Registry** | Configure AI providers, local hosts, and role assignments |
+| **Change Password** | Update your login password |
+| **Personas** | List and rename your personas |
+
+---
+
+## Usage
+
+Token consumption is tracked automatically for API-backed models. **Navigate to:** ☰ → **Account** → **Usage** section.
+
+The table shows all-time totals per model key, with columns for:
+
+| Column | Meaning |
+|---|---|
+| **Model** | `backend/model-name` key (e.g. `gemini_api/gemini-2.5-flash`, `local/deepseek-v4`) |
+| **Calls** | Number of API calls made |
+| **Prompt** | Input tokens sent |
+| **Output** | Completion tokens received |
+| **Total** | Prompt + Output |
+
+Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).
+
+**What is and isn't tracked:**
+
+- ✅ Gemini API calls (orchestrator, distillation)
+- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
+- ✗ Claude CLI — no structured token data is returned by the subprocess
+- ✗ Gemini CLI — same reason
+
+The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.
+
+---
+
 ## Model Registry

 Configure which AI models are available and which handles each task type.

-**Navigate to:** ☰ (top-right menu) → **Account** → scroll to **Model Registry** → **Manage models →**
+**New user quick path:** ☰ → **Account** → **Set up OpenRouter →** (the guided wizard adds a host, model, and role assignment in one step).
+
+**Full manual path:** ☰ → **Account** → scroll to **Model Registry** → **Manage models →**

 ---

@@ -142,10 +233,16 @@ Do this before adding models — models need a provider account or local host to
 2. Enter a label (e.g. "Work", "Personal") and your API key
 3. Get a free key at [aistudio.google.com/apikey](https://aistudio.google.com/apikey)

-**Local hosts** (Open WebUI, Ollama, OpenRouter, etc.):
+**OpenRouter** (recommended for new users — one key for many models):
+1. Get a key at [openrouter.ai/keys](https://openrouter.ai/keys)
+2. Scroll to **Local Hosts** → **+ Add host**
+3. Label: "OpenRouter", URL: `https://openrouter.ai/api/v1`, paste your key, Type: OpenAI-compatible
+4. Click **Fetch models** to verify, then add models from the fetched list
+
+**Other local hosts** (Open WebUI, Ollama, LM Studio, etc.):
 1. Scroll to **Local Hosts** → click **+ Add host** to expand the form
 2. Enter a label, the API URL (e.g. `http://192.168.1.100:3000`), and optional API key
-3. Set **Type**: Open WebUI / Ollama, or OpenAI-compatible (for OpenRouter, LM Studio, etc.)
+3. Set **Type**: Open WebUI / Ollama, or OpenAI-compatible
 4. Click **Fetch models** on the saved host card to verify connectivity

 ---
@@ -178,6 +275,8 @@ Scroll to **Role Assignments** at the bottom of the page. Each role has **Primar

 Leave all slots empty to use the server default.

+**Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).
+
 ---

 ## Nextcloud Talk Bot
@@ -245,12 +344,12 @@ Controls how much context is prepended to each LLM call:

 | Tier | Loads | ~Tokens |
 |---|---|---|
-| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
-| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
-| **T3** | + last 2 raw session logs | ~15,000 |
-| **T4** | + last 7 raw session logs | ~50,000 |
+| **Min** | SOUL + IDENTITY + USER summary | ~1,500 |
+| **Std** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
+| **Ext** | + last 2 raw session logs | ~15,000 |
+| **Full** | + last 7 raw session logs | ~50,000 |

-Default is T2. Use T1 for small/local models. Use T3–T4 for complex multi-session tasks.
+Default is **Std**. Use **Min** for small/local models. Use **Ext** or **Full** for complex multi-session tasks.

 ### Memory Layers

@@ -318,6 +417,7 @@ For direct access or scripting:
 | `GET` | `/orchestrate/{job_id}` | Poll job status and result |
 | `GET` | `/settings/models` | Model registry UI |
 | `POST` | `/api/models/role` | Set a role assignment (JSON body) |
+| `POST` | `/api/models/role-config` | Set per-role tool list and system prompt append |
 | `GET` | `/api/push/vapid-key` | VAPID public key (for push subscription) |
 | `POST` | `/api/push/subscribe` | Register a push subscription |
 | `DELETE` | `/api/push/subscribe` | Remove a push subscription |
@@ -325,6 +425,11 @@ For direct access or scripting:
 | `GET` | `/api/audit/day?date=` | Tool call entries for a specific date (own data) |
 | `GET` | `/api/audit/recent` | Recent tool calls across days (admin) |
 | `GET` | `/api/audit/stats` | Tool call counts by tool/status/user (admin) |
+| `GET` | `/api/usage` | Full daily token usage log (own data) |
+| `GET` | `/api/usage/summary` | Per-model token totals, all time (own data) |
+| `GET` | `/api/usage/all` | Per-model totals for all users (admin) |
+| `GET` | `/setup/model` | Guided OpenRouter setup form (Step 3 / standalone) |
+| `POST` | `/setup/model` | Save OpenRouter host + model + assign to chat role |
 | `GET` | `/health` | Health check — returns `{"status": "ok"}` |

 Chat request body (`POST /chat`):
--- a/cortex/static/TOOLS.md
+++ b/cortex/static/TOOLS.md
@@ -1,6 +1,6 @@
 # Tool Reference

-> This reference covers all 40 orchestrator tools available when the ⚡ toggle is on.
+> This reference covers all 44 orchestrator tools available when the ⚡ toggle is on.
 > Tools are invoked automatically by the orchestrator — you don't call them directly.

 ¹ **Admin only** — requires the `admin` role. Invisible to regular users.  
@@ -102,3 +102,14 @@
 | Tool | What it does |
 |---|---|
 | `ae_task_list` ¹ | List tasks from the agents_sync Kanban board |
+
+## Agent Notes
+
+Private, durable notes visible only to the orchestrator — not surfaced to users. Persist across sessions. Only available in orchestrated (tool-enabled) sessions.
+
+| Tool | What it does |
+|---|---|
+| `agent_notes_read` | Read the current private notes file |
+| `agent_notes_write` | Overwrite the notes file completely |
+| `agent_notes_append` | Append a timestamped entry (keeps last 3 backups automatically) |
+| `agent_notes_clear` | Erase all notes (backs up first) |
--- a/cortex/static/app.js
+++ b/cortex/static/app.js
@@ -18,6 +18,11 @@
        const settings_dd_el     = document.getElementById('settings-dropdown');
        const sessionsBackdrop   = document.getElementById('sessions-backdrop');

+        // ── Utilities ─────────────────────────────────────────────────
+        function escapeHtml(str) {
+            return String(str).replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;').replace(/"/g,'&quot;');
+        }
+
        // ── Close all panels/dropdowns (mutual exclusion) ─────────────
        function closeAllPanels() {
            if (mode_dropdown_el)  mode_dropdown_el.classList.remove('open');
@@ -435,8 +440,32 @@
            availableRoles = d.available_roles || [];
            roleIdx        = 0;
            setRoleToggleUI(availableRoles[0] || null);
+            _maybeShowNoBanner(availableRoles);
        });

+        function _maybeShowNoBanner(roles) {
+            const key = 'cx_no_model_banner_dismissed';
+            if (roles.length > 0) { localStorage.removeItem(key); return; }
+            if (localStorage.getItem(key)) return;
+            const banner = document.createElement('div');
+            banner.id = 'no-model-banner';
+            banner.style.cssText = [
+                'background:#1c1a0a','border-bottom:1px solid #78350f',
+                'color:#fbbf24','font-size:0.82rem','padding:0.55rem 1rem',
+                'display:flex','align-items:center','gap:0.75rem','flex-shrink:0',
+            ].join(';');
+            banner.innerHTML = `
+                <span style="flex:1">⚡ Using server default model — add your own for more choices and to track your usage.</span>
+                <a href="/setup/model" style="color:#fbbf24;font-weight:600;white-space:nowrap;">Set up OpenRouter →</a>
+                <button onclick="localStorage.setItem('${key}','1');document.getElementById('no-model-banner').remove();"
+                        style="background:none;border:none;color:#78350f;cursor:pointer;font-size:1rem;line-height:1;padding:0 0.2rem;"
+                        title="Dismiss">✕</button>
+            `;
+            // Insert at the top of #chat-col (or body if not found)
+            const col = document.getElementById('chat-col') || document.body.firstElementChild;
+            col.insertBefore(banner, col.firstChild);
+        }
+
        backendToggle.addEventListener('click', () => {
            if (availableRoles.length <= 1) return;
            roleIdx = (roleIdx + 1) % availableRoles.length;
@@ -1067,6 +1096,19 @@
                            sessionId = data.session_id;
                            sessionEl.textContent = `session: ${sessionId}`;
                            persist_session();
+
+                            // Auto-name the session from the first user message
+                            if (wasNewSession) {
+                                const autoName = text.slice(0, 60).trimEnd() + (text.length > 60 ? '…' : '');
+                                fetch(`/sessions/${sessionId}?${_fileParams}`, {
+                                    method: 'PATCH',
+                                    headers: { 'Content-Type': 'application/json' },
+                                    body: JSON.stringify({ name: autoName }),
+                                }).then(() => {
+                                    sessionEl.textContent = `session: ${autoName}`;
+                                    sessionNames.set(sessionId, autoName);
+                                }).catch(() => {});
+                            }
                            thinkingDiv.className = 'message assistant';
                            setMessageText(thinkingDiv, 'assistant', data.response);
                            const assistHistIdx = currentHistory.length;
@@ -1133,6 +1175,8 @@
            const text = inputEl.value.trim();
            if (!text || activeController) return;

+            const wasNewSession = !sessionId;
+
            inputEl.value = '';
            syncHeight();
            sendBtn.style.display = 'none';
@@ -1357,6 +1401,7 @@
            { label: 'Memory',   files: ['MEMORY_LONG.md', 'MEMORY_MID.md', 'MEMORY_SHORT.md'] },
            { label: 'Profile',  files: ['USER.md', 'HELP.md'] },
            { label: 'Settings', files: ['email_allowlist.json'] },
+            { label: 'Agent Notes (read-only)', files: ['AGENT_NOTES.bak1.md', 'AGENT_NOTES.bak2.md', 'AGENT_NOTES.bak3.md'], collapsed: true },
        ];

        function fmtSize(bytes) {
@@ -1394,7 +1439,7 @@
            fileSidebar.innerHTML = '';

            for (const group of FILE_GROUPS) {
-                const { groupEl, items } = _makeFileGroup(group.label);
+                const { groupEl, items } = _makeFileGroup(group.label, group.collapsed || false);

                for (const fname of group.files) {
                    const f = byName[fname];
@@ -1490,12 +1535,20 @@
            // Restore editor/preview buttons hidden by audit view
            fileRawBtn.style.display = '';
            filePreviewBtn.style.display = '';
-            fileSaveBtn.style.display = '';
            const res = await fetch(`/files/${encodeURIComponent(name)}?${_fileParams}`);
            if (!res.ok) { mdEditor.setValue(`Error loading ${name}`); return; }
            const data = await res.json();
            mdEditor.setValue(data.content);
            mdEditor.clearHistory();
+            if (data.readonly) {
+                mdEditor.setOption('readOnly', 'nocursor');
+                fileSaveBtn.style.display = 'none';
+                document.getElementById('file-modal-title').textContent = name + ' (read-only)';
+            } else {
+                mdEditor.setOption('readOnly', false);
+                fileSaveBtn.style.display = '';
+                document.getElementById('file-modal-title').textContent = name;
+            }
            setFileMode(fileMode);
        }

@@ -1794,11 +1847,13 @@
        let memMid      = localStorage.getItem('mem-mid')   !== 'false';
        let memShort    = localStorage.getItem('mem-short') !== 'false';

+        const TIER_LABELS = { 1: 'Min', 2: 'Std', 3: 'Ext', 4: 'Full' };
+
        function updateTierUI() {
            document.querySelectorAll('.ctx-btn[data-tier]').forEach(btn => {
                btn.classList.toggle('active', parseInt(btn.dataset.tier) === currentTier);
            });
-            ctxOpenBtn.querySelector('.tier-badge').textContent = currentTier;
+            ctxOpenBtn.querySelector('.tier-badge').textContent = TIER_LABELS[currentTier] || currentTier;
        }

        function updateMemUI() {
@@ -1870,33 +1925,46 @@
            memShort = !memShort; localStorage.setItem('mem-short', memShort); updateMemUI();
        });

+        const _distillBtns = () => document.querySelectorAll(
+            '#distill-short-btn, #distill-mid-btn, #distill-long-btn, #distill-all-btn, #distill-rebuild-btn'
+        );
+
        function showDistillStatus(msg, isErr) {
            distillStatus.textContent = msg;
            distillStatus.classList.toggle('err', !!isErr);
            distillStatus.classList.add('show');
-            setTimeout(() => distillStatus.classList.remove('show'), 5000);
+            setTimeout(() => distillStatus.classList.remove('show'), isErr ? 8000 : 5000);
        }

-        async function runDistill(endpoint) {
-            showDistillStatus('distilling…', false);
+        async function runDistill(endpoint, label) {
+            _distillBtns().forEach(b => { b.disabled = true; });
+            showDistillStatus(`${label || endpoint} running…`, false);
            try {
                const res = await fetch(`/distill/${endpoint}?${_fileParams}`, { method: 'POST' });
                const d = await res.json();
-                if (!res.ok || d.ok === false) {
-                    const err = d.error || d.mid?.error || d.long?.error || `HTTP ${res.status}`;
+                if (res.status === 409 || res.status === 429) {
+                    showDistillStatus(`⏳ ${d.detail}`, true);
+                } else if (!res.ok || d.ok === false) {
+                    const err = d.detail || d.error || d.mid?.error || d.long?.error || `HTTP ${res.status}`;
                    showDistillStatus(`✗ ${err}`, true);
                } else {
-                    showDistillStatus(`✓ ${endpoint} done`, false);
+                    showDistillStatus(`✓ ${label || endpoint} complete`, false);
                }
            } catch (err) {
                showDistillStatus(`✗ ${err.message}`, true);
+            } finally {
+                _distillBtns().forEach(b => { b.disabled = false; });
            }
        }

-        document.getElementById('distill-short-btn').addEventListener('click', () => runDistill('short'));
-        document.getElementById('distill-mid-btn').addEventListener('click',   () => runDistill('mid'));
-        document.getElementById('distill-long-btn').addEventListener('click',  () => runDistill('long'));
-        document.getElementById('distill-all-btn').addEventListener('click',   () => runDistill('all'));
+        document.getElementById('distill-short-btn').addEventListener('click', () => runDistill('short', 'Short distill'));
+        document.getElementById('distill-mid-btn').addEventListener('click',   () => runDistill('mid',   'Mid distill'));
+        document.getElementById('distill-long-btn').addEventListener('click',  () => runDistill('long',  'Long distill'));
+        document.getElementById('distill-all-btn').addEventListener('click',   () => runDistill('all',   'Full distill'));
+        document.getElementById('distill-rebuild-btn').addEventListener('click', () => {
+            if (!confirm('Rebuild memory from scratch?\n\nThis will wipe MEMORY_MID and MEMORY_LONG (backups kept) then regenerate them from session logs. Any hand-edited content will be replaced.\n\nContinue?')) return;
+            runDistill('rebuild', 'Memory rebuild');
+        });

        updateTierUI();
        updateMemUI();
--- a/cortex/static/index.html
+++ b/cortex/static/index.html
@@ -87,10 +87,10 @@
            <div class="ctx-section">
                <div class="ctx-section-title">Context Tier</div>
                <div class="ctx-row">
-                    <button class="ctx-btn" data-tier="1" id="tier-1" title="Minimal (~1.5k tokens)">T1</button>
-                    <button class="ctx-btn active" data-tier="2" id="tier-2" title="Standard (~5k tokens)">T2</button>
-                    <button class="ctx-btn" data-tier="3" id="tier-3" title="Extended (~15k tokens)">T3</button>
-                    <button class="ctx-btn" data-tier="4" id="tier-4" title="Full (~50k tokens)">T4</button>
+                    <button class="ctx-btn" data-tier="1" id="tier-1" title="Minimal — identity only (~1.5k tokens)">Min</button>
+                    <button class="ctx-btn active" data-tier="2" id="tier-2" title="Standard — memory + user profile (~5k tokens)">Std</button>
+                    <button class="ctx-btn" data-tier="3" id="tier-3" title="Extended — + last 2 sessions (~15k tokens)">Ext</button>
+                    <button class="ctx-btn" data-tier="4" id="tier-4" title="Full — + last 7 sessions (~50k tokens)">Full</button>
                </div>
            </div>
            <div class="ctx-section">
@@ -108,6 +108,7 @@
                    <button class="ctx-btn" id="distill-mid-btn"   title="Summarize SHORT → MID memory (uses LLM)">Mid</button>
                    <button class="ctx-btn" id="distill-long-btn"  title="Integrate MID → LONG memory (uses LLM)">Long</button>
                    <button class="ctx-btn" id="distill-all-btn"   title="Run Short → Mid → Long in sequence">All</button>
+                    <button class="ctx-btn ctx-btn-danger" id="distill-rebuild-btn" title="⚠ Wipe Mid + Long memories and rebuild from session logs. Hand-edited content will be replaced.">Rebuild</button>
                </div>
                <div id="ctx-distill-status"></div>
                <div id="ctx-schedule"></div>
--- a/cortex/static/local_llm.html
+++ b/cortex/static/local_llm.html
@@ -167,9 +167,11 @@
    .pb-anthropic { background: #1e1b4b; color: #818cf8; }
    .pb-google    { background: #042f2e; color: #34d399; }
    .pb-local     { background: #1e293b; color: #64748b; }
+    .pb-notools   { background: #3b1a1a; color: #f87171; }
    [data-theme="light"] .pb-anthropic { background: #ede9fe; color: #5b21b6; }
    [data-theme="light"] .pb-google    { background: #d1fae5; color: #065f46; }
    [data-theme="light"] .pb-local     { background: #e2e8f0; color: #475569; }
+    [data-theme="light"] .pb-notools   { background: #fee2e2; color: #b91c1c; }

    /* Host & model rows */
    .host-row {
@@ -488,8 +490,22 @@
                   autocomplete="off" data-form-type="other">
          </div>
          <div class="field" style="flex:0 0 auto">
-            <label>Context (k tokens)</label>
-            <input type="number" id="add-context-k" name="context_k" value="0" min="0" max="10000">
+            <label title="Context window size in thousands of tokens. 0 = assume 32k.">Context (k tokens)</label>
+            <input type="number" id="add-context-k" name="context_k" value="0" min="0" max="10000"
+                   title="Context window size in thousands of tokens. 0 = assume 32k (compaction budget ~24k tokens).">
+          </div>
+          <div class="field" style="flex:0 0 auto">
+            <label title="Per-model tool loop cap. 0 = use the global default (orchestrator_max_rounds).">Max rounds</label>
+            <input type="number" name="max_rounds" value="0" min="0"
+                   title="Per-model tool loop cap. 0 = use the global default (orchestrator_max_rounds).">
+          </div>
+          <div class="field" style="flex:0 0 auto">
+            <label title="Whether this model supports tool calling. If not supported, requests skip the tool loop entirely.">Tool calling</label>
+            <select name="tools"
+                    title="Whether this model supports tool calling. If not supported, requests skip the tool loop entirely.">
+              <option value="1" selected>Supported</option>
+              <option value="0">Not supported</option>
+            </select>
          </div>
        </div>
        <div class="field">
--- a/cortex/static/settings.html
+++ b/cortex/static/settings.html
@@ -423,6 +423,18 @@
    </div>

    <!-- Browser cache -->
+    <!-- Usage summary -->
+    <div class="section" id="usage-section">
+      <h2>Usage</h2>
+      <p style="font-size:0.8rem; color:var(--pg-muted); margin-bottom:0.85rem; line-height:1.55;">
+        Token consumption tracked for API-backed models (Gemini API, local OpenAI-compatible).
+        Claude CLI calls are not metered.
+      </p>
+      <div id="usage-table-wrap" style="overflow-x:auto;">
+        <p style="font-size:0.8rem; color:var(--pg-muted);">Loading…</p>
+      </div>
+    </div>
+
    <div class="section">
      <h2>Browser Cache</h2>
      <p style="font-size:0.8rem; color:var(--pg-muted); margin-bottom:0.85rem; line-height:1.55;">
@@ -443,6 +455,25 @@
    <!-- Model Registry link -->
    <div class="section">
      <h2>Model Registry</h2>
+
+      <!-- Quick-start card: shown only when no model is configured for chat role -->
+      <div id="openrouter-quickstart" style="display:none; background:#1c1a0a; border:1px solid #78350f;
+           border-radius:8px; padding:1rem; margin-bottom:1rem;">
+        <p style="font-size:0.82rem; color:#fbbf24; font-weight:600; margin-bottom:0.4rem;">
+          ⚡ You're on the server default model
+        </p>
+        <p style="font-size:0.8rem; color:#d97706; margin-bottom:0.75rem; line-height:1.5;">
+          You can chat now, but adding your own model gives you more choices, lets you pick
+          role-specific models, and tracks your usage separately.
+          OpenRouter is the easiest way to get started — one key, many models.
+        </p>
+        <a href="/setup/model"
+           style="display:inline-block; padding:0.5rem 0.9rem; background:#92400e; border-radius:6px;
+                  color:#fef3c7; font-size:0.85rem; font-weight:600; text-decoration:none;">
+          Set up OpenRouter →
+        </a>
+      </div>
+
      <p style="font-size:0.8rem; color:var(--pg-muted); margin-bottom:0.85rem; line-height:1.55;">
        Configure AI providers (Anthropic, Google), local hosts (Open WebUI, Ollama, OpenRouter, etc.),
        and assign models to roles — chat, orchestrator, distill, and more.
@@ -479,6 +510,22 @@
    </div>

    <!-- Personas -->
+    <!-- Sessions -->
+    <div class="section">
+      <h2>Sessions</h2>
+      <p style="font-size:0.8rem; color:var(--pg-muted); margin-bottom:0.85rem; line-height:1.55;">
+        Auto-name any sessions that still show a random ID, using their first message as the name.
+        Only unnamed sessions are affected — existing names are left alone.
+      </p>
+      <button type="button" id="backfill-names-btn"
+              style="padding:0.5rem 1rem; background:none; border:1px solid var(--pg-border); border-radius:6px;
+                     color:var(--pg-muted); font-size:0.88rem; font-weight:500; cursor:pointer;
+                     transition:border-color 0.15s, color 0.15s;">
+        Auto-name old sessions
+      </button>
+      <span id="backfill-names-ok" style="display:none; margin-left:0.75rem; font-size:0.8rem; color:#4ade80;"></span>
+    </div>
+
    <div class="section">
      <h2>Personas</h2>
      <ul class="persona-list">
@@ -532,6 +579,84 @@
      document.getElementById('clear-ls-ok').style.display = 'inline';
    });

+    // Show OpenRouter quick-start card if no model is configured
+    (async () => {
+      try {
+        const d = await fetch('/backend').then(r => r.json());
+        const roles = d.available_roles || [];
+        if (roles.length === 0) {
+          document.getElementById('openrouter-quickstart').style.display = 'block';
+        }
+      } catch (_) {}
+    })();
+
+    // Usage summary table
+    (async () => {
+      const wrap = document.getElementById('usage-table-wrap');
+      try {
+        const resp = await fetch('/api/usage/summary');
+        if (!resp.ok) throw new Error(resp.statusText);
+        const rows_data = await resp.json();
+        if (!rows_data.length) {
+          wrap.innerHTML = '<p style="font-size:0.8rem;color:var(--pg-muted);">No usage recorded yet.</p>';
+          return;
+        }
+        const fmt = n => n >= 1000 ? (n / 1000).toFixed(1) + 'k' : String(n);
+        const rows = rows_data.map(d => {
+          const labelCell = d.label !== d.key
+            ? `<span title="${d.key}">${d.label}</span>`
+            : `<span>${d.key}</span>`;
+          return `<tr>
+            <td style="padding:0.4rem 0.75rem 0.4rem 0; font-size:0.82rem; color:var(--pg-text); white-space:nowrap;">${labelCell}</td>
+            <td style="padding:0.4rem 0.5rem; font-size:0.82rem; color:var(--pg-muted); text-align:right;">${d.calls}</td>
+            <td style="padding:0.4rem 0.5rem; font-size:0.82rem; color:var(--pg-muted); text-align:right;">${fmt(d.prompt_tokens)}</td>
+            <td style="padding:0.4rem 0.5rem; font-size:0.82rem; color:var(--pg-muted); text-align:right;">${fmt(d.completion_tokens)}</td>
+            <td style="padding:0.4rem 0 0.4rem 0.5rem; font-size:0.82rem; color:var(--pg-text); text-align:right; font-weight:600;">${fmt(d.total_tokens)}</td>
+          </tr>`;
+        }).join('');
+        wrap.innerHTML = `<table style="border-collapse:collapse; width:100%; min-width:360px;">
+          <thead>
+            <tr style="border-bottom:1px solid var(--pg-border);">
+              <th style="padding:0.35rem 0.75rem 0.35rem 0; font-size:0.75rem; color:var(--pg-muted); font-weight:600; text-align:left;">Model</th>
+              <th style="padding:0.35rem 0.5rem; font-size:0.75rem; color:var(--pg-muted); font-weight:600; text-align:right;">Calls</th>
+              <th style="padding:0.35rem 0.5rem; font-size:0.75rem; color:var(--pg-muted); font-weight:600; text-align:right;">Prompt</th>
+              <th style="padding:0.35rem 0.5rem; font-size:0.75rem; color:var(--pg-muted); font-weight:600; text-align:right;">Output</th>
+              <th style="padding:0.35rem 0 0.35rem 0.5rem; font-size:0.75rem; color:var(--pg-muted); font-weight:600; text-align:right;">Total</th>
+            </tr>
+          </thead>
+          <tbody>${rows}</tbody>
+        </table>`;
+      } catch (e) {
+        wrap.innerHTML = `<p style="font-size:0.8rem;color:var(--pg-muted);">Could not load usage data.</p>`;
+      }
+    })();
+
+    // Auto-name old sessions backfill
+    document.getElementById('backfill-names-btn').addEventListener('click', async () => {
+      const btn = document.getElementById('backfill-names-btn');
+      const ok  = document.getElementById('backfill-names-ok');
+      btn.disabled = true;
+      btn.textContent = 'Working…';
+      try {
+        const params = new URLSearchParams(window.location.search);
+        const user    = params.get('user')    || document.querySelector('input[value]')?.value || '';
+        const persona = params.get('persona') || '';
+        const qs = user ? `?user=${encodeURIComponent(user)}&persona=${encodeURIComponent(persona)}` : '';
+        const res = await fetch(`/api/sessions/backfill-names${qs}`, { method: 'POST' });
+        const data = await res.json();
+        if (!res.ok) throw new Error(data.detail || res.statusText);
+        const n = data.named ?? 0;
+        ok.textContent = `Named ${n} session${n !== 1 ? 's' : ''}.`;
+        ok.style.display = 'inline';
+      } catch (e) {
+        ok.textContent = 'Error — check console.';
+        ok.style.color = '#f87171';
+        ok.style.display = 'inline';
+      }
+      btn.textContent = 'Auto-name old sessions';
+      btn.disabled = false;
+    });
+
    // Persona rename toggle
    document.querySelectorAll('.persona-rename-toggle').forEach(btn => {
      btn.addEventListener('click', () => {
--- a/cortex/static/setup.html
+++ b/cortex/static/setup.html
@@ -127,6 +127,36 @@

    .emoji-opt.selected { border-color: #7c3aed; background: #2d1f52; }
    #emoji-hidden { display: none; }
+
+    .provider-badge {
+      display: inline-flex;
+      align-items: center;
+      gap: 0.4rem;
+      background: #2d1f52;
+      border: 1px solid #7c3aed;
+      border-radius: 6px;
+      padding: 0.3rem 0.6rem;
+      font-size: 0.78rem;
+      color: #a78bfa;
+      margin-bottom: 1rem;
+    }
+
+    .skip-link {
+      display: block;
+      text-align: center;
+      margin-top: 1rem;
+      font-size: 0.8rem;
+      color: #64748b;
+      text-decoration: none;
+    }
+    .skip-link:hover { color: #94a3b8; }
+
+    .model-hint {
+      font-size: 0.72rem;
+      color: #64748b;
+      margin-top: 0.75rem;
+      text-align: center;
+    }
  </style>
 </head>
 <body>
@@ -137,10 +167,11 @@
    </div>

    <!-- ERROR -->
+    <!-- ERROR_MODEL -->

    <!-- ── Step 1: password ───────────────────────────────────────── -->
    <div id="step-password">
-      <div class="step-label">Step 1 of 2</div>
+      <div class="step-label">Step 1 of 3</div>
      <h2>Set your password</h2>
      <form method="POST" action="" id="password-form">
        <input type="hidden" name="step" value="password">
@@ -161,7 +192,7 @@

    <!-- ── Step 2: persona ────────────────────────────────────────── -->
    <div id="step-persona" style="display:none">
-      <div class="step-label">Step 2 of 2</div>
+      <div class="step-label">Step 2 of 3</div>
      <h2>Create your persona</h2>
      <form method="POST" action="" id="persona-form">
        <input type="hidden" name="step" value="persona">
@@ -203,6 +234,39 @@
        <button type="submit">Create my persona →</button>
      </form>
    </div>
+
+    <!-- ── Step 3: model connect ─────────────────────────────────── -->
+    <div id="step-model" style="display:none">
+      <div class="step-label"><!-- SETUP_STEP3_LABEL --></div>
+      <h2>Connect an AI model</h2>
+      <div class="provider-badge">⚡ Recommended: OpenRouter</div>
+      <p style="font-size:0.82rem;color:#94a3b8;margin-bottom:1rem;">
+        One API key gives you access to Claude, Gemini, Llama, and dozens of other models.
+        Get a free key at <a href="https://openrouter.ai/keys" target="_blank" style="color:#a78bfa;">openrouter.ai/keys</a>.
+      </p>
+      <form method="POST" action="/setup/model" id="model-form">
+        <div class="field">
+          <label for="api_key">OpenRouter API key</label>
+          <input type="password" id="api_key" name="api_key"
+                 autocomplete="off" placeholder="sk-or-v1-..." required>
+        </div>
+        <div class="field">
+          <label for="model_name">Starting model</label>
+          <select id="model_name" name="model_name">
+            <option value="anthropic/claude-3-5-haiku-20241022">Claude 3.5 Haiku — Fast &amp; affordable</option>
+            <option value="anthropic/claude-3-7-sonnet-20250219">Claude 3.7 Sonnet — Smarter Claude</option>
+            <option value="google/gemini-2.0-flash-001">Gemini 2.0 Flash — Fast Google model</option>
+            <option value="meta-llama/llama-3.3-70b-instruct">Llama 3.3 70B — Open source</option>
+          </select>
+          <p class="hint">You can add more models or switch anytime in Account → Model Registry.</p>
+        </div>
+        <button type="submit">Connect &amp; start chatting →</button>
+      </form>
+      <p class="model-hint">
+        Using Ollama, a local model, or something else?
+        <a href="#" id="skip-model-link" style="color:#64748b;">Skip this step →</a>
+      </p>
+    </div>
  </div>

  <script>
@@ -232,6 +296,11 @@
      document.getElementById('step-password').style.display = 'none';
      document.getElementById('step-persona').style.display  = 'block';
    }
+    if (params.get('step') === '3') {
+      document.getElementById('step-password').style.display = 'none';
+      document.getElementById('step-persona').style.display  = 'none';
+      document.getElementById('step-model').style.display    = 'block';
+    }

    // ── Client-side confirm password check ───────────────────────────
    document.getElementById('password-form').addEventListener('submit', e => {
@@ -243,6 +312,15 @@
      }
    });

+    // ── Skip model setup — navigate to user home ─────────────────────
+    document.getElementById('skip-model-link')?.addEventListener('click', e => {
+      e.preventDefault();
+      // Ask server for skip target (the cx_setup_persona cookie has the path)
+      fetch('/setup/model/skip', { method: 'POST', credentials: 'same-origin' })
+        .then(r => { if (r.redirected) location.href = r.url; else location.href = '/'; })
+        .catch(() => { location.href = '/'; });
+    });
+
    // ── Auto-generate persona slug from display name ─────────────────
    document.getElementById('display_name').addEventListener('input', function() {
      const slugField = document.getElementById('persona_name');
--- a/cortex/static/style.css
+++ b/cortex/static/style.css
@@ -1328,7 +1328,10 @@
        .ctx-btn:hover    { color: var(--text); border-color: var(--muted); }
        .ctx-btn.active   { color: var(--accent); border-color: var(--accent); }
        .ctx-btn.mem-on   { color: var(--success); border-color: var(--success-dim); }
-        .ctx-btn.local-on { color: var(--amber); border-color: var(--amber-border); }
+        .ctx-btn.local-on   { color: var(--amber); border-color: var(--amber-border); }
+        .ctx-btn-danger     { color: #f87171 !important; border-color: #7f1d1d !important; }
+        .ctx-btn-danger:hover { border-color: #f87171 !important; }
+        .ctx-btn:disabled   { opacity: 0.4; cursor: not-allowed; pointer-events: none; }
        #backend-model-hint {
            font-size: 0.68rem; color: var(--amber); opacity: 0.9;
            margin-top: 4px; word-break: break-all; line-height: 1.3;