feat: audit log, usage tracking UI, OpenAI orchestrator compaction, onboarding + docs

Tool audit log:
- Every orchestrator tool call logged to home/{user}/tool_audit/YYYY-MM-DD.jsonl
- Files panel sidebar: audit log group (collapsed), date-linked read-only table
- Admin endpoints: /api/audit/files, /api/audit/day, /api/audit/recent, /api/audit/stats
- Engine and model name recorded per entry

OpenAI orchestrator improvements:
- Context budget enforcement: 75% of model context_k (min 16k)
- Message compaction: truncates old tool results when approaching budget
- max_rounds respected per model config (intersected with server cap)

OpenRouter onboarding (setup.html, onboarding.py, app.js, settings.html):
- Step 3 of 3: /setup/model with curated model picker
- Chat banner for users on server-default model (informational, not alarmist)
- Settings quick-link card; /setup/model works standalone for existing users

Model registry + session store:
- set_role_config / get_role_config for per-role tool lists and system_append
- session_store: session rename, session name backfill endpoint

UI updates (app.js, index.html, style.css, local_llm.html):
- Role toggle in context panel
- Off-the-record mode
- Agent notes read-only viewer
- OPERATIONS.md loaded at T2+ in context

Documentation:
- HELP.md: full tool table, per-role tool sets, Agent Notes, usage tracking
- TOOLS.md: Agent Notes section, count corrected to 44
- ARCH__SYSTEM.md, ARCH__BACKENDS.md, MASTER.md updated to match reality
- CLAUDE.md: onboarding flow, documentation philosophy sections
- README.md: stack in practice, DeepSeek TUI mention, architecture diagram updated
- TODO__Agents.md: onboarding task completed with deviation notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-05-08 21:26:43 -04:00
parent c02d2462b0
commit f8f7cd75da
25 changed files with 1088 additions and 151 deletions

View File

@@ -6,7 +6,24 @@
and are appended automatically by help.html when present.
-->
*Last updated: 2026-05-05*
*Last updated: 2026-05-08*
---
## Getting Started
If this is your first time using Cortex, you need one thing before the chat will work: an AI model connected to your account.
**Fastest path — OpenRouter:**
OpenRouter gives you access to Claude, Gemini, and dozens of other models with a single API key.
1. Get a free API key at [openrouter.ai/keys](https://openrouter.ai/keys)
2. Go to **☰ → Account → [Set up OpenRouter →]** (shown automatically if no model is configured)
3. Paste your key, pick a starting model, click **Connect**
That's it — you're ready to chat.
**Already past setup but seeing errors?** Go to **☰ → Account → Model Registry → Manage models** and confirm a model is assigned to the **Chat** role (Primary slot). If all slots are empty, add a model first.
---
@@ -52,19 +69,45 @@ Click the **⚡** button in the input row to enable the Tools toggle. When lit (
The orchestrator runs a multi-step tool loop:
1. The **orchestrator model** reasons about the request and calls tools as needed — web search, file reads, task management, shell commands, Aether Journals, and more
1. The **orchestrator model** reasons about the request and calls tools as needed
2. It produces an enriched summary of what it found
3. The **responder model** (set by the active Role) receives that context and writes the final user-facing reply
4. A `⚡ N tool calls: …` note appears below the response listing what was used
The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**. By default this is Gemini API.
The full tool reference is in the **Tools** tab. 40 tools across web, files, shell, system, tasks, cron, reminders, scratchpad, notifications, and Aether Journals.
The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**.
Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.
Orchestrated sessions persist to history exactly like regular chat.
### Available Tools
40 tools across 11 categories. Each tool schema is sent to the model on every orchestrated call — fewer active tools means fewer tokens per call.
| Category | Tools |
|---|---|
| **Web** | `web_search`, `http_fetch` |
| **Files** | `file_read`, `file_list`, `file_write` |
| **Shell** | `shell_exec`, `claude_allow_dir` |
| **System** | `cortex_restart`, `cortex_logs`, `cortex_status`, `cortex_update` |
| **Tasks** | `task_list`, `task_create`, `task_update`, `task_complete` |
| **Cron** | `cron_list`, `cron_add`, `cron_remove`, `cron_toggle` |
| **Reminders** | `reminders_add`, `reminders_list`, `reminders_remove`, `reminders_clear` |
| **Scratchpad** | `scratch_read`, `scratch_write`, `scratch_append`, `scratch_clear` |
| **Notifications** | `web_push`, `email_send`, `nc_talk_send` |
| **Aether Journals** | `ae_journal_list/search`, `ae_journal_entries_list`, `ae_journal_entry_read/create/update/disable/append/prepend` |
| **Agent Notes** | `agent_notes_read`, `agent_notes_write`, `agent_notes_append`, `agent_notes_clear` |
File, Shell, System, and some Notification tools are **admin-only** and not visible to regular users.
### Per-Role Tool Sets
Each role can be configured with a specific subset of tool categories. When a role has a tool subset configured, only those tools are sent to the orchestrator — the rest are invisible to the model for that session.
**Example:** a Coder role might only need Web, Files, Shell, and Agent Notes. A Research role might only need Web. Configuring this avoids sending schemas for 30+ irrelevant tools on every call.
Configure per-role tool sets in **Account → Model Registry → Role Assignments** — expand a role card to see the category checkboxes. The default (no checkboxes selected) sends all tools the user has access to.
---
## Sessions
@@ -123,11 +166,59 @@ Each response shows a **model tag** (bottom-right of message) with the model lab
---
## Account Settings
**Navigate to:** ☰ (top-right menu) → **Account**
| Section | What you can do |
|---|---|
| **Account** | View your username, role badge (Admin / User), rename your username |
| **Connected Accounts** | See which Google account is linked for OAuth sign-in |
| **Email Allowlist** | Regex patterns controlling which addresses the `email_send` tool can reach |
| **Notifications** | Set which channel (NC Talk, Google Chat, email) Inara uses for proactive messages |
| **Tool Permissions** | Allow or block specific orchestrator tools for your account |
| **Usage** | Token consumption by model — see below |
| **Browser Cache** | Clear UI preferences stored locally (theme, font size, session ID, etc.) |
| **Model Registry** | Configure AI providers, local hosts, and role assignments |
| **Change Password** | Update your login password |
| **Personas** | List and rename your personas |
---
## Usage
Token consumption is tracked automatically for API-backed models. **Navigate to:** ☰ → **Account****Usage** section.
The table shows all-time totals per model key, with columns for:
| Column | Meaning |
|---|---|
| **Model** | `backend/model-name` key (e.g. `gemini_api/gemini-2.5-flash`, `local/deepseek-v4`) |
| **Calls** | Number of API calls made |
| **Prompt** | Input tokens sent |
| **Output** | Completion tokens received |
| **Total** | Prompt + Output |
Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).
**What is and isn't tracked:**
- ✅ Gemini API calls (orchestrator, distillation)
- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
- ✗ Claude CLI — no structured token data is returned by the subprocess
- ✗ Gemini CLI — same reason
The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.
---
## Model Registry
Configure which AI models are available and which handles each task type.
**Navigate to:** ☰ (top-right menu) → **Account** → scroll to **Model Registry****Manage models →**
**New user quick path:** ☰ → **Account****Set up OpenRouter →** (the guided wizard adds a host, model, and role assignment in one step).
**Full manual path:** ☰ → **Account** → scroll to **Model Registry****Manage models →**
---
@@ -142,10 +233,16 @@ Do this before adding models — models need a provider account or local host to
2. Enter a label (e.g. "Work", "Personal") and your API key
3. Get a free key at [aistudio.google.com/apikey](https://aistudio.google.com/apikey)
**Local hosts** (Open WebUI, Ollama, OpenRouter, etc.):
**OpenRouter** (recommended for new users — one key for many models):
1. Get a key at [openrouter.ai/keys](https://openrouter.ai/keys)
2. Scroll to **Local Hosts****+ Add host**
3. Label: "OpenRouter", URL: `https://openrouter.ai/api/v1`, paste your key, Type: OpenAI-compatible
4. Click **Fetch models** to verify, then add models from the fetched list
**Other local hosts** (Open WebUI, Ollama, LM Studio, etc.):
1. Scroll to **Local Hosts** → click **+ Add host** to expand the form
2. Enter a label, the API URL (e.g. `http://192.168.1.100:3000`), and optional API key
3. Set **Type**: Open WebUI / Ollama, or OpenAI-compatible (for OpenRouter, LM Studio, etc.)
3. Set **Type**: Open WebUI / Ollama, or OpenAI-compatible
4. Click **Fetch models** on the saved host card to verify connectivity
---
@@ -178,6 +275,8 @@ Scroll to **Role Assignments** at the bottom of the page. Each role has **Primar
Leave all slots empty to use the server default.
**Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).
---
## Nextcloud Talk Bot
@@ -245,12 +344,12 @@ Controls how much context is prepended to each LLM call:
| Tier | Loads | ~Tokens |
|---|---|---|
| **T1** | SOUL + IDENTITY + USER summary | ~1,500 |
| **T2** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| **T3** | + last 2 raw session logs | ~15,000 |
| **T4** | + last 7 raw session logs | ~50,000 |
| **Min** | SOUL + IDENTITY + USER summary | ~1,500 |
| **Std** | + USER full + PROTOCOLS + HELP + memory layers | ~5,000 |
| **Ext** | + last 2 raw session logs | ~15,000 |
| **Full** | + last 7 raw session logs | ~50,000 |
Default is T2. Use T1 for small/local models. Use T3T4 for complex multi-session tasks.
Default is **Std**. Use **Min** for small/local models. Use **Ext** or **Full** for complex multi-session tasks.
### Memory Layers
@@ -318,6 +417,7 @@ For direct access or scripting:
| `GET` | `/orchestrate/{job_id}` | Poll job status and result |
| `GET` | `/settings/models` | Model registry UI |
| `POST` | `/api/models/role` | Set a role assignment (JSON body) |
| `POST` | `/api/models/role-config` | Set per-role tool list and system prompt append |
| `GET` | `/api/push/vapid-key` | VAPID public key (for push subscription) |
| `POST` | `/api/push/subscribe` | Register a push subscription |
| `DELETE` | `/api/push/subscribe` | Remove a push subscription |
@@ -325,6 +425,11 @@ For direct access or scripting:
| `GET` | `/api/audit/day?date=` | Tool call entries for a specific date (own data) |
| `GET` | `/api/audit/recent` | Recent tool calls across days (admin) |
| `GET` | `/api/audit/stats` | Tool call counts by tool/status/user (admin) |
| `GET` | `/api/usage` | Full daily token usage log (own data) |
| `GET` | `/api/usage/summary` | Per-model token totals, all time (own data) |
| `GET` | `/api/usage/all` | Per-model totals for all users (admin) |
| `GET` | `/setup/model` | Guided OpenRouter setup form (Step 3 / standalone) |
| `POST` | `/setup/model` | Save OpenRouter host + model + assign to chat role |
| `GET` | `/health` | Health check — returns `{"status": "ok"}` |
Chat request body (`POST /chat`):