feat: SSH dev routing, model registry UX, chat input toolbar, doc sync

Backend / infrastructure:
- cortex/tools/_projects.py (new): shared project alias registry with ssh_host
  for workstation projects (aether_api, aether_frontend, aether_container)
- cortex/tools/git.py: all git tools route to workstation via SSH when ssh_host set
- cortex/tools/aider.py: aider_run SSH-routes to workstation using bash -l -c
- cortex/routers/local_llm.py: POST /api/models/{id}/edit AJAX endpoint — save
  model edits without page reload or tab reset; returns JSON {ok, label, model_name}
- cortex/llm_client.py: remove Gemini CLI and Claude CLI backends; clean up
  fallback chain and process group tracking (continuation of Gemini CLI removal)
- cortex/routers/auth.py: strip Claude/Gemini CLI auth status checks (CLI removed)
- cortex/routers/chat.py: remove legacy claude/gemini backend fields
- cortex/config.py: clean up CLI-related settings
- cortex/main.py: remove CLI lifecycle hooks

UI:
- cortex/static/local_llm.html: model edit forms now save via fetch() + toast;
  stay on Models tab; update row header label in place on success
- cortex/static/index.html: restructure input area to column layout — textarea
  above, compact toolbar below (Chat/Tools/Attach + Send); fixes dead space at
  M/L/XL sizes; context panel "Role" → "Model" section label
- cortex/static/style.css: column input-area layout; #input-toolbar; flex:1 →
  width:100% on textarea (fixes scrollHeight in column flex context); compact
  send/stop button padding
- cortex/static/app.js: add XL (720px) to height cycle; default M (240px)

Docs:
- cortex/static/HELP.md: S/M/L → S/M/L/XL; add Rebuild to distill table; fix
  "Role selector" references (no such UI); fix "your active role" → Chat role;
  fix  toggle description; Model Registry section cleanup
- documentation/ARCH__BACKENDS.md: reflect CLI removal, current backend state

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-06-18 22:14:07 -04:00
parent 85223326b0
commit b144d8385f
15 changed files with 378 additions and 586 deletions

View File

@@ -6,7 +6,7 @@
and are appended automatically by help.html when present.
-->
*Last updated: 2026-05-13*
*Last updated: 2026-06-18* <!-- input toolbar refactor; XL size added; help doc sync -->
---
@@ -44,7 +44,7 @@ The **Context & Memory** panel (sliders icon with tier number) contains all conf
| **Memory Layers** | Toggle Long / Mid / Short memory on/off |
| **Distill Memory** | Manually trigger Short / Mid / Long / All distillation |
| **Model** | Active chat model — click to cycle through your configured slot models (Primary → Backup 1 → …) |
| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L** cycles input area height · **⌃↵** toggles send shortcut |
| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L/XL** cycles input area height · **⌃↵** toggles send shortcut |
All settings persist in `localStorage` across page refreshes.
@@ -74,7 +74,7 @@ The orchestrator runs a multi-step tool loop:
3. The model produces the final user-facing reply — when the orchestrator role uses Gemini, Claude writes the final response; when it uses a local model, that same model writes it
4. Expandable tool-call cards appear above the response — click any card to see the arguments sent and the result returned
The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**.
The ⚡ toggle routes requests through the **Orchestrator** role model regardless of which chat model is active. Configure it in **Account → Model Registry → Role Assignments → Orchestrator**.
Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.
@@ -156,7 +156,7 @@ Once installed, opening Cortex from the home screen or app launcher skips the br
## Switching Models
The **Model** button in the Context & Memory panel cycles through the slot models configured for your active role (Primary → Backup 1). Click it to switch between models mid-session.
The **Model** button in the Context & Memory panel cycles through the slot models configured for your **Chat** role (Primary → Backup 1). Click it to switch between models mid-session.
- The button label shows the active model (e.g. "GPT-4o", "Gemini 2.5 Flash")
- The selected slot is sent with each chat request so the correct model is used
@@ -205,12 +205,11 @@ The table shows all-time totals per model key, with columns for:
Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).
**What is and isn't tracked:**
**What is tracked:**
-Gemini API calls (orchestrator, distillation)
-Anthropic API calls (direct SDK)
- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
- ✗ Claude CLI — no structured token data is returned by the subprocess
- ✗ Gemini CLI — same reason
- ✅ Gemini API calls (orchestrator, distillation)
The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.
@@ -230,9 +229,10 @@ Configure which AI models are available and which handles each task type.
Do this before adding models — models need a provider account or local host to attach to.
**Anthropic (Claude):** Two options:
- **CLI (OAuth):** Nothing to configure — uses your existing `claude auth login` session. If Claude isn't working, run `claude auth login` in a terminal.
- **Direct API key:** Scroll to **Cloud Providers → Anthropic** → click **+ Add API key**. Enter a label and your `sk-ant-…` key from [console.anthropic.com/keys](https://console.anthropic.com/keys). When you add a model using an API key credential, it routes through the Anthropic SDK instead of the CLI.
**Anthropic (Claude):** Uses a direct API key — no Claude CLI required:
- Scroll to **Cloud Providers → Anthropic** → click **+ Add API key**
- Enter a label and your `sk-ant-…` key from [console.anthropic.com/keys](https://console.anthropic.com/keys)
- Models added with this credential call the Anthropic API directly via the SDK
**Google (Gemini):** Add one entry per API key you want to use:
1. Scroll to **Cloud Providers → Google** → click **+ Add Google account**
@@ -261,7 +261,7 @@ Scroll to **Add Model**. Select the provider tab, fill in the details, click **A
|---|---|
| **Local** | Select a host (from Step 1) → enter model name, or use **Fetch from host** to pick from a live list |
| **Google** | Select a Gemini model from the catalog → select a Google account (from Step 1) |
| **Anthropic** | Select a credential (CLI OAuth or an API key added in Step 1) → select a Claude model from the catalog |
| **Anthropic** | Select an API key credential (from Step 1) → select a Claude model from the catalog |
The label and context window size auto-fill from the catalog — edit them if you want. Tags are optional.
@@ -286,7 +286,7 @@ Scroll to **Role Assignments** at the bottom of the page. Each role has **Primar
| **Coder** | Code-focused tasks — larger context window, code-aware model |
| **Research** | Long-context research — high-token model, web tools prioritized |
Switch roles via the **Role** selector in the Context & Memory panel (⚙). Leave all slots empty to use the server default.
Leave all slots empty to use the server default.
**Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).
@@ -390,6 +390,7 @@ Distillation builds up the memory layers from raw session logs. Runs automatical
| **mid** | LLM summarizes `MEMORY_SHORT.md``MEMORY_MID.md` |
| **long** | LLM integrates `MEMORY_MID.md``MEMORY_LONG.md` |
| **all** | Runs short → mid → long in sequence |
| **Rebuild** | ⚠ Wipes Mid + Long memories and rebuilds from session logs. Use to recover from distillation drift. Hand-edited content will be replaced. |
**Recommended workflow:** run **short** after any productive session; **mid** weekly; **long** monthly.
@@ -462,8 +463,7 @@ For direct access or scripting:
| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/chat` | Send a message — returns SSE stream |
| `GET` | `/backend` | Get current primary/fallback backends |
| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
| `GET` | `/backend` | Get configured model slots and orchestrator |
| `GET` | `/sessions` | List all sessions |
| `GET` | `/history/{id}` | Get session message history |
| `PUT` | `/history/{id}` | Replace full session history |