feat: SSH dev routing, model registry UX, chat input toolbar, doc sync

Backend / infrastructure: - cortex/tools/_projects.py (new): shared project alias registry with ssh_host for workstation projects (aether_api, aether_frontend, aether_container) - cortex/tools/git.py: all git tools route to workstation via SSH when ssh_host set - cortex/tools/aider.py: aider_run SSH-routes to workstation using bash -l -c - cortex/routers/local_llm.py: POST /api/models/{id}/edit AJAX endpoint — save model edits without page reload or tab reset; returns JSON {ok, label, model_name} - cortex/llm_client.py: remove Gemini CLI and Claude CLI backends; clean up fallback chain and process group tracking (continuation of Gemini CLI removal) - cortex/routers/auth.py: strip Claude/Gemini CLI auth status checks (CLI removed) - cortex/routers/chat.py: remove legacy claude/gemini backend fields - cortex/config.py: clean up CLI-related settings - cortex/main.py: remove CLI lifecycle hooks UI: - cortex/static/local_llm.html: model edit forms now save via fetch() + toast; stay on Models tab; update row header label in place on success - cortex/static/index.html: restructure input area to column layout — textarea above, compact toolbar below (Chat/Tools/Attach + Send); fixes dead space at M/L/XL sizes; context panel "Role" → "Model" section label - cortex/static/style.css: column input-area layout; #input-toolbar; flex:1 → width:100% on textarea (fixes scrollHeight in column flex context); compact send/stop button padding - cortex/static/app.js: add XL (720px) to height cycle; default M (240px) Docs: - cortex/static/HELP.md: S/M/L → S/M/L/XL; add Rebuild to distill table; fix "Role selector" references (no such UI); fix "your active role" → Chat role; fix ⚡ toggle description; Model Registry section cleanup - documentation/ARCH__BACKENDS.md: reflect CLI removal, current backend state Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 22:14:07 -04:00
parent 85223326b0
commit b144d8385f
15 changed files with 378 additions and 586 deletions
--- a/cortex/static/HELP.md
+++ b/cortex/static/HELP.md
@@ -6,7 +6,7 @@
     and are appended automatically by help.html when present.
 -->

-*Last updated: 2026-05-13*
+*Last updated: 2026-06-18* <!-- input toolbar refactor; XL size added; help doc sync -->

 ---

@@ -44,7 +44,7 @@ The **Context & Memory** panel (sliders icon with tier number) contains all conf
 | **Memory Layers** | Toggle Long / Mid / Short memory on/off |
 | **Distill Memory** | Manually trigger Short / Mid / Long / All distillation |
 | **Model** | Active chat model — click to cycle through your configured slot models (Primary → Backup 1 → …) |
-| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L** cycles input area height · **⌃↵** toggles send shortcut |
+| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L/XL** cycles input area height · **⌃↵** toggles send shortcut |

 All settings persist in `localStorage` across page refreshes.

@@ -74,7 +74,7 @@ The orchestrator runs a multi-step tool loop:
 3. The model produces the final user-facing reply — when the orchestrator role uses Gemini, Claude writes the final response; when it uses a local model, that same model writes it
 4. Expandable tool-call cards appear above the response — click any card to see the arguments sent and the result returned

-The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**.
+The ⚡ toggle routes requests through the **Orchestrator** role model regardless of which chat model is active. Configure it in **Account → Model Registry → Role Assignments → Orchestrator**.

 Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.

@@ -156,7 +156,7 @@ Once installed, opening Cortex from the home screen or app launcher skips the br

 ## Switching Models

-The **Model** button in the Context & Memory panel cycles through the slot models configured for your active role (Primary → Backup 1). Click it to switch between models mid-session.
+The **Model** button in the Context & Memory panel cycles through the slot models configured for your **Chat** role (Primary → Backup 1). Click it to switch between models mid-session.

 - The button label shows the active model (e.g. "GPT-4o", "Gemini 2.5 Flash")
 - The selected slot is sent with each chat request so the correct model is used
@@ -205,12 +205,11 @@ The table shows all-time totals per model key, with columns for:

 Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).

-**What is and isn't tracked:**
+**What is tracked:**

- ✅ Gemini API calls (orchestrator, distillation)
+- ✅ Anthropic API calls (direct SDK)
 - ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
- ✗ Claude CLI — no structured token data is returned by the subprocess
- ✗ Gemini CLI — same reason
+- ✅ Gemini API calls (orchestrator, distillation)

 The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.

@@ -230,9 +229,10 @@ Configure which AI models are available and which handles each task type.

 Do this before adding models — models need a provider account or local host to attach to.

-**Anthropic (Claude):** Two options:
- **CLI (OAuth):** Nothing to configure — uses your existing `claude auth login` session. If Claude isn't working, run `claude auth login` in a terminal.
- **Direct API key:** Scroll to **Cloud Providers → Anthropic** → click **+ Add API key**. Enter a label and your `sk-ant-…` key from [console.anthropic.com/keys](https://console.anthropic.com/keys). When you add a model using an API key credential, it routes through the Anthropic SDK instead of the CLI.
+**Anthropic (Claude):** Uses a direct API key — no Claude CLI required:
+- Scroll to **Cloud Providers → Anthropic** → click **+ Add API key**
+- Enter a label and your `sk-ant-…` key from [console.anthropic.com/keys](https://console.anthropic.com/keys)
+- Models added with this credential call the Anthropic API directly via the SDK

 **Google (Gemini):** Add one entry per API key you want to use:
 1. Scroll to **Cloud Providers → Google** → click **+ Add Google account**
@@ -261,7 +261,7 @@ Scroll to **Add Model**. Select the provider tab, fill in the details, click **A
 |---|---|
 | **Local** | Select a host (from Step 1) → enter model name, or use **Fetch from host** to pick from a live list |
 | **Google** | Select a Gemini model from the catalog → select a Google account (from Step 1) |
-| **Anthropic** | Select a credential (CLI OAuth or an API key added in Step 1) → select a Claude model from the catalog |
+| **Anthropic** | Select an API key credential (from Step 1) → select a Claude model from the catalog |

 The label and context window size auto-fill from the catalog — edit them if you want. Tags are optional.

@@ -286,7 +286,7 @@ Scroll to **Role Assignments** at the bottom of the page. Each role has **Primar
 | **Coder** | Code-focused tasks — larger context window, code-aware model |
 | **Research** | Long-context research — high-token model, web tools prioritized |

-Switch roles via the **Role** selector in the Context & Memory panel (⚙). Leave all slots empty to use the server default.
+Leave all slots empty to use the server default.

 **Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).

@@ -390,6 +390,7 @@ Distillation builds up the memory layers from raw session logs. Runs automatical
 | **mid** | LLM summarizes `MEMORY_SHORT.md` → `MEMORY_MID.md` |
 | **long** | LLM integrates `MEMORY_MID.md` → `MEMORY_LONG.md` |
 | **all** | Runs short → mid → long in sequence |
+| **Rebuild** | ⚠ Wipes Mid + Long memories and rebuilds from session logs. Use to recover from distillation drift. Hand-edited content will be replaced. |

 **Recommended workflow:** run **short** after any productive session; **mid** weekly; **long** monthly.

@@ -462,8 +463,7 @@ For direct access or scripting:
 | Method | Endpoint | Description |
 |---|---|---|
 | `POST` | `/chat` | Send a message — returns SSE stream |
-| `GET` | `/backend` | Get current primary/fallback backends |
-| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
+| `GET` | `/backend` | Get configured model slots and orchestrator |
 | `GET` | `/sessions` | List all sessions |
 | `GET` | `/history/{id}` | Get session message history |
 | `PUT` | `/history/{id}` | Replace full session history |