Tool schema optimization (PLAN__Tool_Schema_Optimization.md Phases 1-3):
- model_registry.py: ROLE_DEFAULT_TOOLS — distill gets [], research/coder get
narrow tool lists by default; applied in get_role_config() when user hasn't
configured a custom list
- openai_orchestrator.py: keyword routing via narrow_tools_by_keywords() — scans
user message + last assistant turn; narrows active schemas to matched categories
only (e.g. "weather" → 3 web tools instead of 69); zero tools sent for pure chat
- openai_orchestrator.py: _get_cached_tools() — module-level schema cache keyed by
(role, sorted_tool_list, risk_params); eliminates redundant schema rebuilds
- openai_orchestrator.py: _TOOL_SCHEMA_OVERHEAD 3000 → 500 tokens (schemas now
excluded from the per-call fixed estimate since they're cached separately)
- tools/__init__.py: CATEGORY_TOOL_MAP + _KEYWORD_CATEGORY_MAP + classify_tool_categories()
+ narrow_tools_by_keywords() — the classifier logic lives here so both orchestrators
can share it
aider_run tool (cortex/tools/aider.py):
- Invokes Aider as a subprocess with --message --yes-always --no-pretty --no-stream
- Project aliases: cortex / aether_api / aether_frontend / aether_container
- Auto-injects OpenRouter API key from Cortex model registry (no ~/.env needed)
- background=True fires async + registers in agent_manager; notify=True sends push
notification on completion
- admin-only, confirm-required, TOOL_RISK=high
- .gitignore: added .aider.chat.history.md / .aider.input.history / .aider.llm.history
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New /settings/tools page: max_risk selector (low/medium/high) + per-tool
override dropdowns (Default / Force include / Force exclude) for all 58 tools
grouped by category with color-coded risk badges; JS updates Auto status live
- get_tools_for_role() + get_openai_tools_for_role() now accept max_risk,
whitelist, blacklist; _apply_risk_policy() handles the filtering logic
- get_risk_policy() helper in auth_utils reads from tool_policy.json
- Risk policy wired through orchestrator.py, openai_orchestrator.py,
orchestrator_engine.py, nextcloud_talk.py, homeassistant.py
- Tools nav link added to settings.html and notifications.html
- CLAUDE.md and ARCH__SYSTEM.md updated: tool count 50→58, risk system docs,
tool access control three-layer model documented
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- reasoning_budget_tokens: optional int field on local_openai models;
when set, injects {"reasoning": {"budget_tokens": N}} via extra_body
into every OpenRouter API call (both tool-loop and confirmation-gate
rounds). Field exposed in the model edit form in Settings.
- session name moved from standalone full-row div between #messages
and #input-area into the persona-switcher block in the header, as a
third dim line under "Cortex · Local". Collapses when empty via
:empty CSS. No JS changes required.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- inject_mode: per-role toggle (parallel to inject_datetime) gates the
"Current mode: Off The Record" line in the system prompt; wired through
model_registry, context_loader, chat router, orchestrator router, and
local_llm settings UI
- OTR orchestrator fix: OrchestrateRequest now carries off_record;
_finalize_job stores it per message and gates log_turn on it; JS
orchestrate payload sends off_record correctly
- Per-message hover metadata: removed always-visible .model-tag; replaced
with .msg-meta strip in the action bar (hover-only); shows model label,
host, fallback indicator, and OTR badge; stored in session JSON
- Send/stop button tooltip: shows role + model and (when tools on)
separate orchestrator model + engine label; live elapsed timer on stop
button via startRunTimer/stopRunTimer
- OrchestratorResult.backend_label: new field; openai_orchestrator fills
it; finalize_job propagates it to job dict and session messages
- GET /backend: exposes orchestrator_model label so the frontend tooltip
can show both models separately
- TODO: session delete confirmation added
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract orchestrator inner loop into _doOrchestrate() so the retry button
can re-run without re-adding the user message to DOM or history — same
pattern as the existing chat retry.
Also set AsyncOpenAI(timeout=settings.timeout_local) so slow remote models
(OpenRouter/DeepSeek) get the same 300s budget as local chat calls instead
of the SDK default which varies by connection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Protects all models in the Primary/Backup chain regardless of context window:
- _context_budget(): 75% of model_cfg["context_k"] * 1000 (default 32k if unset)
- _estimate_tokens(): char count / 4 + 3k overhead for tool schemas
- _compact_messages(): truncates old tool results to 400 chars, keeps last 6
intact (~2 recent rounds), logs chars saved per compaction pass
- Compaction runs before every API call; log line now shows estimated token count
- Malformed tool call args logged with model/args detail instead of silent {}
- finish_reason check accepts "stop" and None alongside "tool_calls" (some
models return wrong reason even when tool_calls are present)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- tool_audit: ContextVars (engine, model) set at orchestrator run start; fields added to every entry
- orchestrator_engine: tool_audit.set_context("gemini", model_name) at run() start
- openai_orchestrator: tool_audit.set_context("openai", model label) at run() start
- audit table: Model column between Status and Args
- HELP.md: push notifications section, audit log in Files section, tool count 30→40, new API endpoints
- TODO__Agents.md: web_push and audit log marked complete with full detail
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each role in model_registry.json can now carry two optional keys:
system_append — injected into the system prompt at position 7 (after
memory, closest to the turn) for the active chat_role
tools — explicit tool allow-list; intersected with the user's
access-level filter so it can only restrict, never elevate
No changes needed for existing users — missing keys fall back to current
behavior. Add keys to a role to give it a specialty focus:
"coder": {
"primary": "claude_cli",
"system_append": "You are in code-specialist mode...",
"tools": ["web_search", "file_read", "shell_exec", "scratch_write"]
}
Changes:
- model_registry.py: get_role_config() returns system_append + tools
- context_loader.py: role_append param appended as "--- Role Context ---"
- tools/__init__.py: get_tools_for_role/get_openai_tools_for_role accept
optional tool_list and intersect with access-level filter
- orchestrator_engine.py: tool_list threaded through run/resume/checkpoint
- openai_orchestrator.py: tool_list threaded through run/resume/checkpoint;
_build_client now calls get_openai_tools_for_role instead of returning
unfiltered OPENAI_TOOL_SCHEMAS
- routers/orchestrator.py: pulls role_cfg for chat_role, passes both
role_append and tool_list to context loader and engine
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes the broken confirmation gate where users had no way to approve
or deny a blocked tool call in the web UI.
Changes:
- orchestrator_engine.py: add OrchestrateCheckpoint dataclass, extract
loop into _run_from_contents(), add resume() function
- openai_orchestrator.py: same treatment — _run_from_messages(), resume()
- routers/orchestrator.py: POST /{job_id}/confirm and /deny endpoints,
separate _checkpoints store, _resume_job() + _finalize_job() helpers,
"awaiting_confirmation" job status with pending_confirmation payload
- auth_utils.py: get_tool_policy() and save_tool_policy() helpers reading
home/{user}/tool_policy.json (allow/deny lists)
- routers/orchestrator.py: loads tool_policy per user and passes
confirm_allow/confirm_deny to both engines
- app.js: poll loop handles awaiting_confirmation — shows Confirm/Deny
buttons inline, resumes polling after user action
- settings.html + settings.py: Tool Permissions section with allow/deny
textareas, POST /settings/tool-policy route
- style.css: .confirm-gate, .confirm-btn, .deny-btn styles
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- auth_utils: get_user_role() reads role from auth.json (admin|user, default user)
- manage_passwords: new `role` command to promote/demote users (admin-only by convention)
- tools/__init__: TOOL_ROLES map, CONFIRM_REQUIRED set, get_tools_for_role(),
get_openai_tools_for_role() — both orchestrators now filter tools by caller's role
- tools/system: cortex_restart (detached subprocess, 5s delay), cortex_logs (admin-only)
- tools/web: http_fetch — direct URL fetch, distinct from web_search
- tools/files: file_list (directory listing), file_write (restricted paths, admin-only)
- tools/notify: nc_talk_send — proactive outbound via notification.py
- orchestrator_engine + openai_orchestrator: user_role param; CONFIRM_REQUIRED tools
return a confirmation-request result instead of executing — loop breaks after Claude
asks user to confirm in a follow-up message
- home/scott/auth.json: role set to admin
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The AsyncOpenAI client always appends /chat/completions to base_url.
Open WebUI's endpoint is at /api/chat/completions, so for openwebui
host_type the base_url must include the /api prefix — same logic as
_local() in llm_client.py.
Also strip non-standard metadata fields (backend, host, etc.) from
session_messages before passing them to the API.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- openai_orchestrator.py — new ReAct tool loop engine for any
OpenAI-compatible endpoint (OpenRouter, Open WebUI, Ollama, LiteLLM);
model handles both tool loop and final response, no Claude handoff needed
- tools/__init__.py — auto-derive OpenAI JSON Schema from existing Gemini
FunctionDeclarations so tool definitions have a single source of truth
- routers/orchestrator.py — route to openai_orchestrator when model registry
"orchestrator" role resolves to a local_openai type host
- routers/chat.py — pass role to _backend_label(); fix fallback_used logic
(only meaningful for explicit backend overrides, not auto-routing)
- static/app.js — add null/"auto" to backend cycle; fetch local model hint
without overriding the auto default on page load
- model_registry.py — _normalize() back-fills host_type on old registry files
- requirements.txt — add openai>=1.0.0
- ARCH__BACKENDS.md — document OpenAI-compat backend and routing logic
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>