Adds host_type ("openwebui" | "openai") to the host schema so Cortex can
talk to both Open WebUI/Ollama and OpenRouter/standard-OpenAI endpoints.
Path differences per type:
openwebui (default): /api/chat/completions, /api/models
openai: /chat/completions, /models
model_registry.py:
- host_type added to host schema (default "openwebui", backward compat)
- save_host() accepts host_type parameter
- _resolve_model() passes host_type through with the merged host fields
llm_client._local():
- Reads host_type from resolved model_cfg
- Selects correct chat completions path accordingly
routers/local_llm.py:
- save_host route accepts host_type form field
- fetch-models uses /models for openai type, /api/models for openwebui
- Existing host rows show type selector pre-filled from stored value
local_llm.html:
- "Add host" form includes type selector
To use OpenRouter:
- Add host: URL = https://openrouter.ai/api/v1, Type = OpenAI-compatible
- API key from openrouter.ai (store in .env or model_registry.json only)
- Fetch models or add manually (e.g. anthropic/claude-sonnet-4-5-20251022)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces model_registry.py as the single source of truth for all LLM
backend configuration. Replaces scattered backend settings across user_settings,
config distill_backend_*, and the UI toggle.
model_registry.py:
- Per-user home/{user}/model_registry.json with version, hosts, models, roles
- Models have: type (local_openai|claude_cli|gemini_cli|gemini_api), label,
model_name, host_id, context_k (tokens), tags (capability labels)
- Roles map to priority chains: primary, backup_1..backup_4
- Built-in IDs (claude_cli, gemini_cli, gemini_api) always resolvable
- Auto-migrates existing local_llm.json on first access
- CRUD: save_host, remove_host, save_model, remove_model, set_role
- get_model_for_role(): registry → .env default → hardcoded fallback
config.py:
- role_chat/orchestrator/distill/coder/research .env defaults
- defined_roles: comma-separated standard role list (extensible)
- get_defined_roles() and get_role_default() helper methods
llm_client.complete():
- New role= parameter (default "chat") for registry-based routing
- model= still accepted for explicit UI toggle override
- _claude() and _local() accept model_cfg dict instead of raw string
- _local() uses pre-resolved config from registry
memory_distiller.py:
- distill_mid/long now use role="distill" (no more distill_backend_* .env vars needed)
cron_runner.py:
- brief jobs use role="chat"
routers/chat.py + auth.py:
- Use model_registry instead of user_settings for local model info
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix 'undefined' in auth banner: read access_token_hours_remaining (not hours_remaining)
- Fix false-positive warning on fresh tokens: when refresh token present, only warn
within 1 hour of expiry (not 24h) since the CLI should auto-rotate but sometimes misses
- Emit claude_auth_expired SSE event on 401 so UI shows inline red banner immediately
- app.js: handle claude_auth_expired SSE event with persistent top banner + dismiss button
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All hardcoded "Inara"/"Scott" strings replaced with settings.agent_name
and settings.user_name, read from .env at startup:
- config.py: AGENT_NAME and USER_NAME settings (defaults: Inara / Scott)
- llm_client.py: conversation labels in prompt builder
- session_logger.py: **Name:** labels in session log markdown
- memory_distiller.py: distillation system prompts (mid + long)
- routers/nextcloud_talk.py: @mention prefix strip
- routers/google_chat.py: greeting message
Second instance scaffolding:
- holly/: identity directory with placeholder files (USER_NAME=Holly,
AGENT_NAME to be chosen by Holly)
- cortex/.env.holly: config for Holly's instance on port 8001
- cortex-holly.service: systemd unit for the second instance
No behavioural change to the Inara/Scott instance — defaults unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cortex: FastAPI backend serving Inara via Claude/Gemini CLI backends.
Includes SSE streaming chat, session persistence, Google Chat webhook
handler, and Docker support.
Inara: Identity files (persona, soul, protocols, memory, context tiers)
mounted read-only into the container at runtime.
Features in initial cut:
- /chat endpoint with SSE keepalive + LLM fallback
- Session store with rolling history window
- Markdown rendering, copy-to-clipboard, links open in new tab
- Stacked right-column input controls (height selector, enter toggle,
note mode with public/private) — semi-hidden until textarea grows
- /note endpoint for injecting public context into session history
- Docker Compose config (local dev runs natively; Docker for server)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>