context_loader.load_context() now accepts a mode param ("chat"|"otr").
In OTR mode, the --- System --- block gains a second line:
Current mode: Off The Record — this conversation is private
and will not be logged or included in memory distillation
routers/chat.py passes mode="otr" when req.off_record is True.
Normal chat and all orchestrator calls stay at mode="chat" (no change
to the System block). The System block consolidates date/time and mode
in one place, matching the existing timestamp pattern.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
/history/{session_id} now returns a 'name' field alongside messages.
resumeSession() uses data.name first, then the sessionNames map, then
raw ID as fallback — so named sessions display correctly even on page
load before the sessions panel has been opened.
'Resumed session X' message also now shows the friendly name.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The backend toggle now cycles through configured roles (chat, coder,
research, distill, etc.) instead of backup model slots within the chat
role. Each role uses its own primary→backup chain from the registry.
- ChatRequest.slot replaced by chat_role (default "chat")
- GET /backend returns available_roles instead of chat_models
- _available_roles_for_toggle() builds list from defined_roles, excluding
orchestrator (which has its own Agent mode)
- Model label on responses now reflects the actual role's assigned model
- Toggle is inert when only one role is configured (avoids useless cycling)
- Add "Clear browser cache" button to Account Settings (Connected Accounts)
- Add _role_model_label() helper for cleaner response tag labeling
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend toggle now cycles through chat role models by label instead of
cycling service type strings (auto/claude/gemini/local).
- model_registry: get_model_for_slot() — resolves a specific priority
slot without walking the fallback chain
- llm_client: complete() gains slot param; explicit slot selection
dispatches directly to that model with no silent fallback
- routers/chat.py: ChatRequest.slot; GET /backend returns chat_models
[{slot, label, type}] for the UI; _stream_chat uses resolved model
label for the response tag when a slot is pinned
- app.js: toggle loads chat_models from /backend, cycles by label,
sends slot in chat payload; legacy model field removed from payload
- app.js: fix Gap B — agent mode placeholder no longer says "Gemini
tool loop"; now says "orchestrator"
- DESIGN doc: updated to reflect phases 1+2 complete, catalog-as-code
decision, Gap A/B documented, Phase 3 implementation details
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each assistant message in the session JSON now carries:
backend, backend_label, host (platform.node())
These fields are shown as model tags in the UI — on live responses and
when loading session history. Session log entries (sessions/YYYY-MM-DD.md)
include the backend label and host in the turn header.
The local (OpenAI-compat) backend strips non-standard fields before
sending messages to the API so extra fields don't leak upstream.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- openai_orchestrator.py — new ReAct tool loop engine for any
OpenAI-compatible endpoint (OpenRouter, Open WebUI, Ollama, LiteLLM);
model handles both tool loop and final response, no Claude handoff needed
- tools/__init__.py — auto-derive OpenAI JSON Schema from existing Gemini
FunctionDeclarations so tool definitions have a single source of truth
- routers/orchestrator.py — route to openai_orchestrator when model registry
"orchestrator" role resolves to a local_openai type host
- routers/chat.py — pass role to _backend_label(); fix fallback_used logic
(only meaningful for explicit backend overrides, not auto-routing)
- static/app.js — add null/"auto" to backend cycle; fetch local model hint
without overriding the auto default on page load
- model_registry.py — _normalize() back-fills host_type on old registry files
- requirements.txt — add openai>=1.0.0
- ARCH__BACKENDS.md — document OpenAI-compat backend and routing logic
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes:
- app.js was tracking primaryBackend locally but never included
model: primaryBackend in the /chat POST body, so the server always
used settings.primary_backend regardless of what the user clicked.
Now model: primaryBackend is sent on every chat request.
- Responses were only annotated when fallback occurred. Now every
assistant message shows a small model tag at the bottom right.
chat.py:
- _backend_label() resolves human-readable name:
claude → "Claude", gemini → "Gemini",
local → registry label (e.g. "Gemma 4 E4B") or model_name
- SSE payload now includes backend_label field
app.js:
- model: primaryBackend added to /chat fetch body
- After every response, appends .model-tag div with backend_label
- Fallback shows "⚡ fallback → <label>" in amber; normal is muted
- Removed separate system message for fallback (tag covers it)
style.css:
- .model-tag: small muted text, right-aligned, separated by thin line
- .model-tag.fallback: amber (#f59e0b)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces model_registry.py as the single source of truth for all LLM
backend configuration. Replaces scattered backend settings across user_settings,
config distill_backend_*, and the UI toggle.
model_registry.py:
- Per-user home/{user}/model_registry.json with version, hosts, models, roles
- Models have: type (local_openai|claude_cli|gemini_cli|gemini_api), label,
model_name, host_id, context_k (tokens), tags (capability labels)
- Roles map to priority chains: primary, backup_1..backup_4
- Built-in IDs (claude_cli, gemini_cli, gemini_api) always resolvable
- Auto-migrates existing local_llm.json on first access
- CRUD: save_host, remove_host, save_model, remove_model, set_role
- get_model_for_role(): registry → .env default → hardcoded fallback
config.py:
- role_chat/orchestrator/distill/coder/research .env defaults
- defined_roles: comma-separated standard role list (extensible)
- get_defined_roles() and get_role_default() helper methods
llm_client.complete():
- New role= parameter (default "chat") for registry-based routing
- model= still accepted for explicit UI toggle override
- _claude() and _local() accept model_cfg dict instead of raw string
- _local() uses pre-resolved config from registry
memory_distiller.py:
- distill_mid/long now use role="distill" (no more distill_backend_* .env vars needed)
cron_runner.py:
- brief jobs use role="chat"
routers/chat.py + auth.py:
- Use model_registry instead of user_settings for local model info
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a third input mode toggle alongside Note and Agent. When active:
- Textarea gets a subtle purple tint with dashed border
- OTR button highlights purple
- Placeholder reads "Off the record — not logged or distilled…"
- off_record=True is sent to /chat; session_logger is skipped
- In-memory session context is preserved within the session
Switching to Note or Agent mode deactivates OTR, and vice versa.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Session name field: PATCH /sessions/{id} endpoint, inline rename button in UI
- Persona rename: inline ✏ toggle form in settings, POST /settings/persona/rename
- Username rename: inline form in settings, POST /settings/username (renames home dir, forces re-login)
- Help page: dedicated /help route replacing modal, collapsible sections
- Per-persona isolation: files.py and session_store.py now scope to correct user/persona
- Contrast/visibility: muted text bumped to slate-400+, session rename btn at 0.4 opacity
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- session_store: store sessions under home/{user}/persona/{name}/session_data/
instead of the shared cortex/data/sessions/ bucket
- chat endpoints: add user/persona query params to /sessions, /history/*,
/sessions/*, /note so they resolve the correct persona context
- files router: add user/persona query params to /files and /files/{name}
so the file browser loads the right persona's files
- app.js: pass user/persona on all session, history, and file fetches;
move _fileParams to top-level scope so it is available everywhere
- onboarding: fix FastAPI route ordering — register /persona before /{token}
so the literal path wins and does not get captured as a token value
- ui.py: read Emoji field from IDENTITY.md and inject into CORTEX_CONFIG
so the header icon reflects each persona's chosen emoji
- .gitignore: exclude home/**/session_data/ (runtime state)
- migrate scott/inara sessions from cortex/data/sessions/ to session_data/
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Restructures persona storage from a flat personas/{name}/ layout to
home/{username}/persona/{name}/, mirroring Linux home directories.
Changes:
- persona.py: two ContextVars (user + persona), Linux-style name validation,
set_context(), get_user(), get_persona(), validate(), list_users(),
list_user_personas(); persona_path() takes (username, name)
- config.py: replaces personas_dir with home_dir + home_root()
- git mv personas/inara → home/scott/persona/inara (history preserved)
- home/holly/persona/tina/: Holly's persona stub added
- cron_runner.py: all storage functions take (username, persona) params
- tools/cron.py: stamps user + persona on jobs; APScheduler IDs are
{user}:{persona}:{job_id} to prevent collisions across users
- memory_distiller.py: distill_short/mid/long take (username, persona);
added missing Path + settings imports
- scheduler.py: _load_user_crons() iterates home/*/persona/* (two-level)
- routers/chat.py, orchestrator.py: user field added; set_context() called
- tests/conftest.py: home_root fixture with two-level structure;
patches home_dir instead of personas_dir
- tests/test_persona.py: fully rewritten for two-level API
- tests/test_api_files.py: updated fixture name and path
- .env.default: documents HOME_DIR setting; scrubs stale API key
- CLAUDE.md, README.md: directory maps updated for new layout
All 80 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add cortex/persona.py: ContextVar-based per-request routing with
path traversal protection and persona validation
- Migrate inara/ → personas/inara/ (git history preserved via git mv)
- config.py: add personas_root(), inara_path() delegates to personas/inara
- All 14 settings.inara_path() call sites replaced with persona_path()
- ChatRequest + OrchestrateRequest: add persona field (default: "inara")
with validation at request entry before any processing
- memory_distiller: add optional persona param for future per-persona distill
- cron_runner/tools/cron: stamp persona on jobs, prefix APScheduler IDs
(persona:job_id) to prevent collisions across personas
- scheduler: _load_user_crons() iterates all personas at startup
Adding a new persona: create personas/<name>/ with IDENTITY.md + SOUL.md.
Auth: handled at nginx level (inject X-Cortex-Persona header per subdomain).
Future: persona maps to Aether account_id_random for full integration.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Session delete:
- DELETE /sessions/{session_id} endpoint (chat.py + session_store.py)
- × button on each session item in the panel (hover-reveal on desktop)
- Clears UI if the active session is deleted
Touch accessibility:
- @media (hover: none) rule makes msg-actions always visible on touch devices
- msg-act-btn tap targets enlarged to 36px min-height, readable font size
- session-delete-btn also always visible and finger-sized on touch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Edit/delete individual messages from session context with inline editing
(Ctrl+Enter saves, Escape cancels); changes sync to backend via PUT /history
- PUT /history/{session_id} endpoint to replace full message list
- Named sessions: readable slugs (e.g. quiet-spring) instead of UUID fragments
- Scroll no longer snaps to bottom when user has scrolled up to read history
- cortex.service: systemd unit for auto-start and restart-on-failure
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cortex: FastAPI backend serving Inara via Claude/Gemini CLI backends.
Includes SSE streaming chat, session persistence, Google Chat webhook
handler, and Docker support.
Inara: Identity files (persona, soul, protocols, memory, context tiers)
mounted read-only into the container at runtime.
Features in initial cut:
- /chat endpoint with SSE keepalive + LLM fallback
- Session store with rolling history window
- Markdown rendering, copy-to-clipboard, links open in new tab
- Stacked right-column input controls (height selector, enter toggle,
note mode with public/private) — semi-hidden until textarea grows
- /note endpoint for injecting public context into session history
- Docker Compose config (local dev runs natively; Docker for server)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>