18 Commits

Author SHA1 Message Date
Scott Idem
29d8aa4aae feat: tool schema optimization, keyword routing, aider_run coding agent
Tool schema optimization (PLAN__Tool_Schema_Optimization.md Phases 1-3):
- model_registry.py: ROLE_DEFAULT_TOOLS — distill gets [], research/coder get
  narrow tool lists by default; applied in get_role_config() when user hasn't
  configured a custom list
- openai_orchestrator.py: keyword routing via narrow_tools_by_keywords() — scans
  user message + last assistant turn; narrows active schemas to matched categories
  only (e.g. "weather" → 3 web tools instead of 69); zero tools sent for pure chat
- openai_orchestrator.py: _get_cached_tools() — module-level schema cache keyed by
  (role, sorted_tool_list, risk_params); eliminates redundant schema rebuilds
- openai_orchestrator.py: _TOOL_SCHEMA_OVERHEAD 3000 → 500 tokens (schemas now
  excluded from the per-call fixed estimate since they're cached separately)
- tools/__init__.py: CATEGORY_TOOL_MAP + _KEYWORD_CATEGORY_MAP + classify_tool_categories()
  + narrow_tools_by_keywords() — the classifier logic lives here so both orchestrators
  can share it

aider_run tool (cortex/tools/aider.py):
- Invokes Aider as a subprocess with --message --yes-always --no-pretty --no-stream
- Project aliases: cortex / aether_api / aether_frontend / aether_container
- Auto-injects OpenRouter API key from Cortex model registry (no ~/.env needed)
- background=True fires async + registers in agent_manager; notify=True sends push
  notification on completion
- admin-only, confirm-required, TOOL_RISK=high
- .gitignore: added .aider.chat.history.md / .aider.input.history / .aider.llm.history

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-03 22:39:44 -04:00
Scott Idem
7a27190ffe feat: custom roles, Tailwind settings pages, pg.css fixes, doc cleanup
Model Registry:
- Add per-user custom roles (add/remove via UI); required roles chat/orchestrator/distill
  are always present and cannot be removed
- Auto-migrate legacy .env-defined roles to custom_roles on first access
- Role config panel (gear): Remove role button moved inside panel; required badge below name
- Role select: Primary + Backup slots only (was three)

Settings pages — Tailwind CSS migration (CDN, preflight: false):
- local_llm.html, settings.html, help.html, notifications.html, tools_settings.html,
  crons.html, integrations.html all migrated; pg-* color tokens; dark/light via data-theme

pg.css fixes:
- input[type=checkbox/radio]: width: auto — prevents pg.css width:100% from stretching checkboxes
- btn-submit: responsive sizing via Tailwind w-full md:w-96 on each button (no longer full-width on desktop)

Documentation:
- MASTER.md, TODO__Agents.md: remove "/ Inara" from titles; description updated to "per-user AI personas"
- HELP.md: persona-agnostic language throughout (NC Talk, Google Chat, push, schedules, HA sections);
  roles section restructured to show required vs. custom roles with examples
- notifications.html: subtitle and HA description use "your persona" not "Inara"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 21:03:11 -04:00
Scott Idem
a92fd90f0d feat: Anthropic SDK backend — API key alternative to Claude CLI OAuth
Adds `anthropic_api` model type so users can authenticate with a direct
Anthropic API key instead of (or alongside) the CLI OAuth session.

- model_registry.py: `anthropic_api` type; `save/get/remove_anthropic_api_key()`
  mirroring the Google account pattern; `save_cloud_model()` now picks type
  based on credential type (cli → claude_cli, api_key → anthropic_api);
  `_resolve_model()` merges api_key from the credential entry
- llm_client.py: `_anthropic_api()` backend (AsyncAnthropic SDK); dispatch
  and fallback wiring; usage tracking
- routers/local_llm.py: Anthropic API key management routes
  (POST /settings/local/anthropic-key, /anthropic-key/{id}/remove);
  `anthropic_api` badge and edit-form credential selector
- static/local_llm.html: Anthropic Cloud Provider block now shows API key
  management (add/remove); Add Model → Anthropic tab has credential selector
  (CLI vs API key)
- requirements.txt: enable anthropic>=0.40.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 21:30:56 -04:00
Scott Idem
a66c5a7f84 feat: reasoning token budget + session name in header
- reasoning_budget_tokens: optional int field on local_openai models;
  when set, injects {"reasoning": {"budget_tokens": N}} via extra_body
  into every OpenRouter API call (both tool-loop and confirmation-gate
  rounds). Field exposed in the model edit form in Settings.

- session name moved from standalone full-row div between #messages
  and #input-area into the persona-switcher block in the header, as a
  third dim line under "Cortex · Local". Collapses when empty via
  :empty CSS. No JS changes required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 21:35:23 -04:00
Scott Idem
85792a7bcf feat: per-role inject_mode, OTR fixes, hover metadata, send/stop tooltip
- inject_mode: per-role toggle (parallel to inject_datetime) gates the
  "Current mode: Off The Record" line in the system prompt; wired through
  model_registry, context_loader, chat router, orchestrator router, and
  local_llm settings UI

- OTR orchestrator fix: OrchestrateRequest now carries off_record;
  _finalize_job stores it per message and gates log_turn on it; JS
  orchestrate payload sends off_record correctly

- Per-message hover metadata: removed always-visible .model-tag; replaced
  with .msg-meta strip in the action bar (hover-only); shows model label,
  host, fallback indicator, and OTR badge; stored in session JSON

- Send/stop button tooltip: shows role + model and (when tools on)
  separate orchestrator model + engine label; live elapsed timer on stop
  button via startRunTimer/stopRunTimer

- OrchestratorResult.backend_label: new field; openai_orchestrator fills
  it; finalize_job propagates it to job dict and session messages

- GET /backend: exposes orchestrator_model label so the frontend tooltip
  can show both models separately

- TODO: session delete confirmation added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:12:03 -04:00
Scott Idem
09d775b47b feat: spawn_agent tool + host max_concurrent + docs
Adds a synchronous sub-agent spawning tool that lets the orchestrator
delegate tasks to a specific role's model and tool set.

- cortex/tools/agents.py: spawn_agent(task, role, tier, timeout, max_rounds)
  - Supports local_openai and gemini_api model types
  - Per-host asyncio semaphore (keyed by host_id or model type)
  - asyncio.wait_for() enforces timeout; admin-only tool
- cortex/model_registry.py: max_concurrent field in host schema (default 3,
  clamped 1-20); backfilled on _normalize() for existing hosts
- cortex/routers/local_llm.py + local_llm.html: "Max parallel" number input
  in host add/edit forms
- cortex/tools/__init__.py: spawn_agent registered in TOOL_CATEGORIES["Agents"],
  _CALLABLES, TOOL_ROLES (admin), and _ALL_DECLARATIONS
- Docs: TOOLS.md count 44→45, spawn_agent section; HELP.md tool table updated;
  ARCH__FUTURE.md Round 2 completed items; TODO__Agents.md spawn_agent checked;
  CLAUDE.md tool count and list updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 22:48:21 -04:00
Scott Idem
6ad7597db8 feat: per-role inject_datetime toggle for system prompt
Each role can now disable the current date/time header injected into the
system prompt. Default is true (all existing roles unchanged). Useful for
pure processing roles (summarizer, classifier, translator) where temporal
context is irrelevant or could cause unexpected model behavior.

Changes:
- model_registry: set_role_config/get_role_config gain inject_datetime field
- context_loader: load_context gains inject_datetime param (default True)
- orchestrator router: passes inject_datetime from role_cfg to load_context
- local_llm router: reads inject_datetime from POST body, passes to registry;
  role_config_data_js includes the field
- local_llm.html: checkbox in role config panel; populate on open, save on submit

Session logs still timestamp every turn (HH:MM header in YYYY-MM-DD.md files)
regardless of this setting — the toggle only affects the system prompt header.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 21:53:35 -04:00
Scott Idem
f8f7cd75da feat: audit log, usage tracking UI, OpenAI orchestrator compaction, onboarding + docs
Tool audit log:
- Every orchestrator tool call logged to home/{user}/tool_audit/YYYY-MM-DD.jsonl
- Files panel sidebar: audit log group (collapsed), date-linked read-only table
- Admin endpoints: /api/audit/files, /api/audit/day, /api/audit/recent, /api/audit/stats
- Engine and model name recorded per entry

OpenAI orchestrator improvements:
- Context budget enforcement: 75% of model context_k (min 16k)
- Message compaction: truncates old tool results when approaching budget
- max_rounds respected per model config (intersected with server cap)

OpenRouter onboarding (setup.html, onboarding.py, app.js, settings.html):
- Step 3 of 3: /setup/model with curated model picker
- Chat banner for users on server-default model (informational, not alarmist)
- Settings quick-link card; /setup/model works standalone for existing users

Model registry + session store:
- set_role_config / get_role_config for per-role tool lists and system_append
- session_store: session rename, session name backfill endpoint

UI updates (app.js, index.html, style.css, local_llm.html):
- Role toggle in context panel
- Off-the-record mode
- Agent notes read-only viewer
- OPERATIONS.md loaded at T2+ in context

Documentation:
- HELP.md: full tool table, per-role tool sets, Agent Notes, usage tracking
- TOOLS.md: Agent Notes section, count corrected to 44
- ARCH__SYSTEM.md, ARCH__BACKENDS.md, MASTER.md updated to match reality
- CLAUDE.md: onboarding flow, documentation philosophy sections
- README.md: stack in practice, DeepSeek TUI mention, architecture diagram updated
- TODO__Agents.md: onboarding task completed with deviation notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 21:26:43 -04:00
Scott Idem
eab92d876d refactor: split tool declarations into domain files + role config UI
tools/__init__.py shrinks from 1,137 → 250 lines. Each domain file now
owns both its callables and its FunctionDeclarations (DECLARATIONS list),
so adding a new tool only touches one file.

New TOOL_CATEGORIES dict exported from __init__ — used by the UI for
grouped tool checkboxes.

Role config UI (Settings → Model Registry → Role Assignments):
- ⚙ button per role expands an inline configure panel
- Textarea for system_append (injected into system prompt for this role)
- Grouped checkboxes for tool allow-list (all checked = no restriction)
- POST /api/models/role-config saves both fields; updates ROLE_CONFIG_DATA
  in-page so re-open reflects current state without a page reload

Backend:
- model_registry.set_role_config() writes system_append + tools to registry
- TOOL_CATEGORIES exported from tools/__init__ for UI rendering
- TOOLS.md header updated: 30 → 39 tools (ae_journal_* and cortex_* additions)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-01 20:40:50 -04:00
Scott Idem
49123cdd5c feat: per-role tool lists and system prompt overlays
Each role in model_registry.json can now carry two optional keys:
  system_append — injected into the system prompt at position 7 (after
                  memory, closest to the turn) for the active chat_role
  tools         — explicit tool allow-list; intersected with the user's
                  access-level filter so it can only restrict, never elevate

No changes needed for existing users — missing keys fall back to current
behavior. Add keys to a role to give it a specialty focus:

  "coder": {
    "primary": "claude_cli",
    "system_append": "You are in code-specialist mode...",
    "tools": ["web_search", "file_read", "shell_exec", "scratch_write"]
  }

Changes:
- model_registry.py: get_role_config() returns system_append + tools
- context_loader.py: role_append param appended as "--- Role Context ---"
- tools/__init__.py: get_tools_for_role/get_openai_tools_for_role accept
  optional tool_list and intersect with access-level filter
- orchestrator_engine.py: tool_list threaded through run/resume/checkpoint
- openai_orchestrator.py: tool_list threaded through run/resume/checkpoint;
  _build_client now calls get_openai_tools_for_role instead of returning
  unfiltered OPENAI_TOOL_SCHEMAS
- routers/orchestrator.py: pulls role_cfg for chat_role, passes both
  role_append and tool_list to context loader and engine

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-01 20:00:38 -04:00
Scott Idem
962d58d2e2 feat: model registry Phase 3 — slot-based backend toggle
Backend toggle now cycles through chat role models by label instead of
cycling service type strings (auto/claude/gemini/local).

- model_registry: get_model_for_slot() — resolves a specific priority
  slot without walking the fallback chain
- llm_client: complete() gains slot param; explicit slot selection
  dispatches directly to that model with no silent fallback
- routers/chat.py: ChatRequest.slot; GET /backend returns chat_models
  [{slot, label, type}] for the UI; _stream_chat uses resolved model
  label for the response tag when a slot is pinned
- app.js: toggle loads chat_models from /backend, cycles by label,
  sends slot in chat payload; legacy model field removed from payload
- app.js: fix Gap B — agent mode placeholder no longer says "Gemini
  tool loop"; now says "orchestrator"
- DESIGN doc: updated to reflect phases 1+2 complete, catalog-as-code
  decision, Gap A/B documented, Phase 3 implementation details

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 21:43:08 -04:00
Scott Idem
3bc6b45f9f fix: update Anthropic model catalog — add previous versions, fix context sizes
- Add Claude Opus 4.6 and Sonnet 4.5 (previous versions, still available)
- Fix context_k for Opus 4.7 and Sonnet 4.6: both have 1M context (was 200)
- Haiku 4.5 context_k 200 is correct

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 21:16:22 -04:00
Scott Idem
ef07596955 fix: update Google model catalog to current models (April 2026)
Remove outdated gemini-2.0-flash and gemini-1.5-pro.
Add gemini-2.5-flash-lite (GA) and the three Gemini 3.x preview
models (gemini-3.1-pro-preview, gemini-3-flash-preview,
gemini-3.1-flash-lite-preview). All have 1M token context windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 21:14:23 -04:00
Scott Idem
f08b033d6c feat: model registry Phase 2 — cloud provider UI (Anthropic + Google)
Adds cloud provider management to /settings/models:
- Google Accounts section: add/remove Gemini API keys with labels
- Add Model form: provider tabs (Local / Google / Anthropic) with
  catalog dropdowns that auto-fill label and context_k
- Provider badges on model rows (Anthropic / Google / Local)
- /settings/local now redirects to /settings/models (canonical URL)
- save_cloud_model() in model_registry for Anthropic/Google entries
- Distill role migration restored in _migrate_from_local_llm
- Test fixes: version assertions updated to V2

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 20:41:06 -04:00
Scott Idem
45c95d20ba feat: model registry V2 — provider-aware schema with multi-account support
Adds a providers section to the per-user model registry for Anthropic and
Google as first-class providers alongside local hosts. Google accounts
(API keys) are now stored as a list so multiple Google accounts can coexist.

Changes:
- model_registry.py: V2 schema, auto migration V1→V2 (pulls gemini_api_key
  from auth.json into providers.google.accounts), _resolve_model() merges
  account API key for gemini_api type models
- routers/orchestrator.py: uses model-resolved api_key when orchestrator
  role resolves to a gemini_api model with account_id
- ANTHROPIC_CATALOG and GOOGLE_CATALOG constants for model picker (Phase 2)
- New functions: get_google_api_key(), save/remove_google_account(), get_catalog()
- Documentation: ARCH__BACKENDS.md updated to V2 schema, DESIGN doc added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 20:21:04 -04:00
Scott Idem
d9a322164a feat: OpenAI-compatible orchestrator + backend auto-routing
- openai_orchestrator.py — new ReAct tool loop engine for any
  OpenAI-compatible endpoint (OpenRouter, Open WebUI, Ollama, LiteLLM);
  model handles both tool loop and final response, no Claude handoff needed
- tools/__init__.py — auto-derive OpenAI JSON Schema from existing Gemini
  FunctionDeclarations so tool definitions have a single source of truth
- routers/orchestrator.py — route to openai_orchestrator when model registry
  "orchestrator" role resolves to a local_openai type host
- routers/chat.py — pass role to _backend_label(); fix fallback_used logic
  (only meaningful for explicit backend overrides, not auto-routing)
- static/app.js — add null/"auto" to backend cycle; fetch local model hint
  without overriding the auto default on page load
- model_registry.py — _normalize() back-fills host_type on old registry files
- requirements.txt — add openai>=1.0.0
- ARCH__BACKENDS.md — document OpenAI-compat backend and routing logic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:18:18 -04:00
Scott Idem
a6e404c143 feat: host_type field for OpenRouter / OpenAI-compatible API support
Adds host_type ("openwebui" | "openai") to the host schema so Cortex can
talk to both Open WebUI/Ollama and OpenRouter/standard-OpenAI endpoints.

Path differences per type:
  openwebui (default): /api/chat/completions, /api/models
  openai:              /chat/completions,     /models

model_registry.py:
  - host_type added to host schema (default "openwebui", backward compat)
  - save_host() accepts host_type parameter
  - _resolve_model() passes host_type through with the merged host fields

llm_client._local():
  - Reads host_type from resolved model_cfg
  - Selects correct chat completions path accordingly

routers/local_llm.py:
  - save_host route accepts host_type form field
  - fetch-models uses /models for openai type, /api/models for openwebui
  - Existing host rows show type selector pre-filled from stored value

local_llm.html:
  - "Add host" form includes type selector

To use OpenRouter:
  - Add host: URL = https://openrouter.ai/api/v1, Type = OpenAI-compatible
  - API key from openrouter.ai (store in .env or model_registry.json only)
  - Fetch models or add manually (e.g. anthropic/claude-sonnet-4-5-20251022)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 21:11:22 -04:00
Scott Idem
6a1a1c2686 feat: unified model registry with role-based routing
Introduces model_registry.py as the single source of truth for all LLM
backend configuration. Replaces scattered backend settings across user_settings,
config distill_backend_*, and the UI toggle.

model_registry.py:
- Per-user home/{user}/model_registry.json with version, hosts, models, roles
- Models have: type (local_openai|claude_cli|gemini_cli|gemini_api), label,
  model_name, host_id, context_k (tokens), tags (capability labels)
- Roles map to priority chains: primary, backup_1..backup_4
- Built-in IDs (claude_cli, gemini_cli, gemini_api) always resolvable
- Auto-migrates existing local_llm.json on first access
- CRUD: save_host, remove_host, save_model, remove_model, set_role
- get_model_for_role(): registry → .env default → hardcoded fallback

config.py:
- role_chat/orchestrator/distill/coder/research .env defaults
- defined_roles: comma-separated standard role list (extensible)
- get_defined_roles() and get_role_default() helper methods

llm_client.complete():
- New role= parameter (default "chat") for registry-based routing
- model= still accepted for explicit UI toggle override
- _claude() and _local() accept model_cfg dict instead of raw string
- _local() uses pre-resolved config from registry

memory_distiller.py:
- distill_mid/long now use role="distill" (no more distill_backend_* .env vars needed)

cron_runner.py:
- brief jobs use role="chat"

routers/chat.py + auth.py:
- Use model_registry instead of user_settings for local model info

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 21:25:18 -04:00