feat: unified model registry with role-based routing

Introduces model_registry.py as the single source of truth for all LLM backend configuration. Replaces scattered backend settings across user_settings, config distill_backend_*, and the UI toggle. model_registry.py: - Per-user home/{user}/model_registry.json with version, hosts, models, roles - Models have: type (local_openai|claude_cli|gemini_cli|gemini_api), label, model_name, host_id, context_k (tokens), tags (capability labels) - Roles map to priority chains: primary, backup_1..backup_4 - Built-in IDs (claude_cli, gemini_cli, gemini_api) always resolvable - Auto-migrates existing local_llm.json on first access - CRUD: save_host, remove_host, save_model, remove_model, set_role - get_model_for_role(): registry → .env default → hardcoded fallback config.py: - role_chat/orchestrator/distill/coder/research .env defaults - defined_roles: comma-separated standard role list (extensible) - get_defined_roles() and get_role_default() helper methods llm_client.complete(): - New role= parameter (default "chat") for registry-based routing - model= still accepted for explicit UI toggle override - _claude() and _local() accept model_cfg dict instead of raw string - _local() uses pre-resolved config from registry memory_distiller.py: - distill_mid/long now use role="distill" (no more distill_backend_* .env vars needed) cron_runner.py: - brief jobs use role="chat" routers/chat.py + auth.py: - Use model_registry instead of user_settings for local model info Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 21:25:18 -04:00
parent a4daebdc9b
commit 6a1a1c2686
7 changed files with 541 additions and 33 deletions
--- a/cortex/config.py
+++ b/cortex/config.py
@@ -65,6 +65,20 @@ class Settings(BaseSettings):
    distill_backend_mid: str = ""
    distill_backend_long: str = ""

+    # Model registry: default backend type per role when user registry has no entry.
+    # Values: "claude_cli" | "gemini_cli" | "gemini_api" (builtin IDs)
+    # Override in .env: ROLE_CHAT=claude_cli  ROLE_DISTILL=gemini_api  etc.
+    role_chat: str = "claude_cli"
+    role_orchestrator: str = "gemini_api"
+    role_distill: str = "claude_cli"
+    role_coder: str = "claude_cli"
+    role_research: str = "gemini_api"
+
+    # Comma-separated list of standard roles shown in the model settings UI.
+    # Add custom roles here to extend the UI without code changes.
+    # Example: DEFINED_ROLES=chat,orchestrator,distill,coder,research,medical
+    defined_roles: str = "chat,orchestrator,distill,coder,research"
+
    # Memory tier token budgets — soft caps used during distillation
    # Override in .env: MEMORY_BUDGET_LONG=4000 etc.
    memory_budget_long: int = 2000
@@ -90,6 +104,14 @@ class Settings(BaseSettings):

    model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")

+    def get_defined_roles(self) -> list[str]:
+        """Return the ordered list of standard roles from the defined_roles setting."""
+        return [r.strip() for r in self.defined_roles.split(",") if r.strip()]
+
+    def get_role_default(self, role: str) -> str:
+        """Return the .env default backend type for a role (e.g. 'claude_cli')."""
+        return getattr(self, f"role_{role.replace('-', '_')}", "claude_cli")
+
    def home_root(self) -> Path:
        """Resolve home_dir relative to this file's location if not absolute."""
        if self.home_dir.is_absolute():