Files

Scott Idem 45c95d20ba feat: model registry V2 — provider-aware schema with multi-account support

Adds a providers section to the per-user model registry for Anthropic and
Google as first-class providers alongside local hosts. Google accounts
(API keys) are now stored as a list so multiple Google accounts can coexist.

Changes:
- model_registry.py: V2 schema, auto migration V1→V2 (pulls gemini_api_key
  from auth.json into providers.google.accounts), _resolve_model() merges
  account API key for gemini_api type models
- routers/orchestrator.py: uses model-resolved api_key when orchestrator
  role resolves to a gemini_api model with account_id
- ANTHROPIC_CATALOG and GOOGLE_CATALOG constants for model picker (Phase 2)
- New functions: get_google_api_key(), save/remove_google_account(), get_catalog()
- Documentation: ARCH__BACKENDS.md updated to V2 schema, DESIGN doc added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-27 20:21:04 -04:00

11 KiB

Raw Blame History

Model Registry V2 — Design Document

Status: Planning / Pre-implementation Goal: Unified, provider-agnostic model management with clean role-based routing

Problem Statement

The current system has two classes of models with different treatment:

Type	How configured	How selected
Claude, Gemini	Hardcoded built-ins (`claude_cli`, `gemini_api`)	Backend toggle string ("claude"/"gemini")
Local (Ollama, Open WebUI)	Configured via `/settings/local`	Backend toggle string "local"

This breaks down when you want:

Multiple Gemini API keys (e.g. one per Google account)
Claude via direct API key instead of OAuth CLI
OpenRouter or other hosted providers alongside local models
Role assignments to span all provider types uniformly
A chat toggle that shows "which model" not "which service"

Proposed Architecture

Core concept: Providers + Credentials + Models + Roles

Providers (built-in, fixed set)
  └─ Anthropic       ← has a catalog of Claude model IDs
  └─ Google          ← has a catalog of Gemini model IDs
  └─ Local Host      ← OpenAI-compatible endpoint (user adds these)

Credentials (user-configured, per provider)
  └─ Anthropic       ← Claude CLI (OAuth, default) or API key
  └─ Google          ← one or more API keys (one per Google account)
  └─ Local Host      ← api_key stored on the host record (existing)

Model Entries (user-registered — "I want to use this model")
  └─ Provider + model ID + credential = one usable model entry
  └─ Same model ID with two different accounts = two model entries

Role Assignments (unified — any model entry can fill any role)
  └─ chat:        primary → backup_1 → backup_2
  └─ orchestrator: primary → backup_1
  └─ distill:     primary
  └─ (etc.)

Backend toggle redesign

Current: cycles service type strings — auto → claude → gemini → local New: cycles through the chat role's assigned models — Primary → Backup 1 → Backup 2

The toggle displays the active model's label (e.g. "Sonnet 4.6" / "Gemini 2.5 Flash" / "Gemma 4 E4B"). Auto defaults to Primary.

This means the toggle is context-free — it just picks a slot — and all the "what model, what provider, what credentials" logic lives in the registry.

Data Model (V2 Schema)

Stored in home/{user}/model_registry.json.

{
  "version": 2,

  "providers": {
    "anthropic": {
      "catalog": [
        {"id": "claude-opus-4-7",    "label": "Claude Opus 4.7",    "context_k": 200},
        {"id": "claude-sonnet-4-6",  "label": "Claude Sonnet 4.6",  "context_k": 200},
        {"id": "claude-haiku-4-5",   "label": "Claude Haiku 4.5",   "context_k": 200}
      ],
      "credentials": [
        {"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}
      ]
    },
    "google": {
      "catalog": [
        {"id": "gemini-2.5-pro",   "label": "Gemini 2.5 Pro",   "context_k": 1000},
        {"id": "gemini-2.5-flash", "label": "Gemini 2.5 Flash", "context_k": 1000},
        {"id": "gemini-2.0-flash", "label": "Gemini 2.0 Flash", "context_k": 1000},
        {"id": "gemini-1.5-pro",   "label": "Gemini 1.5 Pro",   "context_k": 2000}
      ],
      "accounts": [
        {"id": "osit", "label": "One Sky IT (scott.idem@oneskyit.com)", "api_key": "AIza..."}
      ]
    }
  },

  "hosts": [
    {
      "id": "h1",
      "label": "Gaming Laptop",
      "api_url": "http://192.168.x.x:3000",
      "api_key": "",
      "host_type": "openwebui"
    }
  ],

  "models": [
    {
      "id": "m1",
      "label": "Sonnet 4.6 (CLI)",
      "type": "claude_cli",
      "provider": "anthropic",
      "model_name": "claude-sonnet-4-6",
      "credential_id": "cli",
      "context_k": 200,
      "tags": ["chat", "persona"]
    },
    {
      "id": "m2",
      "label": "Gemini 2.5 Flash (OSIT)",
      "type": "gemini_api",
      "provider": "google",
      "model_name": "gemini-2.5-flash",
      "account_id": "osit",
      "context_k": 1000,
      "tags": ["orchestrator", "research"]
    },
    {
      "id": "m3",
      "label": "Gemma 4 E4B",
      "type": "local_openai",
      "provider": "local",
      "host_id": "h1",
      "model_name": "gemma4:e4b",
      "context_k": 72,
      "tags": ["fast", "local"]
    }
  ],

  "roles": {
    "chat":        {"primary": "m1", "backup_1": "m2", "backup_2": "m3"},
    "orchestrator":{"primary": "m2", "backup_1": "m3"},
    "distill":     {"primary": "m1"}
  }
}

Key differences from V1

V1	V2
Built-ins (`claude_cli`, `gemini_api`) are hardcoded constants	All models are registry entries — built-ins become auto-populated defaults
Single Gemini API key in `auth.json`	`providers.google.accounts[]` — list of accounts
Role assignments only work with local models in UI	All models in all roles
Host list only for local	Host list stays for local; `providers` section for cloud
`type` field existed but only `local_openai` was user-configurable	`type` fully determines dispatch for all models

Resolution Logic (updated)

get_model_for_role(username, role) stays the same interface. Internally:

Walk roles[role].primary → backup_1 → backup_2 → backup_3 → backup_4
For each slot: resolve the model entry → merge in credentials
If no registry entry for a role: fall back to .env defaults, then hardcoded

_resolve_model(registry, model_id) gains new merge cases:

type == "claude_cli" → merge in credential from providers.anthropic.credentials
type == "gemini_api" → merge in api_key from providers.google.accounts[account_id]
type == "local_openai" → merge host fields (existing logic, unchanged)

Backend toggle → dispatch

UI sends: slot = "primary" | "backup_1" | "backup_2" | null (auto)

llm_client.complete() resolves the slot against the chat role, gets a full model config, dispatches by type. No more "claude"/"gemini"/"local" string matching.

Routing Code Changes

`llm_client.complete()`

Remove: model: str | None → service type string
Add: slot: str | None = None → role slot override ("primary"/"backup_1"/etc.)
Dispatch table: type → handler
- claude_cli → _claude() (unchanged)
- claude_api → _claude_api() (new, direct Anthropic API — future phase)
- gemini_cli → _gemini() (unchanged)
- gemini_api → _gemini_api() (new, replaces current hardcoded gemini_api built-in)
- local_openai → _local() (unchanged)

`orchestrator_engine.py` / `openai_orchestrator.py`

Get orchestrator model via get_model_for_role(username, "orchestrator")
Already works — openai_orchestrator.py runs when type is local_openai
orchestrator_engine.py (Gemini) runs when type is gemini_api

Chat router (`routers/chat.py`)

Accept slot instead of model from UI
Pass to llm_client.complete(slot=slot)

Settings UI Redesign

New page structure

/settings/models     ← unified model registry (replaces /settings/local)
  Section 1: Cloud Providers
    Anthropic
      - credential: Claude CLI (OAuth) [default, always there]
      - + Add API Key (future)
      - model catalog [editable list of available Claude models]
    Google
      - accounts: [osit key ●●●●, + Add account]
      - model catalog [editable list of available Gemini models]
  Section 2: Local Hosts
    [existing host cards, unchanged]
  Section 3: Models  
    [unified list — all registered model entries across all providers]
    + Add Model (provider picker first, then model + credential/account dropdowns)

/settings/roles      ← standalone page (or promoted to /settings/models bottom)
  Role Assignments
    chat:         [primary ▾] [backup 1 ▾] [backup 2 ▾]
    orchestrator: [primary ▾] [backup 1 ▾]
    distill:      [primary ▾]
    (all dropdowns show all models from all providers)

Backend toggle in chat UI

Replace the claude → gemini → local → auto cycle with:

[Model label] ▾ (clickable cycles through chat role slots)

Shows the label of the currently active chat model
Click cycles: Primary → Backup 1 → Backup 2 → Primary
Slots with no model assigned are skipped
Color: same purple/amber/slate theme, based on provider type (optional)

Migration

V1 → V2 is handled in _load():

Detect version == 1 (or missing)
Synthesize providers.anthropic catalog from hardcoded defaults
Synthesize providers.google — migrate API key from auth.json as first account
Convert built-in role assignments (claude_cli / gemini_api) to new model entry IDs
Existing hosts[] and local_openai models carry over unchanged
Write version: 2 and save

No data loss. Old local_llm.json migration path still works (V0 → V1 → V2).

Phases

Phase 1 — Data model + backend routing (no UI changes yet)

Extend schema to V2 in model_registry.py
Migration from V1 on first load
Update _resolve_model() to handle gemini_api + account lookup
Update llm_client.complete() to accept slot parameter
Update routers/chat.py to pass slot instead of backend string
Keep backend toggle UI working (map old strings to slots temporarily)
Deliverable: routing works with multi-account Gemini, no UI changes needed yet

Phase 2 — Cloud provider UI

Add Anthropic and Google sections to /settings/local (rename to /settings/models)
Google accounts: add/remove API keys with labels
Editable model catalog for Anthropic + Google (add/remove model IDs from the list)
Model entry creation: provider picker → model dropdown (from catalog) → account/credential picker
Deliverable: can register cloud models in the UI just like local models

Phase 3 — Unified role assignments + toggle redesign

Promote role assignments to standalone /settings/roles page (or /settings/models bottom)
All models from all providers appear in role selects
Chat UI toggle: replace service-type cycle with slot cycle, show model label
Deliverable: end-to-end unified experience

Phase 4 — Polish + future providers

Claude direct API key support (optional, CLI is fine for now)
OpenRouter as a named provider (already works as a "local" host with host_type=openai — could be promoted)
Model catalog sync: fetch available models from Anthropic/Google API if keys are present
Per-role "test" button in role assignments UI

Open Questions

Claude direct API key: Is this needed now, or is CLI OAuth sufficient for all users?
- Decision: CLI-only for Phase 1; add API key support in Phase 4 if needed
Catalog management: Should the Anthropic/Google catalogs be server-wide defaults that users can extend, or fully per-user?
- Recommendation: ship sensible defaults in code (updated with each deploy); users can add custom entries if needed
Toggle UX: Cycle through slot labels ("Primary / Backup 1 / Backup 2") or cycle through model labels ("Sonnet 4.6 / Gemini 2.5 Flash / Gemma 4")?
- Model labels are more useful — clearer what you're switching to
Orchestrator mode toggle: Does agent mode also respect the slot toggle, or is it always "use orchestrator role"?
- Keep orchestrator role separate; the UI toggle only affects chat role

11 KiB Raw Blame History