feat: SSH dev routing, model registry UX, chat input toolbar, doc sync
Backend / infrastructure:
- cortex/tools/_projects.py (new): shared project alias registry with ssh_host
for workstation projects (aether_api, aether_frontend, aether_container)
- cortex/tools/git.py: all git tools route to workstation via SSH when ssh_host set
- cortex/tools/aider.py: aider_run SSH-routes to workstation using bash -l -c
- cortex/routers/local_llm.py: POST /api/models/{id}/edit AJAX endpoint — save
model edits without page reload or tab reset; returns JSON {ok, label, model_name}
- cortex/llm_client.py: remove Gemini CLI and Claude CLI backends; clean up
fallback chain and process group tracking (continuation of Gemini CLI removal)
- cortex/routers/auth.py: strip Claude/Gemini CLI auth status checks (CLI removed)
- cortex/routers/chat.py: remove legacy claude/gemini backend fields
- cortex/config.py: clean up CLI-related settings
- cortex/main.py: remove CLI lifecycle hooks
UI:
- cortex/static/local_llm.html: model edit forms now save via fetch() + toast;
stay on Models tab; update row header label in place on success
- cortex/static/index.html: restructure input area to column layout — textarea
above, compact toolbar below (Chat/Tools/Attach + Send); fixes dead space at
M/L/XL sizes; context panel "Role" → "Model" section label
- cortex/static/style.css: column input-area layout; #input-toolbar; flex:1 →
width:100% on textarea (fixes scrollHeight in column flex context); compact
send/stop button padding
- cortex/static/app.js: add XL (720px) to height cycle; default M (240px)
Docs:
- cortex/static/HELP.md: S/M/L → S/M/L/XL; add Rebuild to distill table; fix
"Role selector" references (no such UI); fix "your active role" → Chat role;
fix ⚡ toggle description; Model Registry section cleanup
- documentation/ARCH__BACKENDS.md: reflect CLI removal, current backend state
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -3,7 +3,7 @@ from pydantic_settings import BaseSettings, SettingsConfigDict
|
|||||||
|
|
||||||
|
|
||||||
class Settings(BaseSettings):
|
class Settings(BaseSettings):
|
||||||
anthropic_api_key: str | None = None # not used — claude CLI handles auth
|
anthropic_api_key: str | None = None # not used — configure via model registry
|
||||||
|
|
||||||
# Google OAuth — "Sign in with Google" for all users
|
# Google OAuth — "Sign in with Google" for all users
|
||||||
# Create credentials at console.cloud.google.com → APIs & Services → Credentials
|
# Create credentials at console.cloud.google.com → APIs & Services → Credentials
|
||||||
@@ -38,7 +38,6 @@ class Settings(BaseSettings):
|
|||||||
default_model: str = "claude-sonnet-4-6"
|
default_model: str = "claude-sonnet-4-6"
|
||||||
default_tier: int = 2
|
default_tier: int = 2
|
||||||
max_history_messages: int = 40 # rolling window — 20 turns (user + assistant)
|
max_history_messages: int = 40 # rolling window — 20 turns (user + assistant)
|
||||||
primary_backend: str = "claude" # "claude" | "local" — gemini CLI removed June 2026
|
|
||||||
|
|
||||||
# Local model backend — OpenAI-compatible API (Open WebUI / Ollama)
|
# Local model backend — OpenAI-compatible API (Open WebUI / Ollama)
|
||||||
# Set LOCAL_API_URL in .env to enable; leave blank to disable
|
# Set LOCAL_API_URL in .env to enable; leave blank to disable
|
||||||
@@ -46,9 +45,6 @@ class Settings(BaseSettings):
|
|||||||
local_api_key: str = "" # sk-... from Open WebUI → Settings → Account → API Keys
|
local_api_key: str = "" # sk-... from Open WebUI → Settings → Account → API Keys
|
||||||
local_model: str = "" # workspace or model name, e.g. test-agent-simple
|
local_model: str = "" # workspace or model name, e.g. test-agent-simple
|
||||||
|
|
||||||
# Per-backend timeouts in seconds
|
|
||||||
timeout_claude: int = 60
|
|
||||||
timeout_gemini: int = 120 # frequently slow under load
|
|
||||||
timeout_local: int = 300 # local models may need to load first
|
timeout_local: int = 300 # local models may need to load first
|
||||||
|
|
||||||
# Auto-distillation schedule — override in .env
|
# Auto-distillation schedule — override in .env
|
||||||
@@ -66,14 +62,13 @@ class Settings(BaseSettings):
|
|||||||
distill_backend_long: str = ""
|
distill_backend_long: str = ""
|
||||||
|
|
||||||
# Model registry: default backend type per role when user registry has no entry.
|
# Model registry: default backend type per role when user registry has no entry.
|
||||||
# Values: "claude_cli" | "gemini_cli" | "gemini_api" (builtin IDs)
|
# All roles must be configured via /settings/models — no built-in fallback.
|
||||||
# Override in .env: ROLE_CHAT=claude_cli ROLE_DISTILL=gemini_api etc.
|
role_chat: str = ""
|
||||||
role_chat: str = "claude_cli"
|
role_orchestrator: str = ""
|
||||||
role_orchestrator: str = "gemini_api"
|
role_distill: str = ""
|
||||||
role_distill: str = "claude_cli"
|
role_janitor: str = ""
|
||||||
role_janitor: str = "claude_cli" # assign a cheap/fast model: Haiku 4.5, local Gemma E4B
|
role_coder: str = ""
|
||||||
role_coder: str = "claude_cli"
|
role_research: str = ""
|
||||||
role_research: str = "gemini_api"
|
|
||||||
|
|
||||||
# Comma-separated list of standard roles shown in the model settings UI.
|
# Comma-separated list of standard roles shown in the model settings UI.
|
||||||
# Add custom roles here to extend the UI without code changes.
|
# Add custom roles here to extend the UI without code changes.
|
||||||
@@ -122,8 +117,8 @@ class Settings(BaseSettings):
|
|||||||
return [r.strip() for r in self.defined_roles.split(",") if r.strip()]
|
return [r.strip() for r in self.defined_roles.split(",") if r.strip()]
|
||||||
|
|
||||||
def get_role_default(self, role: str) -> str:
|
def get_role_default(self, role: str) -> str:
|
||||||
"""Return the .env default backend type for a role (e.g. 'claude_cli')."""
|
"""Return the .env default backend type for a role, or '' if unconfigured."""
|
||||||
return getattr(self, f"role_{role.replace('-', '_')}", "claude_cli")
|
return getattr(self, f"role_{role.replace('-', '_')}", "")
|
||||||
|
|
||||||
def home_root(self) -> Path:
|
def home_root(self) -> Path:
|
||||||
"""Resolve home_dir relative to this file's location if not absolute."""
|
"""Resolve home_dir relative to this file's location if not absolute."""
|
||||||
|
|||||||
@@ -1,50 +1,18 @@
|
|||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import os
|
|
||||||
import signal
|
|
||||||
import subprocess
|
|
||||||
from config import settings
|
from config import settings
|
||||||
import event_bus
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Track active Gemini process group IDs so we can kill them on shutdown
|
|
||||||
_active_pgroups: set[int] = set()
|
|
||||||
|
|
||||||
|
|
||||||
def _register_pgroup(pid: int) -> None:
|
|
||||||
_active_pgroups.add(pid)
|
|
||||||
|
|
||||||
|
|
||||||
def _unregister_pgroup(pid: int) -> None:
|
|
||||||
_active_pgroups.discard(pid)
|
|
||||||
|
|
||||||
|
|
||||||
async def cleanup() -> None:
|
|
||||||
"""Kill any lingering Gemini process groups. Call from lifespan shutdown."""
|
|
||||||
for pid in list(_active_pgroups):
|
|
||||||
try:
|
|
||||||
os.killpg(pid, signal.SIGKILL)
|
|
||||||
logger.info("Shutdown: killed Gemini process group %d", pid)
|
|
||||||
except ProcessLookupError:
|
|
||||||
pass
|
|
||||||
_active_pgroups.clear()
|
|
||||||
|
|
||||||
|
|
||||||
# Map from registry model type → dispatch function key
|
|
||||||
_TYPE_TO_BACKEND = {
|
_TYPE_TO_BACKEND = {
|
||||||
"claude_cli": "claude",
|
|
||||||
"gemini_cli": "gemini", # Gemini CLI is being replaced by Antigravity CLI (June 2026)
|
|
||||||
"gemini_api": "gemini", # routes to CLI subprocess — no users configured; kept for compat
|
|
||||||
"local_openai": "local",
|
"local_openai": "local",
|
||||||
"anthropic_api": "anthropic_api",
|
"anthropic_api": "anthropic_api",
|
||||||
}
|
}
|
||||||
|
|
||||||
# Explicit UI toggle values (kept for backward compat)
|
_FALLBACK: dict[str, str | None] = {
|
||||||
_EXPLICIT_BACKENDS = ("claude", "gemini", "local")
|
"local": None,
|
||||||
# Gemini CLI removed from the claude fallback — it's shutting down June 18 2026.
|
"anthropic_api": None,
|
||||||
# claude failures now surface directly; gemini backend still falls back to claude.
|
}
|
||||||
_FALLBACK: dict[str, str | None] = {"claude": None, "gemini": "claude", "local": "claude", "anthropic_api": "claude"}
|
|
||||||
|
|
||||||
|
|
||||||
async def complete(
|
async def complete(
|
||||||
@@ -55,16 +23,15 @@ async def complete(
|
|||||||
slot: str | None = None,
|
slot: str | None = None,
|
||||||
max_tokens: int = 2048,
|
max_tokens: int = 2048,
|
||||||
attachment: dict | None = None,
|
attachment: dict | None = None,
|
||||||
token_sink=None, # async (str) -> None; if set, stream tokens as they arrive
|
token_sink=None,
|
||||||
) -> tuple[str, str]:
|
) -> tuple[str, str]:
|
||||||
"""
|
"""
|
||||||
Returns (response_text, actual_backend_used).
|
Returns (response_text, actual_backend_used).
|
||||||
|
|
||||||
slot: Phase 3 — specific role slot ("primary" | "backup_1" | "backup_2").
|
slot: explicit role slot ("primary" | "backup_1" | "backup_2").
|
||||||
Resolves that exact slot, no fallback chain. Takes priority over model.
|
Resolves that exact slot, no fallback chain. Takes priority over role.
|
||||||
model: legacy backend override ("claude" | "gemini" | "local") from old toggle.
|
role: registry role used for auto routing (default: "chat").
|
||||||
None = resolve via model registry for the given role.
|
model: ignored — kept for API compatibility; routing is via slot/role only.
|
||||||
role: registry role used for slot/auto routing (default: "chat").
|
|
||||||
"""
|
"""
|
||||||
import model_registry as _reg
|
import model_registry as _reg
|
||||||
from persona import _user
|
from persona import _user
|
||||||
@@ -73,46 +40,33 @@ async def complete(
|
|||||||
resolved_cfg: dict | None = None
|
resolved_cfg: dict | None = None
|
||||||
|
|
||||||
if slot is not None:
|
if slot is not None:
|
||||||
# Phase 3: explicit slot selection — no fallback within the role
|
|
||||||
resolved_cfg = _reg.get_model_for_slot(username, role, slot)
|
resolved_cfg = _reg.get_model_for_slot(username, role, slot)
|
||||||
if resolved_cfg:
|
if resolved_cfg:
|
||||||
primary = _TYPE_TO_BACKEND.get(resolved_cfg["type"], "claude")
|
primary = _TYPE_TO_BACKEND.get(resolved_cfg["type"], "local")
|
||||||
else:
|
else:
|
||||||
# Slot not configured — fall through to auto routing
|
|
||||||
slot = None
|
slot = None
|
||||||
|
|
||||||
if slot is None:
|
if slot is None:
|
||||||
if model in _EXPLICIT_BACKENDS:
|
|
||||||
# Legacy: explicit backend override from old UI toggle
|
|
||||||
if model == "local":
|
|
||||||
resolved_cfg = _reg.get_best_local_model(username, role)
|
|
||||||
if not resolved_cfg:
|
|
||||||
raise RuntimeError("No local model configured — add one at /settings/models")
|
|
||||||
primary = model
|
|
||||||
else:
|
|
||||||
# Auto: role-based routing via model registry
|
|
||||||
resolved = _reg.get_model_for_role(username, role)
|
resolved = _reg.get_model_for_role(username, role)
|
||||||
if resolved:
|
if resolved:
|
||||||
resolved_cfg = resolved
|
resolved_cfg = resolved
|
||||||
primary = _TYPE_TO_BACKEND.get(resolved["type"], "claude")
|
primary = _TYPE_TO_BACKEND.get(resolved["type"], "local")
|
||||||
else:
|
else:
|
||||||
primary = settings.primary_backend
|
raise RuntimeError(
|
||||||
|
f"No model configured for role '{role}'. "
|
||||||
|
"Add one at /settings/models."
|
||||||
|
)
|
||||||
|
|
||||||
fallback = _FALLBACK.get(primary, "claude")
|
fallback = _FALLBACK.get(primary)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
response = await _dispatch(primary, system_prompt, messages, resolved_cfg,
|
response = await _dispatch(primary, system_prompt, messages, resolved_cfg,
|
||||||
attachment=attachment, token_sink=token_sink)
|
attachment=attachment, token_sink=token_sink)
|
||||||
return response, primary
|
return response, primary
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
err_str = str(e)
|
|
||||||
if primary == "claude" and any(k in err_str for k in ("401", "authenticate", "expired", "OAuth")):
|
|
||||||
await event_bus.publish({"type": "claude_auth_expired"})
|
|
||||||
# Surface errors when a model is explicitly configured or a specific slot was pinned.
|
|
||||||
if resolved_cfg is not None:
|
if resolved_cfg is not None:
|
||||||
logger.error("%s failed (no fallback — model explicitly configured): %s", primary, e)
|
logger.error("%s failed (no fallback — model explicitly configured): %s", primary, e)
|
||||||
raise
|
raise
|
||||||
# No fallback defined for this backend — surface the error directly.
|
|
||||||
if not fallback:
|
if not fallback:
|
||||||
logger.error("%s failed (no fallback configured): %s", primary, e)
|
logger.error("%s failed (no fallback configured): %s", primary, e)
|
||||||
raise
|
raise
|
||||||
@@ -129,9 +83,7 @@ async def _dispatch(
|
|||||||
attachment: dict | None = None,
|
attachment: dict | None = None,
|
||||||
token_sink=None,
|
token_sink=None,
|
||||||
) -> str:
|
) -> str:
|
||||||
if backend == "gemini":
|
if backend == "local":
|
||||||
text = await _gemini(system_prompt, messages)
|
|
||||||
elif backend == "local":
|
|
||||||
if token_sink:
|
if token_sink:
|
||||||
return await _local_streaming(token_sink, system_prompt, messages, model_cfg)
|
return await _local_streaming(token_sink, system_prompt, messages, model_cfg)
|
||||||
text = await _local(system_prompt, messages, model_cfg, attachment=attachment)
|
text = await _local(system_prompt, messages, model_cfg, attachment=attachment)
|
||||||
@@ -140,55 +92,12 @@ async def _dispatch(
|
|||||||
return await _anthropic_api_streaming(token_sink, system_prompt, messages, model_cfg)
|
return await _anthropic_api_streaming(token_sink, system_prompt, messages, model_cfg)
|
||||||
text = await _anthropic_api(system_prompt, messages, model_cfg)
|
text = await _anthropic_api(system_prompt, messages, model_cfg)
|
||||||
else:
|
else:
|
||||||
text = await _claude(system_prompt, messages, model_cfg)
|
raise RuntimeError(f"Unknown backend '{backend}' — check model type in registry")
|
||||||
# For non-streaming backends when token_sink is provided, emit the full text as one chunk.
|
|
||||||
if token_sink and text:
|
if token_sink and text:
|
||||||
await token_sink(text)
|
await token_sink(text)
|
||||||
return text
|
return text
|
||||||
|
|
||||||
|
|
||||||
def _fresh_claude_token() -> str | None:
|
|
||||||
"""Read the current OAuth access token from the Claude credentials file.
|
|
||||||
|
|
||||||
The token in the systemd .env goes stale (it rotates on each login).
|
|
||||||
Reading directly from ~/.claude/.credentials.json always gets the latest.
|
|
||||||
"""
|
|
||||||
import json as _json
|
|
||||||
creds_path = os.path.expanduser("~/.claude/.credentials.json")
|
|
||||||
try:
|
|
||||||
with open(creds_path) as f:
|
|
||||||
data = _json.load(f)
|
|
||||||
return data["claudeAiOauth"]["accessToken"]
|
|
||||||
except Exception as e:
|
|
||||||
logger.debug("Could not read Claude credentials file: %s", e)
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
async def _claude(system_prompt: str, messages: list[dict], model_cfg: dict | None) -> str:
|
|
||||||
model_name = (model_cfg or {}).get("model_name") if model_cfg else None
|
|
||||||
cmd = [
|
|
||||||
"claude", "--print",
|
|
||||||
"--no-session-persistence",
|
|
||||||
"--output-format", "text",
|
|
||||||
]
|
|
||||||
# Only pass --model if it's a real model name (not a backend type string)
|
|
||||||
if model_name and model_name not in ("claude", "gemini", "local", ""):
|
|
||||||
cmd.extend(["--model", model_name])
|
|
||||||
if system_prompt:
|
|
||||||
cmd.extend(["--system-prompt", system_prompt])
|
|
||||||
cmd.append(_build_conversation(messages))
|
|
||||||
|
|
||||||
# Always use the freshest token from the credentials file so the systemd
|
|
||||||
# service doesn't break when the env-var token rotates after a login.
|
|
||||||
env = os.environ.copy()
|
|
||||||
token = _fresh_claude_token()
|
|
||||||
if token:
|
|
||||||
env["CLAUDE_CODE_OAUTH_TOKEN"] = token
|
|
||||||
env.pop("ANTHROPIC_API_KEY", None) # never let a stale API key override OAuth
|
|
||||||
|
|
||||||
return await _run(cmd, timeout=settings.timeout_claude, env=env)
|
|
||||||
|
|
||||||
|
|
||||||
async def _local(
|
async def _local(
|
||||||
system_prompt: str,
|
system_prompt: str,
|
||||||
messages: list[dict],
|
messages: list[dict],
|
||||||
@@ -413,106 +322,3 @@ async def _local_streaming(
|
|||||||
return full_text.strip()
|
return full_text.strip()
|
||||||
|
|
||||||
|
|
||||||
async def _gemini(system_prompt: str, messages: list[dict]) -> str:
|
|
||||||
# Gemini CLI spawns MCP child processes that keep stdout pipes open after responding.
|
|
||||||
# start_new_session=True puts the whole tree in its own process group so
|
|
||||||
# os.killpg kills everything at once on timeout.
|
|
||||||
cmd = [
|
|
||||||
"gemini",
|
|
||||||
"--output-format", "text",
|
|
||||||
"--extensions", "", # disable all extensions — prevents MCP child processes
|
|
||||||
"-p", _build_prompt(system_prompt, messages),
|
|
||||||
]
|
|
||||||
|
|
||||||
try:
|
|
||||||
proc = await asyncio.create_subprocess_exec(
|
|
||||||
*cmd,
|
|
||||||
stdout=asyncio.subprocess.PIPE,
|
|
||||||
stderr=asyncio.subprocess.PIPE,
|
|
||||||
start_new_session=True,
|
|
||||||
)
|
|
||||||
except FileNotFoundError:
|
|
||||||
raise RuntimeError("gemini not found in PATH")
|
|
||||||
|
|
||||||
_register_pgroup(proc.pid)
|
|
||||||
timeout = settings.timeout_gemini
|
|
||||||
try:
|
|
||||||
stdout_bytes, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout)
|
|
||||||
raw = stdout_bytes.decode()
|
|
||||||
except asyncio.TimeoutError:
|
|
||||||
try:
|
|
||||||
os.killpg(proc.pid, signal.SIGKILL)
|
|
||||||
except ProcessLookupError:
|
|
||||||
pass
|
|
||||||
raise RuntimeError(f"Gemini timed out after {timeout}s")
|
|
||||||
except asyncio.CancelledError:
|
|
||||||
try:
|
|
||||||
os.killpg(proc.pid, signal.SIGKILL)
|
|
||||||
except ProcessLookupError:
|
|
||||||
pass
|
|
||||||
raise
|
|
||||||
finally:
|
|
||||||
_unregister_pgroup(proc.pid)
|
|
||||||
|
|
||||||
clean = _clean_gemini_output(raw)
|
|
||||||
if not clean:
|
|
||||||
raise RuntimeError("Gemini returned an empty response")
|
|
||||||
return clean
|
|
||||||
|
|
||||||
|
|
||||||
# Lines Gemini CLI writes to stdout that are not part of the actual response
|
|
||||||
_GEMINI_NOISE = (
|
|
||||||
"Loaded cached credentials",
|
|
||||||
"Loading extension:",
|
|
||||||
"Server '",
|
|
||||||
"Listening for",
|
|
||||||
"Model is overloaded",
|
|
||||||
"High demand",
|
|
||||||
"Retrying",
|
|
||||||
"retrying",
|
|
||||||
"429",
|
|
||||||
"quota",
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _clean_gemini_output(text: str) -> str:
|
|
||||||
lines = [
|
|
||||||
line for line in text.splitlines()
|
|
||||||
if not any(line.strip().startswith(p) for p in _GEMINI_NOISE)
|
|
||||||
]
|
|
||||||
return "\n".join(lines).strip()
|
|
||||||
|
|
||||||
|
|
||||||
async def _run(cmd: list[str], timeout: int = 60, env: dict | None = None) -> str:
|
|
||||||
loop = asyncio.get_running_loop()
|
|
||||||
result = await loop.run_in_executor(
|
|
||||||
None,
|
|
||||||
lambda: subprocess.run(cmd, capture_output=True, text=True, timeout=timeout, env=env),
|
|
||||||
)
|
|
||||||
if result.returncode != 0:
|
|
||||||
detail = result.stderr.strip() or result.stdout.strip() or f"exit code {result.returncode}"
|
|
||||||
raise RuntimeError(f"{cmd[0]} failed: {detail}")
|
|
||||||
return result.stdout.strip()
|
|
||||||
|
|
||||||
|
|
||||||
def _build_conversation(messages: list[dict]) -> str:
|
|
||||||
"""Conversation only — used for Claude (system prompt passed separately)."""
|
|
||||||
parts = []
|
|
||||||
prior = messages[:-1]
|
|
||||||
if prior:
|
|
||||||
history_lines = []
|
|
||||||
for msg in prior:
|
|
||||||
label = settings.user_name if msg["role"] == "user" else settings.agent_name
|
|
||||||
history_lines.append(f"{label}: {msg['content']}")
|
|
||||||
parts.append("<conversation>\n" + "\n\n".join(history_lines) + "\n</conversation>")
|
|
||||||
parts.append(messages[-1]["content"] if messages else "")
|
|
||||||
return "\n\n".join(parts)
|
|
||||||
|
|
||||||
|
|
||||||
def _build_prompt(system_prompt: str, messages: list[dict]) -> str:
|
|
||||||
"""Full prompt with system context embedded — used for Gemini."""
|
|
||||||
parts = []
|
|
||||||
if system_prompt:
|
|
||||||
parts.append(f"<system>\n{system_prompt}\n</system>")
|
|
||||||
parts.append(_build_conversation(messages))
|
|
||||||
return "\n\n".join(parts)
|
|
||||||
|
|||||||
@@ -18,8 +18,6 @@ async def lifespan(app: FastAPI):
|
|||||||
scheduler.start()
|
scheduler.start()
|
||||||
yield
|
yield
|
||||||
scheduler.stop()
|
scheduler.stop()
|
||||||
from llm_client import cleanup
|
|
||||||
await cleanup()
|
|
||||||
|
|
||||||
|
|
||||||
app = FastAPI(title="Cortex Dispatcher", lifespan=lifespan)
|
app = FastAPI(title="Cortex Dispatcher", lifespan=lifespan)
|
||||||
|
|||||||
@@ -1,76 +1,12 @@
|
|||||||
"""
|
"""
|
||||||
CLI auth status for both Claude and Gemini backends.
|
GET /auth/status — returns connectivity status for configured model backends.
|
||||||
|
|
||||||
GET /auth/status — returns per-backend auth info and warning flags
|
|
||||||
|
|
||||||
Claude: warns when OAuth token is < WARN_HOURS from expiry (requires
|
|
||||||
user to re-run `claude` to refresh via browser flow).
|
|
||||||
Gemini: warns only when oauth_creds.json is missing or has no
|
|
||||||
refresh_token (access token rotates automatically every ~1h).
|
|
||||||
"""
|
"""
|
||||||
import json
|
|
||||||
import logging
|
import logging
|
||||||
from datetime import datetime, timezone
|
|
||||||
from pathlib import Path
|
|
||||||
from fastapi import APIRouter
|
from fastapi import APIRouter
|
||||||
from config import settings
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
router = APIRouter(prefix="/auth")
|
router = APIRouter(prefix="/auth")
|
||||||
|
|
||||||
CLAUDE_CREDS = Path.home() / ".claude" / ".credentials.json"
|
|
||||||
GEMINI_CREDS = Path.home() / ".gemini" / "oauth_creds.json"
|
|
||||||
GEMINI_ACCTS = Path.home() / ".gemini" / "google_accounts.json"
|
|
||||||
WARN_HOURS = 24 # no refresh token — warn a day ahead
|
|
||||||
WARN_HOURS_REFRESH = 1 # refresh token present — only warn if CLI hasn't rotated in time
|
|
||||||
|
|
||||||
|
|
||||||
def _claude_status() -> dict:
|
|
||||||
try:
|
|
||||||
data = json.loads(CLAUDE_CREDS.read_text())
|
|
||||||
oauth = data["claudeAiOauth"]
|
|
||||||
has_refresh = bool(oauth.get("refreshToken"))
|
|
||||||
expires_dt = datetime.fromtimestamp(oauth["expiresAt"] / 1000, tz=timezone.utc)
|
|
||||||
now = datetime.now(tz=timezone.utc)
|
|
||||||
hours_remaining = (expires_dt - now).total_seconds() / 3600
|
|
||||||
# When a refresh token is present the CLI *should* auto-rotate the access
|
|
||||||
# token, but sometimes it doesn't. Use a tight 1-hour window so a fresh
|
|
||||||
# 8-hour token doesn't immediately trigger a warning, but a stale token
|
|
||||||
# that the CLI missed will still surface before it expires.
|
|
||||||
expired = hours_remaining <= 0
|
|
||||||
threshold = WARN_HOURS_REFRESH if has_refresh else WARN_HOURS
|
|
||||||
warning = expired or hours_remaining < threshold
|
|
||||||
return {
|
|
||||||
"ok": True,
|
|
||||||
"has_refresh_token": has_refresh,
|
|
||||||
"access_token_expires_at": expires_dt.isoformat(),
|
|
||||||
"access_token_hours_remaining": round(hours_remaining, 1),
|
|
||||||
"warning": warning,
|
|
||||||
"expired": expired,
|
|
||||||
}
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning("claude auth check failed: %s", e)
|
|
||||||
return {"ok": False, "error": str(e), "warning": True, "expired": False}
|
|
||||||
|
|
||||||
|
|
||||||
def _gemini_status() -> dict:
|
|
||||||
try:
|
|
||||||
creds = json.loads(GEMINI_CREDS.read_text())
|
|
||||||
if not creds.get("refresh_token"):
|
|
||||||
return {"ok": True, "authenticated": False, "warning": True, "account": None}
|
|
||||||
account = None
|
|
||||||
try:
|
|
||||||
accts = json.loads(GEMINI_ACCTS.read_text())
|
|
||||||
account = accts.get("active")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
return {"ok": True, "authenticated": True, "warning": False, "account": account}
|
|
||||||
except FileNotFoundError:
|
|
||||||
return {"ok": True, "authenticated": False, "warning": True, "account": None}
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning("gemini auth check failed: %s", e)
|
|
||||||
return {"ok": False, "error": str(e), "warning": True, "authenticated": False}
|
|
||||||
|
|
||||||
|
|
||||||
async def _local_status(username: str = "scott") -> dict:
|
async def _local_status(username: str = "scott") -> dict:
|
||||||
"""Check reachability of the user's configured local model host."""
|
"""Check reachability of the user's configured local model host."""
|
||||||
@@ -104,7 +40,5 @@ async def _local_status(username: str = "scott") -> dict:
|
|||||||
@router.get("/status")
|
@router.get("/status")
|
||||||
async def auth_status() -> dict:
|
async def auth_status() -> dict:
|
||||||
return {
|
return {
|
||||||
"claude": _claude_status(),
|
|
||||||
"gemini": _gemini_status(),
|
|
||||||
"local": await _local_status(),
|
"local": await _local_status(),
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -21,11 +21,7 @@ router = APIRouter()
|
|||||||
|
|
||||||
|
|
||||||
def _backend_label(backend: str, username: str, role: str = "chat") -> str:
|
def _backend_label(backend: str, username: str, role: str = "chat") -> str:
|
||||||
"""Human-readable label for the model that handled a request (legacy path)."""
|
"""Human-readable label for the model that handled a request."""
|
||||||
if backend == "claude":
|
|
||||||
return "Claude"
|
|
||||||
if backend == "gemini":
|
|
||||||
return "Gemini"
|
|
||||||
if backend == "local":
|
if backend == "local":
|
||||||
cfg = model_registry.get_best_local_model(username, role)
|
cfg = model_registry.get_best_local_model(username, role)
|
||||||
if cfg:
|
if cfg:
|
||||||
@@ -52,7 +48,7 @@ class ChatRequest(BaseModel):
|
|||||||
message: str
|
message: str
|
||||||
session_id: str | None = None
|
session_id: str | None = None
|
||||||
tier: int | None = None
|
tier: int | None = None
|
||||||
model: str | None = None # legacy backend override ("claude"|"gemini"|"local")
|
model: str | None = None # ignored — kept for API compatibility
|
||||||
slot: str | None = None # Phase 3: explicit slot ("primary"|"backup_1"|"backup_2")
|
slot: str | None = None # Phase 3: explicit slot ("primary"|"backup_1"|"backup_2")
|
||||||
chat_role: str = "chat" # active role: "chat"|"coder"|"research"|"distill" etc.
|
chat_role: str = "chat" # active role: "chat"|"coder"|"research"|"distill" etc.
|
||||||
include_long: bool = True
|
include_long: bool = True
|
||||||
@@ -64,10 +60,6 @@ class ChatRequest(BaseModel):
|
|||||||
attachment: Attachment | None = None # image attachment (text files injected client-side)
|
attachment: Attachment | None = None # image attachment (text files injected client-side)
|
||||||
|
|
||||||
|
|
||||||
class BackendRequest(BaseModel):
|
|
||||||
primary: str # "claude", "gemini", or "local"
|
|
||||||
|
|
||||||
|
|
||||||
class NoteRequest(BaseModel):
|
class NoteRequest(BaseModel):
|
||||||
session_id: str
|
session_id: str
|
||||||
note: str
|
note: str
|
||||||
@@ -183,9 +175,6 @@ async def _stream_chat(req: ChatRequest):
|
|||||||
yield f"data: {json.dumps({'type': 'error', 'message': str(e)})}\n\n"
|
yield f"data: {json.dumps({'type': 'error', 'message': str(e)})}\n\n"
|
||||||
|
|
||||||
finally:
|
finally:
|
||||||
# Ensure the LLM task is cancelled if the generator is torn down
|
|
||||||
# (e.g. client disconnect or server shutdown). This propagates
|
|
||||||
# CancelledError into _gemini() which kills the process group.
|
|
||||||
if not task.done():
|
if not task.done():
|
||||||
task.cancel()
|
task.cancel()
|
||||||
try:
|
try:
|
||||||
@@ -203,10 +192,6 @@ async def chat(req: ChatRequest) -> StreamingResponse:
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
_BACKEND_CYCLE = ("claude", "gemini", "local")
|
|
||||||
_BACKEND_FALLBACK = {"claude": "gemini", "gemini": "claude", "local": "claude"}
|
|
||||||
|
|
||||||
|
|
||||||
def _request_user(request: Request) -> str | None:
|
def _request_user(request: Request) -> str | None:
|
||||||
"""Extract username from JWT cookie, or None."""
|
"""Extract username from JWT cookie, or None."""
|
||||||
try:
|
try:
|
||||||
@@ -216,20 +201,6 @@ def _request_user(request: Request) -> str | None:
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
def _local_model_info(request: Request) -> dict | None:
|
|
||||||
"""Return the best local model {label, model_name} for the session user, or None."""
|
|
||||||
username = _request_user(request)
|
|
||||||
if not username:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
cfg = model_registry.get_best_local_model(username, "chat")
|
|
||||||
if cfg:
|
|
||||||
return {"label": cfg.get("label", ""), "model_name": cfg.get("model_name", "")}
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _chat_slot_models(username: str) -> list[dict]:
|
def _chat_slot_models(username: str) -> list[dict]:
|
||||||
"""Return [{slot, label, type}] for each configured slot in the chat role, primary first."""
|
"""Return [{slot, label, type}] for each configured slot in the chat role, primary first."""
|
||||||
registry = model_registry.get_registry(username)
|
registry = model_registry.get_registry(username)
|
||||||
@@ -279,7 +250,6 @@ async def get_backend(request: Request) -> dict:
|
|||||||
username = _request_user(request)
|
username = _request_user(request)
|
||||||
chat_models = _chat_slot_models(username) if username else []
|
chat_models = _chat_slot_models(username) if username else []
|
||||||
available_roles = _available_roles_for_toggle(username) if username else []
|
available_roles = _available_roles_for_toggle(username) if username else []
|
||||||
p = settings.primary_backend
|
|
||||||
|
|
||||||
orch_label = None
|
orch_label = None
|
||||||
if username:
|
if username:
|
||||||
@@ -288,25 +258,9 @@ async def get_backend(request: Request) -> dict:
|
|||||||
orch_label = orch_cfg.get("label") or orch_cfg.get("model_name") or None
|
orch_label = orch_cfg.get("label") or orch_cfg.get("model_name") or None
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"chat_models": chat_models, # Phase 3: [{slot, label, type}] for chat-role slots
|
"chat_models": chat_models,
|
||||||
"available_roles": available_roles, # kept for banner + backward compat
|
"available_roles": available_roles,
|
||||||
"orchestrator_model": orch_label,
|
"orchestrator_model": orch_label,
|
||||||
# Legacy fields kept for backward compat
|
|
||||||
"primary": p,
|
|
||||||
"fallback": _BACKEND_FALLBACK.get(p, "claude"),
|
|
||||||
"local_model": _local_model_info(request),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
@router.post("/backend")
|
|
||||||
async def set_backend(req: BackendRequest, request: Request) -> dict:
|
|
||||||
if req.primary not in _BACKEND_CYCLE:
|
|
||||||
raise HTTPException(status_code=400, detail="primary must be 'claude', 'gemini', or 'local'")
|
|
||||||
settings.primary_backend = req.primary
|
|
||||||
return {
|
|
||||||
"primary": req.primary,
|
|
||||||
"fallback": _BACKEND_FALLBACK[req.primary],
|
|
||||||
"local_model": _local_model_info(request),
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -744,6 +744,53 @@ async def remove_custom_role_route(
|
|||||||
return RedirectResponse("/settings/models#roles", status_code=303)
|
return RedirectResponse("/settings/models#roles", status_code=303)
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/api/models/{model_id}/edit")
|
||||||
|
async def edit_model_ajax(
|
||||||
|
request: Request,
|
||||||
|
model_id: str,
|
||||||
|
mtype: str = Form(""),
|
||||||
|
label: str = Form(""),
|
||||||
|
model_name: str = Form(""),
|
||||||
|
context_k: int = Form(0),
|
||||||
|
max_rounds: int = Form(0),
|
||||||
|
tools: int = Form(1),
|
||||||
|
tags: str = Form(""),
|
||||||
|
reasoning_budget_tokens: int = Form(0),
|
||||||
|
host_id: str = Form(""),
|
||||||
|
account_id: str = Form(""),
|
||||||
|
credential_id: str = Form("cli"),
|
||||||
|
) -> JSONResponse:
|
||||||
|
"""AJAX: edit a model entry. Returns JSON {ok, label, model_name} on success."""
|
||||||
|
username = _get_user(request)
|
||||||
|
if not username:
|
||||||
|
return JSONResponse({"error": "Not authenticated"}, status_code=401)
|
||||||
|
if not model_name.strip():
|
||||||
|
return JSONResponse({"error": "Model name is required."}, status_code=400)
|
||||||
|
tag_list = [t.strip() for t in tags.split(",") if t.strip()]
|
||||||
|
max_rounds_ = max_rounds or None
|
||||||
|
tools_bool = tools != 0
|
||||||
|
reasoning_budget_ = reasoning_budget_tokens or None
|
||||||
|
if mtype == "local_openai":
|
||||||
|
if not host_id.strip():
|
||||||
|
return JSONResponse({"error": "Select a host for this model."}, status_code=400)
|
||||||
|
reg.save_model(username, model_id, host_id, label, model_name, context_k, tag_list,
|
||||||
|
max_rounds=max_rounds_, tools=tools_bool,
|
||||||
|
reasoning_budget_tokens=reasoning_budget_)
|
||||||
|
elif mtype == "gemini_api":
|
||||||
|
reg.save_cloud_model(username, model_id, "google", model_name, label,
|
||||||
|
account_id=account_id or None, context_k=context_k, tags=tag_list,
|
||||||
|
max_rounds=max_rounds_, tools=tools_bool)
|
||||||
|
elif mtype in ("claude_cli", "anthropic_api"):
|
||||||
|
reg.save_cloud_model(username, model_id, "anthropic", model_name, label,
|
||||||
|
credential_id=credential_id or "cli", context_k=context_k, tags=tag_list,
|
||||||
|
max_rounds=max_rounds_, tools=tools_bool)
|
||||||
|
else:
|
||||||
|
return JSONResponse({"error": f"Unknown model type: {mtype}"}, status_code=400)
|
||||||
|
display = label.strip() or model_name.strip()
|
||||||
|
logger.info("model edited (ajax): %s / %s (%s)", username, display, mtype)
|
||||||
|
return JSONResponse({"ok": True, "label": display, "model_name": model_name.strip()})
|
||||||
|
|
||||||
|
|
||||||
@router.post("/api/models/role")
|
@router.post("/api/models/role")
|
||||||
async def set_role(request: Request) -> JSONResponse:
|
async def set_role(request: Request) -> JSONResponse:
|
||||||
"""AJAX: assign a model to a role priority slot.
|
"""AJAX: assign a model to a role priority slot.
|
||||||
|
|||||||
@@ -6,7 +6,7 @@
|
|||||||
and are appended automatically by help.html when present.
|
and are appended automatically by help.html when present.
|
||||||
-->
|
-->
|
||||||
|
|
||||||
*Last updated: 2026-05-13*
|
*Last updated: 2026-06-18* <!-- input toolbar refactor; XL size added; help doc sync -->
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -44,7 +44,7 @@ The **Context & Memory** panel (sliders icon with tier number) contains all conf
|
|||||||
| **Memory Layers** | Toggle Long / Mid / Short memory on/off |
|
| **Memory Layers** | Toggle Long / Mid / Short memory on/off |
|
||||||
| **Distill Memory** | Manually trigger Short / Mid / Long / All distillation |
|
| **Distill Memory** | Manually trigger Short / Mid / Long / All distillation |
|
||||||
| **Model** | Active chat model — click to cycle through your configured slot models (Primary → Backup 1 → …) |
|
| **Model** | Active chat model — click to cycle through your configured slot models (Primary → Backup 1 → …) |
|
||||||
| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L** cycles input area height · **⌃↵** toggles send shortcut |
|
| **Display** | **Aa** cycles font size · **☾** toggles theme · **S/M/L/XL** cycles input area height · **⌃↵** toggles send shortcut |
|
||||||
|
|
||||||
All settings persist in `localStorage` across page refreshes.
|
All settings persist in `localStorage` across page refreshes.
|
||||||
|
|
||||||
@@ -74,7 +74,7 @@ The orchestrator runs a multi-step tool loop:
|
|||||||
3. The model produces the final user-facing reply — when the orchestrator role uses Gemini, Claude writes the final response; when it uses a local model, that same model writes it
|
3. The model produces the final user-facing reply — when the orchestrator role uses Gemini, Claude writes the final response; when it uses a local model, that same model writes it
|
||||||
4. Expandable tool-call cards appear above the response — click any card to see the arguments sent and the result returned
|
4. Expandable tool-call cards appear above the response — click any card to see the arguments sent and the result returned
|
||||||
|
|
||||||
The ⚡ toggle is **independent of the Role selector** — you can use any role (chat, coder, research, etc.) with or without tools. The orchestrator model is configured in **Account → Model Registry → Role Assignments → Orchestrator**.
|
The ⚡ toggle routes requests through the **Orchestrator** role model regardless of which chat model is active. Configure it in **Account → Model Registry → Role Assignments → Orchestrator**.
|
||||||
|
|
||||||
Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.
|
Tools mode is best for tasks requiring research, multi-step reasoning, or side effects (e.g. "search for X", "add a task", "what's on my list?", "append this to my journal"). Regular chat is faster for conversational turns.
|
||||||
|
|
||||||
@@ -156,7 +156,7 @@ Once installed, opening Cortex from the home screen or app launcher skips the br
|
|||||||
|
|
||||||
## Switching Models
|
## Switching Models
|
||||||
|
|
||||||
The **Model** button in the Context & Memory panel cycles through the slot models configured for your active role (Primary → Backup 1). Click it to switch between models mid-session.
|
The **Model** button in the Context & Memory panel cycles through the slot models configured for your **Chat** role (Primary → Backup 1). Click it to switch between models mid-session.
|
||||||
|
|
||||||
- The button label shows the active model (e.g. "GPT-4o", "Gemini 2.5 Flash")
|
- The button label shows the active model (e.g. "GPT-4o", "Gemini 2.5 Flash")
|
||||||
- The selected slot is sent with each chat request so the correct model is used
|
- The selected slot is sent with each chat request so the correct model is used
|
||||||
@@ -205,12 +205,11 @@ The table shows all-time totals per model key, with columns for:
|
|||||||
|
|
||||||
Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).
|
Values ≥ 1,000 are displayed as `k` (e.g. `24.3k`).
|
||||||
|
|
||||||
**What is and isn't tracked:**
|
**What is tracked:**
|
||||||
|
|
||||||
- ✅ Gemini API calls (orchestrator, distillation)
|
- ✅ Anthropic API calls (direct SDK)
|
||||||
- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
|
- ✅ Local OpenAI-compatible calls (Open WebUI, Ollama, OpenRouter)
|
||||||
- ✗ Claude CLI — no structured token data is returned by the subprocess
|
- ✅ Gemini API calls (orchestrator, distillation)
|
||||||
- ✗ Gemini CLI — same reason
|
|
||||||
|
|
||||||
The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.
|
The raw data lives in `home/{username}/usage.json` and is also accessible via the Files panel or the API.
|
||||||
|
|
||||||
@@ -230,9 +229,10 @@ Configure which AI models are available and which handles each task type.
|
|||||||
|
|
||||||
Do this before adding models — models need a provider account or local host to attach to.
|
Do this before adding models — models need a provider account or local host to attach to.
|
||||||
|
|
||||||
**Anthropic (Claude):** Two options:
|
**Anthropic (Claude):** Uses a direct API key — no Claude CLI required:
|
||||||
- **CLI (OAuth):** Nothing to configure — uses your existing `claude auth login` session. If Claude isn't working, run `claude auth login` in a terminal.
|
- Scroll to **Cloud Providers → Anthropic** → click **+ Add API key**
|
||||||
- **Direct API key:** Scroll to **Cloud Providers → Anthropic** → click **+ Add API key**. Enter a label and your `sk-ant-…` key from [console.anthropic.com/keys](https://console.anthropic.com/keys). When you add a model using an API key credential, it routes through the Anthropic SDK instead of the CLI.
|
- Enter a label and your `sk-ant-…` key from [console.anthropic.com/keys](https://console.anthropic.com/keys)
|
||||||
|
- Models added with this credential call the Anthropic API directly via the SDK
|
||||||
|
|
||||||
**Google (Gemini):** Add one entry per API key you want to use:
|
**Google (Gemini):** Add one entry per API key you want to use:
|
||||||
1. Scroll to **Cloud Providers → Google** → click **+ Add Google account**
|
1. Scroll to **Cloud Providers → Google** → click **+ Add Google account**
|
||||||
@@ -261,7 +261,7 @@ Scroll to **Add Model**. Select the provider tab, fill in the details, click **A
|
|||||||
|---|---|
|
|---|---|
|
||||||
| **Local** | Select a host (from Step 1) → enter model name, or use **Fetch from host** to pick from a live list |
|
| **Local** | Select a host (from Step 1) → enter model name, or use **Fetch from host** to pick from a live list |
|
||||||
| **Google** | Select a Gemini model from the catalog → select a Google account (from Step 1) |
|
| **Google** | Select a Gemini model from the catalog → select a Google account (from Step 1) |
|
||||||
| **Anthropic** | Select a credential (CLI OAuth or an API key added in Step 1) → select a Claude model from the catalog |
|
| **Anthropic** | Select an API key credential (from Step 1) → select a Claude model from the catalog |
|
||||||
|
|
||||||
The label and context window size auto-fill from the catalog — edit them if you want. Tags are optional.
|
The label and context window size auto-fill from the catalog — edit them if you want. Tags are optional.
|
||||||
|
|
||||||
@@ -286,7 +286,7 @@ Scroll to **Role Assignments** at the bottom of the page. Each role has **Primar
|
|||||||
| **Coder** | Code-focused tasks — larger context window, code-aware model |
|
| **Coder** | Code-focused tasks — larger context window, code-aware model |
|
||||||
| **Research** | Long-context research — high-token model, web tools prioritized |
|
| **Research** | Long-context research — high-token model, web tools prioritized |
|
||||||
|
|
||||||
Switch roles via the **Role** selector in the Context & Memory panel (⚙). Leave all slots empty to use the server default.
|
Leave all slots empty to use the server default.
|
||||||
|
|
||||||
**Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).
|
**Per-role tool sets:** Expand any role card to configure which tool categories the orchestrator can use when that role is active. Unchecked categories are hidden from the model entirely — reducing token overhead on every orchestrated call. Leaving all categories unchecked means all tools the user has access to are available (the default).
|
||||||
|
|
||||||
@@ -390,6 +390,7 @@ Distillation builds up the memory layers from raw session logs. Runs automatical
|
|||||||
| **mid** | LLM summarizes `MEMORY_SHORT.md` → `MEMORY_MID.md` |
|
| **mid** | LLM summarizes `MEMORY_SHORT.md` → `MEMORY_MID.md` |
|
||||||
| **long** | LLM integrates `MEMORY_MID.md` → `MEMORY_LONG.md` |
|
| **long** | LLM integrates `MEMORY_MID.md` → `MEMORY_LONG.md` |
|
||||||
| **all** | Runs short → mid → long in sequence |
|
| **all** | Runs short → mid → long in sequence |
|
||||||
|
| **Rebuild** | ⚠ Wipes Mid + Long memories and rebuilds from session logs. Use to recover from distillation drift. Hand-edited content will be replaced. |
|
||||||
|
|
||||||
**Recommended workflow:** run **short** after any productive session; **mid** weekly; **long** monthly.
|
**Recommended workflow:** run **short** after any productive session; **mid** weekly; **long** monthly.
|
||||||
|
|
||||||
@@ -462,8 +463,7 @@ For direct access or scripting:
|
|||||||
| Method | Endpoint | Description |
|
| Method | Endpoint | Description |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| `POST` | `/chat` | Send a message — returns SSE stream |
|
| `POST` | `/chat` | Send a message — returns SSE stream |
|
||||||
| `GET` | `/backend` | Get current primary/fallback backends |
|
| `GET` | `/backend` | Get configured model slots and orchestrator |
|
||||||
| `POST` | `/backend` | Set primary backend (`{"primary": "claude"}`) |
|
|
||||||
| `GET` | `/sessions` | List all sessions |
|
| `GET` | `/sessions` | List all sessions |
|
||||||
| `GET` | `/history/{id}` | Get session message history |
|
| `GET` | `/history/{id}` | Get session message history |
|
||||||
| `PUT` | `/history/{id}` | Replace full session history |
|
| `PUT` | `/history/{id}` | Replace full session history |
|
||||||
|
|||||||
@@ -140,15 +140,16 @@
|
|||||||
});
|
});
|
||||||
|
|
||||||
// ── Textarea height ──────────────────────────────────────────
|
// ── Textarea height ──────────────────────────────────────────
|
||||||
const HEIGHT_SIZES = [120, 240, 480];
|
const HEIGHT_SIZES = [120, 240, 480, 720];
|
||||||
const HEIGHT_LABELS = ['S', 'M', 'L'];
|
const HEIGHT_LABELS = ['S', 'M', 'L', 'XL'];
|
||||||
const HEIGHT_TITLES = [
|
const HEIGHT_TITLES = [
|
||||||
'Input size: Compact — click to cycle',
|
'Input size: Compact — click to cycle',
|
||||||
'Input size: Medium — click to cycle',
|
'Input size: Medium — click to cycle',
|
||||||
'Input size: Large — click to cycle',
|
'Input size: Large — click to cycle',
|
||||||
|
'Input size: Extra Large — click to cycle',
|
||||||
];
|
];
|
||||||
|
|
||||||
let maxHeight = parseInt(localStorage.getItem('maxHeight') || '120');
|
let maxHeight = parseInt(localStorage.getItem('maxHeight') || '240');
|
||||||
const heightCycleBtn = document.getElementById('height-cycle-btn');
|
const heightCycleBtn = document.getElementById('height-cycle-btn');
|
||||||
|
|
||||||
function syncHeight() {
|
function syncHeight() {
|
||||||
|
|||||||
@@ -115,9 +115,9 @@
|
|||||||
<div id="ctx-schedule"></div>
|
<div id="ctx-schedule"></div>
|
||||||
</div>
|
</div>
|
||||||
<div class="ctx-section">
|
<div class="ctx-section">
|
||||||
<div class="ctx-section-title">Role</div>
|
<div class="ctx-section-title">Model</div>
|
||||||
<div class="ctx-row">
|
<div class="ctx-row">
|
||||||
<button id="backend-toggle" class="ctx-btn" title="Active role — click to cycle">chat</button>
|
<button id="backend-toggle" class="ctx-btn" title="Active model — click to cycle chat role slots">chat</button>
|
||||||
</div>
|
</div>
|
||||||
<div id="backend-model-hint"></div>
|
<div id="backend-model-hint"></div>
|
||||||
</div>
|
</div>
|
||||||
@@ -167,24 +167,6 @@
|
|||||||
<div id="messages"></div>
|
<div id="messages"></div>
|
||||||
|
|
||||||
<div id="input-area">
|
<div id="input-area">
|
||||||
<!-- Mode select — compact dropdown, opens upward, MRU sorted -->
|
|
||||||
<div id="mode-select">
|
|
||||||
<button id="mode-select-btn" title="Input mode">
|
|
||||||
<span id="mode-icon">💬</span>
|
|
||||||
<span id="mode-label">Chat</span>
|
|
||||||
<span class="mode-arrow">▲</span>
|
|
||||||
</button>
|
|
||||||
<!-- Populated dynamically in MRU order -->
|
|
||||||
<div id="mode-dropdown"></div>
|
|
||||||
<!-- Note visibility sub-toggle — only shown when note mode is active -->
|
|
||||||
<button id="note-vis-btn" title="Toggle note visibility (private / public)">prv</button>
|
|
||||||
<!-- Tools toggle — routes through the orchestrator tool loop when active -->
|
|
||||||
<button id="tools-toggle" title="Tools disabled — click to enable">⚡</button>
|
|
||||||
<!-- Attach file — images (vision) or text/code files -->
|
|
||||||
<button id="attach-btn" title="Attach image or text file">📎</button>
|
|
||||||
<input type="file" id="file-input" style="display:none"
|
|
||||||
accept="image/png,image/jpeg,image/webp,image/gif,text/plain,text/markdown,.md,.txt,.py,.js,.ts,.jsx,.tsx,.json,.yaml,.yml,.toml,.html,.css,.sh,.csv,.xml,.rs,.go,.java,.c,.cpp,.h,.rb,.php,.swift,.kt,.sql">
|
|
||||||
</div>
|
|
||||||
<!-- Attachment preview — shown when a file is pending -->
|
<!-- Attachment preview — shown when a file is pending -->
|
||||||
<div id="attachment-row" style="display:none">
|
<div id="attachment-row" style="display:none">
|
||||||
<div id="attachment-preview">
|
<div id="attachment-preview">
|
||||||
@@ -195,7 +177,26 @@
|
|||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<textarea id="input" rows="1" placeholder="Message…" autofocus></textarea>
|
<textarea id="input" rows="1" placeholder="Message…" autofocus></textarea>
|
||||||
<div id="send-col">
|
<!-- Compact toolbar: mode, tools, attach | spacer | send/stop -->
|
||||||
|
<div id="input-toolbar">
|
||||||
|
<div id="mode-select">
|
||||||
|
<button id="mode-select-btn" title="Input mode">
|
||||||
|
<span id="mode-icon">💬</span>
|
||||||
|
<span id="mode-label">Chat</span>
|
||||||
|
<span class="mode-arrow">▲</span>
|
||||||
|
</button>
|
||||||
|
<!-- Populated dynamically in MRU order -->
|
||||||
|
<div id="mode-dropdown"></div>
|
||||||
|
</div>
|
||||||
|
<!-- Note visibility sub-toggle — only shown when note mode is active -->
|
||||||
|
<button id="note-vis-btn" title="Toggle note visibility (private / public)">prv</button>
|
||||||
|
<!-- Tools toggle — routes through the orchestrator tool loop when active -->
|
||||||
|
<button id="tools-toggle" title="Tools disabled — click to enable">⚡</button>
|
||||||
|
<!-- Attach file — images (vision) or text/code files -->
|
||||||
|
<button id="attach-btn" title="Attach image or text file">📎</button>
|
||||||
|
<input type="file" id="file-input" style="display:none"
|
||||||
|
accept="image/png,image/jpeg,image/webp,image/gif,text/plain,text/markdown,.md,.txt,.py,.js,.ts,.jsx,.tsx,.json,.yaml,.yml,.toml,.html,.css,.sh,.csv,.xml,.rs,.go,.java,.c,.cpp,.h,.rb,.php,.swift,.kt,.sql">
|
||||||
|
<div style="flex:1"></div>
|
||||||
<button id="send">Send</button>
|
<button id="send">Send</button>
|
||||||
<button id="stop"><svg data-lucide="square" width="14" height="14" class="btn-icon"></svg> Stop</button>
|
<button id="stop"><svg data-lucide="square" width="14" height="14" class="btn-icon"></svg> Stop</button>
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
@@ -982,6 +982,42 @@
|
|||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// ── Model edit: AJAX save (stay on Models tab) ────────────────────────────
|
||||||
|
document.querySelectorAll('.model-edit-form').forEach(form => {
|
||||||
|
form.addEventListener('submit', async e => {
|
||||||
|
e.preventDefault();
|
||||||
|
const id = form.id.replace('edit-form-', '');
|
||||||
|
const saveBtn = form.querySelector('button[type="submit"]');
|
||||||
|
saveBtn.disabled = true;
|
||||||
|
try {
|
||||||
|
const res = await fetch(`/api/models/${id}/edit`, {method: 'POST', body: new FormData(form)});
|
||||||
|
const data = await res.json();
|
||||||
|
if (data.ok) {
|
||||||
|
// Update the row header label in place
|
||||||
|
const row = document.getElementById('model-' + id);
|
||||||
|
if (row && data.label) {
|
||||||
|
const labelEl = row.querySelector('.model-label');
|
||||||
|
if (labelEl) labelEl.textContent = data.label;
|
||||||
|
}
|
||||||
|
if (row && data.model_name) {
|
||||||
|
const nameEl = row.querySelector('.model-name');
|
||||||
|
if (nameEl) nameEl.textContent = data.model_name;
|
||||||
|
}
|
||||||
|
// Close the edit panel
|
||||||
|
form.style.display = 'none';
|
||||||
|
document.querySelector(`.model-edit-btn[data-id="${id}"]`).textContent = 'Edit';
|
||||||
|
showToast('Model saved');
|
||||||
|
} else {
|
||||||
|
showToast(data.error || 'Save failed', true);
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
showToast(err.message, true);
|
||||||
|
} finally {
|
||||||
|
saveBtn.disabled = false;
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
// ── Edit form: fetch from host ────────────────────────────────────────────
|
// ── Edit form: fetch from host ────────────────────────────────────────────
|
||||||
document.querySelectorAll('.edit-fetch-btn').forEach(btn => {
|
document.querySelectorAll('.edit-fetch-btn').forEach(btn => {
|
||||||
btn.addEventListener('click', async () => {
|
btn.addEventListener('click', async () => {
|
||||||
|
|||||||
@@ -735,35 +735,28 @@
|
|||||||
.message.note-private .note-content { color: #c9a84c; white-space: pre-wrap; }
|
.message.note-private .note-content { color: #c9a84c; white-space: pre-wrap; }
|
||||||
.message.note-public .note-content { color: #4abfb0; white-space: pre-wrap; }
|
.message.note-public .note-content { color: #4abfb0; white-space: pre-wrap; }
|
||||||
|
|
||||||
/* ── Input area — 3-col: [mode-toggle] [textarea] [send-col] ── */
|
/* ── Input area — column: [attachment?] [textarea] [toolbar] ── */
|
||||||
#input-area {
|
#input-area {
|
||||||
padding: 12px 20px;
|
padding: 10px 20px 12px;
|
||||||
background: var(--surface);
|
background: var(--surface);
|
||||||
border-top: 1px solid var(--border);
|
border-top: 1px solid var(--border);
|
||||||
display: flex;
|
display: flex;
|
||||||
flex-direction: row;
|
flex-direction: column;
|
||||||
gap: 10px;
|
gap: 6px;
|
||||||
align-items: flex-end;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/* ── Mode select — compact dropdown ─────────────────────────── */
|
/* ── Compact toolbar below the textarea ─────────────────────── */
|
||||||
|
#input-toolbar {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: row;
|
||||||
|
align-items: center;
|
||||||
|
gap: 6px;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ── Mode select — positioned container for dropdown only ────── */
|
||||||
#mode-select {
|
#mode-select {
|
||||||
position: relative;
|
position: relative;
|
||||||
flex-shrink: 0;
|
flex-shrink: 0;
|
||||||
display: flex;
|
|
||||||
flex-direction: column;
|
|
||||||
align-items: stretch;
|
|
||||||
gap: 4px;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* S: collapse to a single row — mode button + compact tools toggle */
|
|
||||||
#mode-select[data-size="s"] {
|
|
||||||
flex-direction: row;
|
|
||||||
align-items: center;
|
|
||||||
}
|
|
||||||
#mode-select[data-size="s"] #tools-toggle {
|
|
||||||
padding: 3px 7px;
|
|
||||||
font-size: 0.75rem;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
#mode-select-btn {
|
#mode-select-btn {
|
||||||
@@ -874,8 +867,7 @@
|
|||||||
#attach-btn:hover { color: rgba(255,255,255,0.6); border-color: rgba(255,255,255,0.25); }
|
#attach-btn:hover { color: rgba(255,255,255,0.6); border-color: rgba(255,255,255,0.25); }
|
||||||
|
|
||||||
#attachment-row {
|
#attachment-row {
|
||||||
padding: 0.3rem 0.5rem;
|
padding: 0.2rem 0;
|
||||||
border-bottom: 1px solid var(--border);
|
|
||||||
}
|
}
|
||||||
#attachment-preview {
|
#attachment-preview {
|
||||||
display: inline-flex;
|
display: inline-flex;
|
||||||
@@ -914,7 +906,8 @@
|
|||||||
#attachment-clear:hover { color: var(--text); }
|
#attachment-clear:hover { color: var(--text); }
|
||||||
|
|
||||||
#input {
|
#input {
|
||||||
flex: 1;
|
width: 100%;
|
||||||
|
box-sizing: border-box;
|
||||||
background: var(--bg);
|
background: var(--bg);
|
||||||
border: 1px solid var(--border);
|
border: 1px solid var(--border);
|
||||||
border-radius: 8px;
|
border-radius: 8px;
|
||||||
@@ -936,16 +929,7 @@
|
|||||||
#input.mode-note.public:focus { border-color: rgba(40,170,150,0.85); }
|
#input.mode-note.public:focus { border-color: rgba(40,170,150,0.85); }
|
||||||
#input.mode-otr { border-color: rgba(120,80,160,0.4); background: rgba(120,80,160,0.04); }
|
#input.mode-otr { border-color: rgba(120,80,160,0.4); background: rgba(120,80,160,0.04); }
|
||||||
|
|
||||||
/* Send column — right side, stacked */
|
/* Send button — sits in #input-toolbar row */
|
||||||
#send-col {
|
|
||||||
display: flex;
|
|
||||||
flex-direction: column;
|
|
||||||
align-items: stretch;
|
|
||||||
gap: 4px;
|
|
||||||
flex-shrink: 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Send button */
|
|
||||||
#send {
|
#send {
|
||||||
display: flex;
|
display: flex;
|
||||||
align-items: center;
|
align-items: center;
|
||||||
@@ -955,11 +939,12 @@
|
|||||||
border: 1px solid var(--user-border);
|
border: 1px solid var(--user-border);
|
||||||
color: var(--text);
|
color: var(--text);
|
||||||
border-radius: 8px;
|
border-radius: 8px;
|
||||||
padding: 10px 14px;
|
padding: 7px 16px;
|
||||||
cursor: pointer;
|
cursor: pointer;
|
||||||
font-size: 0.9rem;
|
font-size: 0.9rem;
|
||||||
text-align: center;
|
text-align: center;
|
||||||
white-space: nowrap;
|
white-space: nowrap;
|
||||||
|
flex-shrink: 0;
|
||||||
transition: background 0.15s;
|
transition: background 0.15s;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -977,10 +962,11 @@
|
|||||||
border: 1px solid var(--error-border);
|
border: 1px solid var(--error-border);
|
||||||
color: var(--error-text);
|
color: var(--error-text);
|
||||||
border-radius: 8px;
|
border-radius: 8px;
|
||||||
padding: 10px 14px;
|
padding: 7px 14px;
|
||||||
cursor: pointer;
|
cursor: pointer;
|
||||||
font-size: 0.9rem;
|
font-size: 0.9rem;
|
||||||
text-align: center;
|
text-align: center;
|
||||||
|
flex-shrink: 0;
|
||||||
transition: background 0.15s;
|
transition: background 0.15s;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
31
cortex/tools/_projects.py
Normal file
31
cortex/tools/_projects.py
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
"""Shared project alias registry for Cortex tools."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ProjectDef:
|
||||||
|
path: str # path on the host where the project lives (~ is expanded at runtime)
|
||||||
|
ssh_host: str = "" # if set, git/aider commands run via SSH on this host
|
||||||
|
|
||||||
|
|
||||||
|
_CORTEX_ROOT_STR: str = str(Path(__file__).parent.parent.parent.resolve())
|
||||||
|
|
||||||
|
PROJECT_ALIASES: dict[str, ProjectDef] = {
|
||||||
|
"cortex": ProjectDef(path=_CORTEX_ROOT_STR),
|
||||||
|
"aether_api": ProjectDef(
|
||||||
|
path="~/OSIT_dev/aether_api_fastapi",
|
||||||
|
ssh_host="scott-wks-main-i7",
|
||||||
|
),
|
||||||
|
"aether_frontend": ProjectDef(
|
||||||
|
path="~/OSIT_dev/aether_app_sveltekit",
|
||||||
|
ssh_host="scott-wks-main-i7",
|
||||||
|
),
|
||||||
|
"aether_container": ProjectDef(
|
||||||
|
path="~/OSIT_dev/aether_container_env",
|
||||||
|
ssh_host="scott-wks-main-i7",
|
||||||
|
),
|
||||||
|
}
|
||||||
@@ -16,25 +16,16 @@ background=True runs the subprocess asynchronously and returns an agent_id immed
|
|||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
|
import shlex
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from google.genai import types
|
from google.genai import types
|
||||||
|
|
||||||
import agent_manager
|
import agent_manager
|
||||||
|
from ._projects import PROJECT_ALIASES
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_CORTEX_DIR = Path(__file__).parent # .../Cortex_and_Inara_dev/cortex/
|
|
||||||
_PROJECT_ROOT = _CORTEX_DIR.parent # .../Cortex_and_Inara_dev/
|
|
||||||
|
|
||||||
# Known project aliases — expand before passing to subprocess
|
|
||||||
_PROJECT_ALIASES: dict[str, str] = {
|
|
||||||
"cortex": str(_PROJECT_ROOT),
|
|
||||||
"aether_api": "~/OSIT_dev/aether_api_fastapi",
|
|
||||||
"aether_frontend": "~/OSIT_dev/aether_app_sveltekit",
|
|
||||||
"aether_container": "~/OSIT_dev/aether_container_env",
|
|
||||||
}
|
|
||||||
|
|
||||||
_MAX_OUTPUT_CHARS = 12_000
|
_MAX_OUTPUT_CHARS = 12_000
|
||||||
|
|
||||||
# Maps URL fragments → Aider --api-key provider slug.
|
# Maps URL fragments → Aider --api-key provider slug.
|
||||||
@@ -192,11 +183,16 @@ async def aider_run(
|
|||||||
immediately. Use agent_status(agent_id) to check progress; set notify=True to
|
immediately. Use agent_status(agent_id) to check progress; set notify=True to
|
||||||
receive a push/Talk notification on completion.
|
receive a push/Talk notification on completion.
|
||||||
"""
|
"""
|
||||||
resolved = _PROJECT_ALIASES.get(project, project)
|
proj_def = PROJECT_ALIASES.get(project)
|
||||||
cwd = Path(os.path.expanduser(resolved))
|
if proj_def is not None:
|
||||||
|
cwd = Path(os.path.expanduser(proj_def.path))
|
||||||
|
ssh_host = proj_def.ssh_host
|
||||||
|
else:
|
||||||
|
cwd = Path(os.path.expanduser(project))
|
||||||
|
ssh_host = ""
|
||||||
|
|
||||||
if not cwd.is_dir():
|
if not ssh_host and not cwd.is_dir():
|
||||||
return f"Error: project directory '{resolved}' does not exist."
|
return f"Error: project directory '{cwd}' does not exist."
|
||||||
|
|
||||||
timeout = min(max(int(timeout), 10), 600)
|
timeout = min(max(int(timeout), 10), 600)
|
||||||
|
|
||||||
@@ -232,11 +228,22 @@ async def aider_run(
|
|||||||
cmd += ["--file", f]
|
cmd += ["--file", f]
|
||||||
|
|
||||||
logger.info(
|
logger.info(
|
||||||
"aider_run: project=%s model=%s host_label=%s auto_commit=%s background=%s task=%.120s",
|
"aider_run: project=%s ssh_host=%s model=%s host_label=%s auto_commit=%s background=%s task=%.120s",
|
||||||
project, model, host_label, auto_commit, background, task,
|
project, ssh_host or "local", model, host_label, auto_commit, background, task,
|
||||||
)
|
)
|
||||||
|
|
||||||
async def _run() -> str:
|
async def _run() -> str:
|
||||||
|
if ssh_host:
|
||||||
|
# Run aider natively on the remote host via a login shell so PATH
|
||||||
|
# includes ~/.local/bin where aider is typically installed.
|
||||||
|
inner_cmd = "cd " + shlex.quote(str(cwd)) + " && " + shlex.join(cmd)
|
||||||
|
ssh_cmd = f"bash -l -c {shlex.quote(inner_cmd)}"
|
||||||
|
proc = await asyncio.create_subprocess_exec(
|
||||||
|
"ssh", ssh_host, ssh_cmd,
|
||||||
|
stdout=asyncio.subprocess.PIPE,
|
||||||
|
stderr=asyncio.subprocess.PIPE,
|
||||||
|
)
|
||||||
|
else:
|
||||||
proc = await asyncio.create_subprocess_exec(
|
proc = await asyncio.create_subprocess_exec(
|
||||||
*cmd,
|
*cmd,
|
||||||
cwd=str(cwd),
|
cwd=str(cwd),
|
||||||
@@ -323,6 +330,8 @@ DECLARATIONS = [
|
|||||||
"Credentials are resolved automatically from the Cortex model registry — "
|
"Credentials are resolved automatically from the Cortex model registry — "
|
||||||
"OpenRouter, local Open WebUI/Ollama, Anthropic API, and other configured hosts "
|
"OpenRouter, local Open WebUI/Ollama, Anthropic API, and other configured hosts "
|
||||||
"are all supported. Use host_label to pick a specific host. "
|
"are all supported. Use host_label to pick a specific host. "
|
||||||
|
"aether_api, aether_frontend, and aether_container run aider natively on the "
|
||||||
|
"workstation (scott-wks-main-i7) via SSH — aider must be installed there. "
|
||||||
"Set background=True for long tasks — returns an agent_id immediately and sends "
|
"Set background=True for long tasks — returns an agent_id immediately and sends "
|
||||||
"a notification when done. ADMIN ONLY. Requires confirmation."
|
"a notification when done. ADMIN ONLY. Requires confirmation."
|
||||||
),
|
),
|
||||||
|
|||||||
@@ -13,26 +13,23 @@ Write operations (admin-only, confirm-required):
|
|||||||
All tools accept an optional `project` parameter using the same aliases as aider_run:
|
All tools accept an optional `project` parameter using the same aliases as aider_run:
|
||||||
"cortex" (default), "aether_api", "aether_frontend", "aether_container"
|
"cortex" (default), "aether_api", "aether_frontend", "aether_container"
|
||||||
Or pass an absolute path directly.
|
Or pass an absolute path directly.
|
||||||
|
|
||||||
|
Projects with an ssh_host defined in _projects.py run all git commands on the remote
|
||||||
|
host via SSH, using shlex-quoted commands to handle paths and arguments safely.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
|
import shlex
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from google.genai import types
|
from google.genai import types
|
||||||
|
|
||||||
|
from ._projects import PROJECT_ALIASES
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_CORTEX_ROOT: Path = Path(__file__).parent.parent.parent.resolve()
|
|
||||||
|
|
||||||
_PROJECT_ALIASES: dict[str, str] = {
|
|
||||||
"cortex": str(_CORTEX_ROOT),
|
|
||||||
"aether_api": "~/OSIT_dev/aether_api_fastapi",
|
|
||||||
"aether_frontend": "~/OSIT_dev/aether_app_sveltekit",
|
|
||||||
"aether_container": "~/OSIT_dev/aether_container_env",
|
|
||||||
}
|
|
||||||
|
|
||||||
_MAX_OUTPUT = 50_000
|
_MAX_OUTPUT = 50_000
|
||||||
|
|
||||||
_PROJECT_PARAM = types.Schema(
|
_PROJECT_PARAM = types.Schema(
|
||||||
@@ -45,16 +42,29 @@ _PROJECT_PARAM = types.Schema(
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def _resolve_project(project: str) -> Path:
|
def _resolve_project(project: str) -> tuple[Path, str]:
|
||||||
"""Resolve a project alias or path string to an absolute Path."""
|
"""Return (path, ssh_host). path may not exist locally when ssh_host is set."""
|
||||||
if not project:
|
if not project:
|
||||||
return _CORTEX_ROOT
|
d = PROJECT_ALIASES["cortex"]
|
||||||
resolved = _PROJECT_ALIASES.get(project, project)
|
else:
|
||||||
return Path(os.path.expanduser(resolved))
|
d = PROJECT_ALIASES.get(project)
|
||||||
|
if d is None:
|
||||||
|
# Raw path — no SSH routing
|
||||||
|
return Path(os.path.expanduser(project)), ""
|
||||||
|
return Path(os.path.expanduser(d.path)), d.ssh_host
|
||||||
|
|
||||||
|
|
||||||
async def _git(*args: str, cwd: Path, timeout: int = 15) -> tuple[int, str]:
|
async def _git(*args: str, cwd: Path, ssh_host: str = "", timeout: int = 15) -> tuple[int, str]:
|
||||||
"""Run a git command in cwd. Returns (returncode, combined output)."""
|
"""Run a git command locally or via SSH. Returns (returncode, combined output)."""
|
||||||
|
if ssh_host:
|
||||||
|
# Build a single shell-safe command string for the remote shell
|
||||||
|
remote_cmd = shlex.join(["git", "-C", str(cwd)] + list(args))
|
||||||
|
proc = await asyncio.create_subprocess_exec(
|
||||||
|
"ssh", ssh_host, remote_cmd,
|
||||||
|
stdout=asyncio.subprocess.PIPE,
|
||||||
|
stderr=asyncio.subprocess.PIPE,
|
||||||
|
)
|
||||||
|
else:
|
||||||
proc = await asyncio.create_subprocess_exec(
|
proc = await asyncio.create_subprocess_exec(
|
||||||
"git", "-C", str(cwd), *args,
|
"git", "-C", str(cwd), *args,
|
||||||
stdout=asyncio.subprocess.PIPE,
|
stdout=asyncio.subprocess.PIPE,
|
||||||
@@ -80,10 +90,10 @@ def _cap(text: str) -> str:
|
|||||||
|
|
||||||
async def git_status(project: str = "") -> str:
|
async def git_status(project: str = "") -> str:
|
||||||
"""Return the working tree status for a project."""
|
"""Return the working tree status for a project."""
|
||||||
cwd = _resolve_project(project)
|
cwd, ssh_host = _resolve_project(project)
|
||||||
if not cwd.is_dir():
|
if not ssh_host and not cwd.is_dir():
|
||||||
return f"Error: project directory not found: {cwd}"
|
return f"Error: project directory not found: {cwd}"
|
||||||
rc, out = await _git("status", cwd=cwd)
|
rc, out = await _git("status", cwd=cwd, ssh_host=ssh_host)
|
||||||
if rc != 0:
|
if rc != 0:
|
||||||
return f"git status failed: {out}"
|
return f"git status failed: {out}"
|
||||||
return out or "Working tree clean — nothing to report."
|
return out or "Working tree clean — nothing to report."
|
||||||
@@ -91,8 +101,8 @@ async def git_status(project: str = "") -> str:
|
|||||||
|
|
||||||
async def git_log(n: int = 20, path: str = "", oneline: bool = True, project: str = "") -> str:
|
async def git_log(n: int = 20, path: str = "", oneline: bool = True, project: str = "") -> str:
|
||||||
"""Return recent commit history for a project."""
|
"""Return recent commit history for a project."""
|
||||||
cwd = _resolve_project(project)
|
cwd, ssh_host = _resolve_project(project)
|
||||||
if not cwd.is_dir():
|
if not ssh_host and not cwd.is_dir():
|
||||||
return f"Error: project directory not found: {cwd}"
|
return f"Error: project directory not found: {cwd}"
|
||||||
args = ["log"]
|
args = ["log"]
|
||||||
if oneline:
|
if oneline:
|
||||||
@@ -102,7 +112,7 @@ async def git_log(n: int = 20, path: str = "", oneline: bool = True, project: st
|
|||||||
args += [f"-{max(1, min(n, 200))}"]
|
args += [f"-{max(1, min(n, 200))}"]
|
||||||
if path:
|
if path:
|
||||||
args += ["--", path]
|
args += ["--", path]
|
||||||
rc, out = await _git(*args, cwd=cwd)
|
rc, out = await _git(*args, cwd=cwd, ssh_host=ssh_host)
|
||||||
if rc != 0:
|
if rc != 0:
|
||||||
return f"git log failed: {out}"
|
return f"git log failed: {out}"
|
||||||
return _cap(out) or "No commits found."
|
return _cap(out) or "No commits found."
|
||||||
@@ -110,8 +120,8 @@ async def git_log(n: int = 20, path: str = "", oneline: bool = True, project: st
|
|||||||
|
|
||||||
async def git_diff(ref_a: str = "", ref_b: str = "", path: str = "", stat_only: bool = False, project: str = "") -> str:
|
async def git_diff(ref_a: str = "", ref_b: str = "", path: str = "", stat_only: bool = False, project: str = "") -> str:
|
||||||
"""Show a diff for a project. Defaults to working tree vs HEAD."""
|
"""Show a diff for a project. Defaults to working tree vs HEAD."""
|
||||||
cwd = _resolve_project(project)
|
cwd, ssh_host = _resolve_project(project)
|
||||||
if not cwd.is_dir():
|
if not ssh_host and not cwd.is_dir():
|
||||||
return f"Error: project directory not found: {cwd}"
|
return f"Error: project directory not found: {cwd}"
|
||||||
args = ["diff"]
|
args = ["diff"]
|
||||||
if stat_only:
|
if stat_only:
|
||||||
@@ -122,7 +132,7 @@ async def git_diff(ref_a: str = "", ref_b: str = "", path: str = "", stat_only:
|
|||||||
args += [ref_a]
|
args += [ref_a]
|
||||||
if path:
|
if path:
|
||||||
args += ["--", path]
|
args += ["--", path]
|
||||||
rc, out = await _git(*args, cwd=cwd)
|
rc, out = await _git(*args, cwd=cwd, ssh_host=ssh_host)
|
||||||
# diff exits 1 when differences exist — normal
|
# diff exits 1 when differences exist — normal
|
||||||
if rc not in (0, 1):
|
if rc not in (0, 1):
|
||||||
return f"git diff failed: {out}"
|
return f"git diff failed: {out}"
|
||||||
@@ -133,29 +143,27 @@ async def git_diff(ref_a: str = "", ref_b: str = "", path: str = "", stat_only:
|
|||||||
|
|
||||||
async def git_commit(message: str, project: str = "", files: list[str] | None = None) -> str:
|
async def git_commit(message: str, project: str = "", files: list[str] | None = None) -> str:
|
||||||
"""Stage files and create a commit in a project."""
|
"""Stage files and create a commit in a project."""
|
||||||
cwd = _resolve_project(project)
|
cwd, ssh_host = _resolve_project(project)
|
||||||
if not cwd.is_dir():
|
if not ssh_host and not cwd.is_dir():
|
||||||
return f"Error: project directory not found: {cwd}"
|
return f"Error: project directory not found: {cwd}"
|
||||||
if not message.strip():
|
if not message.strip():
|
||||||
return "Error: commit message is required."
|
return "Error: commit message is required."
|
||||||
|
|
||||||
# Stage specified files or all changes
|
|
||||||
if files:
|
if files:
|
||||||
for f in files:
|
for f in files:
|
||||||
rc, out = await _git("add", "--", f, cwd=cwd)
|
rc, out = await _git("add", "--", f, cwd=cwd, ssh_host=ssh_host)
|
||||||
if rc != 0:
|
if rc != 0:
|
||||||
return f"git add '{f}' failed: {out}"
|
return f"git add '{f}' failed: {out}"
|
||||||
else:
|
else:
|
||||||
rc, out = await _git("add", "-A", cwd=cwd)
|
rc, out = await _git("add", "-A", cwd=cwd, ssh_host=ssh_host)
|
||||||
if rc != 0:
|
if rc != 0:
|
||||||
return f"git add -A failed: {out}"
|
return f"git add -A failed: {out}"
|
||||||
|
|
||||||
# Check that something is actually staged
|
rc, staged = await _git("diff", "--cached", "--stat", cwd=cwd, ssh_host=ssh_host)
|
||||||
rc, staged = await _git("diff", "--cached", "--stat", cwd=cwd)
|
|
||||||
if not staged.strip():
|
if not staged.strip():
|
||||||
return "Nothing staged to commit — working tree already clean."
|
return "Nothing staged to commit — working tree already clean."
|
||||||
|
|
||||||
rc, out = await _git("commit", "-m", message, cwd=cwd)
|
rc, out = await _git("commit", "-m", message, cwd=cwd, ssh_host=ssh_host)
|
||||||
if rc != 0:
|
if rc != 0:
|
||||||
return f"git commit failed: {out}"
|
return f"git commit failed: {out}"
|
||||||
return out or "Committed successfully."
|
return out or "Committed successfully."
|
||||||
@@ -163,15 +171,15 @@ async def git_commit(message: str, project: str = "", files: list[str] | None =
|
|||||||
|
|
||||||
async def git_push(project: str = "", remote: str = "origin", branch: str = "") -> str:
|
async def git_push(project: str = "", remote: str = "origin", branch: str = "") -> str:
|
||||||
"""Push the current branch to a remote."""
|
"""Push the current branch to a remote."""
|
||||||
cwd = _resolve_project(project)
|
cwd, ssh_host = _resolve_project(project)
|
||||||
if not cwd.is_dir():
|
if not ssh_host and not cwd.is_dir():
|
||||||
return f"Error: project directory not found: {cwd}"
|
return f"Error: project directory not found: {cwd}"
|
||||||
|
|
||||||
args = ["push", remote]
|
args = ["push", remote]
|
||||||
if branch:
|
if branch:
|
||||||
args.append(branch)
|
args.append(branch)
|
||||||
|
|
||||||
rc, out = await _git(*args, cwd=cwd, timeout=30)
|
rc, out = await _git(*args, cwd=cwd, ssh_host=ssh_host, timeout=30)
|
||||||
if rc != 0:
|
if rc != 0:
|
||||||
return f"git push failed: {out}"
|
return f"git push failed: {out}"
|
||||||
return out or f"Pushed to {remote} successfully."
|
return out or f"Pushed to {remote} successfully."
|
||||||
@@ -186,6 +194,7 @@ DECLARATIONS = [
|
|||||||
"Show the working tree status for a project: staged changes, unstaged "
|
"Show the working tree status for a project: staged changes, unstaged "
|
||||||
"modifications, and untracked files. Use before committing to see what "
|
"modifications, and untracked files. Use before committing to see what "
|
||||||
"will be included. Defaults to the Cortex project. "
|
"will be included. Defaults to the Cortex project. "
|
||||||
|
"aether_api, aether_frontend, and aether_container run on the workstation via SSH."
|
||||||
),
|
),
|
||||||
parameters=types.Schema(
|
parameters=types.Schema(
|
||||||
type=types.Type.OBJECT,
|
type=types.Type.OBJECT,
|
||||||
@@ -198,6 +207,7 @@ DECLARATIONS = [
|
|||||||
"Show recent commit history for a project. Returns commit hashes, dates, "
|
"Show recent commit history for a project. Returns commit hashes, dates, "
|
||||||
"and messages. Use after aider_run completes to see what was committed. "
|
"and messages. Use after aider_run completes to see what was committed. "
|
||||||
"Defaults to the Cortex project. "
|
"Defaults to the Cortex project. "
|
||||||
|
"aether_api, aether_frontend, and aether_container run on the workstation via SSH."
|
||||||
),
|
),
|
||||||
parameters=types.Schema(
|
parameters=types.Schema(
|
||||||
type=types.Type.OBJECT,
|
type=types.Type.OBJECT,
|
||||||
@@ -227,6 +237,7 @@ DECLARATIONS = [
|
|||||||
"With ref_a and ref_b: changes between the two refs. "
|
"With ref_a and ref_b: changes between the two refs. "
|
||||||
"Use after aider_run (auto_commit=False) to review changes before committing. "
|
"Use after aider_run (auto_commit=False) to review changes before committing. "
|
||||||
"Defaults to the Cortex project. "
|
"Defaults to the Cortex project. "
|
||||||
|
"aether_api, aether_frontend, and aether_container run on the workstation via SSH."
|
||||||
),
|
),
|
||||||
parameters=types.Schema(
|
parameters=types.Schema(
|
||||||
type=types.Type.OBJECT,
|
type=types.Type.OBJECT,
|
||||||
@@ -257,6 +268,7 @@ DECLARATIONS = [
|
|||||||
"Stage files and create a git commit in a project. "
|
"Stage files and create a git commit in a project. "
|
||||||
"Use after reviewing changes with git_diff — especially when aider_run ran "
|
"Use after reviewing changes with git_diff — especially when aider_run ran "
|
||||||
"with auto_commit=False. Stages all changes by default (files=None). "
|
"with auto_commit=False. Stages all changes by default (files=None). "
|
||||||
|
"aether_api, aether_frontend, and aether_container commit on the workstation via SSH. "
|
||||||
"ADMIN ONLY. Requires confirmation."
|
"ADMIN ONLY. Requires confirmation."
|
||||||
),
|
),
|
||||||
parameters=types.Schema(
|
parameters=types.Schema(
|
||||||
@@ -284,6 +296,7 @@ DECLARATIONS = [
|
|||||||
description=(
|
description=(
|
||||||
"Push the current branch to a remote. "
|
"Push the current branch to a remote. "
|
||||||
"Use after git_commit or after aider_run commits to share the changes. "
|
"Use after git_commit or after aider_run commits to share the changes. "
|
||||||
|
"aether_api, aether_frontend, and aether_container push on the workstation via SSH. "
|
||||||
"ADMIN ONLY. Requires confirmation."
|
"ADMIN ONLY. Requires confirmation."
|
||||||
),
|
),
|
||||||
parameters=types.Schema(
|
parameters=types.Schema(
|
||||||
|
|||||||
@@ -1,20 +1,21 @@
|
|||||||
# Architecture: LLM Backends
|
# Architecture: LLM Backends
|
||||||
|
|
||||||
> How Cortex selects and talks to AI models.
|
> How Cortex selects and talks to AI models.
|
||||||
> Last updated: 2026-05-06
|
> Last updated: 2026-06-18
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Providers
|
## Providers
|
||||||
|
|
||||||
Cortex supports four model types, each dispatched differently:
|
Cortex supports two model types, each dispatched differently:
|
||||||
|
|
||||||
| Type | Auth | Use |
|
| Type | Auth | Use |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| `claude_cli` | OAuth token from `~/.claude/.credentials.json` | Chat, persona responses |
|
| `local_openai` | API key per host in model registry | Open WebUI, Ollama, OpenRouter, LiteLLM, any OpenAI-compatible endpoint |
|
||||||
| `gemini_cli` | Gemini CLI credentials | Chat fallback / explicit selection |
|
| `anthropic_api` | API key in model registry (Anthropic cloud provider) | Claude models via Anthropic SDK |
|
||||||
| `gemini_api` | API key from registry account or `.env` | Orchestrator tool loop |
|
|
||||||
| `local_openai` | API key per host in model registry | Open WebUI, Ollama, OpenRouter, LiteLLM, etc. |
|
The Gemini API (`gemini_api`) is a third type used exclusively by the orchestrator engine —
|
||||||
|
it is not dispatched through `llm_client.py` and is not available for chat/distill roles.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -22,40 +23,36 @@ Cortex supports four model types, each dispatched differently:
|
|||||||
|
|
||||||
### Default: Role-Based Routing (Auto)
|
### Default: Role-Based Routing (Auto)
|
||||||
|
|
||||||
When no explicit backend is selected, Cortex routes to the model configured for the
|
All routing goes through the user's model registry. When a request arrives, `complete()` in
|
||||||
request's **role** in the user's model registry. Roles: `chat`, `orchestrator`, `distill`,
|
`llm_client.py` resolves the model for the given role:
|
||||||
`coder`, `research` (extensible via `DEFINED_ROLES` in `.env`).
|
|
||||||
|
|
||||||
Resolution order for a role:
|
|
||||||
1. User registry: `roles[role].primary → backup_1 → backup_2 → backup_3 → backup_4`
|
|
||||||
2. `.env` role default: `ROLE_CHAT=claude_cli`, `ROLE_DISTILL=claude_cli`, etc.
|
|
||||||
3. Hardcoded last-resort: `chat/distill/coder → claude_cli`, `orchestrator/research → gemini_api`
|
|
||||||
|
|
||||||
### Explicit Override
|
|
||||||
|
|
||||||
The **Role** toggle in the Context & Memory panel cycles through configured role slots for the `chat` role: **Primary → Backup 1 → Backup 2 → auto**.
|
|
||||||
|
|
||||||
- Each slot shows the configured model label
|
|
||||||
- `auto` uses the Primary without forcing a specific backend type
|
|
||||||
- The ⚡ Tools toggle is independent — it routes to the `orchestrator` role regardless of the chat role selection
|
|
||||||
|
|
||||||
**Fallback chain** (automatic, only when no explicit registry entry exists):
|
|
||||||
```
|
```
|
||||||
claude → gemini
|
slot specified → resolve that exact slot (primary / backup_1 / backup_2)
|
||||||
gemini → claude
|
no slot → get_model_for_role(username, role)
|
||||||
local → claude
|
no registry entry → RuntimeError: "No model configured for role '...'"
|
||||||
```
|
```
|
||||||
When a model is explicitly configured in the registry, errors surface immediately — no silent fallback.
|
|
||||||
|
|
||||||
Each response shows a model tag (bottom-right of the message bubble) with the model label and host.
|
Roles: `chat`, `orchestrator`, `distill`, `janitor`, `coder`, `research` (extensible via
|
||||||
|
`DEFINED_ROLES` in `.env`).
|
||||||
|
|
||||||
|
There is no implicit fallback to a built-in model. If no model is configured for a role,
|
||||||
|
the request fails with a clear error directing the user to `/settings/models`.
|
||||||
|
|
||||||
|
### Explicit Slot Selection
|
||||||
|
|
||||||
|
The **Role** toggle in the Context & Memory panel cycles through configured role slots:
|
||||||
|
**Primary → Backup 1 → auto**. Each slot resolves the configured model for that position.
|
||||||
|
|
||||||
|
When a model is explicitly configured (via slot or registry entry), errors surface
|
||||||
|
immediately — no silent fallback to another backend.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Model Registry — V2 Schema
|
## Model Registry Schema
|
||||||
|
|
||||||
Per-user configuration stored in `home/{user}/model_registry.json`.
|
Per-user configuration stored in `home/{user}/model_registry.json`.
|
||||||
|
|
||||||
Managed at **Settings → Models** (`/settings/models`). Full provider UI coming in Phase 2.
|
Managed at **Settings → Models** (`/settings/models`).
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
@@ -64,7 +61,7 @@ Managed at **Settings → Models** (`/settings/models`). Full provider UI coming
|
|||||||
"providers": {
|
"providers": {
|
||||||
"anthropic": {
|
"anthropic": {
|
||||||
"credentials": [
|
"credentials": [
|
||||||
{"id": "cli", "label": "Claude CLI (OAuth)", "type": "cli"}
|
{"id": "key1", "label": "My Anthropic Key", "type": "api_key", "api_key": "sk-ant-..."}
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"google": {
|
"google": {
|
||||||
@@ -77,6 +74,13 @@ Managed at **Settings → Models** (`/settings/models`). Full provider UI coming
|
|||||||
"hosts": [
|
"hosts": [
|
||||||
{
|
{
|
||||||
"id": "abc123",
|
"id": "abc123",
|
||||||
|
"label": "OpenRouter",
|
||||||
|
"api_url": "https://openrouter.ai/api/v1",
|
||||||
|
"api_key": "sk-or-...",
|
||||||
|
"host_type": "openai"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "def456",
|
||||||
"label": "Gaming Laptop",
|
"label": "Gaming Laptop",
|
||||||
"api_url": "http://192.168.x.x:3000",
|
"api_url": "http://192.168.x.x:3000",
|
||||||
"api_key": "",
|
"api_key": "",
|
||||||
@@ -87,23 +91,22 @@ Managed at **Settings → Models** (`/settings/models`). Full provider UI coming
|
|||||||
"models": [
|
"models": [
|
||||||
{
|
{
|
||||||
"id": "m1",
|
"id": "m1",
|
||||||
"type": "claude_cli",
|
"type": "local_openai",
|
||||||
"label": "Sonnet 4.6 (CLI)",
|
"label": "Claude Sonnet 4.6 (OpenRouter)",
|
||||||
"model_name": "claude-sonnet-4-6",
|
"model_name": "anthropic/claude-sonnet-4-6",
|
||||||
"provider": "anthropic",
|
"host_id": "abc123",
|
||||||
"credential_id": "cli",
|
|
||||||
"context_k": 200,
|
"context_k": 200,
|
||||||
"tags": ["chat", "persona"]
|
"tags": ["chat", "persona"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "m2",
|
"id": "m2",
|
||||||
"type": "gemini_api",
|
"type": "anthropic_api",
|
||||||
"label": "Gemini 2.5 Flash (OSIT)",
|
"label": "Claude Sonnet 4.6 (Direct)",
|
||||||
"model_name": "gemini-2.5-flash",
|
"model_name": "claude-sonnet-4-6",
|
||||||
"provider": "google",
|
"provider": "anthropic",
|
||||||
"account_id": "a1b2",
|
"credential_id": "key1",
|
||||||
"context_k": 1000,
|
"context_k": 200,
|
||||||
"tags": ["orchestrator", "research"]
|
"tags": ["chat"]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "m3",
|
"id": "m3",
|
||||||
@@ -111,7 +114,7 @@ Managed at **Settings → Models** (`/settings/models`). Full provider UI coming
|
|||||||
"label": "Gemma 4 E4B",
|
"label": "Gemma 4 E4B",
|
||||||
"model_name": "gemma4:e4b",
|
"model_name": "gemma4:e4b",
|
||||||
"provider": "local",
|
"provider": "local",
|
||||||
"host_id": "abc123",
|
"host_id": "def456",
|
||||||
"context_k": 72,
|
"context_k": 72,
|
||||||
"max_rounds": 5,
|
"max_rounds": 5,
|
||||||
"tools": true,
|
"tools": true,
|
||||||
@@ -120,8 +123,8 @@ Managed at **Settings → Models** (`/settings/models`). Full provider UI coming
|
|||||||
],
|
],
|
||||||
|
|
||||||
"roles": {
|
"roles": {
|
||||||
"chat": {"primary": "m1", "backup_1": "m2", "backup_2": "m3"},
|
"chat": {"primary": "m1", "backup_1": "m2"},
|
||||||
"orchestrator": {"primary": "m2", "backup_1": "m3"},
|
"orchestrator": {"primary": "m2"},
|
||||||
"distill": {"primary": "m1"}
|
"distill": {"primary": "m1"}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -145,52 +148,9 @@ Managed at **Settings → Models** (`/settings/models`). Full provider UI coming
|
|||||||
Set `api_url` to the base path before `/chat/completions`:
|
Set `api_url` to the base path before `/chat/completions`:
|
||||||
- OpenRouter: `https://openrouter.ai/api/v1`
|
- OpenRouter: `https://openrouter.ai/api/v1`
|
||||||
|
|
||||||
### Built-in model IDs
|
|
||||||
|
|
||||||
Always resolvable without a user-created registry entry. Used as role defaults.
|
|
||||||
|
|
||||||
| ID | Type | Notes |
|
|
||||||
|---|---|---|
|
|
||||||
| `claude_cli` | `claude_cli` | Model from `DEFAULT_MODEL` in `.env` |
|
|
||||||
| `gemini_cli` | `gemini_cli` | Gemini CLI subprocess |
|
|
||||||
| `gemini_api` | `gemini_api` | Model from `ORCHESTRATOR_MODEL` in `.env`; key from `GEMINI_API_KEY` |
|
|
||||||
|
|
||||||
### V1 → V2 migration
|
|
||||||
|
|
||||||
Automatic on first load. Changes:
|
|
||||||
- Adds `providers` section (Anthropic CLI credential + empty Google accounts)
|
|
||||||
- Migrates `gemini_api_key` from `auth.json` → `providers.google.accounts[0]`
|
|
||||||
- All existing hosts, models, and role assignments are preserved
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Claude Backend (`_claude()`)
|
## Local/OpenAI-Compatible Backend (`_local()`)
|
||||||
|
|
||||||
Runs `claude --print --no-session-persistence --output-format text` as a subprocess.
|
|
||||||
|
|
||||||
- System prompt passed via `--system-prompt`
|
|
||||||
- Conversation history formatted as `<conversation>` block
|
|
||||||
- Token read live from `~/.claude/.credentials.json` on every call — never uses the
|
|
||||||
env var, which goes stale after `claude auth login`
|
|
||||||
- Model override via `--model` flag when `model_name` is set in the registry entry
|
|
||||||
|
|
||||||
Timeout: `TIMEOUT_CLAUDE=60` seconds (`.env`)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Gemini CLI Backend (`_gemini()`)
|
|
||||||
|
|
||||||
Runs `gemini --output-format text --extensions "" -p <prompt>` as a subprocess.
|
|
||||||
|
|
||||||
- `--extensions ""` disables all MCP extensions — prevents child processes keeping pipes open
|
|
||||||
- `start_new_session=True` puts the process in its own group for clean `os.killpg` on timeout
|
|
||||||
- Output is cleaned to strip CLI noise (loading messages, retry notices, quota warnings)
|
|
||||||
|
|
||||||
Timeout: `TIMEOUT_GEMINI=120` seconds (`.env`)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Local Backend (`_local()`)
|
|
||||||
|
|
||||||
HTTP POST to an OpenAI-compatible endpoint. Model config is resolved via the model registry.
|
HTTP POST to an OpenAI-compatible endpoint. Model config is resolved via the model registry.
|
||||||
|
|
||||||
@@ -199,13 +159,36 @@ HTTP POST to an OpenAI-compatible endpoint. Model config is resolved via the mod
|
|||||||
# host_type "openai": POST {api_url}/chat/completions
|
# host_type "openai": POST {api_url}/chat/completions
|
||||||
```
|
```
|
||||||
|
|
||||||
|
System prompt is sent as the first `{"role": "system", "content": "..."}` message.
|
||||||
|
Image attachments are injected into the last user message as `image_url` content blocks.
|
||||||
|
Token usage is recorded when returned by the endpoint.
|
||||||
|
|
||||||
|
Streaming variant: `_local_streaming()` — SSE line-by-line, yields tokens via `token_sink`.
|
||||||
|
|
||||||
Timeout: `TIMEOUT_LOCAL=300` seconds (`.env`) — local models may need to load from disk.
|
Timeout: `TIMEOUT_LOCAL=300` seconds (`.env`) — local models may need to load from disk.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Gemini API (Orchestrator)
|
## Anthropic API Backend (`_anthropic_api()`)
|
||||||
|
|
||||||
Used by `orchestrator_engine.py` for the ReAct tool loop. Not used for general chat.
|
Direct call to the Anthropic Messages API via the `anthropic` Python SDK.
|
||||||
|
|
||||||
|
System prompt passed as top-level `system` field. Messages stripped to `role`/`content` only.
|
||||||
|
Token usage is always recorded from `resp.usage`.
|
||||||
|
|
||||||
|
Streaming variant: `_anthropic_api_streaming()` — uses `client.messages.stream()`, yields
|
||||||
|
tokens via `token_sink`.
|
||||||
|
|
||||||
|
API key comes from the model registry: `providers.anthropic.credentials[n].api_key`.
|
||||||
|
|
||||||
|
Timeout: governed by httpx defaults and the Anthropic SDK's own connection handling.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Gemini API (Orchestrator only)
|
||||||
|
|
||||||
|
Used by `orchestrator_engine.py` for the ReAct tool loop. Not dispatched through
|
||||||
|
`llm_client.py` and not available for chat, distill, or other roles.
|
||||||
|
|
||||||
API key resolution order:
|
API key resolution order:
|
||||||
1. `api_key` embedded in the resolved orchestrator model config (V2 registry with `account_id`)
|
1. `api_key` embedded in the resolved orchestrator model config (V2 registry with `account_id`)
|
||||||
@@ -217,9 +200,7 @@ API key resolution order:
|
|||||||
## Distillation
|
## Distillation
|
||||||
|
|
||||||
Memory distillation uses `role="distill"`. Configure via Model Registry → Role Assignments.
|
Memory distillation uses `role="distill"`. Configure via Model Registry → Role Assignments.
|
||||||
|
Any `local_openai` or `anthropic_api` model can be assigned to the distill role.
|
||||||
`.env` override: `ROLE_DISTILL=claude_cli` (default).
|
|
||||||
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -232,4 +213,4 @@ Memory distillation uses `role="distill"`. Configure via Model Registry → Role
|
|||||||
| `cortex/routers/local_llm.py` | Settings UI routes + `/api/models/role` AJAX |
|
| `cortex/routers/local_llm.py` | Settings UI routes + `/api/models/role` AJAX |
|
||||||
| `cortex/routers/chat.py` | `_backend_label()`, `fallback_used` flag |
|
| `cortex/routers/chat.py` | `_backend_label()`, `fallback_used` flag |
|
||||||
| `cortex/routers/orchestrator.py` | Engine selection, Gemini API key resolution |
|
| `cortex/routers/orchestrator.py` | Engine selection, Gemini API key resolution |
|
||||||
| `cortex/config.py` | `ROLE_*` env defaults, `DEFINED_ROLES`, `PRIMARY_BACKEND` |
|
| `cortex/config.py` | `ROLE_*` env defaults, `DEFINED_ROLES`, `TIMEOUT_LOCAL` |
|
||||||
|
|||||||
Reference in New Issue
Block a user