feat: Home Assistant API tools (ha_get_state, ha_get_states, ha_call_service)

Register three HA orchestrator tools so Inara can read device states and control devices via the HA REST API. ha_call_service requires admin role and user confirmation. Also includes accumulated UI fixes (setProcessing helper, wasNewSession flag cleanup). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:39:35 -04:00
parent ba91de37c5
commit fc6600c33e
5 changed files with 334 additions and 15 deletions
--- a/documentation/DESIGN__Model_Registry_V2.md
+++ b/documentation/DESIGN__Model_Registry_V2.md
@@ -90,7 +90,8 @@ Stored in `home/{user}/model_registry.json`.
  "models": [
    {"id": "m1", "type": "claude_cli",   "label": "Sonnet 4.6 (CLI)",     "model_name": "claude-sonnet-4-6",  "provider": "anthropic", "credential_id": "cli",  "context_k": 1000, "tags": []},
    {"id": "m2", "type": "gemini_api",   "label": "Gemini 2.5 Flash",     "model_name": "gemini-2.5-flash",   "provider": "google",    "account_id": "a1b2",    "context_k": 1000, "tags": []},
-    {"id": "m3", "type": "local_openai", "label": "Gemma 4 E4B",          "model_name": "gemma4:e4b",         "provider": "local",     "host_id": "h1",         "context_k": 72,   "tags": []}
+    {"id": "m3", "type": "local_openai", "label": "Gemma 4 E4B",          "model_name": "gemma4:e4b",         "provider": "local",     "host_id": "h1",         "context_k": 72,   "tags": []},
+    {"id": "m4", "type": "local_openai", "label": "DeepSeek: V4 Flash",   "model_name": "deepseek/deepseek-v4-flash", "provider": "local", "host_id": "h1", "context_k": 750, "reasoning_budget_tokens": 4096, "tags": ["frontier"]}
  ],
  "roles": {
    "chat":        {"primary": "m1", "backup_1": "m2", "backup_2": "m3"},
@@ -109,6 +110,15 @@ Stored in `home/{user}/model_registry.json`.
 | `gemini_api` | Currently: Gemini CLI (gap — see Phase 4) | Should use google-genai SDK |
 | `local_openai` | HTTP to OpenAI-compatible endpoint | host_type controls path |

+### Optional model fields
+
+| Field | Type | Default | Meaning |
+|---|---|---|---|
+| `context_k` | int | 32 | Context window in thousands of tokens. Used for compaction budget (75% of window). |
+| `max_rounds` | int \| null | null | Per-model tool loop cap. `null` = use global `orchestrator_max_rounds`. Effective limit = `min(per_model, global)`. |
+| `tools` | bool | true | Whether this model supports tool calling. `false` = skip tool loop entirely; model gets a plain chat request. |
+| `reasoning_budget_tokens` | int \| null | null | Per-model reasoning/thinking budget for models that support it (e.g., DeepSeek V4 via OpenRouter). `null` = no reasoning override. When set, injected as `{"reasoning": {"budget_tokens": <value>}}` in the API call to OpenRouter-compatible endpoints. |
+
 ### Built-in model IDs

 Always resolvable without a registry entry (used as `.env` role defaults):
@@ -196,4 +206,4 @@ the orchestrator role can now be a local model.
 - Claude direct API key support (alternative to CLI OAuth)
 - OpenRouter as a named provider (already works as local host; could be promoted)
 - Per-role "test" button in role assignments UI
- Per-user catalog additions (extend ANTHROPIC_CATALOG / GOOGLE_CATALOG from UI)
+- Per-user catalog additions (extend ANTHROPIC_CATALOG / GOOGLE_CATALOG from UI)