feat: add shell_exec tool and fix orchestrator model name resolution

- Add shell_exec to orchestrator tool suite (system.py + __init__.py) Runs arbitrary shell commands on the Cortex host with timeout (1–120s), combined stdout/stderr output, optional working_dir, and exit code reporting. Enables system diagnostics (df, ls, ps, journalctl, etc.) from Agent mode. - Fix orchestrator_engine.run() to use model_name from resolved registry entry Previously used settings.orchestrator_model (.env hardcode) regardless of what model was assigned to the orchestrator role. Now accepts model_name param and falls back to settings value only when registry has no model_name. - Update ARCH__FUTURE.md: date, running host, local orchestrator status, model registry V2 progress, added Cortex Mesh concept (section 9) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-28 20:29:46 -04:00
parent 8baab874f1
commit 1cc7988953
5 changed files with 101 additions and 6 deletions
--- a/documentation/ARCH__FUTURE.md
+++ b/documentation/ARCH__FUTURE.md
@@ -1,7 +1,7 @@
 # Architecture: Planned Features

 > What's next and how it's designed to work.
-> Last updated: 2026-04-04
+> Last updated: 2026-04-28

 For the current task list see `TODO__Agents.md`. For phases and priorities see `ROADMAP.md`.

@@ -9,7 +9,7 @@ For the current task list see `TODO__Agents.md`. For phases and priorities see `

 ## 1. Local Orchestrator

-**Status:** High priority — design complete, not yet built.
+**Status:** Partially built — `openai_orchestrator.py` exists and is wired into `POST /orchestrate`. If the `orchestrator` role in the model registry resolves to a `local_openai` model, it routes there automatically. Full parity with the Gemini orchestrator (tool loop quality, error handling, context budget enforcement) is still in progress.

 Same ReAct tool loop as the Gemini API orchestrator, but driven by a local model via Open WebUI's OpenAI-compatible API. Enables offline/private agent tasks with no API cost.

@@ -124,7 +124,7 @@ AE Journals becomes the searchable long-term knowledge base. Complements memory

 ## 5. Intelligent Model Routing

-**Status:** Deferred. Currently user-toggled.
+**Status:** Partially addressed. Model Registry V2 (2026-04-27) introduced role-based routing — `chat`, `orchestrator`, `distill`, `coder`, `research` roles each have their own primary/backup model chain, and the UI role toggle lets users manually select which role handles a message. Automatic task-characteristic routing (below) is still deferred.

 Route automatically based on task characteristics rather than requiring manual backend selection:

@@ -183,10 +183,31 @@ The Claude Code system prompt was leaked in early April 2026. Two reimplementati

 **Status:** Deferred.

-Currently running on `scott_lpt` (main laptop). Long-term target: home server (always-on, Docker).
+Currently running on `scott-lt-i7-rtx` (gaming/agents laptop). Disabled on `scott_lpt` (2026-04-28) — that machine is a dev/editing node only. Long-term target: home server (always-on, Docker).

 `docker-compose.yml` already exists in the project root. Deployment path:
 1. Copy to home server
 2. Configure reverse proxy (Nginx, already Docker-hosted)
 3. Set subdomain `cortex.dgrzone.com` → home server internal IP
 4. WireGuard required for all access — not internet-exposed
+
+---
+
+## 9. Cortex Mesh (Multi-Instance Fleet)
+
+**Status:** Concept — no design yet.
+
+Rather than a single Cortex instance, each device in the fleet runs its own instance with its own persona(s), local models, and capabilities. Instances can delegate tasks to each other based on available resources and roles.
+
+**Use cases:**
+- `scott_lpt` (edit/dev node) delegates code tasks to `scott-lt-i7-rtx` (GPU/Ollama host)
+- A background cron on one instance triggers an orchestrated task on another
+- Each instance has its own "best available" model — mesh routing picks the right node automatically
+
+**Design questions to resolve:**
+- Auth between instances (shared JWT secret vs. per-instance API keys)
+- How instances advertise capabilities (model registry over HTTP? shared Syncthing file?)
+- Whether `ae_send_message` / the existing inbox system is the right coordination layer or if a dedicated Cortex-to-Cortex protocol is needed
+- Session continuity — does a conversation that starts on one node stay there, or can it migrate?
+
+The Syncthing-synced `home/` directory and shared `model_registry.json` already provide a natural foundation — instances share persona memory and context without a central DB.