feat: multi-level agent management — background agents, lifecycle tools, 3-level hierarchy
agent_manager.py (new): - AgentRecord dataclass: agent_id, level (1/2/3), role, task, status, started, parent_id (lineage), finished, result, notify, _task_ref - register() / finish() / cancel_agent() / list_agents() / get() / set_task_ref() - Calls notification.notify() on completion when notify=True (same channel as reminders and cron completions) - 24-hour pruning of completed records on each new registration spawn_agent (tools/agents.py): - background=True: fires asyncio.create_task(), registers in agent_manager, returns agent_id string immediately — sync path unchanged (no regression) - notify=True: push/Talk notification when the background task completes - Level enforcement: _agent_level param tracks hierarchy depth; when spawning from Level 2, child automatically gets spawn_agent + aider_run denied so Level 3 agents cannot delegate further New lifecycle tools (tools/agents.py + __init__.py): - agent_status(agent_id) — status, role, level, elapsed, task, result preview; user-level - agent_list(status, limit) — all agents for current user, newest first; user-level - agent_cancel(agent_id) — kills background task; admin-only, confirm-required tests/test_agent_manager.py (new, 41 tests): - agent_manager CRUD, pruning, notification hook - spawn_agent background: returns immediately, completes async, timeout, failure - Level enforcement: L1→L2 permits spawn, L2→L3 auto-denies; explicit tool_list path - agent_status / agent_list / agent_cancel output formatting - aider_run background: returns agent_id, completes async, sync path unchanged - All tests run without browser or Cortex service (~2.5s total) Run: cd cortex && .venv/bin/python -m pytest tests/test_agent_manager.py -v Docs: ARCH__FUTURE.md §13 (full design), ROADMAP.md, TODO__Agents.md, MASTER.md, HELP.md (orchestrator description corrected, tool schema line updated to reflect keyword routing), CLAUDE.md tool count 66→69. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -317,6 +317,149 @@ This pattern maps naturally to several existing concepts:
|
||||
|
||||
---
|
||||
|
||||
## 13. Multi-Level Agent Management
|
||||
|
||||
**Status:** Design complete — implementation not yet started. See `TODO__Agents.md` for the task breakdown.
|
||||
|
||||
Cortex personas can spawn specialized sub-agents to handle parallel or long-running work.
|
||||
Sub-agents can in turn spawn lightweight support agents for simple subtasks. The hierarchy
|
||||
is capped at three levels to prevent runaway delegation.
|
||||
|
||||
### Level Definitions
|
||||
|
||||
| Level | Name | Created by | Can spawn | Tool scope |
|
||||
|---|---|---|---|---|
|
||||
| **1** | Cortex Persona (Inara) | HTTP request / cron | Level 2 | Full orchestrator tool set |
|
||||
| **2** | Specialized Sub-Agent | Level 1 `spawn_agent` | Level 3 only | Role-scoped; `spawn_agent` auto-restricted so children are Level 3 |
|
||||
| **3** | Basic Support Agent | Level 2 `spawn_agent` | Nothing | Narrow tool set; `spawn_agent` and `aider_run` denied |
|
||||
|
||||
**Examples:**
|
||||
- Level 1 spawns a Level 2 **Coder** agent (has file + git + shell tools; can spawn a Level 3 syntax-checker)
|
||||
- Level 1 spawns a Level 2 **Research** agent (web tools only; can spawn a Level 3 web reader for parallel page fetches)
|
||||
- Level 2 spawns a Level 3 **Support** agent for a focused subtask (web_search only, no writes, no further delegation)
|
||||
|
||||
### Core Problem: Everything is Currently Synchronous
|
||||
|
||||
Both `spawn_agent` and `aider_run` block the calling coroutine for their full duration
|
||||
(default 120s / 300s respectively). Level 1 (Inara) cannot respond to the user, send
|
||||
notifications, or inspect other agents while waiting. For 5-minute Aider runs or multi-step
|
||||
research agents this is unusable — the user sees nothing until completion or timeout.
|
||||
|
||||
### Design
|
||||
|
||||
#### 1. Agent Manager (`cortex/agent_manager.py`)
|
||||
|
||||
A lightweight in-process registry of running and recently completed agents. Module-level
|
||||
dict protected by `asyncio.Lock()`:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class AgentRecord:
|
||||
agent_id: str # UUID
|
||||
level: int # 1 / 2 / 3
|
||||
role: str # e.g. "coder", "research"
|
||||
task: str # first 200 chars of the task
|
||||
status: str # running / done / failed / cancelled / timeout
|
||||
started: datetime
|
||||
finished: datetime | None
|
||||
parent_id: str | None # lineage — which agent spawned this one
|
||||
result: str | None # populated on completion (first 500 chars)
|
||||
notify: bool # fire web_push/NC Talk notification on completion
|
||||
user: str
|
||||
|
||||
_agents: dict[str, AgentRecord] = {}
|
||||
_lock = asyncio.Lock()
|
||||
```
|
||||
|
||||
On completion, the manager calls `notification.py notify()` if `notify=True` — the same
|
||||
function used by reminder checks and cron completions. Completed agents stay in the
|
||||
registry for 24 hours then are pruned on next access.
|
||||
|
||||
#### 2. Background Mode for `spawn_agent`
|
||||
|
||||
Add `background: bool = False` and `notify: bool = False` to `spawn_agent`. When
|
||||
`background=False` (default): existing synchronous blocking behaviour — unchanged, no
|
||||
regression. When `background=True`: wraps the run in `asyncio.create_task()`, registers
|
||||
in the agent manager, returns an `agent_id` string immediately.
|
||||
|
||||
```python
|
||||
# Level 1 — non-blocking delegation:
|
||||
agent_id = await spawn_agent(
|
||||
task="Research Zigbee mesh repeaters; summarize findings to my journal",
|
||||
role="research",
|
||||
background=True,
|
||||
notify=True, # web_push + NC Talk when done
|
||||
)
|
||||
# Returns "550e8400-..." immediately. Inara continues responding to the user.
|
||||
```
|
||||
|
||||
#### 3. Agent Lifecycle Tools
|
||||
|
||||
Three new tools, wired into `cortex/tools/__init__.py` under the "Agents" category:
|
||||
|
||||
| Tool | Params | Description |
|
||||
|---|---|---|
|
||||
| `agent_status(agent_id)` | `agent_id: str` | Status, role, task, elapsed, result preview |
|
||||
| `agent_list(status=None, limit=10)` | `status: str \| None` | All agents for current user; filter by status |
|
||||
| `agent_cancel(agent_id)` | `agent_id: str` | Cancel a running background agent (admin, confirm-required) |
|
||||
|
||||
Level 1 can call these between tool rounds to check on delegated work without blocking.
|
||||
|
||||
#### 4. Level Enforcement
|
||||
|
||||
`agent_level` is passed through `spawn_agent` calls as a ContextVar so each agent knows
|
||||
where it sits in the hierarchy. Enforcement is automatic and simple:
|
||||
|
||||
- **L1 → spawns L2:** `spawn_agent` called normally. Child agent inherits role tools.
|
||||
- **L2 → spawns L3:** `spawn_agent` automatically adds `deny_tools=["spawn_agent", "aider_run"]`
|
||||
to the child's effective tool set. Level 3 agents cannot further delegate.
|
||||
- **Level 3:** `spawn_agent` and `aider_run` are never in the tool list.
|
||||
|
||||
Level is stored in `AgentRecord.level` — the lineage (`parent_id`) provides a full call tree.
|
||||
|
||||
#### 5. `aider_run` Background Mode
|
||||
|
||||
Add `background: bool = False` and `notify: bool = False` to `aider_run`. When `True`,
|
||||
runs the Aider subprocess via `asyncio.create_task()`, registers in the agent manager,
|
||||
returns `agent_id` immediately. When called in background mode, `aider_run` is removed
|
||||
from `CONFIRM_REQUIRED` — the user is not blocking on a confirmation gate since the call
|
||||
returns instantly.
|
||||
|
||||
```python
|
||||
# Level 1 or 2 — fire and forget a code change:
|
||||
agent_id = await aider_run(
|
||||
project="cortex",
|
||||
task="Add max_chars param to http_fetch in tools/web.py, cap at 32768",
|
||||
background=True,
|
||||
notify=True,
|
||||
)
|
||||
```
|
||||
|
||||
### Implementation Order
|
||||
|
||||
1. **`agent_manager.py`** — AgentRecord + registry CRUD + completion notification hook.
|
||||
Foundation for everything else; ~100 lines.
|
||||
2. **`spawn_agent` background mode** — `background` + `notify` + `agent_level` params;
|
||||
`asyncio.create_task()`; registers in manager. Existing sync path unchanged.
|
||||
3. **`agent_status` / `agent_list` / `agent_cancel`** — wire into `__init__.py`; add to
|
||||
`TOOL_CATEGORIES["Agents"]`, `TOOL_ROLES` (cancel = admin), `CONFIRM_REQUIRED` (cancel).
|
||||
4. **Level enforcement** — `agent_level` ContextVar; auto `deny_tools` at L2→L3 boundary.
|
||||
5. **`aider_run` background mode** — same pattern as step 2.
|
||||
|
||||
### Files to Create/Modify
|
||||
|
||||
| File | Change |
|
||||
|---|---|
|
||||
| `cortex/agent_manager.py` | **New** — AgentRecord, registry dict, start/finish/cancel/list functions |
|
||||
| `cortex/tools/agents.py` | Add `background`, `notify`, `agent_level` to `spawn_agent`; add `agent_status`, `agent_list`, `agent_cancel` functions + declarations |
|
||||
| `cortex/tools/aider.py` | Add `background`, `notify` params; register with agent_manager when background |
|
||||
| `cortex/tools/__init__.py` | Register new agent tools; update TOOL_CATEGORIES, TOOL_ROLES, CONFIRM_REQUIRED |
|
||||
|
||||
See §12 for the existing `allow_tools` / `deny_tools` per-call restrictions that level
|
||||
enforcement builds on.
|
||||
|
||||
---
|
||||
|
||||
## 12. Spawner-Level Tool Restrictions — `spawn_agent` Permission Control
|
||||
|
||||
**Status:** Design complete, not yet built.
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# Cortex — Master Index
|
||||
|
||||
> Start here. This document is a map, not a manual.
|
||||
> Last updated: 2026-05-13
|
||||
> Last updated: 2026-06-03
|
||||
>
|
||||
> **Documentation philosophy:** Cortex is a no-black-box system. Docs must match reality.
|
||||
> Update docs before implementing significant changes. Verify they still match after.
|
||||
@@ -26,7 +26,7 @@ Cortex is a self-hosted personal AI platform. It routes messages from any input
|
||||
| Claude backend | ✅ Live | Primary — via Claude Code CLI |
|
||||
| Gemini backend | ✅ Live | Fallback — via Gemini CLI |
|
||||
| Local backend | ✅ Live | Open WebUI/Ollama on scott_gaming; per-user multi-model config |
|
||||
| Gemini orchestrator | ✅ Live | Tool loop → Claude response, ⚡ toggle in UI (62 tools) |
|
||||
| Gemini orchestrator | ✅ Live | Tool loop → Claude response, ⚡ toggle in UI (66 tools) |
|
||||
| Local orchestrator | ✅ Live | OpenAI-compatible ReAct loop; used when orchestrator role → local model |
|
||||
| Model registry V2 | ✅ Live | Providers (Anthropic/Google/Local), multi-account Gemini, role assignments |
|
||||
| Memory distillation | ✅ Live | Short (daily) / Mid (weekly) / Long (monthly) |
|
||||
@@ -38,12 +38,13 @@ Cortex is a self-hosted personal AI platform. It routes messages from any input
|
||||
| Token usage tracking | ✅ Live | Per-user daily buckets in `home/{user}/usage.json`; visible in Settings |
|
||||
| Web push notifications | ✅ Live | VAPID push; `web_push` orchestrator tool; subscribe via ☰ menu |
|
||||
| Proactive notifications | ✅ Live | Daily reminder check (09:00); distill/cron completion alerts; dedicated `/settings/notifications` page |
|
||||
| Sub-agent spawning | ✅ Live | `spawn_agent` tool — synchronous sub-agents via any configured model |
|
||||
| Sub-agent spawning | ✅ Live | `spawn_agent` tool — sync or background; `agent_status`/`agent_list`/`agent_cancel`; 3-level hierarchy (L2→L3 enforcement built in) |
|
||||
| Aider coding agent | ✅ Live | `aider_run` tool — Aider subprocess; model-agnostic (DeepSeek, Ollama, OpenRouter, etc.) |
|
||||
| Agent private notes | ✅ Live | `AGENT_NOTES.md` — orchestrator-only notepad; 3 rolling backups; user-visible as read-only |
|
||||
| Distill safety | ✅ Live | Per-persona asyncio lock, per-endpoint cooldowns, Rebuild option |
|
||||
| Guided onboarding | ✅ Live | Setup Step 3 for OpenRouter; existing-user banner; settings quick-link |
|
||||
|
||||
**65 orchestrator tools** across 17 domain modules — added 2026-05-12: `file_diff`, `git_status` / `git_log` / `git_diff` (read-only git inspection), `ae_db_query` / `ae_db_describe` / `ae_db_show_view` (SELECT-only Aether MariaDB access, admin, per-user credentials). `/settings/integrations` page added (admin-only). File attachments in chat (images for vision-capable local models; text/code files for all backends). Settings pages unified under `pg.css`. Added 2026-05-13: `task` cron type (full orchestrator loop on a schedule); monthly/yearly schedule formats (`monthly`, `monthly:DD:HH:MM`, `yearly:MM:DD:HH:MM`); Schedules web UI at `/settings/crons` (list, add, edit, pause, delete); HA inbound webhook tools toggle (orchestrator vs. direct LLM); Anthropic API key backend (`anthropic_api` model type via Anthropic SDK — alternative to CLI OAuth); Cloud APIs catalog in Model Registry — named provider picker (OpenRouter, OpenAI, Groq, X.ai/Grok, Together.ai, Fireworks.ai, Custom) with auto-filled URLs; hosts split into Cloud APIs / Local Hosts sections. Added 2026-05-15: Per-user custom roles — three required roles (`chat`, `orchestrator`, `distill`) are always present; users can add/remove custom roles (e.g. `coder`, `research`) via the Model Registry UI; existing `.env`-defined roles auto-migrated. Settings pages (`local_llm.html` + all settings pages) migrated to Tailwind CSS CDN (no build step); `preflight: false` preserves `pg.css` base styles; `input[type=checkbox/radio]` global width fix in `pg.css`; `btn-submit` now responsive (`w-full md:w-96`).
|
||||
**69 orchestrator tools** across 17 domain modules — added 2026-06-03: `agent_status`/`agent_list` (user-level)/`agent_cancel` (admin, confirm-required); background mode for `spawn_agent` (`background=True` returns agent_id immediately; `notify=True` sends push on completion); `agent_manager.py` registry with lineage tracking and 24h pruning; L2→L3 level enforcement auto-denies `spawn_agent`/`aider_run` in Level 3 children. Added 2026-05-23: `aider_run` (Aider coding agent subprocess; project aliases for cortex/aether_api/aether_frontend/aether_container; model-agnostic via `.aider.conf.yml` or env vars; admin-only, confirm-required). `.aider.conf.yml` added to project root (read-only context, Python lint-cmd, auto-commits). Added 2026-05-12: `file_diff`, `git_status` / `git_log` / `git_diff` (read-only git inspection), `ae_db_query` / `ae_db_describe` / `ae_db_show_view` (SELECT-only Aether MariaDB access, admin, per-user credentials). `/settings/integrations` page added (admin-only). File attachments in chat (images for vision-capable local models; text/code files for all backends). Settings pages unified under `pg.css`. Added 2026-05-13: `task` cron type (full orchestrator loop on a schedule); monthly/yearly schedule formats (`monthly`, `monthly:DD:HH:MM`, `yearly:MM:DD:HH:MM`); Schedules web UI at `/settings/crons` (list, add, edit, pause, delete); HA inbound webhook tools toggle (orchestrator vs. direct LLM); Anthropic API key backend (`anthropic_api` model type via Anthropic SDK — alternative to CLI OAuth); Cloud APIs catalog in Model Registry — named provider picker (OpenRouter, OpenAI, Groq, X.ai/Grok, Together.ai, Fireworks.ai, Custom) with auto-filled URLs; hosts split into Cloud APIs / Local Hosts sections. Added 2026-05-15: Per-user custom roles — three required roles (`chat`, `orchestrator`, `distill`) are always present; users can add/remove custom roles (e.g. `coder`, `research`) via the Model Registry UI; existing `.env`-defined roles auto-migrated. Settings pages (`local_llm.html` + all settings pages) migrated to Tailwind CSS CDN (no build step); `preflight: false` preserves `pg.css` base styles; `input[type=checkbox/radio]` global width fix in `pg.css`; `btn-submit` now responsive (`w-full md:w-96`).
|
||||
|
||||
**Active users / personas:** scott/inara, holly/tina, brian/wintermute
|
||||
|
||||
|
||||
@@ -48,6 +48,8 @@
|
||||
- ✅ `http_post` — POST to external URLs with per-user URL prefix allowlist; admin-only, confirm-required
|
||||
- ✅ `nc_talk_history` — read recent NC Talk messages; requires nc_username + nc_app_password in channels.json
|
||||
- ✅ Local orchestrator retry — exponential backoff on 429/5xx/connection errors (3 attempts)
|
||||
- ✅ Multi-level agent management — `agent_manager.py` (registry + lifecycle), background `spawn_agent`, `agent_status`/`agent_list`/`agent_cancel` tools, 3-level hierarchy enforcement (see `ARCH__FUTURE.md` §13)
|
||||
- ✅ `aider_run` background mode — background task + push notification on completion; sync path unchanged
|
||||
- [ ] Knowledge import — markdown → AE Journals (import script)
|
||||
- [ ] Dev agent pipeline — specialist agents + supervisor + approval gate
|
||||
- [ ] Gitea webhook integration + Actions CI
|
||||
|
||||
@@ -67,6 +67,57 @@ automatically. Remaining work is quality/reliability parity, not ground-up desig
|
||||
- [x] **`email_send`** — SMTP via email_utils, per-user regex allowlist in `home/{user}/email_allowlist.json`, managed via Settings UI textarea + Files panel raw editor — 2026-04-29
|
||||
- [x] **`web_push`** — VAPID push via pywebpush; subscriptions in `home/{user}/push_subscriptions.json`; "Enable notifications" toggle in ☰ menu; sw.js push+notificationclick handlers — 2026-05-05
|
||||
|
||||
### [Agents] Multi-Level Agent Management
|
||||
|
||||
Design: `documentation/ARCH__FUTURE.md` §13
|
||||
|
||||
Three-level hierarchy: Level 1 = Cortex Persona; Level 2 = Specialized Sub-Agent
|
||||
(can spawn Level 3); Level 3 = Basic Support Agent (cannot spawn). All spawning is
|
||||
currently synchronous and blocking — this makes long-running agents (Aider, research
|
||||
pipelines) unusable without freezing the orchestrator.
|
||||
|
||||
**Phase 1 — Foundation (build first):**
|
||||
- [x] **`cortex/agent_manager.py`** — `AgentRecord` dataclass (agent_id, level, role,
|
||||
task, status, started, parent_id, result, notify, user); module-level registry dict
|
||||
with `asyncio.Lock()`; `register()`, `finish()`, `cancel_agent()`,
|
||||
`list_agents(user, status)` functions; calls `notification.notify()` on completion
|
||||
when `notify=True`; prune records older than 24 hours on next register — 2026-06-03
|
||||
- [x] **Background mode for `spawn_agent`** — added `background: bool = False` and
|
||||
`notify: bool = False` params; when `background=True`, wraps `_run()` in
|
||||
`asyncio.create_task()`, registers in agent_manager, returns agent_id immediately;
|
||||
existing sync path unchanged — 2026-06-03
|
||||
- [x] **`agent_status(agent_id)` tool** — returns status, role, task excerpt, elapsed
|
||||
seconds, result preview (first 300 chars); user-level — 2026-06-03
|
||||
- [x] **`agent_list(status=None, limit=10)` tool** — returns running + recent agents for
|
||||
current user; filter by `status`; user-level — 2026-06-03
|
||||
- [x] **`agent_cancel(agent_id)` tool** — cancels background task via stored
|
||||
`asyncio.Task` reference; admin-only, confirm-required — 2026-06-03
|
||||
|
||||
**Phase 2 — Level enforcement:**
|
||||
- [ ] **`agent_level` ContextVar** — set to 1 in the main orchestrators; passed into
|
||||
`spawn_agent` so sub-agents know their level
|
||||
- [ ] **Auto-deny at L2→L3 boundary** — when Level 2 calls `spawn_agent`, automatically
|
||||
add `deny_tools=["spawn_agent", "aider_run"]` so the Level 3 child cannot delegate;
|
||||
store level in `AgentRecord` for lineage tracking
|
||||
|
||||
**Phase 3 — `aider_run` async:**
|
||||
- [x] **`aider_run` background mode** — added `background: bool = False` and
|
||||
`notify: bool = False` params; runs subprocess via `asyncio.create_task()`, registers
|
||||
in agent_manager, returns agent_id immediately; confirmation still required (correct
|
||||
— user confirms before the tool runs, not during) — 2026-06-03
|
||||
- [x] **Register new tools in `__init__.py`** — `agent_status`, `agent_list`, `agent_cancel`
|
||||
in `TOOL_CATEGORIES["Agents"]`; `agent_cancel` in `TOOL_ROLES` (admin) and
|
||||
`CONFIRM_REQUIRED`; added to `_CALLABLES` and `_ALL_DECLARATIONS` — 2026-06-03
|
||||
|
||||
**Tests:**
|
||||
- [x] **`cortex/tests/test_agent_manager.py`** — 41 tests covering: agent_manager CRUD,
|
||||
prune, notify hook, spawn_agent background mode (returns immediately, completes async,
|
||||
timeout, failure), level enforcement (L1→L2 permits, L2→L3 auto-denies), agent
|
||||
lifecycle tools output, aider_run background mode — 2026-06-03
|
||||
Run: `cd cortex && .venv/bin/python -m pytest tests/test_agent_manager.py -v`
|
||||
|
||||
---
|
||||
|
||||
### [Tools] Orchestrator tool expansions — Round 2
|
||||
Next additions identified 2026-05-08. See `ARCH__FUTURE.md` §2 for design notes.
|
||||
|
||||
@@ -227,11 +278,23 @@ Every orchestrator tool invocation logged to `home/{user}/tool_audit/YYYY-MM-DD.
|
||||
|
||||
### [Intelligence] Dev agent pipeline
|
||||
See `ARCH__Intelligence_Layer.md`. Full design not yet started.
|
||||
|
||||
`aider_run` (2026-05-23) provides the execution layer — Cortex dispatches to Aider as
|
||||
the coding worker. Aider is model-agnostic (DeepSeek, Ollama, OpenRouter, etc.) and
|
||||
fully scriptable via `--message --yes-always`. This replaces the Claude Code subprocess
|
||||
dependency for coding tasks. Per-project `.aider.conf.yml` holds read-only context files
|
||||
and lint commands; model/key come from env vars (not committed).
|
||||
|
||||
- [x] **`aider_run` tool** — `cortex/tools/aider.py`; project aliases + subprocess with `--message --yes-always`; admin-only, confirm-required, high risk — 2026-05-23
|
||||
- [ ] **`aider_run` async/notify** — current implementation blocks for up to 5 min with no UI feedback; convert to background task + web_push/NC Talk notification on completion (same pattern as `/orchestrate` jobs); optionally add `aider_status` tool for mid-task polling
|
||||
- [x] **`.aider.conf.yml`** — project-level Aider config: `read: [CLAUDE.md]`, Python lint-cmd, auto-commits — 2026-05-23
|
||||
- [x] **`.gitignore`** — added `.aider.chat.history.md`, `.aider.input.history`, `.aider.llm.history` — 2026-05-23
|
||||
- [ ] Specialist agent: frontend (SvelteKit) code changes
|
||||
- [ ] Specialist agent: backend (FastAPI) code changes
|
||||
- [ ] Supervisor agent: diff review, syntax check, test runner
|
||||
- [ ] Gitea webhook integration: trigger on push/PR, report back
|
||||
- [ ] Human approval gate before commit
|
||||
- [ ] `.aider.conf.yml` for aether_api, aether_frontend, aether_container projects
|
||||
|
||||
### [Intelligence] Supervisor agent
|
||||
- Runs `py_compile`, `svelte-check`, unit tests after specialist agent work
|
||||
|
||||
Reference in New Issue
Block a user