feat: multi-level agent management — background agents, lifecycle tools, 3-level hierarchy

agent_manager.py (new):
- AgentRecord dataclass: agent_id, level (1/2/3), role, task, status, started,
  parent_id (lineage), finished, result, notify, _task_ref
- register() / finish() / cancel_agent() / list_agents() / get() / set_task_ref()
- Calls notification.notify() on completion when notify=True (same channel as
  reminders and cron completions)
- 24-hour pruning of completed records on each new registration

spawn_agent (tools/agents.py):
- background=True: fires asyncio.create_task(), registers in agent_manager, returns
  agent_id string immediately — sync path unchanged (no regression)
- notify=True: push/Talk notification when the background task completes
- Level enforcement: _agent_level param tracks hierarchy depth; when spawning from
  Level 2, child automatically gets spawn_agent + aider_run denied so Level 3 agents
  cannot delegate further

New lifecycle tools (tools/agents.py + __init__.py):
- agent_status(agent_id) — status, role, level, elapsed, task, result preview; user-level
- agent_list(status, limit) — all agents for current user, newest first; user-level
- agent_cancel(agent_id) — kills background task; admin-only, confirm-required

tests/test_agent_manager.py (new, 41 tests):
- agent_manager CRUD, pruning, notification hook
- spawn_agent background: returns immediately, completes async, timeout, failure
- Level enforcement: L1→L2 permits spawn, L2→L3 auto-denies; explicit tool_list path
- agent_status / agent_list / agent_cancel output formatting
- aider_run background: returns agent_id, completes async, sync path unchanged
- All tests run without browser or Cortex service (~2.5s total)
  Run: cd cortex && .venv/bin/python -m pytest tests/test_agent_manager.py -v

Docs: ARCH__FUTURE.md §13 (full design), ROADMAP.md, TODO__Agents.md, MASTER.md,
HELP.md (orchestrator description corrected, tool schema line updated to reflect
keyword routing), CLAUDE.md tool count 66→69.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-06-03 22:40:20 -04:00
parent 29d8aa4aae
commit 658c508925
9 changed files with 1307 additions and 26 deletions

View File

@@ -67,6 +67,57 @@ automatically. Remaining work is quality/reliability parity, not ground-up desig
- [x] **`email_send`** — SMTP via email_utils, per-user regex allowlist in `home/{user}/email_allowlist.json`, managed via Settings UI textarea + Files panel raw editor — 2026-04-29
- [x] **`web_push`** — VAPID push via pywebpush; subscriptions in `home/{user}/push_subscriptions.json`; "Enable notifications" toggle in ☰ menu; sw.js push+notificationclick handlers — 2026-05-05
### [Agents] Multi-Level Agent Management
Design: `documentation/ARCH__FUTURE.md` §13
Three-level hierarchy: Level 1 = Cortex Persona; Level 2 = Specialized Sub-Agent
(can spawn Level 3); Level 3 = Basic Support Agent (cannot spawn). All spawning is
currently synchronous and blocking — this makes long-running agents (Aider, research
pipelines) unusable without freezing the orchestrator.
**Phase 1 — Foundation (build first):**
- [x] **`cortex/agent_manager.py`** — `AgentRecord` dataclass (agent_id, level, role,
task, status, started, parent_id, result, notify, user); module-level registry dict
with `asyncio.Lock()`; `register()`, `finish()`, `cancel_agent()`,
`list_agents(user, status)` functions; calls `notification.notify()` on completion
when `notify=True`; prune records older than 24 hours on next register — 2026-06-03
- [x] **Background mode for `spawn_agent`** — added `background: bool = False` and
`notify: bool = False` params; when `background=True`, wraps `_run()` in
`asyncio.create_task()`, registers in agent_manager, returns agent_id immediately;
existing sync path unchanged — 2026-06-03
- [x] **`agent_status(agent_id)` tool** — returns status, role, task excerpt, elapsed
seconds, result preview (first 300 chars); user-level — 2026-06-03
- [x] **`agent_list(status=None, limit=10)` tool** — returns running + recent agents for
current user; filter by `status`; user-level — 2026-06-03
- [x] **`agent_cancel(agent_id)` tool** — cancels background task via stored
`asyncio.Task` reference; admin-only, confirm-required — 2026-06-03
**Phase 2 — Level enforcement:**
- [ ] **`agent_level` ContextVar** — set to 1 in the main orchestrators; passed into
`spawn_agent` so sub-agents know their level
- [ ] **Auto-deny at L2→L3 boundary** — when Level 2 calls `spawn_agent`, automatically
add `deny_tools=["spawn_agent", "aider_run"]` so the Level 3 child cannot delegate;
store level in `AgentRecord` for lineage tracking
**Phase 3 — `aider_run` async:**
- [x] **`aider_run` background mode** — added `background: bool = False` and
`notify: bool = False` params; runs subprocess via `asyncio.create_task()`, registers
in agent_manager, returns agent_id immediately; confirmation still required (correct
— user confirms before the tool runs, not during) — 2026-06-03
- [x] **Register new tools in `__init__.py`**`agent_status`, `agent_list`, `agent_cancel`
in `TOOL_CATEGORIES["Agents"]`; `agent_cancel` in `TOOL_ROLES` (admin) and
`CONFIRM_REQUIRED`; added to `_CALLABLES` and `_ALL_DECLARATIONS` — 2026-06-03
**Tests:**
- [x] **`cortex/tests/test_agent_manager.py`** — 41 tests covering: agent_manager CRUD,
prune, notify hook, spawn_agent background mode (returns immediately, completes async,
timeout, failure), level enforcement (L1→L2 permits, L2→L3 auto-denies), agent
lifecycle tools output, aider_run background mode — 2026-06-03
Run: `cd cortex && .venv/bin/python -m pytest tests/test_agent_manager.py -v`
---
### [Tools] Orchestrator tool expansions — Round 2
Next additions identified 2026-05-08. See `ARCH__FUTURE.md` §2 for design notes.
@@ -227,11 +278,23 @@ Every orchestrator tool invocation logged to `home/{user}/tool_audit/YYYY-MM-DD.
### [Intelligence] Dev agent pipeline
See `ARCH__Intelligence_Layer.md`. Full design not yet started.
`aider_run` (2026-05-23) provides the execution layer — Cortex dispatches to Aider as
the coding worker. Aider is model-agnostic (DeepSeek, Ollama, OpenRouter, etc.) and
fully scriptable via `--message --yes-always`. This replaces the Claude Code subprocess
dependency for coding tasks. Per-project `.aider.conf.yml` holds read-only context files
and lint commands; model/key come from env vars (not committed).
- [x] **`aider_run` tool** — `cortex/tools/aider.py`; project aliases + subprocess with `--message --yes-always`; admin-only, confirm-required, high risk — 2026-05-23
- [ ] **`aider_run` async/notify** — current implementation blocks for up to 5 min with no UI feedback; convert to background task + web_push/NC Talk notification on completion (same pattern as `/orchestrate` jobs); optionally add `aider_status` tool for mid-task polling
- [x] **`.aider.conf.yml`** — project-level Aider config: `read: [CLAUDE.md]`, Python lint-cmd, auto-commits — 2026-05-23
- [x] **`.gitignore`** — added `.aider.chat.history.md`, `.aider.input.history`, `.aider.llm.history` — 2026-05-23
- [ ] Specialist agent: frontend (SvelteKit) code changes
- [ ] Specialist agent: backend (FastAPI) code changes
- [ ] Supervisor agent: diff review, syntax check, test runner
- [ ] Gitea webhook integration: trigger on push/PR, report back
- [ ] Human approval gate before commit
- [ ] `.aider.conf.yml` for aether_api, aether_frontend, aether_container projects
### [Intelligence] Supervisor agent
- Runs `py_compile`, `svelte-check`, unit tests after specialist agent work