Files

Scott Idem ed472ce9a0 feat: Intelligence Layer Phase 1 — orchestrator service

Adds the Gemini API orchestrator (ReAct tool loop → Claude responder):

Orchestrator engine + router:
- orchestrator_engine.py: Gemini API tool loop, Claude CLI handoff
- routers/orchestrator.py: POST /orchestrate (async job queue), GET /orchestrate/{job_id}

Tools (cortex/tools/):
- web.py: DuckDuckGo web search (no key required)
- ae_knowledge.py: ae_journal_search + ae_journal_entry_create (AE V3 API)
- ae_tasks.py: ae_task_list (reads agents_sync Kanban filesystem)
- files.py: file_read (path-allowlisted to safe dirs)

Config + deps:
- config.py: orchestrator, DuckDuckGo, and AE API settings
- requirements.txt: google-genai, duckduckgo-search
- .env.default: reference config with all new keys documented

Docs:
- CLAUDE.md, README.md, documentation/ added to repo
- Port references updated 7331 → 8000 throughout
- Default model updated to gemini-2.5-flash

Tested: ae_task_list, ae_journal_search, web_search all working end-to-end.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-18 19:37:49 -04:00

6.7 KiB

Raw Blame History

CLAUDE.md — Cortex / Inara Project

This file is loaded automatically by Claude Code when working in this directory. Read it before touching any files.

Identity & Context

Project: Cortex (dispatcher) + Inara (resident agent)
Owner: Scott Idem (One Sky IT / Danger Zone)
Machine context: See ~/CLAUDE.md for fleet identity (scott_lpt = General Manager)
Named after: The 'verse-wide communications network (Firefly)

Directory Map

Cortex_and_Inara_dev/
  cortex/                ← FastAPI service (the dispatcher)
    main.py              ← App entry point, router registration
    config.py            ← All settings (pydantic-settings, reads .env)
    llm_client.py        ← Claude CLI + Gemini CLI subprocess backends
    orchestrator_engine.py ← Gemini API ReAct tool loop → Claude handoff
    context_loader.py    ← Loads Inara's system prompt from inara/ files
    session_store.py     ← In-memory + file session persistence
    session_logger.py    ← Writes session turns to inara/sessions/
    memory_distiller.py  ← Short/mid/long distill jobs (APScheduler)
    scheduler.py         ← APScheduler setup
    event_bus.py         ← Internal SSE pub/sub (NC Talk → browser)
    routers/
      chat.py            ← POST /chat (streaming SSE)
      orchestrator.py    ← POST /orchestrate, GET /orchestrate/{job_id}
      auth.py            ← GET /auth/status (Claude + Gemini CLI token checks)
      distill.py         ← POST /distill/*, GET /distill/status
      files.py           ← GET /files (inara/ file browser)
      nextcloud_talk.py  ← POST /webhook/nextcloud (NC Talk bot)
      google_chat.py     ← POST /webhook/google (Google Chat — stub)
    tools/
      __init__.py        ← Tool registry (Gemini FunctionDeclarations + dispatcher)
      web.py             ← DuckDuckGo web_search tool
    static/              ← Single-page web UI (index.html, style.css, app.js)
    data/sessions/       ← Persisted session JSON files

  inara/                 ← Inara identity, memory, context files
    IDENTITY.md          ← Who Inara is
    SOUL.md              ← Values, personality, voice
    PROTOCOLS.md         ← Behavioral rules
    CONTEXT_TIERS.md     ← What each tier (1–3) includes in the system prompt
    USER.md              ← Scott's profile (loaded into context)
    HELP.md              ← In-app help content (rendered in UI)
    MEMORY.md            ← Persistent facts (written by distiller or manually)
    MEMORY_SHORT.md      ← Rolling short-term memory (auto-distilled daily)
    MEMORY_MID.md        ← Mid-term memory (auto-distilled weekly)
    MEMORY_LONG.md       ← Long-term memory (auto-distilled monthly)
    sessions/            ← Session turn logs (YYYY-MM-DD_<id>.md)

  docs/                  ← Integration reference docs
    NEXTCLOUD_TALK_BOT.md

  documentation/         ← Architecture decisions and agent task list
    TODO__Agents.md      ← READ THIS FIRST — active task list
    ARCH__Intelligence_Layer.md ← Orchestrator, dev agent, knowledge architecture

  docker-compose.yml     ← Docker deployment
  .env.default           ← Reference config (copy to .env, fill in secrets)
  README.md              ← Project orientation

Run Commands

# Start (Docker)
docker compose up -d

# Restart service (after any Python change)
sudo systemctl restart cortex

# Syntax check a file before restarting
python3 -m py_compile cortex/<file>.py

# Syntax check all routers
for f in cortex/routers/*.py cortex/tools/*.py cortex/orchestrator_engine.py; do
    python3 -m py_compile "$f" && echo "OK: $f"
done

# Install/update dependencies
cd cortex && .venv/bin/pip install -r requirements.txt

# Logs
journalctl -u cortex -f

# Web UI (local)
http://localhost:8000

# Swagger docs
http://localhost:8000/docs

Key Design Decisions

Two-Brain Architecture (Orchestrator / Responder)

Gemini API (orchestrator_engine.py) — runs the ReAct tool loop; handles tool calling, planning, research
Claude CLI (llm_client.py) — produces all user-facing responses; receives enriched context from Gemini
Direct chat bypasses the orchestrator entirely — POST /chat goes straight to Claude (faster)
Orchestrated tasks go to POST /orchestrate — returns a job_id, result is polled

LLM Backends

llm_client.py manages Claude CLI (claude --print) and Gemini CLI (gemini -p) subprocesses
orchestrator_engine.py uses the Gemini API (google-genai SDK) — completely separate from the Gemini CLI
Claude OAuth token is read live from ~/.claude/.credentials.json (never rely on stale env var)

Tool Strategy

Orchestrator tools live in cortex/tools/ — separate from the ae_* MCP tools
Do not modify the ae_* MCP server to support orchestrator needs; add new tools to cortex/tools/ instead
Tools are registered in cortex/tools/__init__.py as both Gemini FunctionDeclarations and Python callables

Context / Memory

context_loader.py assembles Inara's system prompt from inara/ files based on tier (1–3)
Tier 1 = minimal (identity only); Tier 2 = standard (+ memory + user profile); Tier 3 = full
Memory files are written by the distiller or manually — do not delete them

Security / Safety

Never rm — move files to ~/tmp/gemini_trash
Never commit secrets — .env is gitignored; use .env.default as the reference
NEXTCLOUD_TALK_BOT_SECRET and GEMINI_API_KEY live in .env only
Cortex should only be accessible via WireGuard — never internet-exposed without VPN

Adding a New Tool

Implement the tool function in cortex/tools/<domain>.py
- Must be async def; use asyncio.to_thread for blocking calls
- Return a plain string result
Add a FunctionDeclaration and register it in cortex/tools/__init__.py
Syntax check: python3 -m py_compile cortex/tools/<domain>.py
Restart Cortex

Adding a New Router

Create cortex/routers/<name>.py with router = APIRouter()
Import and register in cortex/main.py
Syntax check, restart

Active Tasks

See documentation/TODO__Agents.md for the current task list. High priority items as of 2026-03-18:

Ollama backend (third LLM option — local, no API cost)
NC Talk integration stabilization
Knowledge consolidation (markdown → AE Journals)

File	Purpose
`documentation/TODO__Agents.md`	Active task list — read before starting work
`documentation/ARCH__Intelligence_Layer.md`	Full architecture design
`~/agents_sync/projects/CORTEX.md`	High-level project vision and phases
`~/agents_sync/CLAUDE.md`	Fleet coordination rules
`~/CLAUDE.md`	Machine identity (`scott_lpt`)

6.7 KiB Raw Blame History Unescape Escape