Adds the Gemini API orchestrator (ReAct tool loop → Claude responder):
Orchestrator engine + router:
- orchestrator_engine.py: Gemini API tool loop, Claude CLI handoff
- routers/orchestrator.py: POST /orchestrate (async job queue), GET /orchestrate/{job_id}
Tools (cortex/tools/):
- web.py: DuckDuckGo web search (no key required)
- ae_knowledge.py: ae_journal_search + ae_journal_entry_create (AE V3 API)
- ae_tasks.py: ae_task_list (reads agents_sync Kanban filesystem)
- files.py: file_read (path-allowlisted to safe dirs)
Config + deps:
- config.py: orchestrator, DuckDuckGo, and AE API settings
- requirements.txt: google-genai, duckduckgo-search
- .env.default: reference config with all new keys documented
Docs:
- CLAUDE.md, README.md, documentation/ added to repo
- Port references updated 7331 → 8000 throughout
- Default model updated to gemini-2.5-flash
Tested: ae_task_list, ae_journal_search, web_search all working end-to-end.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
6.7 KiB
6.7 KiB
CLAUDE.md — Cortex / Inara Project
This file is loaded automatically by Claude Code when working in this directory. Read it before touching any files.
Identity & Context
- Project: Cortex (dispatcher) + Inara (resident agent)
- Owner: Scott Idem (One Sky IT / Danger Zone)
- Machine context: See
~/CLAUDE.mdfor fleet identity (scott_lpt= General Manager) - Named after: The 'verse-wide communications network (Firefly)
Directory Map
Cortex_and_Inara_dev/
cortex/ ← FastAPI service (the dispatcher)
main.py ← App entry point, router registration
config.py ← All settings (pydantic-settings, reads .env)
llm_client.py ← Claude CLI + Gemini CLI subprocess backends
orchestrator_engine.py ← Gemini API ReAct tool loop → Claude handoff
context_loader.py ← Loads Inara's system prompt from inara/ files
session_store.py ← In-memory + file session persistence
session_logger.py ← Writes session turns to inara/sessions/
memory_distiller.py ← Short/mid/long distill jobs (APScheduler)
scheduler.py ← APScheduler setup
event_bus.py ← Internal SSE pub/sub (NC Talk → browser)
routers/
chat.py ← POST /chat (streaming SSE)
orchestrator.py ← POST /orchestrate, GET /orchestrate/{job_id}
auth.py ← GET /auth/status (Claude + Gemini CLI token checks)
distill.py ← POST /distill/*, GET /distill/status
files.py ← GET /files (inara/ file browser)
nextcloud_talk.py ← POST /webhook/nextcloud (NC Talk bot)
google_chat.py ← POST /webhook/google (Google Chat — stub)
tools/
__init__.py ← Tool registry (Gemini FunctionDeclarations + dispatcher)
web.py ← DuckDuckGo web_search tool
static/ ← Single-page web UI (index.html, style.css, app.js)
data/sessions/ ← Persisted session JSON files
inara/ ← Inara identity, memory, context files
IDENTITY.md ← Who Inara is
SOUL.md ← Values, personality, voice
PROTOCOLS.md ← Behavioral rules
CONTEXT_TIERS.md ← What each tier (1–3) includes in the system prompt
USER.md ← Scott's profile (loaded into context)
HELP.md ← In-app help content (rendered in UI)
MEMORY.md ← Persistent facts (written by distiller or manually)
MEMORY_SHORT.md ← Rolling short-term memory (auto-distilled daily)
MEMORY_MID.md ← Mid-term memory (auto-distilled weekly)
MEMORY_LONG.md ← Long-term memory (auto-distilled monthly)
sessions/ ← Session turn logs (YYYY-MM-DD_<id>.md)
docs/ ← Integration reference docs
NEXTCLOUD_TALK_BOT.md
documentation/ ← Architecture decisions and agent task list
TODO__Agents.md ← READ THIS FIRST — active task list
ARCH__Intelligence_Layer.md ← Orchestrator, dev agent, knowledge architecture
docker-compose.yml ← Docker deployment
.env.default ← Reference config (copy to .env, fill in secrets)
README.md ← Project orientation
Run Commands
# Start (Docker)
docker compose up -d
# Restart service (after any Python change)
sudo systemctl restart cortex
# Syntax check a file before restarting
python3 -m py_compile cortex/<file>.py
# Syntax check all routers
for f in cortex/routers/*.py cortex/tools/*.py cortex/orchestrator_engine.py; do
python3 -m py_compile "$f" && echo "OK: $f"
done
# Install/update dependencies
cd cortex && .venv/bin/pip install -r requirements.txt
# Logs
journalctl -u cortex -f
# Web UI (local)
http://localhost:8000
# Swagger docs
http://localhost:8000/docs
Key Design Decisions
Two-Brain Architecture (Orchestrator / Responder)
- Gemini API (
orchestrator_engine.py) — runs the ReAct tool loop; handles tool calling, planning, research - Claude CLI (
llm_client.py) — produces all user-facing responses; receives enriched context from Gemini - Direct chat bypasses the orchestrator entirely —
POST /chatgoes straight to Claude (faster) - Orchestrated tasks go to
POST /orchestrate— returns a job_id, result is polled
LLM Backends
llm_client.pymanages Claude CLI (claude --print) and Gemini CLI (gemini -p) subprocessesorchestrator_engine.pyuses the Gemini API (google-genai SDK) — completely separate from the Gemini CLI- Claude OAuth token is read live from
~/.claude/.credentials.json(never rely on stale env var)
Tool Strategy
- Orchestrator tools live in
cortex/tools/— separate from theae_*MCP tools - Do not modify the
ae_*MCP server to support orchestrator needs; add new tools tocortex/tools/instead - Tools are registered in
cortex/tools/__init__.pyas both Gemini FunctionDeclarations and Python callables
Context / Memory
context_loader.pyassembles Inara's system prompt frominara/files based on tier (1–3)- Tier 1 = minimal (identity only); Tier 2 = standard (+ memory + user profile); Tier 3 = full
- Memory files are written by the distiller or manually — do not delete them
Security / Safety
- Never
rm— move files to~/tmp/gemini_trash - Never commit secrets —
.envis gitignored; use.env.defaultas the reference NEXTCLOUD_TALK_BOT_SECRETandGEMINI_API_KEYlive in.envonly- Cortex should only be accessible via WireGuard — never internet-exposed without VPN
Adding a New Tool
- Implement the tool function in
cortex/tools/<domain>.py- Must be
async def; useasyncio.to_threadfor blocking calls - Return a plain string result
- Must be
- Add a
FunctionDeclarationand register it incortex/tools/__init__.py - Syntax check:
python3 -m py_compile cortex/tools/<domain>.py - Restart Cortex
Adding a New Router
- Create
cortex/routers/<name>.pywithrouter = APIRouter() - Import and register in
cortex/main.py - Syntax check, restart
Active Tasks
See documentation/TODO__Agents.md for the current task list.
High priority items as of 2026-03-18:
- Ollama backend (third LLM option — local, no API cost)
- NC Talk integration stabilization
- Knowledge consolidation (markdown → AE Journals)
Related Docs
| File | Purpose |
|---|---|
documentation/TODO__Agents.md |
Active task list — read before starting work |
documentation/ARCH__Intelligence_Layer.md |
Full architecture design |
~/agents_sync/projects/CORTEX.md |
High-level project vision and phases |
~/agents_sync/CLAUDE.md |
Fleet coordination rules |
~/CLAUDE.md |
Machine identity (scott_lpt) |