Cortex-Inara/CLAUDE.md

# CLAUDE.md — Cortex / Inara Project

This file is loaded automatically by Claude Code when working in this directory.
Read it before touching any files.

---

## Identity & Context

- **Project:** Cortex (dispatcher) + Inara (resident agent)
- **Owner:** Scott Idem (One Sky IT / Danger Zone)
- **Machine context:** See `~/CLAUDE.md` for fleet identity (`scott_lpt` = General Manager)
- **Named after:** The 'verse-wide communications network (Firefly)

---

## Directory Map

```
Cortex_and_Inara_dev/
  cortex/                ← FastAPI service (the dispatcher)
    main.py              ← App entry point, router registration
    config.py            ← All settings (pydantic-settings, reads .env)
    llm_client.py        ← Claude CLI + Gemini CLI subprocess backends
    orchestrator_engine.py ← Gemini API ReAct tool loop → Claude handoff
    context_loader.py    ← Loads Inara's system prompt from inara/ files
    session_store.py     ← In-memory + file session persistence
    session_logger.py    ← Writes session turns to inara/sessions/
    memory_distiller.py  ← Short/mid/long distill jobs (APScheduler)
    scheduler.py         ← APScheduler setup
    event_bus.py         ← Internal SSE pub/sub (NC Talk → browser)
    routers/
      chat.py            ← POST /chat (streaming SSE)
      orchestrator.py    ← POST /orchestrate, GET /orchestrate/{job_id}
      auth.py            ← GET /auth/status (Claude + Gemini CLI token checks)
      distill.py         ← POST /distill/*, GET /distill/status
      files.py           ← GET /files (inara/ file browser)
      nextcloud_talk.py  ← POST /webhook/nextcloud (NC Talk bot)
      google_chat.py     ← POST /webhook/google (Google Chat — stub)
    tools/
      __init__.py        ← Tool registry (Gemini FunctionDeclarations + dispatcher)
      web.py             ← DuckDuckGo web_search tool
    static/              ← Single-page web UI (index.html, style.css, app.js)
    data/sessions/       ← Persisted session JSON files

  inara/                 ← Inara identity, memory, context files
    IDENTITY.md          ← Who Inara is
    SOUL.md              ← Values, personality, voice
    PROTOCOLS.md         ← Behavioral rules
    CONTEXT_TIERS.md     ← What each tier (1–3) includes in the system prompt
    USER.md              ← Scott's profile (loaded into context)
    HELP.md              ← In-app help content (rendered in UI)
    MEMORY.md            ← Persistent facts (written by distiller or manually)
    MEMORY_SHORT.md      ← Rolling short-term memory (auto-distilled daily)
    MEMORY_MID.md        ← Mid-term memory (auto-distilled weekly)
    MEMORY_LONG.md       ← Long-term memory (auto-distilled monthly)
    sessions/            ← Session turn logs (YYYY-MM-DD_<id>.md)

  docs/                  ← Integration reference docs
    NEXTCLOUD_TALK_BOT.md

  documentation/         ← Architecture decisions and agent task list
    TODO__Agents.md      ← READ THIS FIRST — active task list
    ARCH__Intelligence_Layer.md ← Orchestrator, dev agent, knowledge architecture

  docker-compose.yml     ← Docker deployment
  .env.default           ← Reference config (copy to .env, fill in secrets)
  README.md              ← Project orientation
```

---

## Run Commands

```bash
# Start (Docker)
docker compose up -d

# Restart service (after any Python change)
sudo systemctl restart cortex

# Syntax check a file before restarting
python3 -m py_compile cortex/<file>.py

# Syntax check all routers
for f in cortex/routers/*.py cortex/tools/*.py cortex/orchestrator_engine.py; do
    python3 -m py_compile "$f" && echo "OK: $f"
done

# Install/update dependencies
cd cortex && .venv/bin/pip install -r requirements.txt

# Logs
journalctl -u cortex -f

# Web UI (local)
http://localhost:8000

# Swagger docs
http://localhost:8000/docs
```

---

## Key Design Decisions

### Two-Brain Architecture (Orchestrator / Responder)
- **Gemini API** (`orchestrator_engine.py`) — runs the ReAct tool loop; handles tool calling, planning, research
- **Claude CLI** (`llm_client.py`) — produces all user-facing responses; receives enriched context from Gemini
- **Direct chat** bypasses the orchestrator entirely — `POST /chat` goes straight to Claude (faster)
- **Orchestrated tasks** go to `POST /orchestrate` — returns a job_id, result is polled

### LLM Backends
- `llm_client.py` manages Claude CLI (`claude --print`) and Gemini CLI (`gemini -p`) subprocesses
- `orchestrator_engine.py` uses the Gemini **API** (google-genai SDK) — completely separate from the Gemini CLI
- Claude OAuth token is read live from `~/.claude/.credentials.json` (never rely on stale env var)

### Tool Strategy
- Orchestrator tools live in `cortex/tools/` — separate from the `ae_*` MCP tools
- **Do not modify** the `ae_*` MCP server to support orchestrator needs; add new tools to `cortex/tools/` instead
- Tools are registered in `cortex/tools/__init__.py` as both Gemini FunctionDeclarations and Python callables

### Context / Memory
- `context_loader.py` assembles Inara's system prompt from `inara/` files based on tier (1–3)
- Tier 1 = minimal (identity only); Tier 2 = standard (+ memory + user profile); Tier 3 = full
- Memory files are written by the distiller or manually — do not delete them

### Security / Safety
- **Never `rm`** — move files to `~/tmp/gemini_trash`
- **Never commit secrets** — `.env` is gitignored; use `.env.default` as the reference
- `NEXTCLOUD_TALK_BOT_SECRET` and `GEMINI_API_KEY` live in `.env` only
- Cortex should only be accessible via WireGuard — never internet-exposed without VPN

---

## Adding a New Tool

1. Implement the tool function in `cortex/tools/<domain>.py`
   - Must be `async def`; use `asyncio.to_thread` for blocking calls
   - Return a plain string result
2. Add a `FunctionDeclaration` and register it in `cortex/tools/__init__.py`
3. Syntax check: `python3 -m py_compile cortex/tools/<domain>.py`
4. Restart Cortex

## Adding a New Router

1. Create `cortex/routers/<name>.py` with `router = APIRouter()`
2. Import and register in `cortex/main.py`
3. Syntax check, restart

---

## Active Tasks

See `documentation/TODO__Agents.md` for the current task list.
High priority items as of 2026-03-18:
- Ollama backend (third LLM option — local, no API cost)
- NC Talk integration stabilization
- Knowledge consolidation (markdown → AE Journals)

---

## Related Docs

| File | Purpose |
|---|---|
| `documentation/TODO__Agents.md` | Active task list — read before starting work |
| `documentation/ARCH__Intelligence_Layer.md` | Full architecture design |
| `~/agents_sync/projects/CORTEX.md` | High-level project vision and phases |
| `~/agents_sync/CLAUDE.md` | Fleet coordination rules |
| `~/CLAUDE.md` | Machine identity (`scott_lpt`) |