Cortex-Inara/documentation/TODO__Agents.md

# Cortex / Inara — Agent Task List

> Read this file before starting any work on this project.
> **Status:** Active development — ongoing.

---

## 🔴 High Priority

### [Auth] Token expiry — sudo restart
- Cortex currently requires `sudo systemctl restart cortex` after OAuth token refresh
- This must be done manually by the user (cannot run interactively from Claude Code)
- **Future:** Explore hot-reload or token-passing mechanism so restart isn't required

### [Backend] Ollama local model backend
- Add Ollama as a third LLM backend option (direct Ollama API, no CLI wrapper)
- Endpoint: `http://scott-gaming:<port>/api/` (WireGuard)
- Model selection: configurable per-request or per-session
- Auth status check: ping `/api/tags` to confirm reachability

### [Testing] Gitea SSH port 2222
- pfSense port forward configured but not yet verified end-to-end
- Test: `ssh -p 2222 git@<external>` from outside WireGuard
- Document result in this file

---

## 🟡 Medium Priority

### [Intelligence] Orchestrator service — Phase 1 ✅ Complete
See `ARCH__Intelligence_Layer.md` for full design. Committed: `ed472ce` (2026-03-18)
- [x] Add Gemini API (google-generativeai SDK) as a library dependency (not CLI)
- [x] Create `cortex/routers/orchestrator.py` — `POST /orchestrate` endpoint
- [x] Basic tool registry: web search (DuckDuckGo), AE API query, file read, task list
- [x] ReAct loop: Gemini calls tools, assembles context, hands off to Claude for final response
- [x] `GET /orchestrate/{job_id}` — poll for status/result
- [x] Cron can trigger via HTTP POST (same endpoint)
- **Note:** Default model is `gemini-2.5-flash` — free tier key required (AI Studio)

### [Intelligence] Knowledge consolidation — Phase 1
See `ARCH__Intelligence_Layer.md` for full design. Initial scope:
- [ ] Tool: `ae_journal_search` — search before creating to avoid duplicates
- [ ] Tool: `ae_journal_entry_create` — write a new entry with source metadata
- [ ] Import script: walk a markdown directory, chunk by H2 section, create entries
- [ ] Target: markdown files from `~/DgrZone_Nextcloud/` and `~/OSIT_Nextcloud/`
- [ ] Tag strategy: source path, date, topic tags from frontmatter or filename

### [Channel] Nextcloud Talk integration — stabilize
- NC Talk bot is implemented (`cortex/routers/nextcloud_talk.py`)
- HMAC signing: sign `random + message_text` (NOT raw body) — already fixed
- [ ] Test end-to-end after any Cortex restart
- [ ] Document the bot registration process in `docs/NEXTCLOUD_TALK_BOT.md` (complete it)

### [Multi-user] Holly agent instance
- Plan: run two separate Cortex instances, not multi-user in one service
- Reverse proxy: `inara.dgrzone.com` → port A, `holly.dgrzone.com` → port B
- [ ] Create `holly/` identity directory (parallel to `inara/`)
- [ ] Second `docker-compose` service or separate systemd unit

---

## 🟢 Lower Priority / Future

### [Intelligence] Dev agent pipeline
See `ARCH__Intelligence_Layer.md`. Full design not yet started.
- [ ] Specialist agent: frontend (SvelteKit) code changes
- [ ] Specialist agent: backend (FastAPI) code changes
- [ ] Supervisor agent: diff review, syntax check, test runner
- [ ] Gitea webhook integration: trigger on push/PR, report back
- [ ] Human approval gate before commit

### [Intelligence] Supervisor agent
- Runs `py_compile`, `svelte-check`, unit tests after specialist agent work
- Reports pass/fail back to orchestrator
- Only commits on explicit approval

### [Channel] Gitea webhooks
- Receive push/PR/issue events → route to appropriate agent
- `cortex/routers/` already has pattern; add `gitea.py`
- Gitea Actions (CI) for "run tests on push" — simpler than custom runner

### [Channel] Google Chat integration
- `cortex/routers/google_chat.py` already exists (stub?)
- [ ] Review current state, complete or document gaps

### [Distill] Monitor first auto_distill_long run
- Scheduled for ~April 1 at 04:00
- Manually review `inara/MEMORY_LONG.md` output before fully trusting
- Adjust distill prompts if needed

### [Distill] Distill quality review
- Short/mid/long distill prompts live in `cortex/memory_distiller.py`
- After first few automatic runs, review quality and tune

### [Backend] Intelligent model routing
- Currently hardcoded: Claude default, Gemini fallback
- Future: route by task type (code → Claude, search → Gemini, private → Ollama)
- Future: route by context length (Gemini 2.0 has 1M token context)

---

## ✅ Completed

### [UI] Mobile-friendly header
- Backend toggle, font size, theme buttons moved into ⚙ settings panel
- Header reduced to 4 buttons: Sessions, Files, ⚙, ?
- Committed: `mobile_header` (2026-03)

### [UI] Mobile text input
- `flex-direction: column` on `#input-area` at ≤520px
- `font-size: 16px` on `#input` (prevents iOS Safari auto-zoom)
- `body { height: 100dvh }` (handles soft keyboard)
- Committed: `23f8659` (2026-03)

### [UI] Auth warning banner
- Claude CLI token expiry check (`~/.claude/.credentials.json`)
- Gemini CLI auth check (warns only if no `refresh_token`)
- Dismissible amber/red banner with re-auth instructions
- Committed: `fe6561b` (2026-03)

### [UI] Distill schedule in ⚙ panel
- Shows next_run times for short/mid/long distill jobs
- Fetches from existing `/distill/status` endpoint

### [UI] Help modal collapsible sections
- H2 sections collapse/expand via `<details>` elements
- Top 4 sections (Header Controls, Chat, Sessions, Notes) open by default

### [Backend] Gemini CLI backend
- `gemini -p` subprocess, streaming output
- Auth check endpoint `/auth/status`

### [Backend] Memory distiller
- APScheduler jobs: `distill_short` (6h), `distill_mid` (24h), `distill_long` (weekly)
- Writes to `inara/MEMORY_SHORT.md`, `MEMORY_MID.md`, `MEMORY_LONG.md`

### [Backend] Session logging + file browser
- Sessions saved to `inara/sessions/`
- Files panel in UI browses `inara/` directory

### [Backend] Dispatcher core
- FastAPI service with streaming response
- `claude -p` and `gemini -p` subprocess backends
- Session context management (rolling window)
- Nextcloud Talk webhook handler