Files

Scott Idem 2f675ee4bf Initial commit — Cortex API + Inara identity

Cortex: FastAPI backend serving Inara via Claude/Gemini CLI backends.
Includes SSE streaming chat, session persistence, Google Chat webhook
handler, and Docker support.

Inara: Identity files (persona, soul, protocols, memory, context tiers)
mounted read-only into the container at runtime.

Features in initial cut:
- /chat endpoint with SSE keepalive + LLM fallback
- Session store with rolling history window
- Markdown rendering, copy-to-clipboard, links open in new tab
- Stacked right-column input controls (height selector, enter toggle,
  note mode with public/private) — semi-hidden until textarea grows
- /note endpoint for injecting public context into session history
- Docker Compose config (local dev runs natively; Docker for server)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-06 03:41:00 -05:00

1.8 KiB

Raw Blame History

CONTEXT_TIERS.md — Cortex Dispatcher Loading Spec

This file defines which Inara context files to inject into a session based on the target model's context window. The dispatcher reads this to decide what to prepend.

Tier 1 — Minimal (~1,500 tokens)

Target: Local models with ~8k context or less (Qwen 8B small, etc.)

Load:

SOUL.md
IDENTITY.md
USER.md — first 30 lines only (identity + what he cares about)

Notes: Just enough for Inara to know who she is and who Scott is.

Tier 2 — Standard (~5,000 tokens)

Target: Models with 16k–32k context (Haiku, Gemini Flash, Qwen 8B full)

Load:

SOUL.md
IDENTITY.md
USER.md — full
MEMORY.md
PROTOCOLS.md

Notes: Full operational context. Sufficient for most routine tasks and conversations.

Tier 3 — Extended (~15,000 tokens)

Target: Models with 32k–128k context (Sonnet, Gemini Pro, Qwen 14B, Qwen 30B)

Load:

Everything in Tier 2
~/agents_sync/aether/docs/FLEET_MANIFEST.md
Most recent 2 session files from sessions/
Relevant project doc (e.g., CORTEX.md) if task is project-related

Tier 4 — Full (50,000+ tokens)

Target: Frontier models with 200k+ context (Claude Opus/Sonnet, Gemini 2.5 Pro)

Load:

Everything in Tier 3
Last 5–7 session files
Full project docs as relevant
~/agents_sync/aether/docs/api_v3.md if task involves Aether API

Hard Rules

SOUL.md and IDENTITY.md are always loaded, regardless of tier.
Never inject: .env files, TOOLS.md (contains credentials), raw session logs older than 30 days.
MEMORY.md must stay under 4,000 tokens — enforce this during distillation.
When in doubt, use Tier 2. Over-loading small models degrades output quality.

1.8 KiB Raw Blame History Unescape Escape

CONTEXT_TIERS.md — Cortex Dispatcher Loading Spec

Tier 1 — Minimal (~1,500 tokens)

Tier 2 — Standard (~5,000 tokens)

Tier 3 — Extended (~15,000 tokens)

Tier 4 — Full (50,000+ tokens)

Hard Rules

1.8 KiB

Raw Blame History