feat: multi-persona support (single Cortex, multiple users)
- Add cortex/persona.py: ContextVar-based per-request routing with path traversal protection and persona validation - Migrate inara/ → personas/inara/ (git history preserved via git mv) - config.py: add personas_root(), inara_path() delegates to personas/inara - All 14 settings.inara_path() call sites replaced with persona_path() - ChatRequest + OrchestrateRequest: add persona field (default: "inara") with validation at request entry before any processing - memory_distiller: add optional persona param for future per-persona distill - cron_runner/tools/cron: stamp persona on jobs, prefix APScheduler IDs (persona:job_id) to prevent collisions across personas - scheduler: _load_user_crons() iterates all personas at startup Adding a new persona: create personas/<name>/ with IDENTITY.md + SOUL.md. Auth: handled at nginx level (inject X-Cortex-Persona header per subdomain). Future: persona maps to Aether account_id_random for full integration. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
65
personas/inara/CONTEXT_TIERS.md
Normal file
65
personas/inara/CONTEXT_TIERS.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# CONTEXT_TIERS.md — Cortex Dispatcher Loading Spec
|
||||
|
||||
This file defines which Inara context files to inject into a session based on the target model's
|
||||
context window. The dispatcher reads this to decide what to prepend.
|
||||
|
||||
---
|
||||
|
||||
## Tier 1 — Minimal (~1,500 tokens)
|
||||
|
||||
**Target:** Local models with ~8k context or less (Qwen 8B small, etc.)
|
||||
|
||||
**Load:**
|
||||
- `SOUL.md`
|
||||
- `IDENTITY.md`
|
||||
- `USER.md` — first 30 lines only (identity + what he cares about)
|
||||
|
||||
**Notes:** Just enough for Inara to know who she is and who Scott is.
|
||||
|
||||
---
|
||||
|
||||
## Tier 2 — Standard (~5,000 tokens)
|
||||
|
||||
**Target:** Models with 16k–32k context (Haiku, Gemini Flash, Qwen 8B full)
|
||||
|
||||
**Load:**
|
||||
- `SOUL.md`
|
||||
- `IDENTITY.md`
|
||||
- `USER.md` — full
|
||||
- `MEMORY.md`
|
||||
- `PROTOCOLS.md`
|
||||
|
||||
**Notes:** Full operational context. Sufficient for most routine tasks and conversations.
|
||||
|
||||
---
|
||||
|
||||
## Tier 3 — Extended (~15,000 tokens)
|
||||
|
||||
**Target:** Models with 32k–128k context (Sonnet, Gemini Pro, Qwen 14B, Qwen 30B)
|
||||
|
||||
**Load:**
|
||||
- Everything in Tier 2
|
||||
- `~/agents_sync/aether/docs/FLEET_MANIFEST.md`
|
||||
- Most recent 2 session files from `sessions/`
|
||||
- Relevant project doc (e.g., `CORTEX.md`) if task is project-related
|
||||
|
||||
---
|
||||
|
||||
## Tier 4 — Full (50,000+ tokens)
|
||||
|
||||
**Target:** Frontier models with 200k+ context (Claude Opus/Sonnet, Gemini 2.5 Pro)
|
||||
|
||||
**Load:**
|
||||
- Everything in Tier 3
|
||||
- Last 5–7 session files
|
||||
- Full project docs as relevant
|
||||
- `~/agents_sync/aether/docs/api_v3.md` if task involves Aether API
|
||||
|
||||
---
|
||||
|
||||
## Hard Rules
|
||||
|
||||
- `SOUL.md` and `IDENTITY.md` are **always** loaded, regardless of tier.
|
||||
- **Never inject:** `.env` files, `TOOLS.md` (contains credentials), raw session logs older than 30 days.
|
||||
- **MEMORY.md must stay under 4,000 tokens** — enforce this during distillation.
|
||||
- When in doubt, use Tier 2. Over-loading small models degrades output quality.
|
||||
Reference in New Issue
Block a user