feat: janitor role — session checkpoint compaction
New cortex/janitor.py runs before each orchestrator dispatch. When a session exceeds 20 user turns or ~12K estimated tokens, the oldest half is summarized by the janitor role model and replaced with a compact checkpoint message. Fail-safe: always returns original history if the model call fails. Config: JANITOR_TURN_THRESHOLD, JANITOR_TOKEN_THRESHOLD in .env. Assign Gemma E4B or Haiku 4.5 to the janitor role for effectively-free compaction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -71,13 +71,20 @@ class Settings(BaseSettings):
|
||||
role_chat: str = "claude_cli"
|
||||
role_orchestrator: str = "gemini_api"
|
||||
role_distill: str = "claude_cli"
|
||||
role_janitor: str = "claude_cli" # assign a cheap/fast model: Haiku 4.5, local Gemma E4B
|
||||
role_coder: str = "claude_cli"
|
||||
role_research: str = "gemini_api"
|
||||
|
||||
# Comma-separated list of standard roles shown in the model settings UI.
|
||||
# Add custom roles here to extend the UI without code changes.
|
||||
# Example: DEFINED_ROLES=chat,orchestrator,distill,coder,research,medical
|
||||
defined_roles: str = "chat,orchestrator,distill,coder,research"
|
||||
# Example: DEFINED_ROLES=chat,orchestrator,distill,janitor,coder,research,medical
|
||||
defined_roles: str = "chat,orchestrator,distill,janitor,coder,research"
|
||||
|
||||
# Session checkpoint compaction ("janitor") thresholds.
|
||||
# Compaction fires when EITHER threshold is exceeded.
|
||||
# Override in .env: JANITOR_TURN_THRESHOLD=15 JANITOR_TOKEN_THRESHOLD=8000
|
||||
janitor_turn_threshold: int = 20 # user turns (each turn = 1 user + 1 assistant message)
|
||||
janitor_token_threshold: int = 12000 # estimated tokens (chars / 4 heuristic)
|
||||
|
||||
# Memory tier token budgets — soft caps used during distillation
|
||||
# Override in .env: MEMORY_BUDGET_LONG=4000 etc.
|
||||
|
||||
Reference in New Issue
Block a user