feat: janitor role — session checkpoint compaction

New cortex/janitor.py runs before each orchestrator dispatch. When a session
exceeds 20 user turns or ~12K estimated tokens, the oldest half is summarized
by the janitor role model and replaced with a compact checkpoint message.
Fail-safe: always returns original history if the model call fails.

Config: JANITOR_TURN_THRESHOLD, JANITOR_TOKEN_THRESHOLD in .env.
Assign Gemma E4B or Haiku 4.5 to the janitor role for effectively-free compaction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-06-17 21:32:54 -04:00
parent 32585804dd
commit 67f5db70a3
4 changed files with 149 additions and 38 deletions

View File

@@ -257,6 +257,7 @@ async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
try:
from session_store import load as load_session, save as save_session, generate_session_id
from janitor import maybe_checkpoint as janitor_checkpoint
tier = req.tier or settings.default_tier
role_cfg = model_registry.get_role_config(user, req.chat_role)
@@ -272,7 +273,8 @@ async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
)
session_id = req.session_id or generate_session_id()
history = load_session(session_id)
# Compact old session turns before dispatching — no-op on new sessions or short ones.
history = await janitor_checkpoint(session_id) if req.session_id else load_session(session_id)
session_messages = history or None
orch_model = model_registry.get_model_for_role(user, "orchestrator")