feat: janitor role — session checkpoint compaction
New cortex/janitor.py runs before each orchestrator dispatch. When a session exceeds 20 user turns or ~12K estimated tokens, the oldest half is summarized by the janitor role model and replaced with a compact checkpoint message. Fail-safe: always returns original history if the model call fails. Config: JANITOR_TURN_THRESHOLD, JANITOR_TOKEN_THRESHOLD in .env. Assign Gemma E4B or Haiku 4.5 to the janitor role for effectively-free compaction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -257,6 +257,7 @@ async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
|
||||
|
||||
try:
|
||||
from session_store import load as load_session, save as save_session, generate_session_id
|
||||
from janitor import maybe_checkpoint as janitor_checkpoint
|
||||
|
||||
tier = req.tier or settings.default_tier
|
||||
role_cfg = model_registry.get_role_config(user, req.chat_role)
|
||||
@@ -272,7 +273,8 @@ async def _run_job(job_id: str, req: OrchestrateRequest, user: str) -> None:
|
||||
)
|
||||
|
||||
session_id = req.session_id or generate_session_id()
|
||||
history = load_session(session_id)
|
||||
# Compact old session turns before dispatching — no-op on new sessions or short ones.
|
||||
history = await janitor_checkpoint(session_id) if req.session_id else load_session(session_id)
|
||||
session_messages = history or None
|
||||
|
||||
orch_model = model_registry.get_model_for_role(user, "orchestrator")
|
||||
|
||||
Reference in New Issue
Block a user