Add tiered memory system with manual distillation
- config.py: memory_budget_long/mid/short settings (overridable in .env)
- memory_distiller.py: distill_short (no LLM), distill_mid, distill_long (LLM)
- routers/distill.py: POST /distill/{short,mid,long,all} endpoints
- context_loader.py: rewrote to load long→mid→short order with include_* toggles
- routers/chat.py: ChatRequest gains include_long/mid/short fields
- routers/files.py: MEMORY_LONG/MID/SHORT.md added to ALLOWED set
- main.py: register distill router
- static/index.html: context bar — tier selector, L/M/S memory toggles,
distill buttons with status feedback; send includes tier + memory flags
- inara/MEMORY_LONG.md: migrated from MEMORY.md + Cortex/Talk bot notes
- inara/MEMORY_MID.md, MEMORY_SHORT.md: stubs ready for distillation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -19,6 +19,9 @@ class ChatRequest(BaseModel):
|
||||
session_id: str | None = None
|
||||
tier: int | None = None
|
||||
model: str | None = None # "claude" or "gemini" to override; None = use primary_backend
|
||||
include_long: bool = True
|
||||
include_mid: bool = True
|
||||
include_short: bool = True
|
||||
|
||||
|
||||
class BackendRequest(BaseModel):
|
||||
@@ -49,7 +52,12 @@ async def _stream_chat(req: ChatRequest):
|
||||
session_id = req.session_id or generate_session_id()
|
||||
tier = req.tier or settings.default_tier
|
||||
|
||||
system_prompt = load_context(tier)
|
||||
system_prompt = load_context(
|
||||
tier,
|
||||
include_long=req.include_long,
|
||||
include_mid=req.include_mid,
|
||||
include_short=req.include_short,
|
||||
)
|
||||
history = load_session(session_id)
|
||||
history.append({"role": "user", "content": req.message})
|
||||
|
||||
|
||||
Reference in New Issue
Block a user