Compare commits

...

7 Commits

Author SHA1 Message Date
Scott Idem
b9a78819ac docs: add LLM wiki concept (Karpathy pattern) to ARCH__FUTURE.md
Inara's exploration of a living-wiki knowledge compilation architecture
as an alternative to RAG — three-layer model, ingest/query/lint ops,
and a mapping to existing Cortex concepts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 13:22:55 -04:00
Scott Idem
3672fa1506 docs: comprehensive doc audit — sync all docs to current state
- MASTER.md: tool count 40→47, add proactive notifications + spawn_agent rows, date bump
- ROADMAP.md: mark local orchestrator/web push/proactive notifs/spawn_agent/web_read/session_read as done, date bump
- ARCH__CHANNELS.md: rewrite notification channel config section — all 4 channels, all triggers, on-demand endpoints
- ARCH__SYSTEM.md: update tools/ module list to include files, agents
- README.md: update LLM backends in architecture diagram, add browser push to channels table
- CLAUDE.md: add doc update checklist to Documentation Philosophy section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 13:13:45 -04:00
Scott Idem
52c19afbcc fix: raise web_read and http_fetch max_chars cap to 128K
Both tools now accept max_chars up to 131072 to accommodate long
documentation pages and large API responses.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 13:08:17 -04:00
Scott Idem
17e8869d12 docs: update tool count (45→47), HELP.md, and TODO for new web/file tools
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 13:05:04 -04:00
Scott Idem
7c3291960a feat: web_read (trafilatura), session_read, http_fetch max_chars
web_read(url, max_chars=16000) — fetches a URL and extracts clean article
text via trafilatura, stripping ads/nav/boilerplate. Returns markdown.

session_read(date) — reads a full session log by YYYY-MM-DD date; lists
available dates if the requested one is not found.

http_fetch gains a max_chars param (default 8192, max 32768) so the cap
is configurable instead of hardcoded.

Tool count: 45 → 47.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 13:04:24 -04:00
Scott Idem
a99ebb8c30 feat: retry button for orchestrator errors + explicit client timeout
Extract orchestrator inner loop into _doOrchestrate() so the retry button
can re-run without re-adding the user message to DOM or history — same
pattern as the existing chat retry.

Also set AsyncOpenAI(timeout=settings.timeout_local) so slow remote models
(OpenRouter/DeepSeek) get the same 300s budget as local chat calls instead
of the SDK default which varies by connection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 12:39:34 -04:00
Scott Idem
ff154b1ec0 docs: update CLAUDE.md, HELP.md, and TODO for notifications page + push fix
- CLAUDE.md: date → 2026-05-08, add Proactive notifications row to channel table
- HELP.md: update Notifications settings entry, expand Push Notifications section
  with channel config link, add test API endpoints to reference table
- TODO__Agents.md: mark notifications dedicated page and pywebpush fix as done

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 23:58:47 -04:00
15 changed files with 295 additions and 56 deletions

View File

@@ -185,6 +185,19 @@ Cortex is a no-black-box system. Docs must match reality — at all times.
- **CLAUDE.md + ARCH__*.md are the developer contract:** Update them as the architecture evolves.
- **Stale docs are bugs.** If you notice drift, fix it before moving on.
### Doc update checklist (run after any significant change)
| Doc | Update when |
|---|---|
| `CLAUDE.md` | New tool, channel, router, major design change, tool count |
| `cortex/static/HELP.md` | Any user-visible feature — tools, settings, UI, API endpoints |
| `documentation/TODO__Agents.md` | Mark completed items; add new planned work |
| `documentation/MASTER.md` | New capability goes live; tool count changes |
| `documentation/ROADMAP.md` | Phase items completed or added |
| `documentation/ARCH__CHANNELS.md` | New channel, notification trigger, or scheduler job |
| `documentation/ARCH__SYSTEM.md` | New module, router, or tools/ file |
| `README.md` | Architecture diagram, channels table, or setup steps change |
---
## Adding a New Tool
@@ -237,7 +250,7 @@ clearly asked for a directory to be unblocked.
---
## Current State (2026-05-06)
## Current State (2026-05-08)
Cortex is running and stable. All channels are live:
@@ -252,11 +265,12 @@ Cortex is running and stable. All channels are live:
| Tool audit log | ✅ Live | Every tool call logged to `home/{user}/tool_audit/YYYY-MM-DD.jsonl` |
| Token usage tracking | ✅ Live | Per-user `home/{user}/usage.json`; summary in Settings |
| Web push | ✅ Live | VAPID push notifications; `web_push` tool; subscribe via ☰ menu |
| Proactive notifications | ✅ Live | Daily reminder check (09:00); distill/cron completions; `GET /settings/notifications` dedicated page |
Active users: scott (inara), holly (tina), brian (wintermute)
**45 orchestrator tools:** web_search, http_fetch,
file_read/list/write/session_search, shell_exec, claude_allow_dir,
**47 orchestrator tools:** web_search, http_fetch, web_read,
file_read/list/write/session_read/session_search, shell_exec, claude_allow_dir,
cortex_restart/logs/status/update,
task_list/create/update/complete, cron_list/add/remove/toggle,
reminders_add/list/remove/clear, scratch_read/write/append/clear,

View File

@@ -182,10 +182,10 @@ Back it up separately — it is required to restore from any snapshot.
└─ POST /channels/google-chat/{username} — Google Chat Add-on (per-user)
LLM Backends
• Claude CLI — primary, all user-facing responses
• Gemini CLI — fallback
• Gemini API — orchestrator tool loop only (not general chat)
• Local — Open WebUI/Ollama on scott_gaming (private/offline)
• Claude CLI — primary, all user-facing responses
• Gemini CLI — fallback
• Gemini API — orchestrator tool loop (two-brain: Gemini plans, Claude responds)
• Local OpenAI — Open WebUI/Ollama on scott_gaming; also runs local orchestrator loop
Persona context loaded from home/{user}/persona/{name}/
```
@@ -213,11 +213,12 @@ Context is loaded at request time from `home/{user}/persona/{name}/` via `cortex
Webhook endpoints are per-user — each user configures their own secrets in `home/{username}/channels.json`.
| Channel | Status | Endpoint |
| Channel | Status | Endpoint / Notes |
|---|---|---|
| Web UI | Live | `https://cortex.dgrzone.com` — session auth (login form + JWT cookie) |
| Nextcloud Talk | Live | `POST /webhook/nextcloud/{username}` — HMAC-signed, async reply |
| Google Chat | Live | `POST /channels/google-chat/{username}` — Workspace Add-on, JWT auth |
| Browser Push | Live | VAPID push notifications — subscribe via ☰ menu; proactive reminders + distill alerts |
See `docs/NEXTCLOUD_TALK_BOT.md` and `docs/GOOGLE_CHAT_BOT.md` for setup instructions.

View File

@@ -405,7 +405,7 @@ def _build_client(
base_url = api_url.rstrip("/")
if host_type == "openwebui":
base_url = base_url + "/api"
client = AsyncOpenAI(base_url=base_url, api_key=api_key)
client = AsyncOpenAI(base_url=base_url, api_key=api_key, timeout=settings.timeout_local)
if model_cfg.get("tools") is False:
active_tools = []
else:

View File

@@ -19,6 +19,9 @@ python-multipart>=0.0.9 # required by FastAPI for Form() data
# Async HTTP client — used for local OpenAI-compatible backend (Open WebUI / Ollama)
httpx>=0.27.0
# Web content extraction — strips ads/nav/boilerplate, returns clean article text
trafilatura>=1.6.0
# OpenAI-compatible client — tool calling for OpenRouter / LiteLLM / any OAI-compat host
openai>=1.0.0

View File

@@ -6,7 +6,7 @@
and are appended automatically by help.html when present.
-->
*Last updated: 2026-05-08*
*Last updated: 2026-05-09*
---
@@ -82,12 +82,12 @@ Orchestrated sessions persist to history exactly like regular chat.
### Available Tools
45 tools across 12 categories. Each tool schema is sent to the model on every orchestrated call — fewer active tools means fewer tokens per call.
47 tools across 12 categories. Each tool schema is sent to the model on every orchestrated call — fewer active tools means fewer tokens per call.
| Category | Tools |
|---|---|
| **Web** | `web_search`, `http_fetch` |
| **Files** | `file_read`, `file_list`, `file_write`, `session_search` |
| **Web** | `web_search`, `http_fetch`, `web_read` |
| **Files** | `file_read`, `file_list`, `file_write`, `session_read`, `session_search` |
| **Shell** | `shell_exec`, `claude_allow_dir` |
| **System** | `cortex_restart`, `cortex_logs`, `cortex_status`, `cortex_update` |
| **Tasks** | `task_list`, `task_create`, `task_update`, `task_complete` |
@@ -176,7 +176,7 @@ Each response shows a **model tag** (bottom-right of message) with the model lab
| **Account** | View your username, role badge (Admin / User), rename your username |
| **Connected Accounts** | See which Google account is linked for OAuth sign-in |
| **Email Allowlist** | Regex patterns controlling which addresses the `email_send` tool can reach |
| **Notifications** | Set which channel (NC Talk, Google Chat, email) Inara uses for proactive messages |
| **Notifications** | Dedicated page — set channel (Browser Push, NC Talk, Google Chat, email) for proactive messages; test buttons for instant verification |
| **Tool Permissions** | Allow or block specific orchestrator tools for your account |
| **Usage** | Token consumption by model — see below |
| **Browser Cache** | Clear UI preferences stored locally (theme, font size, session ID, etc.) |
@@ -337,6 +337,8 @@ Cortex can send browser push notifications — even when the tab is closed.
- Click again to disable. Subscriptions are stored per-device.
- The orchestrator's `web_push` tool lets Inara send you a push proactively (e.g. when a long task completes).
**Notification channel settings:** ☰ → **Account****Notification settings →** — choose Browser Push, Email, Nextcloud Talk, or Google Chat as the channel Inara uses for scheduled reminders, cron job completions, and memory digests. Use the **Send Test Notification** button to verify your setup, or **Check Reminders Now** to trigger the reminder check immediately.
---
## Context & Memory ( ⚙ panel )
@@ -424,6 +426,8 @@ For direct access or scripting:
| `GET` | `/api/push/vapid-key` | VAPID public key (for push subscription) |
| `POST` | `/api/push/subscribe` | Register a push subscription |
| `DELETE` | `/api/push/subscribe` | Remove a push subscription |
| `POST` | `/api/push/test` | Send a test notification via configured channel |
| `POST` | `/api/push/reminders/check` | Run reminder check immediately; returns `{"reminders_found": n}` |
| `GET` | `/api/audit/files` | List available audit log dates (own data) |
| `GET` | `/api/audit/day?date=` | Tool call entries for a specific date (own data) |
| `GET` | `/api/audit/recent` | Recent tool calls across days (admin) |

View File

@@ -1215,24 +1215,9 @@
inputEl.focus();
}
async function sendOrchestrate() {
const text = inputEl.value.trim();
if (!text || activeController) return;
inputEl.value = '';
syncHeight();
sendBtn.style.display = 'none';
stopBtn.style.display = 'flex';
headerEmoji.classList.add('processing');
activeController = new AbortController();
currentHistory.push({ role: 'user', content: text });
const userMsgDiv = addMessage('user', text);
scrollToBottom();
const thinkingDiv = addMessage('assistant thinking', '⚡ working…');
// Extracted so the retry button can call it without re-adding the
// user message to the DOM or currentHistory.
async function _doOrchestrate(text, thinkingDiv, userMsgDiv) {
try {
const res = await fetch('/orchestrate', {
method: 'POST',
@@ -1336,9 +1321,59 @@
thinkingDiv.textContent = 'Stopped.';
} else {
thinkingDiv.className = 'message error';
thinkingDiv.textContent = `Error: ${err.message}`;
thinkingDiv.innerHTML = '';
const errSpan = document.createElement('span');
errSpan.textContent = `Error: ${err.message}`;
thinkingDiv.appendChild(errSpan);
const retryBtn = document.createElement('button');
retryBtn.className = 'retry-btn';
retryBtn.textContent = '↺ Retry';
retryBtn.addEventListener('click', async () => {
if (currentHistory.at(-1)?.role === 'user') currentHistory.pop();
currentHistory.push({ role: 'user', content: text });
thinkingDiv.className = 'message assistant thinking';
thinkingDiv.textContent = '⚡ working…';
activeController = new AbortController();
sendBtn.style.display = 'none';
stopBtn.style.display = 'flex';
headerEmoji.classList.add('processing');
await _doOrchestrate(text, thinkingDiv, userMsgDiv);
activeController = null;
headerEmoji.classList.remove('processing');
sendBtn.style.display = 'block';
stopBtn.style.display = 'none';
inputEl.focus();
});
thinkingDiv.appendChild(retryBtn);
}
}
}
async function sendOrchestrate() {
const text = inputEl.value.trim();
if (!text || activeController) return;
inputEl.value = '';
syncHeight();
sendBtn.style.display = 'none';
stopBtn.style.display = 'flex';
headerEmoji.classList.add('processing');
activeController = new AbortController();
currentHistory.push({ role: 'user', content: text });
const userMsgDiv = addMessage('user', text);
scrollToBottom();
const thinkingDiv = addMessage('assistant thinking', '⚡ working…');
await _doOrchestrate(text, thinkingDiv, userMsgDiv);
activeController = null;
headerEmoji.classList.remove('processing');

View File

@@ -17,7 +17,7 @@ from google.genai import types
# ── Callable imports ──────────────────────────────────────────────────────────
from tools.web import search as _web_search, http_fetch as _http_fetch
from tools.web import search as _web_search, http_fetch as _http_fetch, web_read as _web_read
from tools.ae_knowledge import (
journal_list as _ae_journal_list,
journal_search as _ae_journal_search,
@@ -30,7 +30,7 @@ from tools.ae_knowledge import (
journal_entry_prepend as _ae_journal_entry_prepend,
)
from tools.ae_tasks import task_list as _ae_task_list
from tools.files import file_read as _file_read, file_list as _file_list, file_write as _file_write, session_search as _session_search
from tools.files import file_read as _file_read, file_list as _file_list, file_write as _file_write, session_search as _session_search, session_read as _session_read
from tools.system import (
shell_exec as _shell_exec,
claude_allow_dir as _claude_allow_dir,
@@ -90,8 +90,8 @@ import tools.agents as _mod_agents
# ── Tool categories — used by the Model Registry UI for grouped checkboxes ───
TOOL_CATEGORIES: dict[str, list[str]] = {
"Web": ["web_search", "http_fetch"],
"Files": ["file_read", "file_list", "file_write", "session_search"],
"Web": ["web_search", "http_fetch", "web_read"],
"Files": ["file_read", "file_list", "file_write", "session_read", "session_search"],
"Shell": ["shell_exec", "claude_allow_dir"],
"System": ["cortex_restart", "cortex_logs", "cortex_status", "cortex_update"],
"Tasks": ["task_list", "task_create", "task_update", "task_complete"],
@@ -116,6 +116,7 @@ TOOL_CATEGORIES: dict[str, list[str]] = {
_CALLABLES: dict[str, callable] = {
"web_search": _web_search,
"http_fetch": _http_fetch,
"web_read": _web_read,
"ae_journal_list": _ae_journal_list,
"ae_journal_search": _ae_journal_search,
"ae_journal_entry_read": _ae_journal_entry_read,
@@ -129,6 +130,7 @@ _CALLABLES: dict[str, callable] = {
"file_read": _file_read,
"file_list": _file_list,
"file_write": _file_write,
"session_read": _session_read,
"session_search": _session_search,
"shell_exec": _shell_exec,
"claude_allow_dir": _claude_allow_dir,

View File

@@ -230,6 +230,34 @@ def _sync_file_write(path: str, content: str, mode: str) -> str:
_SEARCH_EXCERPT_CHARS = 150
async def session_read(date: str) -> str:
"""Read a full session log by date (YYYY-MM-DD).
Returns the complete session log for that date. If the date is not found,
lists the most recent available dates instead.
Only reads the current user's own sessions (per-persona isolation via ContextVars).
"""
return await asyncio.to_thread(_sync_session_read, date.strip())
def _sync_session_read(date: str) -> str:
from persona import persona_path
sessions_dir = persona_path() / "sessions"
if not sessions_dir.exists():
return "No session logs found."
target = sessions_dir / f"{date}.md"
if target.exists():
content = target.read_text()
return f"Session log for {date} ({len(content)} chars):\n\n{content}"
available = sorted([f.stem for f in sessions_dir.glob("*.md")], reverse=True)
if not available:
return "No session logs found."
recent = "\n".join(f" {d}" for d in available[:15])
return f"No session log found for '{date}'. Available dates (most recent first):\n{recent}"
async def session_search(query: str, limit: int = 5) -> str:
"""Search past session logs for a keyword or phrase.
@@ -329,6 +357,22 @@ DECLARATIONS = [
required=["path", "content"],
),
),
types.FunctionDeclaration(
name="session_read",
description=(
"Read a full session log by date (YYYY-MM-DD). Returns the complete conversation "
"from that session — useful for continuity, recalling decisions, or reviewing "
"what was discussed on a specific day. If the date is not found, lists available dates. "
"Only reads this user's own sessions."
),
parameters=types.Schema(
type=types.Type.OBJECT,
properties={
"date": types.Schema(type=types.Type.STRING, description="Date in YYYY-MM-DD format (e.g. '2026-05-08')"),
},
required=["date"],
),
),
types.FunctionDeclaration(
name="session_search",
description=(

View File

@@ -1,5 +1,5 @@
"""
Web tools — search (DuckDuckGo) and direct HTTP fetch.
Web tools — search (DuckDuckGo), direct HTTP fetch, and clean content extraction.
"""
import asyncio
@@ -56,20 +56,25 @@ async def http_fetch(
method: str = "GET",
body: str | None = None,
timeout: int = 15,
max_chars: int = 8192,
) -> str:
"""Fetch a URL directly and return the response body.
"""Fetch a URL directly and return the raw response body.
Unlike web_search, this hits a specific URL — useful for health checks,
API probing, JSON endpoints, webhook testing, etc.
Response body is capped at 8 KB.
API probing, JSON endpoints, webhook testing, or reading raw page source.
For readable article content, use web_read instead.
Response body is capped at max_chars (default 8192, max 32768).
"""
method = method.upper()
timeout = min(max(int(timeout), 1), 60)
max_chars = min(max(int(max_chars), 100), 131072)
try:
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
resp = await client.request(method, url, content=body)
body_text = resp.text[:8192]
return f"HTTP {resp.status_code} {resp.url}\n\n{body_text}"
body_text = resp.text[:max_chars]
truncated = len(resp.text) > max_chars
suffix = f"\n\n[… truncated at {max_chars} chars]" if truncated else ""
return f"HTTP {resp.status_code} {resp.url}\n\n{body_text}{suffix}"
except httpx.HTTPError as e:
return f"HTTP error: {e}"
except Exception as e:
@@ -77,6 +82,39 @@ async def http_fetch(
return f"Error: {e}"
async def web_read(url: str, max_chars: int = 16000) -> str:
"""Fetch a URL and extract clean readable text, stripping ads, navigation, and boilerplate.
Uses trafilatura to extract the main article content — ideal for blog posts,
documentation, news articles, and any page where you want the text without
surrounding noise. Returns markdown-formatted output.
For raw responses (JSON APIs, health checks), use http_fetch instead.
"""
max_chars = min(max(int(max_chars), 1000), 131072)
return await asyncio.to_thread(_sync_web_read, url, max_chars)
def _sync_web_read(url: str, max_chars: int) -> str:
try:
import trafilatura
except ImportError:
return "web_read requires trafilatura — run: pip install trafilatura"
downloaded = trafilatura.fetch_url(url)
if downloaded is None:
return f"Failed to download content from: {url}"
text = trafilatura.extract(downloaded, output_format="markdown", include_links=True, url=url)
if not text:
text = trafilatura.extract(downloaded, url=url)
if not text:
return f"Could not extract readable content from: {url}"
if len(text) > max_chars:
text = text[:max_chars] + f"\n\n[… truncated at {max_chars} chars — pass a larger max_chars (up to 131072) to see more]"
return f"Content from {url}:\n\n{text}"
DECLARATIONS = [
types.FunctionDeclaration(
name="web_search",
@@ -96,10 +134,10 @@ DECLARATIONS = [
types.FunctionDeclaration(
name="http_fetch",
description=(
"Fetch a specific URL and return the response. Unlike web_search, this hits "
"Fetch a specific URL and return the raw response body. Unlike web_search, this hits "
"a direct URL — useful for health checks, JSON API endpoints, webhook testing, "
"or reading a specific page when you already know the URL. "
"Response body is capped at 8 KB."
"or inspecting raw page source. For readable article/doc content, use web_read instead. "
"Response body is capped at max_chars (default 8192, max 32768)."
),
parameters=types.Schema(
type=types.Type.OBJECT,
@@ -108,6 +146,25 @@ DECLARATIONS = [
"method": types.Schema(type=types.Type.STRING, description="HTTP method: GET (default), POST, HEAD"),
"body": types.Schema(type=types.Type.STRING, description="Optional request body (for POST requests)"),
"timeout": types.Schema(type=types.Type.INTEGER, description="Request timeout in seconds (default 15, max 60)"),
"max_chars": types.Schema(type=types.Type.INTEGER, description="Max characters to return (default 8192, max 131072)"),
},
required=["url"],
),
),
types.FunctionDeclaration(
name="web_read",
description=(
"Fetch a URL and extract clean readable text, stripping ads, navigation, sidebars, "
"and other boilerplate. Returns the main article/document content as markdown. "
"Use this for blog posts, documentation, news articles, GitHub READMEs, or any page "
"where you want the content without surrounding noise. "
"For raw HTTP responses (JSON APIs, health checks, source inspection), use http_fetch."
),
parameters=types.Schema(
type=types.Type.OBJECT,
properties={
"url": types.Schema(type=types.Type.STRING, description="Full URL to fetch and extract"),
"max_chars": types.Schema(type=types.Type.INTEGER, description="Max characters to return (default 16000, max 131072)"),
},
required=["url"],
),

View File

@@ -129,16 +129,24 @@ User-defined scheduled jobs stored in `home/{user}/persona/{name}/CRONS.json`. R
## Notification Channel Config
`notification_channel` in `channels.json` sets the default outbound channel for all proactive messages (distill alerts, cron message/brief jobs):
`notification_channel` in `channels.json` sets the default outbound channel for all proactive messages (distill alerts, cron jobs, reminder checks):
```json
{
"notification_channel": "nextcloud",
...
"notification_channel": "web_push",
"notification_email": "user@example.com",
"nextcloud": { "notification_room": "<token>" },
"google_chat": { "outbound_webhook": "https://..." }
}
```
If absent, defaults to `nextcloud` if configured. Currently only NC Talk is supported for outbound; Google Chat outbound is a future item.
Supported channels: `web_push` (browser push via VAPID), `email`, `nextcloud` (NC Talk), `google_chat`. Configured via **Settings → Notifications** (`/settings/notifications`).
**Proactive notification triggers:**
- **Daily 09:00** — `_run_reminder_check()` in `scheduler.py`: reads due/overdue reminders per persona, fires `notify()` with a formatted summary
- **Memory distillation** — `_run_mid()` / `_run_long()` call `notify()` on completion
- **Cron jobs** — `message` / `brief` job types call `notify()` directly
- **On-demand** — `POST /api/push/test` (test notification) and `POST /api/push/reminders/check` (immediate reminder check)
---

View File

@@ -256,3 +256,61 @@ Rather than a single Cortex instance, each device in the fleet runs its own inst
- Session continuity — does a conversation that starts on one node stay there, or can it migrate?
The Syncthing-synced `home/` directory and shared `model_registry.json` already provide a natural foundation — instances share persona memory and context without a central DB.
---
## 11. LLM Wiki — Persistent Knowledge Compilation (Karpathy Pattern)
**Status:** Concept — no design yet. Inspired by [Karpathy's llm-wiki](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) gist.
**Core idea:** Instead of treating AE Journals as an archive you retrieve from, evolve them into a **living wiki** that the LLM incrementally builds and maintains. When a new source is added, the LLM doesn't just index it — it reads it, extracts key information, and integrates it into the existing wiki: updating entity pages, revising topic summaries, flagging contradictions, strengthening or challenging the evolving synthesis. Knowledge is compiled once and kept current, not re-derived on every query.
This is a philosophical shift from our current approach (RAG/retrieval) toward **compounding knowledge** — the wiki gets richer with every source added and every question asked.
### Three-Layer Architecture
```
Raw Sources (immutable) ↓
→ LLM reads, extracts, cross-references
Wiki (LLM-maintained markdown) ← the persistent artifact
→ Human reads, LLM writes
Schema (CLAUDE.md / AGENTS.md) ← configuration + conventions
```
1. **Raw sources** — curated, immutable originals (articles, papers, session logs, transcripts). LLM reads from them, never modifies them.
2. **The wiki** — directory of LLM-generated markdown files: summaries, entity pages, concept pages, comparisons, synthesis. The LLM owns this layer entirely. Creates pages, updates them when new sources arrive, maintains cross-references.
3. **Schema** — a configuration document (analogous to our `PROTOCOLS.md`) that tells the LLM how the wiki is structured, what conventions to follow, and what workflows to use when ingesting sources or answering questions. Co-evolved with the human over time.
### Operations
**Ingest.** Drop a new source into the raw collection and tell the LLM to process it. Flow: LLM reads source → discusses key takeaways with human → writes summary page → updates index → updates relevant entity/concept pages (a single source might touch 10-15 pages) → appends to log. Human stays involved, guiding emphasis.
**Query.** Ask questions against the wiki. LLM reads the index to find relevant pages, drills in, synthesizes an answer with citations. **Key insight: good answers get filed back into the wiki as new pages.** A comparison table, an analysis, a connection discovered — these are valuable and shouldn't disappear into chat history.
**Lint.** Periodic health check: contradictions between pages, stale claims superseded by newer sources, orphan pages with no inbound links, missing cross-references, data gaps that could be filled with a web search.
### Index and Log (Two Navigation Files)
**`index.md`** — content-oriented catalog. Every wiki page listed with link, one-line summary, and optional metadata (date, source count). Organized by category. LLM updates on every ingest. At moderate scale (~100 sources, ~hundreds of pages), this replaces the need for embedding-based RAG.
**`log.md`** — chronological, append-only record of what happened and when (ingests, queries, lint passes). Each entry starts with a consistent prefix (e.g. `## [2026-04-02] ingest | Article Title`) making it parseable with simple tools like `grep "^## \[" log.md | tail -5`.
### Applicability to Cortex / Inara
This pattern maps naturally to several existing concepts:
| Karpathy Concept | Cortex Equivalent | Gap |
|---|---|---|
| Raw sources | Session logs, imported docs | No curated raw-source collection yet |
| Wiki pages | AE Journals | Journals are entry-based, not interlinked-wiki-based |
| Index + Log | No equivalent | Would need `wiki_index.md` and `wiki_log.md` |
| Schema/Protocols | PROTOCOLS.md, OPERATIONS.md | Not configured for wiki maintenance workflows |
| Lint operation | No equivalent | No periodic wiki health-check exists |
| Answers filed back | Session chat history | Answers are lost after session (unless distilled) |
| Obsidian as IDE | Cortex UI / Files panel | Files panel could serve as the browsing surface |
**Next steps (if pursued):**
1. Design the wiki directory structure within `agents_sync/` — separate from session logs and memory files
2. Define the schema document — what goes in a wiki page, cross-reference format, category taxonomy
3. Build an ingest tool/script that reads a source and updates wiki pages (LLM-driven)
4. Build a lint cron job that health-checks the wiki periodically
5. Consider Obsidian compatibility for human browsing of the wiki graph

View File

@@ -72,7 +72,7 @@ Details: [`ARCH__BACKENDS.md`](ARCH__BACKENDS.md) | [`ARCH__PERSONA.md`](ARCH__P
| `email_utils.py` | SMTP invite emails |
| `persona_template.py` | Bootstrap a new persona directory from templates |
| `routers/` | One file per endpoint group — `chat`, `orchestrator`, `auth`, `files`, `ui`, `settings`, `local_llm`, `distill`, `audit`, `usage`, `push`, `help`, `onboarding`, `auth_google`, `nextcloud_talk`, `google_chat` |
| `tools/` | Orchestrator tool implementations — `web`, `tasks`, `scratch`, `reminders`, `cron`, `system`, `notify`, `ae_journals`, `ae_tasks`, `agent_notes` |
| `tools/` | Orchestrator tool implementations — `web` (search/fetch/web_read), `files` (file_read/write/session_read/search), `tasks`, `scratch`, `reminders`, `cron`, `system`, `notify`, `ae_journals`, `ae_tasks`, `agent_notes`, `agents` (spawn_agent) |
| `static/` | Web UI — `index.html`, `app.js`, `style.css`, `login.html`, `setup.html`, `HELP.md`, `local_llm.html`, `settings.html` |
| `tests/` | pytest suite |

View File

@@ -1,7 +1,7 @@
# Cortex / Inara — Master Index
> Start here. This document is a map, not a manual.
> Last updated: 2026-05-06
> Last updated: 2026-05-09
>
> **Documentation philosophy:** Cortex is a no-black-box system. Docs must match reality.
> Update docs before implementing significant changes. Verify they still match after.
@@ -26,7 +26,7 @@ Cortex is a self-hosted personal AI platform. It routes messages from any input
| Claude backend | ✅ Live | Primary — via Claude Code CLI |
| Gemini backend | ✅ Live | Fallback — via Gemini CLI |
| Local backend | ✅ Live | Open WebUI/Ollama on scott_gaming; per-user multi-model config |
| Gemini orchestrator | ✅ Live | Tool loop → Claude response, ⚡ toggle in UI (40 tools) |
| Gemini orchestrator | ✅ Live | Tool loop → Claude response, ⚡ toggle in UI (47 tools) |
| Local orchestrator | ✅ Live | OpenAI-compatible ReAct loop; used when orchestrator role → local model |
| Model registry V2 | ✅ Live | Providers (Anthropic/Google/Local), multi-account Gemini, role assignments |
| Memory distillation | ✅ Live | Short (daily) / Mid (weekly) / Long (monthly) |
@@ -36,6 +36,8 @@ Cortex is a self-hosted personal AI platform. It routes messages from any input
| Tool audit log | ✅ Live | Every orchestrator tool call logged to `home/{user}/tool_audit/` |
| Token usage tracking | ✅ Live | Per-user daily buckets in `home/{user}/usage.json`; visible in Settings |
| Web push notifications | ✅ Live | VAPID push; `web_push` orchestrator tool; subscribe via ☰ menu |
| Proactive notifications | ✅ Live | Daily reminder check (09:00); distill/cron completion alerts; dedicated `/settings/notifications` page |
| Sub-agent spawning | ✅ Live | `spawn_agent` tool — synchronous sub-agents via any configured model |
| Agent private notes | ✅ Live | `AGENT_NOTES.md` — orchestrator-only notepad; 3 rolling backups; user-visible as read-only |
| Distill safety | ✅ Live | Per-persona asyncio lock, per-endpoint cooldowns, Rebuild option |
| Guided onboarding | ✅ Live | Setup Step 3 for OpenRouter; existing-user banner; settings quick-link |

View File

@@ -1,7 +1,7 @@
# Cortex — Roadmap
> Phases and priorities. For active tasks see `TODO__Agents.md`.
> Last updated: 2026-04-29
> Last updated: 2026-05-09
---
@@ -39,7 +39,12 @@
- ✅ Session search (full-text across past session logs)
- ✅ Distill notifications (NC Talk after mid/long runs)
- ✅ Local backend for distillation (DISTILL_BACKEND_MID/LONG in .env)
- [ ] **Local orchestrator** — ReAct tool loop using local model (High priority — see `TODO__Agents.md`)
-Local orchestrator — OpenAI-compatible ReAct loop; fires when orchestrator role → local model
- ✅ Web push notifications — VAPID; `web_push` tool; PWA-installable; subscribe via ☰ menu
- ✅ Proactive notifications — daily reminder check (09:00); `notify()` routes to any configured channel; dedicated settings page
- ✅ Sub-agent spawning — `spawn_agent` tool; per-host concurrency limit; Gemini API + local OpenAI backends
- ✅ Web content extraction — `web_read` via trafilatura; strips ads/nav/boilerplate; 128K cap
- ✅ Session log reader — `session_read(date)` tool; complements `session_search`
- [ ] Knowledge import — markdown → AE Journals (import script)
- [ ] Dev agent pipeline — specialist agents + supervisor + approval gate
- [ ] Gitea webhook integration + Actions CI

View File

@@ -96,8 +96,11 @@ system prompt by `context_loader.py` at all tiers.
- Params: `conversation_token: str`, `limit: int = 20`
- Returns last N messages with sender + timestamp
- Admin-only (requires NC Talk API credentials from channels.json)
- [ ] **`http_post`** — POST to external URLs with allowlist
- [ ] **`task_list` priority filter** — add `priority` param alongside existing `status`
- [ ] **`http_fetch` max_chars** — optional param, default 8192, cap at 32768
- [x] **`http_fetch` max_chars** — optional param, default 8192, cap at 32768 — 2026-05-09
- [x] **`web_read(url, max_chars=16000)`** — clean article extraction via trafilatura; strips ads/nav/boilerplate, returns markdown — 2026-05-09
- [x] **`session_read(date)`** — read a full session log by YYYY-MM-DD date; lists available dates if not found — 2026-05-09
### [Channel] Proactive notifications ✅ — 2026-05-08
Inara reaches out on her own initiative via NC Talk, Google Chat, email, or browser push.
@@ -108,6 +111,9 @@ Inara reaches out on her own initiative via NC Talk, Google Chat, email, or brow
- [x] `scheduler.py` — distill_mid and distill_long already call `notify()` on completion
- [x] Settings UI — "Browser Push Notification" option added to Notification Channel selector
- [x] `notification_channel` accepts `"web_push"` in `routers/settings.py`
- [x] `GET /settings/notifications` — dedicated Notifications page (channel form + test buttons); Settings page now shows a link card
- [x] `POST /api/push/test` + `POST /api/push/reminders/check` — on-demand test endpoints
- [x] `push_utils.py` — fixed `pywebpush` 2.x key deserialisation (use `Vapid.from_pem()` instead of passing PEM string)
### [UI] File attachments in chat
Upload an image or document inline and have it flow into context. Natural workflow