feat: web_read (trafilatura), session_read, http_fetch max_chars

web_read(url, max_chars=16000) — fetches a URL and extracts clean article
text via trafilatura, stripping ads/nav/boilerplate. Returns markdown.

session_read(date) — reads a full session log by YYYY-MM-DD date; lists
available dates if the requested one is not found.

http_fetch gains a max_chars param (default 8192, max 32768) so the cap
is configurable instead of hardcoded.

Tool count: 45 → 47.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-05-09 13:04:24 -04:00
parent a99ebb8c30
commit 7c3291960a
4 changed files with 119 additions and 13 deletions

View File

@@ -19,6 +19,9 @@ python-multipart>=0.0.9 # required by FastAPI for Form() data
# Async HTTP client — used for local OpenAI-compatible backend (Open WebUI / Ollama)
httpx>=0.27.0
# Web content extraction — strips ads/nav/boilerplate, returns clean article text
trafilatura>=1.6.0
# OpenAI-compatible client — tool calling for OpenRouter / LiteLLM / any OAI-compat host
openai>=1.0.0