Text files (.md, .py, .js, .json, etc.): read client-side and injected
into the message body as a fenced code block — works with all backends
with zero model capability requirements.
Images (PNG/JPG/WebP/GIF, max 5 MB): encoded as base64 data URL on the
client and sent as a separate attachment field. Backend formats them as
OpenAI multimodal content (text + image_url) for local_openai backends.
Claude CLI and Gemini CLI see the text message with a "📎 filename.png"
note; image data is never written to session history.
- index.html: 📎 button + hidden file input in mode-select row;
attachment-row preview area with thumbnail (images) or filename chip
- app.js: _resolveAttachment(), file reader, clearAttachment();
sendMessage/sendOrchestrate updated to allow no-text sends when a
file is pending; attachment spread into chat payload for images
- chat.py: Attachment model; attachment field on ChatRequest;
llm_attachment extracted in _stream_chat and passed to complete()
- llm_client.py: attachment param through complete()/_dispatch()/_local();
_local() builds multimodal content array for vision calls
- style.css: #attach-btn, #attachment-row, #attachment-preview, thumb
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>