feat(tools): add send_document tool for file sharing to channels by penso · Pull Request #464 · moltis-org/moltis

penso · 2026-03-23T10:41:04Z

Summary

Add send_document tool for sending local files (PDF, CSV, DOCX, TXT, JSON, ZIP, etc.) to conversation channels (Telegram, Discord, Slack)
Extract shared file_io module from send_image for consistent host/sandbox file reads with size validation and TOCTOU protection
Deduplicate MIME maps into moltis_media::mime (single source of truth for extension↔MIME mapping)
Add filename field to MediaAttachment so channel outbounds use original filenames instead of generic placeholders
Add Content-Disposition header to media API with proper filename sanitization
Add container_cli() with OnceLock cache for podman/docker auto-detection
Normalize podman localhost/ prefix in sandbox image listings
Document card UI component with file-type icon, filename, size, and download/open button

Resurrects the good parts from #353, cleanly rebased on current main.

Closes

Validation

Completed

cargo check passes
cargo +nightly-2025-11-30 fmt --all -- --check passes
cargo +nightly-2025-11-30 clippy --workspace --all-targets -- -D warnings passes
biome check --write passes
send_document unit tests (21 tests pass)
file_io unit tests (7 tests pass)
media MIME tests (13 tests pass)
common types tests (15 tests pass)
web API content_disposition tests (11 tests pass)

Remaining

cargo test (full suite)
E2E tests: cd crates/web/ui && npx playwright test e2e/specs/send-document.spec.js
./scripts/local-validate.sh

Manual QA

Start gateway, open web UI chat
Ask the LLM to create and send a PDF file — verify document card renders with icon, filename, size, and Open button
Ask the LLM to send a ZIP file — verify Download button (not Open)
Verify documents arrive in connected channels (Telegram/Discord/Slack) with correct filename
Check /api/sessions/<key>/media/<file> returns correct Content-Disposition header

Add a new send_document tool that allows sending local files (PDF, CSV, DOCX, TXT, JSON, ZIP, etc.) to conversation channels. Key changes: - New SendDocumentTool with blocked-extension security list and MIME validation via moltis_media::mime - Extract shared file_io module from send_image for host/sandbox reads with size validation and TOCTOU protection - Media-ref persistence path: saves files to session media dir, avoiding base64 in JSON/WS/persistence; falls back to data URI without store - Add filename field to MediaAttachment for original filenames in channel outbounds (Discord, Slack, Telegram) - Document card UI component with icon, filename, size, and download/open button - Deduplicate MIME maps: single source of truth in moltis_media::mime, used by api.rs, send_image, and send_document - Cache container_cli() with OnceLock to avoid repeated subprocess spawns - Add Content-Disposition header to media API with proper sanitization - Podman support for image_cache, sandbox listing, swift-bridge, and web API check-packages - E2E tests for document card rendering (PDF, ZIP, CSV) Closes #337 Closes #332 Entire-Checkpoint: 5b7cc2977dc0

codspeed-hq · 2026-03-23T10:44:11Z

Merging this PR will not alter performance

✅ 39 untouched benchmarks
⏩ 5 skipped benchmarks¹

_{Comparing feat/send-document-v2 (b1467bc) with main (346e888)}

5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

codecov · 2026-03-23T10:49:32Z

Codecov Report

❌ Patch coverage is 54.93671% with 178 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/chat/src/lib.rs	17.64%	140 Missing ⚠️
crates/web/src/api.rs	80.95%	16 Missing ⚠️
crates/gateway/src/server.rs	0.00%	9 Missing ⚠️
crates/slack/src/outbound.rs	0.00%	5 Missing ⚠️
crates/discord/src/outbound.rs	0.00%	4 Missing ⚠️
crates/telegram/src/outbound.rs	0.00%	4 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-03-23T10:50:06Z

Greptile Summary

This PR adds a send_document tool that allows the LLM to share local files (PDF, CSV, DOCX, TXT, JSON, ZIP, etc.) to connected conversation channels (Telegram, Discord, Slack), with a matching document card UI. The implementation is well-engineered: it extracts a shared file_io module from send_image, deduplicates MIME mapping into moltis_media::mime, adds a filename field to MediaAttachment, introduces Content-Disposition handling for the media API, and replaces hard-coded docker references with auto-detected podman/docker CLI selection cached via OnceLock.

Key points from the review:

Security – extension blocklist gap: .mjs and .cjs (Node.js ES / CommonJS module files) are absent from BLOCKED_EXTENSIONS despite .js being blocked. Both are directly executable with node and carry identical risk.
UX inconsistency – "Open" button for HTML files: The isText check in helpers.js matches text/html, rendering an "Open" button for HTML documents. The server correctly forces Content-Disposition: attachment for HTML, so clicking "Open" triggers a download rather than opening the file — misleading to users.
Non-ASCII filenames in Content-Disposition: The filename sanitization strips control characters but leaves Unicode characters in the quoted filename parameter, which violates RFC 7230. Browsers handle non-ASCII in quoted-string filename inconsistently; RFC 5987/8187 filename* encoding should be used for non-ASCII names.
The startup warmup of the container_cli() OnceLock via spawn_blocking correctly resolves the prior async-blocking concern.
The XSS concern from serving text/html inline has been correctly addressed by removing it from the inline disposition list.

Confidence Score: 4/5

Safe to merge with minor fixes; the .mjs/.cjs gap is the only logical issue and has limited practical impact in a controlled deployment.
The PR is well-tested (55+ unit tests), cleanly architected, and the previous critical concerns (async blocking, stored XSS) have been fixed. The remaining issues are: one logic gap (missing .mjs/.cjs in the blocklist), one UX mismatch (HTML "Open" button), and one RFC compliance issue (non-ASCII Content-Disposition filenames). None of these are blocking for most deployments but the blocklist gap warrants a quick fix before merge.
crates/tools/src/send_document.rs (blocked extensions list), crates/web/src/assets/js/helpers.js (isText check), crates/web/src/api.rs (Content-Disposition filename encoding)

Important Files Changed

Filename	Overview
crates/tools/src/send_document.rs	New `send_document` tool: well-structured with good security practices (extension blocklist, size limits, TOCTOU guard, sandbox routing). One gap: `.mjs`/`.cjs` (Node.js module extensions) are not blocked alongside `.js`.
crates/tools/src/file_io.rs	New shared file-reading module extracted from `send_image`. Clean design with size validation, TOCTOU protection, and proper sandbox routing. Shell quoting via `shell_single_quote` is correct and well-tested.
crates/web/src/api.rs	Adds `Content-Disposition` header with filename sanitization and inline/attachment policy. HTML correctly forced to attachment. Non-ASCII characters in filenames not sanitized per RFC 7230/5987, which could cause browser inconsistencies.
crates/web/src/assets/js/helpers.js	Adds `renderDocument` function for the document card UI. `isText` check includes `text/html`, showing a misleading "Open" button for HTML files despite the server sending `attachment` disposition.
crates/chat/src/lib.rs	Adds document dispatch logic alongside the existing screenshot path. Cleanly separates `document_ref` (media-dir) and legacy `document` (data URI) paths. Properly uses `tokio::spawn` for async channel dispatch without blocking the main loop.
crates/tools/src/sandbox/containers.rs	Adds `container_cli()` with `OnceLock` cache, warmed at startup via `spawn_blocking`. Podman `localhost/` prefix normalization is correct. Previous async-blocking concern has been addressed.
crates/media/src/mime.rs	Expands MIME mapping tables and adds `mime_from_extension` delegating to `mime_guess`. Good deduplication of the previously scattered MIME maps. Well-tested with round-trip assertions.
crates/common/src/types.rs	Adds optional `filename` field to `MediaAttachment` with backward-compatible serde defaults. Straightforward and well-tested.

Sequence Diagram

sequenceDiagram
    participant LLM as LLM Agent
    participant Tool as SendDocumentTool
    participant FileIO as file_io
    participant Store as SessionStore
    participant Chat as chat::run_with_tools
    participant Outbound as ChannelOutbound
    participant API as /api/sessions/.../media

    LLM->>Tool: execute({ path, caption })
    Tool->>Tool: validate extension (blocklist + MIME check)
    Tool->>FileIO: read_file_for_session(router, session, path)
    alt sandboxed session
        FileIO->>FileIO: read_sandbox_file (base64 via exec)
    else host session
        FileIO->>FileIO: read_host_file (size check + TOCTOU guard)
    end
    FileIO-->>Tool: Vec<u8>

    alt SessionStore available
        Tool->>Store: save_media(session, uuid_filename, bytes)
        Store-->>Tool: media_ref ("media/main/abc_report.pdf")
        Tool-->>Chat: { document_ref, mime_type, filename, size_bytes }
        Chat->>Store: read_media(session, ref_filename)
        Store-->>Chat: Vec<u8>
        Chat->>Chat: base64-encode → data URI
    else no SessionStore
        Tool-->>Chat: { document: "data:...", filename }
    end

    Chat->>Outbound: send_media(account, chat, payload, reply_to)
    Outbound-->>LLM: (async, fire-and-forget via tokio::spawn)

    Note over API: Media serving
    API->>Store: read_media(session, filename)
    Store-->>API: Vec<u8>
    API-->>API: mime_from_extension → Content-Type
    API-->>API: media_content_disposition → Content-Disposition
    API-->>LLM: file bytes with headers

Comments Outside Diff (3)

crates/tools/src/send_document.rs, line 1155-1158 (link)

.mjs and .cjs missing from blocked extensions

js is correctly blocked, but .mjs (ES module syntax) and .cjs (CommonJS module syntax) are absent from the list. Both are directly executable via node file.mjs / node file.cjs and carry the same risk as a .js file. A recipient who receives a .mjs attachment can run it immediately with Node.js.
crates/web/src/assets/js/helpers.js, line 2227-2236 (link)

isText matches text/html, showing misleading "Open" button

(mimeType || "").startsWith("text/") is true for text/html, so HTML documents rendered in the document card will show an "⬆ Open" button with target="_blank". However, the server deliberately sends Content-Disposition: attachment for text/html (to prevent stored XSS). When a user clicks "Open", the browser will trigger a download instead of opening the file — a confusing mismatch between the button label and the actual behavior.

Consider explicitly excluding text/html from the inline branch:
crates/web/src/api.rs, line 2011-2016 (link)

Non-ASCII filenames in Content-Disposition are not RFC-compliant

The current sanitization strips only a few ASCII control characters (", \n, \r, ;, \) but leaves arbitrary non-ASCII (Unicode) characters in the filename quoted-string. RFC 7230 requires HTTP header field values to be composed of visible ASCII characters (or horizontal tab). Non-ASCII bytes embedded in a quoted string are technically invalid and are handled inconsistently across browsers — some garble the name, some drop it, and some crash the parse.

The RFC 5987 / RFC 8187 filename* parameter with UTF-8''%XX percent-encoding is the correct mechanism for non-ASCII filenames. A minimal fix is to strip (or percent-encode) all non-ASCII characters from the quoted filename parameter, while adding a filename* parameter for Unicode names:
```
// ASCII-only fallback filename
let ascii_name: String = safe_name.chars().filter(|c| c.is_ascii()).collect();
if ascii_name == safe_name {
    format!("{disposition}; filename=\"{safe_name}\"")
} else {
    let encoded: String = safe_name
        .bytes()
        .flat_map(|b| {
            if b.is_ascii_alphanumeric() || b".-_~".contains(&b) {
                vec![b as char]
            } else {
                format!("%{b:02X}").chars().collect()
            }
        })
        .collect();
    format!(
        "{disposition}; filename=\"{ascii_name}\"; filename*=UTF-8''{encoded}"
    )
}
```

_{Reviews (4): Last reviewed commit: "fix(e2e): use system-event RPC to dispat..." | Re-trigger Greptile}

greptile-apps · 2026-03-23T10:50:09Z

+pub fn container_cli() -> &'static str {
+    static CLI: std::sync::OnceLock<&str> = std::sync::OnceLock::new();
+    CLI.get_or_init(|| {
+        if is_cli_available("podman") {
+            "podman"
+        } else {
+            "docker"
+        }
+    })


Blocking subprocess in OnceLock init from async context

container_cli() calls is_cli_available("podman"), which uses std::process::Command (blocking I/O) inside OnceLock::get_or_init. This function is now called from DockerImageBuilder::new() (a sync constructor) and directly from api_check_packages_handler (an async Axum handler). On the first invocation, it will block whichever Tokio worker thread happens to run it.

CLAUDE.md states "never use block_on inside async context" — the spirit is the same: don't block async workers with synchronous I/O. Consider using tokio::task::spawn_blocking for the detection call, or performing the detection eagerly at server startup (sync context before the runtime starts) rather than lazily inside the lock.

// In async handlers, the call should be deferred: let cli = tokio::task::spawn_blocking(container_cli).await?;

Or, initialize container_cli() from a known-sync startup path (e.g., during prepare_gateway before the first .await) so the OnceLock is already warm before any async context hits it.

Fixed — warmed the container_cli() OnceLock at startup via spawn_blocking in spawn_post_listener_warmups. By the time any async handler runs, the lock is already initialized and returns instantly.

Replaces the hand-maintained 30-arm match in mime_from_extension with mime_guess::from_ext(), keeping manual overrides only for extensions the crate doesn't cover (text, log, ppm). This broadens format support automatically and reduces maintenance surface. Entire-Checkpoint: cf6ef2abee12

Avoids blocking an async worker thread on the first call to container_cli() by eagerly initializing the OnceLock from spawn_post_listener_warmups. Subsequent calls from async handlers just read the cached value. Entire-Checkpoint: 15dd0bc8c257

greptile-apps · 2026-03-23T11:11:06Z

+        || content_type.starts_with("audio/")
+        || matches!(
+            content_type,
+            "application/pdf" | "text/plain" | "text/html" | "text/csv" | "text/markdown"


Serving text/html inline enables stored XSS

text/html is included in the inline disposition list, which tells the browser to render the file directly from the same application origin. Since the LLM can generate and save arbitrary HTML files via send_document (.html is in the supported extensions), this creates a stored XSS path:

An adversarial prompt injection tricks the LLM into generating an HTML file with malicious <script> content.

send_document saves it to the session media directory.

A user clicks "Open" in the document card — their browser renders the HTML on the app's own origin.

The injected script runs with full access to localStorage, cookies, and the session API.

text/csv and text/markdown are less risky because browsers don't execute them, but text/html is the primary XSS attack surface. Consider removing it from the inline list so HTML files always trigger a download:

Suggested change

"application/pdf" | "text/plain" | "text/html" | "text/csv" | "text/markdown"

"application/pdf" | "text/plain" | "text/csv" | "text/markdown"

Fixed — removed text/html from the inline disposition list. HTML files now always trigger a download (attachment) to prevent stored XSS.

greptile-apps · 2026-03-23T11:11:07Z

+const BLOCKED_EXTENSIONS: &[&str] = &[
+    "exe", "bat", "cmd", "com", "msi", "scr", "pif", // Windows executables
+    "sh", "bash", "zsh", "csh", "ksh", // Shell scripts
+    "dll", "so", "dylib", // Shared libraries
+    "app", "dmg", "pkg", // macOS
+    "deb", "rpm", // Linux packages
+    "ps1", "psm1", "psd1", // PowerShell
+    "vbs", "vbe", "js", "jse", "wsf", "wsh", // Script engines
+    "reg", "inf", // Windows registry / setup
+];
+
+/// Document-sending tool.
+#[derive(Default)]


Common scripting languages missing from blocked list

The blocked list targets Windows executables, Unix shell scripts, and Windows script engines, but omits several common interpreted languages that are also executable as scripts on many platforms:

py / pyw — Python (directly executable via python, python3, or #!)

php — PHP scripts

rb — Ruby

pl — Perl

lua — Lua

tcl — Tcl

On a typical Linux/macOS host, python script.py or a #!/usr/bin/env python shebang can execute these just as effectively as a .sh file. If the intent of this list is to prevent sharing executable/scriptable files that could be run on the recipient's machine, these formats should also be blocked.

Suggested change

const BLOCKED_EXTENSIONS: &[&str] = &[

"exe", "bat", "cmd", "com", "msi", "scr", "pif", // Windows executables

"sh", "bash", "zsh", "csh", "ksh", // Shell scripts

"dll", "so", "dylib", // Shared libraries

"app", "dmg", "pkg", // macOS

"deb", "rpm", // Linux packages

"ps1", "psm1", "psd1", // PowerShell

"vbs", "vbe", "js", "jse", "wsf", "wsh", // Script engines

"reg", "inf", // Windows registry / setup

];

/// Document-sending tool.

#[derive(Default)]

const BLOCKED_EXTENSIONS: &[&str] = &[

"exe", "bat", "cmd", "com", "msi", "scr", "pif", // Windows executables

"sh", "bash", "zsh", "csh", "ksh", // Shell scripts

"dll", "so", "dylib", // Shared libraries

"app", "dmg", "pkg", // macOS

"deb", "rpm", // Linux packages

"ps1", "psm1", "psd1", // PowerShell

"vbs", "vbe", "js", "jse", "wsf", "wsh", // Script engines

"reg", "inf", // Windows registry / setup

"py", "pyw", "php", "rb", "pl", "lua", "tcl", // Interpreted scripts

];

Fixed — added py, pyw, php, rb, pl, lua, tcl to the blocked extensions list.

…tensions - Remove text/html from inline Content-Disposition list so HTML files always trigger a download instead of rendering on our origin. - Add interpreted script extensions (py, pyw, php, rb, pl, lua, tcl) to the send_document blocked list. Entire-Checkpoint: 32a224dd48e0

Entire-Checkpoint: 05f0dd7e8156

The tests were calling events.eventListeners.forEach() treating it as an array, but it's an object keyed by event name. Dispatch through eventListeners["chat"] with proper tool_call_start/tool_call_end payloads matching the real WebSocket frame format. Entire-Checkpoint: 71aae0eb4a43

…nt tests The previous approach dispatched events via eventListeners["chat"] which bypasses the internal eventHandlers map. Use the system-event RPC (same pattern as websocket.spec.js) which goes through the full server dispatch chain including handleChatEvent. Entire-Checkpoint: 308b42036d7a

greptile-apps Bot reviewed Mar 23, 2026

View reviewed changes

penso added 2 commits March 23, 2026 10:54

greptile-apps Bot reviewed Mar 23, 2026

View reviewed changes

penso added 4 commits March 23, 2026 11:25

style(media): fix nightly rustfmt trailing comma

a07dc12

Entire-Checkpoint: 05f0dd7e8156

penso merged commit 3e7661e into main Mar 23, 2026
45 of 55 checks passed

penso deleted the feat/send-document-v2 branch March 23, 2026 17:04

github-actions Bot mentioned this pull request Mar 24, 2026

🦞 OpenClaw 生态日报 2026-03-24 gsscsd/big_model_radar#87

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): add send_document tool for file sharing to channels#464

feat(tools): add send_document tool for file sharing to channels#464
penso merged 7 commits intomainfrom
feat/send-document-v2

penso commented Mar 23, 2026

Uh oh!

codspeed-hq Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Mar 23, 2026 •

edited

Loading

Comments Outside Diff (3)

Uh oh!

greptile-apps Bot Mar 23, 2026

Uh oh!

penso Mar 23, 2026

Uh oh!

greptile-apps Bot Mar 23, 2026

Uh oh!

penso Mar 23, 2026

Uh oh!

greptile-apps Bot Mar 23, 2026

Uh oh!

penso Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	"application/pdf" \| "text/plain" \| "text/html" \| "text/csv" \| "text/markdown"
	"application/pdf" \| "text/plain" \| "text/csv" \| "text/markdown"

Conversation

penso commented Mar 23, 2026

Summary

Closes

Validation

Completed

Remaining

Manual QA

Uh oh!

codspeed-hq Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

codecov Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (3)

Uh oh!

greptile-apps Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

penso Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

penso Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

penso Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codspeed-hq Bot commented Mar 23, 2026 •

edited

Loading

codecov Bot commented Mar 23, 2026 •

edited

Loading

greptile-apps Bot commented Mar 23, 2026 •

edited

Loading