feat: download and store WhatsApp media for agent access by baijunjie · Pull Request #128 · qwibitai/nanoclaw

baijunjie · 2026-02-07T10:32:40Z

Summary

When a registered group receives a media message (image, video, audio, document, or sticker), NanoClaw now automatically downloads the file and saves it to groups/{folder}/media/. The container-accessible path is prepended to the message content as [media: /workspace/group/media/...], so agents can read and use the file directly.

No database schema changes — the media path is embedded in the existing content field. The existing container mount already makes the media directory visible, so no mount changes are needed either.

Motivation

I'm working on letting the assistant post tweets with images that users send via WhatsApp. Currently NanoClaw only extracts caption text from media messages and discards the actual file, so the agent has no way to access user-sent images. This PR fixes that by downloading and storing media on disk, making it available for the agent to use in integrations like X image posting.

baijunjie · 2026-02-12T03:32:38Z

I noticed there was a major refactor (index split into channels/whatsapp, ipc, router modules). I've rebased and updated the code — this is now built on top of the latest version.

TomGranot · 2026-02-12T16:19:47Z

Changes needed:

Good feature — the new src/whatsapp-media.ts file is clean.
The src/channels/whatsapp.ts changes correctly target the post-refactor file.
Please rebase to verify the messages.upsert handler lines match current main exactly (the handler has been modified since this PR was opened).
Consider adding tests for the media download logic.

baijunjie · 2026-02-13T01:56:33Z

@TomGranot Updated and rebased on the latest main. Also added comprehensive tests:

src/whatsapp-media.test.ts — 17 unit tests covering getMediaInfo (all media types, edge cases) and downloadAndSaveMedia (MIME-to-extension mapping, fallback extensions, download failure handling)
src/channels/whatsapp.test.ts — 3 new integration tests for media download in the message handler (successful download, download failure, text-only skip), plus fixes for existing test compatibility with the async handler (Browsers mock, GROUPS_DIR mock, flushPromises for async awaits)

All 159 tests passing.

baijunjie · 2026-02-16T04:21:12Z

I've rebased and updated the base branch to the latest version. I'd like to know if this PR is still on track to be merged, or if there's something that doesn't meet the requirements? Or perhaps no one really needs the AI assistant to support image viewing?

Apologies for pressing — I think this will be my last update to this PR. If it's not going to be merged, please feel free to close it.

Add media download support for registered groups. When a message contains an image, video, audio, document, or sticker, it is downloaded and saved to groups/{folder}/media/. The container path is prepended to the message content as [media: /workspace/group/media/filename] so the agent can access the file. Add unit tests for whatsapp-media module and integration tests for media download in the WhatsApp channel handler. Fix existing test compatibility (add Browsers/GROUPS_DIR mocks, async handler awaits).

TomGranot · 2026-02-19T05:28:58Z

Heads up — PR #281 also implements WhatsApp media download and file send support. You might want to compare approaches and see what each can learn from the other.

baijunjie · 2026-02-19T05:45:32Z

@TomGranot Thanks for the heads up. I've done a detailed comparison between PR #281 and this PR. Here's what I found:

Architecture

Aspect	PR #281	This PR (#128)
Code organization	Inline `downloadMedia()` private method in `whatsapp.ts`	Separate `whatsapp-media.ts` module with exported `getMediaInfo()` and `downloadAndSaveMedia()`
Scope	Bidirectional: download + send (`send_file` MCP tool)	Download only

Media Download

Aspect	PR #281	This PR (#128)
Nested message unwrapping	Handles ephemeral, viewOnce, viewOnceV2, documentWithCaption wrappers	None — only checks top-level message keys
MIME safety check	`SAFE_MIME_PREFIXES` allowlist, rejects executables/scripts	None — downloads any media type
File naming	`${Date.now()}-${randomHex}.${ext}` (random)	`${msgId}.${ext}` (deterministic, based on message ID)
Empty content handling	Sets placeholder like `[image]` or `[document: file.pdf]`	Prepends `[media: /workspace/group/media/xxx.jpg]\n` to content

Data Persistence

Aspect	PR #281	This PR (#128)
DB schema	Adds 4 columns: `media_type`, `media_path`, `media_mime`, `media_filename`	No DB changes
Router format	Structured XML attributes: `<message media_type="image" media_path="..." ...>`	Media path embedded directly in content text
Type changes	`NewMessage` gets 4 optional fields + `Channel` gets `sendFile` method	No type changes

Media Send (PR #281 only)

PR #281 also implements the full file-send pipeline from agent container back to WhatsApp:

send_file MCP tool in the container (path/size validation, 64MB limit)
IPC layer handles type: 'file' messages with container-to-host path translation
WhatsAppChannel.sendFile() picks image/video/audio/document based on MIME type
Authorization: non-main groups can only send to their own chat

Takeaways

PR #281 is more complete: MIME safety checks, nested message unwrapping, structured media metadata in DB, bidirectional file transfer.

This PR is simpler and more testable: separate module with dedicated unit tests, deterministic message-ID-based naming (avoids duplicate downloads), but lacks nested message unwrapping (ephemeral/viewOnce media won't download) and MIME safety checks.

I think the key gaps on my side are:

Nested message unwrapping — without it, ephemeral/viewOnce media silently fails
MIME safety check — should block potentially dangerous file types
send_file capability — not in scope for this PR but a natural next step

Happy to incorporate any of these improvements if this PR is still being considered.

TomGranot · 2026-02-19T06:39:30Z

@gavrielc — This and PR #281 both implement WhatsApp media. #128 is simpler with 17 tests and an engaged author. #281 is more comprehensive (MIME safety, bidirectional send, nested message unwrapping). The author of #128 did a thorough comparison in the comments. Could you decide which approach to go with?

Adopt PR qwibitai#128's modular structure: move media download/detection into a separate whatsapp-media.ts with dedicated tests (36 tests). Also adopt deterministic message-ID-based filenames and per-type default extensions, while keeping our MIME safety checks, nested message unwrapping, and structured DB metadata.

Andy-NanoClaw-AI · 2026-03-07T18:33:50Z

Hey @baijunjie 👋 Thank you for this — automatically downloading and surfacing WhatsApp media to agents is exactly the kind of quality-of-life improvement NanoClaw needs!

This feature was subsequently implemented and merged in #770 (the image vision skill), which handles media delivery to container agents. This PR also has merge conflicts with the current codebase.

We're adding Status: Pending Closure. Your idea was spot on — thanks for the contribution! 🙌

baijunjie · 2026-03-08T03:10:29Z

Glad to see NanoClaw has implemented image download capability. This PR is no longer needed and can be closed.

Adopt PR qwibitai#128's modular structure: move media download/detection into a separate whatsapp-media.ts with dedicated tests (36 tests). Also adopt deterministic message-ID-based filenames and per-type default extensions, while keeping our MIME safety checks, nested message unwrapping, and structured DB metadata.

Agent was too conservative — told users 'cannot restart yourself'. In host mode, agent can edit config + run nanoclaw restart via bash. Updated model change flow and removed restart from 'cannot do' list. Co-authored-by: Kenan Rpi5 Claw <rpi5-claw@nanoclaw.dev>

baijunjie requested a review from gavrielc as a code owner February 7, 2026 10:32

baijunjie force-pushed the feat/whatsapp-media-download branch from cddc04c to 4786d6e Compare February 7, 2026 10:36

baijunjie mentioned this pull request Feb 7, 2026

Refactor x-integration skill architecture #129

Open

baijunjie force-pushed the feat/whatsapp-media-download branch from 4786d6e to bf4d2e4 Compare February 12, 2026 03:30

baijunjie force-pushed the feat/whatsapp-media-download branch from a618ae4 to ba46995 Compare February 13, 2026 01:55

baijunjie force-pushed the feat/whatsapp-media-download branch from ba46995 to 6500dd6 Compare February 16, 2026 04:13

baijunjie force-pushed the feat/whatsapp-media-download branch from 6500dd6 to 3a2d98d Compare February 18, 2026 06:48

TomGranot mentioned this pull request Feb 19, 2026

feat: add WhatsApp media download and file send support #281

Open

7 tasks

Andy-NanoClaw-AI added PR: Feature New feature or enhancement Status: Blocked Blocked by merge conflicts or dependencies labels Mar 5, 2026

gavrielc requested a review from gabi-simons as a code owner March 6, 2026 10:17

Andy-NanoClaw-AI added the Status: Pending Closure PR flagged for closure during triage label Mar 7, 2026

baijunjie closed this Mar 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: download and store WhatsApp media for agent access#128

feat: download and store WhatsApp media for agent access#128
baijunjie wants to merge 1 commit intoqwibitai:mainfrom
baijunjie:feat/whatsapp-media-download

baijunjie commented Feb 7, 2026

Uh oh!

baijunjie commented Feb 12, 2026

Uh oh!

TomGranot commented Feb 12, 2026

Uh oh!

baijunjie commented Feb 13, 2026

Uh oh!

baijunjie commented Feb 16, 2026

Uh oh!

TomGranot commented Feb 19, 2026

Uh oh!

baijunjie commented Feb 19, 2026

Uh oh!

TomGranot commented Feb 19, 2026

Uh oh!

Andy-NanoClaw-AI commented Mar 7, 2026

Uh oh!

baijunjie commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

baijunjie commented Feb 7, 2026

Summary

Motivation

Uh oh!

baijunjie commented Feb 12, 2026

Uh oh!

TomGranot commented Feb 12, 2026

Uh oh!

baijunjie commented Feb 13, 2026

Uh oh!

baijunjie commented Feb 16, 2026

Uh oh!

TomGranot commented Feb 19, 2026

Uh oh!

baijunjie commented Feb 19, 2026

Architecture

Media Download

Data Persistence

Media Send (PR #281 only)

Takeaways

Uh oh!

TomGranot commented Feb 19, 2026

Uh oh!

Andy-NanoClaw-AI commented Mar 7, 2026

Uh oh!

baijunjie commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants