Skip to content

feat(feishu): add inbound image message support#951

Closed
shikihane wants to merge 2 commits intosipeed:mainfrom
shikihane:feat/feishu-inbound-image-pr
Closed

feat(feishu): add inbound image message support#951
shikihane wants to merge 2 commits intosipeed:mainfrom
shikihane:feat/feishu-inbound-image-pr

Conversation

@shikihane
Copy link
Copy Markdown
Contributor

@shikihane shikihane commented Mar 1, 2026

Description

Add inbound image message support for the Feishu channel. When users send images (or images with text) via Feishu, the channel now downloads and stores them as media:// refs for the vision pipeline to process.

Depends on: #1020 (vision pipeline — resolves media:// refs to base64 data URLs for the LLM)

Changes

pkg/channels/feishu/feishu_64.go:

  • Pure image messages (MsgTypeImage): extract image_key from {"image_key":"..."} content, download via MessageResource API
  • Rich text post messages (MsgTypePost): parse {"content":[[{"tag":"img","image_key":"..."}]]} format to extract all embedded images
  • downloadImage(): uses MessageResource.Get(messageID, fileKey, type="image") — the correct API for downloading user-sent images (Image.Get only works for bot-uploaded images)
  • No hardcoded extensions: filenames use image_key instead of .jpg — MIME detection is handled downstream by h2non/filetype magic bytes
  • MediaStore integration: downloaded images stored via store.Store()media:// ref → passed to agent via HandleMessage()

pkg/channels/feishu/feishu_64_test.go (new file):

  • extractFeishuImageKey tests (valid key, empty, invalid JSON, missing field)
  • extractFeishuMessageContent tests (text, image, nil message)

Data flow

Feishu image/post message
  → handleMessageReceive() detects MsgTypeImage or MsgTypePost
  → extractFeishuImageKey() / extractFeishuPostImageKeys()
  → downloadImage() via MessageResource.Get API
  → MediaStore.Store() → media://uuid ref
  → HandleMessage(..., mediaPaths, ...) → InboundMessage.Media
  → (vision pipeline resolves to base64 for LLM)

E2E verification

Tested on Radxa Cubie A7A (arm64) with Feishu channel + Dashscope LLM (kimi-k2.5). Sent image via Feishu, LLM correctly identified image content.

Type of Change

  • ✨ New feature (non-breaking change which adds functionality)

AI Code Generation

  • 🛠️ Mostly AI-generated (AI draft, Human verified/modified)

Test Environment

  • Hardware: Radxa Cubie A7A (arm64, 4GB RAM)
  • OS: Debian 11 (bullseye)
  • Model/Provider: Dashscope (kimi-k2.5, vision-capable)
  • Channel: Feishu (WebSocket mode)

Checklist

  • My code follows the style of this project
  • I have performed a self-review of my own changes
  • Backward compatible — non-image messages unaffected
  • All existing tests pass (make test)

@shikihane
Copy link
Copy Markdown
Contributor Author

shikihane commented Mar 1, 2026

✅ Implementation Complete & Tested

The full vision pipeline is now working end-to-end on our deployment:

What's been implemented (on dev branch)

  1. Feishu inbound image download (commits 9527e95, 8912dbb)

    • Detects MsgTypeImage, extracts image_key
    • Downloads via MessageResource.Get() API (not Image.Get() — that only works for bot-uploaded images)
    • Stores as media://uuid refs in MediaStore
  2. Agent pipeline Media support (commits 1a55a8a, 09e0220)

    • Message.Media field propagation through context builder
    • serializeMessages() converts Media to OpenAI vision format
  3. Media resolution to base64 (latest commit, not yet pushed)

    • resolveMediaRefs() converts media:// refs to data:image/...;base64,... URLs before LLM call
    • Session saves lightweight media:// refs (not bloated base64)
    • Feishu sets ContentType in MediaMeta for proper MIME inference

Test results

Hardware: Radxa Cubie A7A (arm64, 4GB RAM)
Model: qwen3.5-plus via coding.dashscope.aliyuncs.com/v1 (supports vision)
Channel: Feishu WebSocket

Screenshot attached below — user sent a dog photo, bot correctly identified it as "好可爱的柴犬!🐕" with detailed description.
a1784508a7d349b612fdda39db13e585

Next steps

Ready to rebase clean commits onto a feat/vision-feishu branch from main for PR submission. Should this PR include:

  • Just Feishu inbound image support, OR
  • Full vision pipeline (Feishu + agent resolution)?

@sipeed-bot sipeed-bot bot added type: documentation Improvements or additions to documentation domain: channel labels Mar 3, 2026
Add image extraction from both pure image messages (MsgTypeImage) and
rich text post messages (MsgTypePost) in the Feishu channel.

Key changes:
- Download user-sent images via MessageResource.Get API (not Image.Get,
  which only works for bot-uploaded images)
- Parse post message format to extract embedded img tags
- Use image_key as filename instead of hardcoded .jpg extension
- Store downloaded images via MediaStore for media:// ref pipeline

Note: This PR depends on the vision pipeline PR for resolving media://
refs to base64 data URLs before sending to the LLM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shikihane shikihane force-pushed the feat/feishu-inbound-image-pr branch from c29cfea to 7d32d5a Compare March 3, 2026 08:53
- Rename shadowed err to mkErr in downloadImage (govet)
- Remove unused strPtr helper in test file (unused)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shikihane
Copy link
Copy Markdown
Contributor Author

Closing this PR — the inbound image support is now fully covered by #1020 (Vision Pipeline V2) which has been merged. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: channel type: documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant