Skip to content

Feature Request: Multimodal Image Support via @ File Search #2589

@salah9003

Description

@salah9003

What feature would you like to see?

Multimodal Image Support via @ File Search

Problem: Codex CLI has full backend infrastructure for OpenAI Vision API but no way for users to send images in interactive mode.

Solution: Extend the existing @ file search system to detect image files and display them as [Image #1] placeholders while sending actual image data to the Vision API.

Usage Example:
user> @screenshot.png explain this UI
[TAB to select image]
user> [Image #1] explain this UI

codex> This interface shows a terminal with...

Benefits:

  • Unlocks existing unused multimodal capabilities
  • Uses familiar @ syntax with TAB completion
  • Clean visual feedback with numbered placeholders
  • Zero breaking changes to existing functionality
  • Maintains chat history readability

Are you interested in implementing this feature?

Yes, I have a working prototype that extends the existing file search system. I will wait for acknowledgement before opening a PR.

Additional information

The implementation builds on existing infrastructure:

  • InputItem::LocalImage already exists in protocol
  • OpenAI Vision API integration already works
  • @ file search popup system already exists
  • Only missing piece is image detection in file search UI

Implementation preserves all existing functionality while adding seamless image support.

Technical Details:

  • Supports jpg, jpeg, png, gif, webp, bmp formats
  • Extends ChatComposer::insert_selected_path() to detect images
  • Creates numbered placeholders [Image #1], [Image #2] for display
  • Sends cleaned text + actual image data to AI processing
  • Maintains backward compatibility with existing @ file search

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions