[model-gateway] multimodality initialization#13350
Conversation
slin1237
commented
Nov 15, 2025
- Format your code according to the Format code with pre-commit.
- Add unit tests according to the Run and add unit tests.
- Update documentation according to Write documentations.
- Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Summary of ChangesHello @slin1237, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request lays the groundwork for supporting multimodal models within the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant new functionality for multimodal support, which is a substantial and well-architected addition. The new modules for media handling, model-specific processing, and input tracking are logically structured. The use of a Python bridge to interface with HuggingFace processors is a practical choice, and the inclusion of parity tests against vLLM is commendable for ensuring compatibility. My review identifies a few areas for improvement, including a potential performance issue with blocking I/O, the use of deprecated APIs, and a type inconsistency, which I've detailed in the comments.
| pub fn new(client: Client, config: MediaConnectorConfig) -> Result<Self, MediaConnectorError> { | ||
| let allowed_domains = config.allowed_domains.map(|domains| { | ||
| domains | ||
| .into_iter() | ||
| .map(|d| d.to_ascii_lowercase()) | ||
| .collect::<HashSet<_>>() | ||
| }); | ||
|
|
||
| let allowed_local_media_path = if let Some(path) = config.allowed_local_media_path { | ||
| Some(std::fs::canonicalize(path)?) | ||
| } else { | ||
| None | ||
| }; | ||
|
|
||
| Ok(Self { | ||
| client, | ||
| allowed_domains, | ||
| allowed_local_media_path, | ||
| fetch_timeout: config.fetch_timeout, | ||
| }) | ||
| } |
There was a problem hiding this comment.
The use of std::fs::canonicalize within the new function is a blocking I/O operation. When this constructor is called from an asynchronous context, it can block the Tokio runtime's worker thread, leading to reduced throughput and potential performance bottlenecks. It is highly recommended to use the asynchronous equivalent, tokio::fs::canonicalize, and make the new function async. This will ensure that the application remains fully non-blocking. Note that this change will require updating the call sites of MediaConnector::new to await its result.
pub async fn new(client: Client, config: MediaConnectorConfig) -> Result<Self, MediaConnectorError> {
let allowed_domains = config.allowed_domains.map(|domains| {
domains
.into_iter()
.map(|d| d.to_ascii_lowercase())
.collect::<HashSet<_>>()
});
let allowed_local_media_path = if let Some(path) = config.allowed_local_media_path {
Some(tokio::fs::canonicalize(path).await?)
} else {
None
};
Ok(Self {
client,
allowed_domains,
allowed_local_media_path,
fetch_timeout: config.fetch_timeout,
})
}031d001 to
5a3a351
Compare