Hi there,
First of all, thank you for this great repository! I have successfully deployed and run the entire pipeline locally, and the functionality works perfectly.
I am writing to ask if you could share some baseline performance data so I can verify if my local inference results are within the expected range. Specifically, I am interested in comparing the Original Qwen-Image vs. the TensorRT-accelerated version regarding:
1。 Inference Speed
2. GPU Memory Usage
Knowing the hardware environment (e.g., A100, 4090) you tested on would also be very helpful.
Thanks again for your work!