-
Notifications
You must be signed in to change notification settings - Fork 64
Open
Description
DPG-Bench introduced dense prompt evaluation for text-to-image (T2I) model benchmarking, becoming one of the most widely used benchmarks in this field. However, as better image generation models continuously emerge and improve, model performance evaluation needs to extend beyond just dense prompts. Aspects such as stylization, text rendering, reasoning, multilingual support, and more now require detailed evaluation.
To address this, in the newly proposed OneIG-Bench (https://arxiv.org/abs/2506.07977), the authors conduct an Omni-dimensional Nuanced Evaluation for the Image Generation task.
Key Features of OneIG-Bench:
-
Comprehensive Prompt Sets:
- Six specialized categories:
- 245 Anime & Stylization prompts (EN/ZH)
- 244 Portrait prompts (EN/ZH)
- 206 General Object prompts (EN/ZH)
- 200 Text Rendering prompts (EN/ZH)
- 225 Knowledge & Reasoning prompts (EN/ZH)
- 200 Multilingualism prompts
- Bilingual coverage: First five sets available in both English and Chinese
- Designed for holistic evaluation of modern text-to-image models
- Six specialized categories:
-
Systematic Quantitative Framework:
- Enables objective capability ranking via standardized metrics
- Ensures direct cross-model comparability
- Dimension-specific evaluation protocol:
- Models generate images only for prompts within one evaluation dimension
- Performance assessed exclusively within that targeted dimension
Here are the evaluation visualization of the most representative SOTA T2I models⬇️
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
