VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models

This repository contains the code and data for the paper "VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models" presented at EMNLP 2024.

Files

code/: Contains the Python scripts and notebooks used for the project.
data/: Contains the datasets used for training and evaluation.

Quick Start

Sampling Prompts

1. GPT

code/Visual Text Sampling/sample_prompts_gpt.py

python sample_prompts_gpt.py \
  --api_key YOUR_KEY \
  --n_prompts 1000 \
  --output prompts.json \
  --n_threads 50 \
  --step 30 \
  --key_word dog

2. LLaMA

code/Visual Text Sampling/sample_prompts_llama.py

python sample_prompts_llama.py \
  --model meta-llama/Llama-2-13b-chat-hf \
  --n_prompts 1000 \
  --output prompts.json \
  --num_return_sequences 2

T2I Generation

1. Stable Diffusion

code/Visual Text Sampling/text2img_sd.py

python text2img_sd.py \
  --model stabilityai/stable-diffusion-2-1 \
  --prompt_json_path prompts.json \
  --output_dir image_output \
  --num 1000 \
  --batch_size 4

2. Stable Diffusion XL

code/Visual Text Sampling/text2img_sdxl.py

python text2img_sdxl.py \
  --model stabilityai/stable-diffusion-xl-base-1.0 \
  --prompt_json_path prompts.json \
  --output_dir image_output \
  --num 1000 \
  --batch_size 4

VLEU Calculation

1. CLIP

code/VLEU Calculation/cal_vleu_clip.py

python cal_vleu_clip.py \
  --model openai/clip-vit-base-patch16 \
  --prompt_json_path prompts.json \
  --image_dir image_output

2. OpenCLIP

code/VLEU Calculation/cal_vleu_openclip.py

python cal_vleu_openclip.py \
  --model ViT-L-14 \
  --pretrained open_clip_pytorch_model.bin \
  --prompt_json_path prompts.json \
  --image_dir image_output

Citation

If you use this code or data in your research, please cite our paper:

@misc{cao2024vleumethodautomaticevaluation,
      title={VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models}, 
      author={Jingtao Cao and Zheng Zhang and Hongru Wang and Kam-Fai Wong},
      year={2024},
      eprint={2409.14704},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2409.14704}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
code		code
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models

Files

Quick Start

Sampling Prompts

1. GPT

2. LLaMA

T2I Generation

1. Stable Diffusion

2. Stable Diffusion XL

VLEU Calculation

1. CLIP

2. OpenCLIP

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models

Files

Quick Start

Sampling Prompts

1. GPT

2. LLaMA

T2I Generation

1. Stable Diffusion

2. Stable Diffusion XL

VLEU Calculation

1. CLIP

2. OpenCLIP

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages