MV-Performer

[SIGGRAPH Asia 2025] The official repo for the conference paper "MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis". [Paper]

Installation

# Create and activate conda environment
conda create -n mvperformer python=3.10 -y
conda activate mvperformer
pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
pip install -r ./requirements.txt
pip install --extra-index-url https://miropsota.github.io/torch_packages_builder pytorch3d==0.7.8+pt2.6.0cu124 
pip install -e ./DiffSynth-Studio
#  Install ffmpeg

Model

The WAN model would download automaticly in to wan_models, If your have downloaded the Wan2.1-T2V-1.3B model, please link it to wan_models/Wan-AI/Wan2.1-T2V-1.3B.

We provide our dit checkpoints on OneDrive. Please download it and put it into checkpoints/mv-performer/dit/diffusion_pytorch_model.bin:

wget https://cuhko365-my.sharepoint.com/:u:/g/personal/223010099_link_cuhk_edu_cn/IQBPNKjkFpu_RJ1sMhNBk8-GAaaWCkEPzKd28Qn3dvu1ANs?download\=1 -O ./checkpoints/mv-performer/dit/diffusion_pytorch_model.bin

Evaluation

1. Prepare the validation dataset

We have uploaded the validation set to OneDrive, which includes the raw data of 10 MVHumanNet actors and 10 DNA-Rendering actors, along with their extracted video latents. Please download each {human_id}.zip to either the data/val_data/dna or data/val_data/mvhuman directory and unzip them. The val_data has the following folder structure:

├── dna / mvhuman
│   ├── {human_id}
│   │   ├── cam.pkl
│   │   ├── crop_gt
│   │   ├── depths
│   │   ├── images
│   │   ├── masks
│   │   ├── partial_render
│   │   ├── smpl_mesh
│   │   └── smpl_params
│   ├── .....
├── val_cache
│   ├── dna
│   ├── mvhuman

If you want to contruct more cases, please refer to this for more details.

2. Run validation from Command Line

To compute FVD, we need to download

# Generative novel view results
python val.py --data_type dna 
python val.py --data_type mvhuman
# The results would be stored in the ./outputs/val_results

To compute FVD, we need to download i3d_pretrained_400.pt

wget https://raw.githubusercontent.com/SongweiGe/TATS/main/tats/fvd/i3d_pretrained_400.pt -O ./checkpoints/fvd/i3d_pretrained_400.pt

Run the evaluation.

# Compute metric on MVHumanNet
python evaluation.py --data_root outputs/val_results/mvhuman/dit_step50 --gt_root data/val_data/mvhuman
# Compute metric on DNA-Render
python evaluation.py --data_root outputs/val_results/dna/dit_step50 --gt_root data/val_data/dna

Inference with monocular video

We put the processed monocular video here, please download and unzip it into data/wild_data/:

wget https://cuhko365-my.sharepoint.com/:u:/g/personal/223010099_link_cuhk_edu_cn/IQDtQUpCyG3FSbnPx8ARGPXDAU2Oq5n5kYVRdYALmn-G900\?download\=1 -O data/test_data.zip
# Unzip
unzip data/test_data.zip -d data/test_data

Inference

python infer.py --vid_name vid01
# The results would be stored in the ./outputs/wild_results

Process in-the-wild video

The process scripts rely on many third-party prior models, and we are working on cleaning the code. Stay tuned.

Acknowledgement

We thank the authors of CogVideoX, SynCamMaster, DiffSynth-Studio, MonoSDF, ViewCrafter, Pi3, SAMURAI, MoGe, and so on, for their great works. We use their code in our our project.

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{zhi2025mv,
  title={MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis},
  author={Zhi, Yihao and Li, Chenghong and Liao, Hongjie and Yang, Xihe and Sun, Zhengwentai and Chang, Jiahao and Cun, Xiaodong and Feng, Wensen and Han, Xiaoguang},
  booktitle={Proceedings of the SIGGRAPH Asia 2025 Conference Papers},
  pages={1--14},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
DiffSynth-Studio		DiffSynth-Studio
assets		assets
data		data
data_scripts		data_scripts
datasets		datasets
eval_utils		eval_utils
.gitignore		.gitignore
README.md		README.md
download_sapiens.sh		download_sapiens.sh
evaluate.py		evaluate.py
infer.py		infer.py
requirements.txt		requirements.txt
val.py		val.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MV-Performer

Installation

Model

Evaluation

1. Prepare the validation dataset

2. Run validation from Command Line

Inference with monocular video

Inference

Process in-the-wild video

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MV-Performer

Installation

Model

Evaluation

1. Prepare the validation dataset

2. Run validation from Command Line

Inference with monocular video

Inference

Process in-the-wild video

Acknowledgement

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages