Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models

TL;DR:

We present Frame Guidance, a training-free framework that supports diverse control tasks using frame-level signals.

This is an official implementation of paper 'Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models'.

[ICLR 2026]- Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
Sangwon Jang*, Taekyung Ki*, Jaehyeong Jo, Jaehong Yoon, Soo Ye Kim, Zhe Lin, Sungju Hwang
(* indicates equal contribution)

Installation

2026.02.11: 🚨 There is an installation error with openai-CLIP. Please refer to: openai/CLIP#528.

2026.02.12: 🚨 There is a Wan model loading error with transformers==5.0.0. Please use transformers==4.57.3 until this issue is fixed.

Please refer to setting.sh for conda environment setup.

Inference

🧩 Task	🔧 Base model	📂 Code
🎯Keyframe-guided, Color block, Depth, Sketch	CogX-I2V	`keyframe_cogx.ipynb`
🎨Stylized, 🔁Loop	CogX-T2V	`others_cogx.ipynb`
Wan2.1 version will be updated!
🎯Keyframe-guided, Color block, Depth, Sketch	Wan-I2V	`keyframe_wan.ipynb`
🎨Stylized, 🔁Loop	Wan-T2V	`others_wan.ipynb`

Parameter	Description	Default
`--video`	Input conditions for guidance (List: `[img0, img1, ... imgL]`)	require for I2V
`--guidance_lr`	Schedule for guidance step size η	`3e0`
`--guidance_step`	Schedule for the number of guidance steps M	see `.ipynb` file
`--fixed_frames`	Where to apply frame-guidance (e.g., `[25,48]` means apply guidance on 25th and 48th frame)	require
`--strength`	V2V strength (It sometimes help converge faster for keyframe guidance)	`0`
`--loss_fn`	Loss design for each task [`frame`, `style`, `depth`, `lineart`, `loop` ...]	require
`--travel_time`	When we apply time-travel (stochastic) step	CogX: (5, 20), Wan: (3, 10)

See details in each task-specific examples.

@inproceedings{
  jang2026frame,
  title={Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Model},
  author={Sangwon Jang and Taekyung Ki and Jaehyeong Jo and Jaehong Yoon and Soo Ye Kim and Zhe Lin and Sung Ju Hwang},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=y39XbEp1vK}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
examples		examples
pipelines		pipelines
results		results
.gitignore		.gitignore
README.md		README.md
keyframe_cogx.ipynb		keyframe_cogx.ipynb
keyframe_wan.ipynb		keyframe_wan.ipynb
others_cogx.ipynb		others_cogx.ipynb
others_wan.ipynb		others_wan.ipynb
settings.sh		settings.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models

TL;DR:

Installation

Inference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models

TL;DR:

Installation

Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages