Hi,
I want to elaborate on #2:
The sampling algorithm in your paper is a bit different that what shown in the paper.
The paper suggests this sample step

while you do this:

The clipping is done here
|
x_recon = tf.clip_by_value(x_recon, -1., 1.) |
Now I checked and indeed, without the clipping, the two equations are the same.
Can you give any interpretation or intuition for the clipping and why it is needed?
It seem to be crucial for training while not mentioned in the paper
Thanks
Hi,
I want to elaborate on #2:
The sampling algorithm in your paper is a bit different that what shown in the paper.
The paper suggests this sample step

while you do this:

The clipping is done here
diffusion/diffusion_tf/diffusion_utils.py
Line 172 in 1e0dceb
Now I checked and indeed, without the clipping, the two equations are the same.
Can you give any interpretation or intuition for the clipping and why it is needed?
It seem to be crucial for training while not mentioned in the paper
Thanks