https://arxiv.org/abs/1711.00937
Abstract
- paper proposes model(VQ-VAE) that learns "discrete representations"
- differs from VAEs
- encode network outputs discrete (means not continuous)
- prior learnt than static
- circumvent issues of posterior collapse
- latent ignored by decoder (typically observed by other VAEs)
1. Introduction
- usefulness of generic representations in unsupervised fashion is lack
- model conserves the important features of the data in latent space while optimising for maximum likelihood
- paper concentrate on representations
- images can often be described concisely by language
- paper most of VAE with discrete latent representations uses parameterization of the posterior distribution of observation but this paper relies on vector quantization
- posterior collapse is that latents being ignored
- can span many dimensions in data space
Models feature
- simple and unsupervised
- use discrete latent, not suffer from posterior collapse and has no variance issue
- perform as well as continuous model
- coherent and high quality on a wide variety
2. Related work
3. VQ-VAE

Order
- Encoder parameterises posterior distribution q(z|x) of discrete latent random variables z with data x
- posteriors and priors in VAEs are assumed normally distributed with diagonal covariance, which allows for Gaussian re-parameterization trick to be used [32, 23]
- autoregressive prior and posterior models [14]
- normalizing flows [10]
- inverse autoregressive posteriors [22]
3.1. Discrete Latent variables
- K is the size of the discrete latent space
- D is dimensionality of each latent embedding vector e_i

3.2. Learning
- Loss
- reconstruction loss
- stop gradients
- commitment loss

4. Experiments

5. Conclusion
- capable of modeling very long term dependencies through compressed discrete latent space
- VQ-VAEs capture important features
My Comments
- word discrete is by embedding(e_i) in VQ-VAE
- training would be hard because we should consider hyperparameters in Loss function
- tf.stop_gradient
https://arxiv.org/abs/1711.00937
Abstract
1. Introduction
Models feature
2. Related work
3. VQ-VAE
Order
3.1. Discrete Latent variables
3.2. Learning
4. Experiments
5. Conclusion
My Comments