Skip to content

Neural Discrete Representation Learning #23

@flrngel

Description

@flrngel

https://arxiv.org/abs/1711.00937

Abstract

  • paper proposes model(VQ-VAE) that learns "discrete representations"
  • differs from VAEs
    • encode network outputs discrete (means not continuous)
    • prior learnt than static
    • circumvent issues of posterior collapse
      • latent ignored by decoder (typically observed by other VAEs)

1. Introduction

  • usefulness of generic representations in unsupervised fashion is lack
  • model conserves the important features of the data in latent space while optimising for maximum likelihood
  • paper concentrate on representations
  • images can often be described concisely by language
  • paper most of VAE with discrete latent representations uses parameterization of the posterior distribution of observation but this paper relies on vector quantization
  • posterior collapse is that latents being ignored
  • can span many dimensions in data space

Models feature

  • simple and unsupervised
  • use discrete latent, not suffer from posterior collapse and has no variance issue
  • perform as well as continuous model
  • coherent and high quality on a wide variety

2. Related work

3. VQ-VAE

image

Order

  1. Encoder parameterises posterior distribution q(z|x) of discrete latent random variables z with data x
  2. posteriors and priors in VAEs are assumed normally distributed with diagonal covariance, which allows for Gaussian re-parameterization trick to be used [32, 23]
  • autoregressive prior and posterior models [14]
  • normalizing flows [10]
  • inverse autoregressive posteriors [22]

3.1. Discrete Latent variables

  • K is the size of the discrete latent space
  • D is dimensionality of each latent embedding vector e_i
    image

3.2. Learning

  • Loss
    • reconstruction loss
    • stop gradients
    • commitment loss
      image

4. Experiments

image

5. Conclusion

  • capable of modeling very long term dependencies through compressed discrete latent space
  • VQ-VAEs capture important features

My Comments

  • word discrete is by embedding(e_i) in VQ-VAE
  • training would be hard because we should consider hyperparameters in Loss function
  • tf.stop_gradient

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions