A Tutorial on Deep Latent Variable Models of Natural Language

Open AccessPosted Content

A Tutorial on Deep Latent Variable Models of Natural Language

- 17 Dec 2018 -

TLDR

This tutorial explores issues in depth through the lens of variational inference about how to parameterize conditional likelihoods in latent variable models with powerful function approximators.

Abstract:

There has been much recent, exciting work on combining the complementary strengths of latent variable models and deep learning. Latent variable modeling makes it easy to explicitly specify model constraints through conditional independence properties, while deep learning makes it possible to parameterize these conditional likelihoods with powerful function approximators. While these "deep latent variable" models provide a rich, flexible framework for modeling many real-world phenomena, difficulties exist: deep parameterizations of conditional likelihoods usually make posterior inference intractable, and latent variable objectives often complicate backpropagation by introducing points of non-differentiability. This tutorial explores these issues in depth through the lens of variational inference.

Citations

PDF

Open Access

More filters

Posted Content

Evaluation of Text Generation: A Survey

Asli Celikyilmaz, +2 more

- 26 Jun 2020 -

arXiv: Computation and Language

TL;DR: This paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models.

...read moreread less

Proceedings ArticleDOI

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

Chunyuan Li, +6 more

TL;DR: This paper proposes the first large-scale language VAE model, Optimus, a universal latent embedding space for sentences that is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks.

...read moreread less

Proceedings ArticleDOI

ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification

Jia Wei, +3 more

TL;DR: This paper proposes ARNOR, a novel Attention Regularization based NOise Reduction framework for distant supervision relation classification that assumes that a trustable relation label should be explained by the neural attention model.

...read moreread less

Proceedings ArticleDOI

Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence

Xiaoyu Shen, +4 more

TL;DR: This work proposes to explicitly segment target text into fragment units and align them with their data correspondences to maintain the same expressive power as neural attention models, while being able to generate fully interpretable outputs with several times less computational cost.

...read moreread less

Posted Content

Paraphrase Generation with Latent Bag of Words

Yao Fu, +2 more

- 07 Jan 2020 -

arXiv: Computation and Language

TL;DR: This work proposes a latent bag of words (BOW) model for paraphrase generation that ground the semantics of a discrete latent variable by the target BOW to build a fully differentiable content planning and surface realization pipeline.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Journal ArticleDOI

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977 -

Journal of the royal statistical society...

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Journal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

...read moreread less

Collapse

Neural Computation

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

A Tutorial on Deep Latent Variable Models of Natural Language

Citations

Evaluation of Text Generation: A Survey

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification

Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence

Paraphrase Generation with Latent Bag of Words

References

Adam: A Method for Stochastic Optimization

Long short-term memory

Maximum likelihood from incomplete data via the EM algorithm

Generative Adversarial Nets

Dropout: a simple way to prevent neural networks from overfitting

Related Papers (5)

Neural Machine Translation by Jointly Learning to Align and Translate

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Attention is All you Need

Long short-term memory

Bleu: a Method for Automatic Evaluation of Machine Translation