Agreement-Based Learning

Open AccessProceedings Article

Agreement-Based Learning

Percy Liang, +2 more

- Vol. 20, pp 913-920

Chats0

TLDR

An objective function is proposed for the approach, EM-style algorithms for parameter estimation are derived, and their effectiveness on three challenging real-world learning tasks is demonstrated.

Abstract:

The learning of probabilistic models with many hidden variables and non-decomposable dependencies is an important and challenging problem. In contrast to traditional approaches based on approximate inference in a single intractable model, our approach is to train a set of tractable submodels by encouraging them to agree on the hidden variables. This allows us to capture non-decomposable aspects of the data while still maintaining tractability. We propose an objective function for our approach, derive EM-style algorithms for parameter estimation, and demonstrate their effectiveness on three challenging real-world learning tasks.

Citations

PDF

Open Access

More filters

Proceedings Article

Learning Bilingual Lexicons from Monolingual Corpora

Aria Haghighi, +3 more

TL;DR: It is shown that high-precision lexicons can be learned in a variety of language pairs and from a range of corpus types.

...read moreread less

Proceedings ArticleDOI

Online EM for Unsupervised Models

Percy Liang, +1 more

TL;DR: It is shown that online variants provide significant speedups and can even find better solutions than those found by batch EM on four unsupervised tasks: part-of-speech tagging, document classification, word segmentation, and word alignment.

...read moreread less

Proceedings ArticleDOI

An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators

Percy Liang, +1 more

TL;DR: This paper presents a unified framework for studying parameter estimators, which allows them to compare their relative (statistical) efficiencies, and suggests that modeling more of the data tends to reduce variance, but at the cost of being more sensitive to model misspecification.

...read moreread less

Proceedings ArticleDOI

Multi-Head Attention with Disagreement Regularization

Jian Li, +4 more

TL;DR: The authors introduce a disagreement regularization to explicitly encourage the diversity among multiple attention heads, which has been shown to be effective in WMT14 English-German and WMT17 Chinese-English translation tasks.

...read moreread less

Proceedings ArticleDOI

Self-Training for Jointly Learning to Ask and Answer Questions

Mrinmaya Sachan, +1 more

TL;DR: This work proposes a self-training method for jointly learning to ask as well as answer questions, leveraging unlabeled text along with labeled question answer pairs for learning, and demonstrates significant improvements over a number of established baselines.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006 -

Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +3 more

Book

Numerical Solution of Stochastic Differential Equations

Peter E. Kloeden, +1 more

TL;DR: In this article, a time-discrete approximation of deterministic Differential Equations is proposed for the stochastic calculus, based on Strong Taylor Expansions and Strong Taylor Approximations.

...read moreread less

Journal Article

The mathematics of statistical machine translation: parameter estimation

Peter Fitzhugh Brown, +3 more

- 01 Jun 1993 -

Computational Linguistics

TL;DR: The authors describe a series of five statistical models of the translation process and give algorithms for estimating the parameters of these models given a set of pairs of sentences that are translations of one another.

...read moreread less

Book

Graphical Models, Exponential Families, and Variational Inference

Martin J. Wainwright, +1 more

TL;DR: The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

...read moreread less

Computational Linguistics

Agreement-Based Learning

Citations

Learning Bilingual Lexicons from Monolingual Corpora

Online EM for Unsupervised Models

An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators

Multi-Head Attention with Disagreement Regularization

Self-Training for Jointly Learning to Ask and Answer Questions

References

A fast learning algorithm for deep belief nets

Probabilistic Models for Segmenting and Labeling Sequence Data

Numerical Solution of Stochastic Differential Equations

The mathematics of statistical machine translation: parameter estimation

Graphical Models, Exponential Families, and Variational Inference

Related Papers (5)

Bleu: a Method for Automatic Evaluation of Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Effective Approaches to Attention-based Neural Machine Translation

Attention is All you Need

The mathematics of statistical machine translation: parameter estimation