scispace - formally typeset
Open AccessProceedings Article

Agreement-Based Learning

Reads0
Chats0
TLDR
An objective function is proposed for the approach, EM-style algorithms for parameter estimation are derived, and their effectiveness on three challenging real-world learning tasks is demonstrated.
Abstract
The learning of probabilistic models with many hidden variables and non-decomposable dependencies is an important and challenging problem. In contrast to traditional approaches based on approximate inference in a single intractable model, our approach is to train a set of tractable submodels by encouraging them to agree on the hidden variables. This allows us to capture non-decomposable aspects of the data while still maintaining tractability. We propose an objective function for our approach, derive EM-style algorithms for parameter estimation, and demonstrate their effectiveness on three challenging real-world learning tasks.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Learning Bilingual Lexicons from Monolingual Corpora

TL;DR: It is shown that high-precision lexicons can be learned in a variety of language pairs and from a range of corpus types.
Proceedings ArticleDOI

Online EM for Unsupervised Models

TL;DR: It is shown that online variants provide significant speedups and can even find better solutions than those found by batch EM on four unsupervised tasks: part-of-speech tagging, document classification, word segmentation, and word alignment.
Proceedings ArticleDOI

An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators

TL;DR: This paper presents a unified framework for studying parameter estimators, which allows them to compare their relative (statistical) efficiencies, and suggests that modeling more of the data tends to reduce variance, but at the cost of being more sensitive to model misspecification.
Proceedings ArticleDOI

Multi-Head Attention with Disagreement Regularization

TL;DR: The authors introduce a disagreement regularization to explicitly encourage the diversity among multiple attention heads, which has been shown to be effective in WMT14 English-German and WMT17 Chinese-English translation tasks.
Proceedings ArticleDOI

Self-Training for Jointly Learning to Ask and Answer Questions

TL;DR: This work proposes a self-training method for jointly learning to ask as well as answer questions, leveraging unlabeled text along with labeled question answer pairs for learning, and demonstrates significant improvements over a number of established baselines.
References
More filters
Journal ArticleDOI

A fast learning algorithm for deep belief nets

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Book

Numerical Solution of Stochastic Differential Equations

TL;DR: In this article, a time-discrete approximation of deterministic Differential Equations is proposed for the stochastic calculus, based on Strong Taylor Expansions and Strong Taylor Approximations.
Journal Article

The mathematics of statistical machine translation: parameter estimation

TL;DR: The authors describe a series of five statistical models of the translation process and give algorithms for estimating the parameters of these models given a set of pairs of sentences that are translations of one another.
Book

Graphical Models, Exponential Families, and Variational Inference

TL;DR: The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.
Related Papers (5)