Open AccessProceedings Article
Agreement-Based Learning
Percy Liang,Dan Klein,Michael I. Jordan +2 more
- Vol. 20, pp 913-920
Reads0
Chats0
TLDR
An objective function is proposed for the approach, EM-style algorithms for parameter estimation are derived, and their effectiveness on three challenging real-world learning tasks is demonstrated.Abstract:
The learning of probabilistic models with many hidden variables and non-decomposable dependencies is an important and challenging problem. In contrast to traditional approaches based on approximate inference in a single intractable model, our approach is to train a set of tractable submodels by encouraging them to agree on the hidden variables. This allows us to capture non-decomposable aspects of the data while still maintaining tractability. We propose an objective function for our approach, derive EM-style algorithms for parameter estimation, and demonstrate their effectiveness on three challenging real-world learning tasks.read more
Citations
More filters
Proceedings Article
Learning Bilingual Lexicons from Monolingual Corpora
TL;DR: It is shown that high-precision lexicons can be learned in a variety of language pairs and from a range of corpus types.
Proceedings ArticleDOI
Online EM for Unsupervised Models
Percy Liang,Dan Klein +1 more
TL;DR: It is shown that online variants provide significant speedups and can even find better solutions than those found by batch EM on four unsupervised tasks: part-of-speech tagging, document classification, word segmentation, and word alignment.
Proceedings ArticleDOI
An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators
Percy Liang,Michael I. Jordan +1 more
TL;DR: This paper presents a unified framework for studying parameter estimators, which allows them to compare their relative (statistical) efficiencies, and suggests that modeling more of the data tends to reduce variance, but at the cost of being more sensitive to model misspecification.
Proceedings ArticleDOI
Multi-Head Attention with Disagreement Regularization
TL;DR: The authors introduce a disagreement regularization to explicitly encourage the diversity among multiple attention heads, which has been shown to be effective in WMT14 English-German and WMT17 Chinese-English translation tasks.
Proceedings ArticleDOI
Self-Training for Jointly Learning to Ask and Answer Questions
Mrinmaya Sachan,Eric P. Xing +1 more
TL;DR: This work proposes a self-training method for jointly learning to ask as well as answer questions, leveraging unlabeled text along with labeled question answer pairs for learning, and demonstrates significant improvements over a number of established baselines.
References
More filters
Journal ArticleDOI
A fast learning algorithm for deep belief nets
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Book
Numerical Solution of Stochastic Differential Equations
Peter E. Kloeden,Eckhard Platen +1 more
TL;DR: In this article, a time-discrete approximation of deterministic Differential Equations is proposed for the stochastic calculus, based on Strong Taylor Expansions and Strong Taylor Approximations.
Journal Article
The mathematics of statistical machine translation: parameter estimation
TL;DR: The authors describe a series of five statistical models of the translation process and give algorithms for estimating the parameters of these models given a set of pairs of sentences that are translations of one another.
Book
Graphical Models, Exponential Families, and Variational Inference
TL;DR: The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.