Statistical phrase-based translation
Philipp Koehn,Franz Josef Och,Daniel Marcu +2 more
- pp 48-54
TLDR
The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation.Abstract:
We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.read more
Citations
More filters
Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Posted Content
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI
Effective Approaches to Attention-based Neural Machine Translation
TL;DR: A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions.
Proceedings ArticleDOI
Moses: Open Source Toolkit for Statistical Machine Translation
Philipp Koehn,Hieu Hoang,Alexandra Birch,Chris Callison-Burch,Marcello Federico,Nicola Bertoldi,Brooke Cowan,Wade Shen,C. Corbett Moran,Richard Zens,Chris Dyer,Ondrej Bojar,Alexandra Elena Constantin,Evan Herbst +13 more
TL;DR: An open-source toolkit for statistical machine translation whose novel contributions are support for linguistically motivated factors, confusion network decoding, and efficient data formats for translation models and language models.
References
More filters
Proceedings ArticleDOI
Bleu: a Method for Automatic Evaluation of Machine Translation
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Journal Article
The mathematics of statistical machine translation: parameter estimation
TL;DR: The authors describe a series of five statistical models of the translation process and give algorithms for estimating the parameters of these models given a set of pairs of sentences that are translations of one another.
Book
Statistical methods for speech recognition
TL;DR: The speech recognition problem hidden Markov models the acoustic model basic language modelling the Viterbi search hypothesis search on a tree and the fast match elements of information theory.
Proceedings ArticleDOI
Improved statistical alignment models
Franz Josef Och,Hermann Ney +1 more
TL;DR: It is shown that models with a first-order dependence and a fertility model lead to significantly better results than the simple models IBM-1 or IBM-2, which are not able to go beyond zero-order dependencies.
Journal Article
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
TL;DR: A novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and the concept of bilingual parsing with a variety of parallel corpus analysis applications are introduced.