Findings of the 2015 Workshop on Statistical Machine Translation
Ondřej Bojar,Rajen Chatterjee,Christian Federmann,Barry Haddow,Matthias Huck,Chris Hokamp,Philipp Koehn,Varvara Logacheva,Christof Monz,Matteo Negri,Matt Post,Carolina Scarton,Lucia Specia,Marco Turchi +13 more
- pp 1-46
TLDR
The WMT15 shared task as discussed by the authors included a standard news translation task, a metrics task, tuning task, and a task for run-time estimation of machine translation quality, and an automatic post-editing task.Abstract:
This paper presents the results of the WMT15 shared tasks, which included a standard news translation task, a metrics task, a tuning task, a task for run-time estimation of machine translation quality, and an automatic post-editing task. This year, 68 machine translation systems from 24 institutions were submitted to the ten translation directions in the standard translation task. An additional 7 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had three subtasks, with a total of 10 teams, submitting 34 entries. The pilot automatic postediting task had a total of 4 teams, submitting 7 entries.read more
Citations
More filters
Posted Content
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
TL;DR: This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.
Proceedings ArticleDOI
Improving Neural Machine Translation Models with Monolingual Data
TL;DR: The authors used target-side monolingual data for NMT and obtained state-of-the-art performance for several NMT tasks, while only using parallel data for training.
Proceedings ArticleDOI
Findings of the 2016 Conference on Machine Translation
Ondˇrej Bojar,Rajen Chatterjee,Christian Federmann,Yvette Graham,Barry Haddow,Matthias Huck,Antonio Jimeno Yepes,Philipp Koehn,Varvara Logacheva,Christof Monz,Matteo Negri,Aurélie Névéol,Mariana Neves,Martin Popel,Matt Post,Raphael Rubino,Carolina Scarton,Lucia Specia,Marco Turchi,Karin Verspoor,Marcos Zampieri +20 more
TL;DR: The results of the WMT16 shared tasks are presented, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task.
Proceedings ArticleDOI
Transfer Learning for Low-Resource Neural Machine Translation
TL;DR: A transfer learning method is presented that significantly improves Bleu scores across a range of low-resource languages by first train a high-resource language pair, then transfer some of the learned parameters to the low- resource pair to initialize and constrain training.
Proceedings ArticleDOI
Unsupervised Pretraining for Sequence to Sequence Learning
TL;DR: This article proposed a general unsupervised learning method to improve the accuracy of sequence-to-sequence (seq2seq) models and achieved state-of-the-art results on the WMT English→German task.
References
More filters
Journal ArticleDOI
The measurement of observer agreement for categorical data
J. R. Landis,Gary G. Koch +1 more
TL;DR: A general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies is presented and tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interob server agreement are developed as generalized kappa-type statistics.
Journal ArticleDOI
A Coefficient of agreement for nominal Scales
TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.
Proceedings ArticleDOI
Bleu: a Method for Automatic Evaluation of Machine Translation
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings ArticleDOI
Moses: Open Source Toolkit for Statistical Machine Translation
Philipp Koehn,Hieu Hoang,Alexandra Birch,Chris Callison-Burch,Marcello Federico,Nicola Bertoldi,Brooke Cowan,Wade Shen,C. Corbett Moran,Richard Zens,Chris Dyer,Ondrej Bojar,Alexandra Elena Constantin,Evan Herbst +13 more
TL;DR: An open-source toolkit for statistical machine translation whose novel contributions are support for linguistically motivated factors, confusion network decoding, and efficient data formats for translation models and language models.
Proceedings Article
METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments
Satanjeev Banerjee,Alon Lavie +1 more
TL;DR: METEOR is described, an automatic metric for machine translation evaluation that is based on a generalized concept of unigram matching between the machineproduced translation and human-produced reference translations and can be easily extended to include more advanced matching strategies.