Findings of the 2015 Workshop on Statistical Machine Translation

doi:10.18653/V1/W15-3001

Open AccessProceedings ArticleDOI

Findings of the 2015 Workshop on Statistical Machine Translation

- pp 1-46

TLDR

The WMT15 shared task as discussed by the authors included a standard news translation task, a metrics task, tuning task, and a task for run-time estimation of machine translation quality, and an automatic post-editing task.

Abstract:

This paper presents the results of the WMT15 shared tasks, which included a standard news translation task, a metrics task, a tuning task, a task for run-time estimation of machine translation quality, and an automatic post-editing task. This year, 68 machine translation systems from 24 institutions were submitted to the ten translation directions in the standard translation task. An additional 7 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had three subtasks, with a total of 10 teams, submitting 34 entries. The pilot automatic postediting task had a total of 4 teams, submitting 7 entries.

Citations

PDF

Open Access

More filters

Posted Content

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 23 Oct 2019 -

arXiv: Learning

TL;DR: This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

...read moreread less

Proceedings ArticleDOI

Improving Neural Machine Translation Models with Monolingual Data

Rico Sennrich, +2 more

TL;DR: The authors used target-side monolingual data for NMT and obtained state-of-the-art performance for several NMT tasks, while only using parallel data for training.

...read moreread less

Proceedings ArticleDOI

Findings of the 2016 Conference on Machine Translation

Ondˇrej Bojar, +20 more

TL;DR: The results of the WMT16 shared tasks are presented, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task.

...read moreread less

Proceedings ArticleDOI

Transfer Learning for Low-Resource Neural Machine Translation

Barret Zoph, +3 more

TL;DR: A transfer learning method is presented that significantly improves Bleu scores across a range of low-resource languages by first train a high-resource language pair, then transfer some of the learned parameters to the low- resource pair to initialize and constrain training.

...read moreread less

Proceedings ArticleDOI

Unsupervised Pretraining for Sequence to Sequence Learning

Prajit Ramachandran, +2 more

TL;DR: This article proposed a general unsupervised learning method to improve the accuracy of sequence-to-sequence (seq2seq) models and achieved state-of-the-art results on the WMT English→German task.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

The measurement of observer agreement for categorical data

J. R. Landis, +1 more

- 01 Mar 1977 -

Biometrics

TL;DR: A general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies is presented and tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interob server agreement are developed as generalized kappa-type statistics.

...read moreread less

Journal ArticleDOI

A Coefficient of agreement for nominal Scales

Jacob Cohen

- 01 Apr 1960 -

Educational and Psychological Measuremen...

TL;DR: In this article, the authors present a procedure for having two or more judges independently categorize a sample of units and determine the degree, significance, and significance of the units. But they do not discuss the extent to which these judgments are reproducible, i.e., reliable.

...read moreread less

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Proceedings Article

METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments

Satanjeev Banerjee, +1 more

TL;DR: METEOR is described, an automatic metric for machine translation evaluation that is based on a generalized concept of unigram matching between the machineproduced translation and human-produced reference translations and can be easily extended to include more advanced matching strategies.

...read moreread less

Collapse

Findings of the 2015 Workshop on Statistical Machine Translation

Citations

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Improving Neural Machine Translation Models with Monolingual Data

Findings of the 2016 Conference on Machine Translation

Transfer Learning for Low-Resource Neural Machine Translation

Unsupervised Pretraining for Sequence to Sequence Learning

References

The measurement of observer agreement for categorical data

A Coefficient of agreement for nominal Scales

Bleu: a Method for Automatic Evaluation of Machine Translation

Moses: Open Source Toolkit for Statistical Machine Translation

METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments

Related Papers (5)

Bleu: a Method for Automatic Evaluation of Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Moses: Open Source Toolkit for Statistical Machine Translation

Neural Machine Translation of Rare Words with Subword Units

Sequence to Sequence Learning with Neural Networks