PET: a Tool for Post-editing and Assessing Machine Translation

Open AccessProceedings Article

PET: a Tool for Post-editing and Assessing Machine Translation

- pp 3982-3987

TLDR

This work describes a standalone tool that has two main purposes: facilitate the post-editing of translations from any MT system so that they reach publishable quality and collect sentence-level information from the post -editing process, e.g.: post-Editing time and detailed keystroke statistics.

Abstract:

Given the significant improvements in Machine Translation (MT) quality and the increasing demand for translations, post-editing of automatic translations is becoming a popular practice in the translation industry. It has been shown to allow for much larger volumes of translations to be produced, saving time and costs. In addition, the post-editing of automatic translations can help understand problems in such translations and this can be used as feedback for researchers and developers to improve MT systems. Finally, post-editing can be used as a way of evaluating the quality of translations in terms of how much post-editing effort these translations require. We describe a standalone tool that has two main purposes: facilitate the post-editing of translations from any MT system so that they reach publishable quality and collect sentence-level information from the post-editing process, e.g.: post-editing time and detailed keystroke statistics.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Findings of the 2016 Conference on Machine Translation

Ondˇrej Bojar, +20 more

TL;DR: The results of the WMT16 shared tasks are presented, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task.

...read moreread less

Posted Content

Evaluation of Text Generation: A Survey

Asli Celikyilmaz, +2 more

- 26 Jun 2020 -

arXiv: Computation and Language

TL;DR: This paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models.

...read moreread less

Proceedings ArticleDOI

Accurate Evaluation of Segment-level Machine Translation Metrics

Yvette Graham, +2 more

TL;DR: Three segment-level metrics — METEOR, NLEPOR and SENTBLEUMOSES — are found to correlate with human assessment at a level not significantly outperformed by any other metric in both the individual language pair assessment for Spanish-toEnglish and the aggregated set of 9 language pairs.

...read moreread less

Proceedings ArticleDOI

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

Graham Neubig, +5 more

TL;DR: The compare-mt tool as discussed by the authors provides a high-level and coherent view of the salient differences between systems that can then be used to guide further analysis or system improvement for machine translation.

...read moreread less

Proceedings ArticleDOI

Findings of the WMT 2018 Shared Task on Quality Estimation

Lucia Specia, +4 more

TL;DR: The WMT18 shared task on Quality Estimation is reported, i.e. the task of predicting the quality of the output of machine translation systems at various granularity levels: word, phrase, sentence and document.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Proceedings Article

A Study of Translation Edit Rate with Targeted Human Annotation

Matthew Snover, +4 more

TL;DR: A new, intuitive measure for evaluating machine translation output that avoids the knowledge intensiveness of more meaning-based approaches, and the labor-intensiveness of human judgments is defined.

...read moreread less

Proceedings Article

Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation

Chris Callison-Burch, +5 more

TL;DR: A large-scale manual evaluation of 104 machine translation systems and 41 system combination entries was conducted, which used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 26 metrics.

...read moreread less

Exploiting Objective Annotations for Minimising Translation Post-editing Effort

Lucia Specia

TL;DR: It is shown that estimations resulting from using post-editing time, a simple and objective annotation, can reliably indicate translation post-EDiting effort in a practical, taskbased scenario.

...read moreread less

PET: a Tool for Post-editing and Assessing Machine Translation

Citations

Findings of the 2016 Conference on Machine Translation

Evaluation of Text Generation: A Survey

Accurate Evaluation of Segment-level Machine Translation Metrics

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

Findings of the WMT 2018 Shared Task on Quality Estimation

References

Bleu: a Method for Automatic Evaluation of Machine Translation

Moses: Open Source Toolkit for Statistical Machine Translation

A Study of Translation Edit Rate with Targeted Human Annotation

Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation

Exploiting Objective Annotations for Minimising Translation Post-editing Effort

Related Papers (5)

A Study of Translation Edit Rate with Targeted Human Annotation

Bleu: a Method for Automatic Evaluation of Machine Translation

Moses: Open Source Toolkit for Statistical Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Neural Machine Translation of Rare Words with Subword Units