A novel string-to-string distance measure with applications to machine translation evaluation

Open Access

A novel string-to-string distance measure with applications to machine translation evaluation

TLDR

The authors introduced a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation and demonstrated how this distance measure can be used as an evaluation criterion in machine translation.

Abstract:

We introduce a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation. An algorithm for the calculation of this distance measure in polynomial time is presented. We then demonstrate how this distance measure can be used as an evaluation criterion in machine translation. The correlation between this evaluation criterion and human judgment is systematically compared with that of other automatic evaluation measures on two translation tasks. In general, like other automatic evaluation measures, the criterion shows low correlation at sentence level, but good correlation at system level.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics

Chin-Yew Lin, +1 more

TL;DR: Two new objective automatic evaluation methods for machine translation based on longest common subsequence between a candidate translation and a set of reference translations and relaxes strict n-gram matching to skip-bigram matching are described.

...read moreread less

Proceedings ArticleDOI

ORANGE: a method for evaluating automatic evaluation metrics for machine translation

Chin-Yew Lin, +1 more

TL;DR: A new evaluation method, Orange, is introduced for evaluating automatic machine translation evaluation metrics automatically without extra human involvement other than using a set of reference translations.

...read moreread less

Proceedings ArticleDOI

Results of the WMT19 metrics shared task: segment-level and strong MT systems pose big challenges

Qingsong Ma, +3 more

TL;DR: The outputs of the translations systems competing in the WMT19 News Translation Task with automatic metrics were asked to score the outputs, and metrics were evaluated on the system level, how well a given metric correlates with the W MT19 official manual ranking, and segment level,How well the metrics correlates with human judgements of segment quality.

...read moreread less

Proceedings Article

CDER: Efficient MT Evaluation Using Block Movements.

Gregor Leusch, +2 more

TL;DR: A new evaluation measure which explicitly models block reordering as an edit operation which can be exactly calculated in quadratic time is presented and how some evaluation measures can be improved by the introduction of word-dependent substitution costs is shown.

...read moreread less

Book ChapterDOI

The significance of recall in automatic metrics for MT evaluation

Alon Lavie, +2 more

TL;DR: This work shows that correlation with human judgments is highest when almost all of the weight is assigned to recall, and shows that stemming is significantly beneficial not just to simpler unigram precision and recall based metrics, but also to BLEU and NIST.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Journal Article

Binary codes capable of correcting deletions, insertions, and reversals

V.I. Levenshtein

- 01 Jan 1966 -

Soviet physics. Doklady

Book

Rank correlation methods

Maurice G. Kendall

TL;DR: The measurement of rank correlation was introduced in this paper, and rank correlation tied ranks tests of significance were applied to the problem of m ranking, and variate values were used to measure rank correlation.

...read moreread less

Journal ArticleDOI

Rank Correlation Methods

P. A. P. Moran, +1 more

- 01 Dec 1973 -

International Statistical Review

Proceedings ArticleDOI

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

George R. Doddington

TL;DR: NIST commissioned NIST to develop an MT evaluation facility based on the IBM work, which is now available from NIST and serves as the primary evaluation measure for TIDES MT research.

...read moreread less

A novel string-to-string distance measure with applications to machine translation evaluation

Citations

Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics

ORANGE: a method for evaluating automatic evaluation metrics for machine translation

Results of the WMT19 metrics shared task: segment-level and strong MT systems pose big challenges

CDER: Efficient MT Evaluation Using Block Movements.

The significance of recall in automatic metrics for MT evaluation

References

Bleu: a Method for Automatic Evaluation of Machine Translation

Binary codes capable of correcting deletions, insertions, and reversals

Rank correlation methods

Rank Correlation Methods

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

Related Papers (5)

Bleu: a Method for Automatic Evaluation of Machine Translation

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments

A Study of Translation Edit Rate with Targeted Human Annotation

Binary codes capable of correcting deletions, insertions, and reversals