scispace - formally typeset
Open Access

A novel string-to-string distance measure with applications to machine translation evaluation

TLDR
The authors introduced a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation and demonstrated how this distance measure can be used as an evaluation criterion in machine translation.
Abstract
We introduce a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation. An algorithm for the calculation of this distance measure in polynomial time is presented. We then demonstrate how this distance measure can be used as an evaluation criterion in machine translation. The correlation between this evaluation criterion and human judgment is systematically compared with that of other automatic evaluation measures on two translation tasks. In general, like other automatic evaluation measures, the criterion shows low correlation at sentence level, but good correlation at system level.

read more

Citations
More filters
Proceedings ArticleDOI

Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics

TL;DR: Two new objective automatic evaluation methods for machine translation based on longest common subsequence between a candidate translation and a set of reference translations and relaxes strict n-gram matching to skip-bigram matching are described.
Proceedings ArticleDOI

ORANGE: a method for evaluating automatic evaluation metrics for machine translation

TL;DR: A new evaluation method, Orange, is introduced for evaluating automatic machine translation evaluation metrics automatically without extra human involvement other than using a set of reference translations.
Proceedings ArticleDOI

Results of the WMT19 metrics shared task: segment-level and strong MT systems pose big challenges

TL;DR: The outputs of the translations systems competing in the WMT19 News Translation Task with automatic metrics were asked to score the outputs, and metrics were evaluated on the system level, how well a given metric correlates with the W MT19 official manual ranking, and segment level,How well the metrics correlates with human judgements of segment quality.
Proceedings Article

CDER: Efficient MT Evaluation Using Block Movements.

TL;DR: A new evaluation measure which explicitly models block reordering as an edit operation which can be exactly calculated in quadratic time is presented and how some evaluation measures can be improved by the introduction of word-dependent substitution costs is shown.
Book ChapterDOI

The significance of recall in automatic metrics for MT evaluation

TL;DR: This work shows that correlation with human judgments is highest when almost all of the weight is assigned to recall, and shows that stemming is significantly beneficial not just to simpler unigram precision and recall based metrics, but also to BLEU and NIST.
References
More filters
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Book

Rank correlation methods

TL;DR: The measurement of rank correlation was introduced in this paper, and rank correlation tied ranks tests of significance were applied to the problem of m ranking, and variate values were used to measure rank correlation.
Proceedings ArticleDOI

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

TL;DR: NIST commissioned NIST to develop an MT evaluation facility based on the IBM work, which is now available from NIST and serves as the primary evaluation measure for TIDES MT research.
Related Papers (5)