Open Access
A novel string-to-string distance measure with applications to machine translation evaluation
TLDR
The authors introduced a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation and demonstrated how this distance measure can be used as an evaluation criterion in machine translation.Abstract:
We introduce a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation. An algorithm for the calculation of this distance measure in polynomial time is presented. We then demonstrate how this distance measure can be used as an evaluation criterion in machine translation. The correlation between this evaluation criterion and human judgment is systematically compared with that of other automatic evaluation measures on two translation tasks. In general, like other automatic evaluation measures, the criterion shows low correlation at sentence level, but good correlation at system level.read more
Citations
More filters
Proceedings ArticleDOI
Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics
Chin-Yew Lin,Franz Josef Och +1 more
TL;DR: Two new objective automatic evaluation methods for machine translation based on longest common subsequence between a candidate translation and a set of reference translations and relaxes strict n-gram matching to skip-bigram matching are described.
Proceedings ArticleDOI
ORANGE: a method for evaluating automatic evaluation metrics for machine translation
Chin-Yew Lin,Franz Josef Och +1 more
TL;DR: A new evaluation method, Orange, is introduced for evaluating automatic machine translation evaluation metrics automatically without extra human involvement other than using a set of reference translations.
Proceedings ArticleDOI
Results of the WMT19 metrics shared task: segment-level and strong MT systems pose big challenges
TL;DR: The outputs of the translations systems competing in the WMT19 News Translation Task with automatic metrics were asked to score the outputs, and metrics were evaluated on the system level, how well a given metric correlates with the W MT19 official manual ranking, and segment level,How well the metrics correlates with human judgements of segment quality.
Proceedings Article
CDER: Efficient MT Evaluation Using Block Movements.
TL;DR: A new evaluation measure which explicitly models block reordering as an edit operation which can be exactly calculated in quadratic time is presented and how some evaluation measures can be improved by the introduction of word-dependent substitution costs is shown.
Book ChapterDOI
The significance of recall in automatic metrics for MT evaluation
TL;DR: This work shows that correlation with human judgments is highest when almost all of the weight is assigned to recall, and shows that stemming is significantly beneficial not just to simpler unigram precision and recall based metrics, but also to BLEU and NIST.
References
More filters
Proceedings ArticleDOI
Bleu: a Method for Automatic Evaluation of Machine Translation
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Journal Article
Binary codes capable of correcting deletions, insertions, and reversals
Book
Rank correlation methods
TL;DR: The measurement of rank correlation was introduced in this paper, and rank correlation tied ranks tests of significance were applied to the problem of m ranking, and variate values were used to measure rank correlation.
Proceedings ArticleDOI
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
TL;DR: NIST commissioned NIST to develop an MT evaluation facility based on the IBM work, which is now available from NIST and serves as the primary evaluation measure for TIDES MT research.
Related Papers (5)
METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments
Satanjeev Banerjee,Alon Lavie +1 more