scispace - formally typeset
Journal ArticleDOI

Statistical Analysis of Machine Translation Evaluation Systems for English- Hindi Language Pair

TLDR
The importance of Automatic Machine Translation Evaluation is discussed and various Machine translation Evaluation metrics are compared by performing Statistical Analysis on various metrics and human evaluations to find out which metric has the highest correlation with human scores.
Abstract
Automatic Machine Translation (AMT) Evaluation Metrics have become popular in the Machine Translation Community in recent times. This is because of the popularity of Machine Translation engines and Machine Translation as a field itself. Translator is a very important tool to break barriers between communities especially in countries like India, where people speak 22 different languages and their many variations. With the onset of Machine Translation engines, there is a need for a system that evaluates how well these are performing. This is where machine translation evaluation enters. This paper discusses the importance of Automatic Machine Translation Evaluation and compares various Machine Translation Evaluation metrics by performing Statistical Analysis on various metrics and human evaluations to find out which metric has the highest correlation with human scores. The correlation between the Automatic and Human Evaluation Scores and the correlation between the five Automatic evaluation scores are examined at the sentence level. Moreover, a hypothesis is set up and p-values are calculated to find out how significant these correlations are. The results of the statistical analysis of the scores of various metrics and human scores are shown in the form of graphs to see the trend of the correlation between the scores of Automatic Machine Translation Evaluation metrics and human scores. Out of the five metrics considered for the study, METEOR shows the highest correlation with human scores as compared to the other metrics.

read more

Citations
More filters
Journal ArticleDOI

Emerging Trends and Applications in Cognitive Computing

TL;DR: Machine learning is an application of Artificial Intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
References
More filters
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article

METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments

TL;DR: METEOR is described, an automatic metric for machine translation evaluation that is based on a generalized concept of unigram matching between the machineproduced translation and human-produced reference translations and can be easily extended to include more advanced matching strategies.
Proceedings ArticleDOI

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

TL;DR: NIST commissioned NIST to develop an MT evaluation facility based on the IBM work, which is now available from NIST and serves as the primary evaluation measure for TIDES MT research.
ReportDOI

Evaluation of Machine Translation and its Evaluation

TL;DR: The unigram-based F-measure has significantly higher correlation with human judgments than recently proposed alternatives and has an intuitive graphical interpretation, which can facilitate insight into how MT systems might be improved.
Proceedings Article

AMBER: A Modified BLEU, Enhanced Ranking Metric

TL;DR: A new automatic machine translation evaluation metric is proposed: AMBER, which is based on the metric BLEU but incorporates recall, extra penalties, and some text processing variants and achieves state-of-the-art performance.