Automatic evaluation of summaries using N-gram co-occurrence statistics

doi:10.3115/1073445.1073465

Open AccessProceedings ArticleDOI

Automatic evaluation of summaries using N-gram co-occurrence statistics

- pp 71-78

TLDR

The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.

Abstract:

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.

Citations

PDF

Open Access

More filters

Proceedings Article

ROUGE: A Package for Automatic Evaluation of Summaries

Chin-Yew Lin

TL;DR: Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations.

...read moreread less

Proceedings Article

TextRank: Bringing Order into Text

Rada Mihalcea, +1 more

TL;DR: TextRank, a graph-based ranking model for text processing, is introduced and it is shown how this model can be successfully used in natural language applications.

...read moreread less

Journal ArticleDOI

LexRank: graph-based lexical centrality as salience in text summarization

Gunes Erkan, +1 more

- 01 Jul 2004 -

Journal of Artificial Intelligence Resea...

TL;DR: LexRank as discussed by the authors is a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing (NLP), which is based on the concept of eigenvector centrality.

...read moreread less

Journal ArticleDOI

Multimodal Machine Learning: A Survey and Taxonomy

Tadas Baltrusaitis, +2 more

- 01 Feb 2019 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy to enable researchers to better understand the state of the field and identify directions for future research.

...read moreread less

Journal ArticleDOI

Inter-coder agreement for computational linguistics

Ron Artstein, +3 more

- 01 Dec 2008 -

Computational Linguistics

TL;DR: It is argued that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasks—but that their use makes the interpretation of the value of the coefficient even harder.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Journal ArticleDOI

An algorithm for suffix stripping

M. F. Porter

- 01 Dec 1997 -

Program: Electronic Library and Informat...

TL;DR: An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL, and performs slightly better than a much more elaborate system with which it has been compared.

...read moreread less

Proceedings ArticleDOI

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

George R. Doddington

TL;DR: NIST commissioned NIST to develop an MT evaluation facility based on the IBM work, which is now available from NIST and serves as the primary evaluation measure for TIDES MT research.

...read moreread less

Book

Evaluating Natural Language Processing Systems: An Analysis and Review

Karen Sparck Jones, +1 more

TL;DR: This comprehensive state-of-the-art book is the first devoted to the important and timely issue of evaluating NLP systems, and provides a wide-ranging and careful analysis of evaluation concepts, reinforced with extensive illustrations.

...read moreread less

Proceedings Article

Tracking and summarizing news on a daily basis with Columbia's Newsblaster

Kathleen R. McKeown, +8 more

TL;DR: Columbia's Newsblaster system for online news summarization is presented, a system that crawls the web for news articles, clusters them on specific topics and produces multidocument summaries for each cluster.

...read moreread less

Automatic evaluation of summaries using N-gram co-occurrence statistics

Citations

ROUGE: A Package for Automatic Evaluation of Summaries

TextRank: Bringing Order into Text

LexRank: graph-based lexical centrality as salience in text summarization

Multimodal Machine Learning: A Survey and Taxonomy

Inter-coder agreement for computational linguistics

References

Bleu: a Method for Automatic Evaluation of Machine Translation

An algorithm for suffix stripping

Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

Evaluating Natural Language Processing Systems: An Analysis and Review

Tracking and summarizing news on a daily basis with Columbia's Newsblaster

Related Papers (5)

Bleu: a Method for Automatic Evaluation of Machine Translation

ROUGE: A Package for Automatic Evaluation of Summaries

LexRank: graph-based lexical centrality as salience in text summarization

The automatic creation of literature abstracts

TextRank: Bringing Order into Text