scispace - formally typeset
Open AccessProceedings Article

ROUGE: A Package for Automatic Evaluation of Summaries

Reads0
Chats0
TLDR
Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations.
Abstract
ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. The measures count the number of overlapping units such as n-gram, word sequences, and word pairs between the computer-generated summary to be evaluated and the ideal summaries created by humans. This paper introduces four different ROUGE measures: ROUGE-N, ROUGE-L, ROUGE-W, and ROUGE-S included in the ROUGE summarization evaluation package and their evaluations. Three of them have been used in the Document Understanding Conference (DUC) 2004, a large-scale summarization evaluation sponsored by NIST.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

A Neural Attention Model for Abstractive Sentence Summarization

TL;DR: The authors propose a local attention-based model that generates each word of the summary conditioned on the input sentence, which shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.
Posted Content

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.

TL;DR: A combined bottom-up and top-down attention mechanism that enables attention to be calculated at the level of objects and other salient image regions is proposed, demonstrating the broad applicability of this approach to VQA.
Posted Content

Microsoft COCO Captions: Data Collection and Evaluation Server

TL;DR: The Microsoft COCO Caption dataset and evaluation server are described and several popular metrics, including BLEU, METEOR, ROUGE and CIDEr are used to score candidate captions.
Posted Content

fairseq: A Fast, Extensible Toolkit for Sequence Modeling.

TL;DR: fairseq as discussed by the authors is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks, and supports distributed training across multiple GPUs and machines.
Proceedings ArticleDOI

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

TL;DR: Fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks and supports distributed training across multiple GPUs and machines.
References
More filters
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Book

Bootstrap Methods and Their Application

TL;DR: In this paper, a broad and up-to-date coverage of bootstrap methods, with numerous applied examples, developed in a coherent way with the necessary theoretical basis, is given, along with a disk of purpose-written S-Plus programs for implementing the methods described in the text.
Proceedings ArticleDOI

Automatic evaluation of summaries using N-gram co-occurrence statistics

TL;DR: The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.
Related Papers (5)