scispace - formally typeset
Search or ask a question

Showing papers on "Multi-document summarization published in 2004"


Proceedings Article
25 Jul 2004
TL;DR: Four different RouGE measures are introduced: ROUGE-N, ROUge-L, R OUGE-W, and ROUAGE-S included in the Rouge summarization evaluation package and their evaluations.
Abstract: ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes measures to automatically determine the quality of a summary by comparing it to other (ideal) summaries created by humans. The measures count the number of overlapping units such as n-gram, word sequences, and word pairs between the computer-generated summary to be evaluated and the ideal summaries created by humans. This paper introduces four different ROUGE measures: ROUGE-N, ROUGE-L, ROUGE-W, and ROUGE-S included in the ROUGE summarization evaluation package and their evaluations. Three of them have been used in the Document Understanding Conference (DUC) 2004, a large-scale summarization evaluation sponsored by NIST.

9,293 citations


Journal ArticleDOI
TL;DR: LexRank as discussed by the authors is a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing (NLP), which is based on the concept of eigenvector centrality.
Abstract: We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS). Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences. Our system, based on LexRank ranked in first place in more than one task in the recent DUC 2004 evaluation. In this paper we present a detailed analysis of our approach and apply it to a larger data set including data from earlier DUC evaluations. We discuss several methods to compute centrality using the similarity graph. The results show that degree-based methods (including LexRank) outperform both centroid-based methods and other systems participating in DUC in most of the cases. Furthermore, the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank. We also show that our approach is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents.

2,367 citations


Journal ArticleDOI
TL;DR: A multi-document summarizer, MEAD, is presented, which generates summaries using cluster centroids produced by a topic detection and tracking system and an evaluation scheme based on sentence utility and subsumption is applied.
Abstract: We present a multi-document summarizer, MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We describe two new techniques, a centroid-based summarizer, and an evaluation scheme based on sentence utility and subsumption. We have applied this evaluation to both single and multiple document summaries. Finally, we describe two user studies that test our models of multi-document summarization.

1,121 citations


Proceedings ArticleDOI
01 Jan 2004
TL;DR: It is argued that the method presented is reliable, predictive and diagnostic, thus improves considerably over the shortcomings of the human evaluation method currently used in the Document Understanding Conference.
Abstract: We present an empirically grounded method for evaluating content selection in summarization. It incorporates the idea that no single best model summary for a collection of documents exists. Our method quantifies the relative importance of facts to be conveyed. We argue that it is reliable, predictive and diagnostic, thus improves considerably over the shortcomings of the human evaluation method currently used in the Document Understanding Conference.

640 citations


Proceedings ArticleDOI
01 May 2004
TL;DR: The functionality of MEAD is described, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500 organizations.
Abstract: This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500 organizations. MEAD has been used in a variety of summarization applications ranging from summarization for mobile devices to Web page summarization within a search engine and to novelty detection.

378 citations


Proceedings ArticleDOI
25 Jul 2004
TL;DR: This paper gives empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web- page classification algorithms and proposes a new Web summarization-based classification algorithm that achieves an approximately 8.8% improvement over pure-text based methods.
Abstract: Web-page classification is much more difficult than pure-text classification due to a large variety of noisy information embedded in Web pages. In this paper, we propose a new Web-page classification algorithm based on Web summarization for improving the accuracy. We first give empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web-page classification algorithms. We then propose a new Web summarization-based classification algorithm and evaluate it along with several other state-of-the-art text summarization algorithms on the LookSmart Web directory. Experimental results show that our proposed summarization-based classification algorithm achieves an approximately 8.8% improvement as compared to pure-text-based classification algorithm. We further introduce an ensemble classifier using the improved summarization algorithm and show that it achieves about 12.9% improvement over pure-text based methods.

204 citations


Proceedings ArticleDOI
06 May 2004
TL;DR: A semantic abstraction approach to automatic summarization in the biomedical domain relies on a semantic processor that functions as the source interpreter and produces a list of predications, ultimately generating a conceptual condensate for a disorder input topic.
Abstract: We explore a semantic abstraction approach to automatic summarization in the biomedical domain. The approach relies on a semantic processor that functions as the source interpreter and produces a list of predications. A transformation stage then generalizes and condenses this list, ultimately generating a conceptual condensate for a disorder input topic. The final condensate is displayed in graphical form. We provide a set of principles for the transformation stage and describe the application of this approach to multidocument input. Finally, we examine the characteristics and quality of the condensates produced.

126 citations


01 Jul 2004
TL;DR: SmartMail, a prototype system for automatically identifying action items (tasks) in email messages, presents the user with a task-focused summary of a message that contains a list of action items extracted from the message.
Abstract: We describe SmartMail, a prototype system for automatically identifying action items (tasks) in email messages. SmartMail presents the user with a task-focused summary of a message. The summary consists of a list of action items extracted from the message. The user can add these action items to their “to do” list.

119 citations


Proceedings ArticleDOI
23 Aug 2004
TL;DR: It is shown how simplifying parentheticals by removing relative clauses and appositives results in improved sentence clustering, by forcing clustering based on central rather than background information.
Abstract: In this paper, we explore the use of automatic syntactic simplification for improving content selection in multi-document summarization. In particular, we show how simplifying parentheticals by removing relative clauses and appositives results in improved sentence clustering, by forcing clustering based on central rather than background information. We argue that the inclusion of parenthetical information in a summary is a reference-generation task rather than a content-selection one, and implement a baseline reference rewriting module. We perform our evaluations on the test sets from the 2003 and 2004 Document Understanding Conference and report that simplifying parentheticals results in significant improvement on the automated evaluation metric Rouge.

106 citations


Proceedings ArticleDOI
02 May 2004
TL;DR: The new multilingual version of the Columbia Newsblaster news summarization system automatically collects, organizes, and summarizes news in multiple source languages, allowing the user to browse news topics with English summaries, and compare perspectives from different countries on the topics.
Abstract: We present the new multilingual version of the Columbia Newsblaster news summarization system. The system addresses the problem of user access to browsing news from multiple languages from multiple sites on the internet. The system automatically collects, organizes, and summarizes news in multiple source languages, allowing the user to browse news topics with English summaries, and compare perspectives from different countries on the topics.

84 citations


Journal ArticleDOI
TL;DR: This article proposes in addition to the classification capacity of clustering techniques, the possibility of offering a indicative extract about the contents of several sources by means of multidocument summarization techniques.
Abstract: A more and more generalized problem in effective information access is the presence in the same corpus of multiple documents that contain similar information. Generally, users may be interested in locating, for a topic addressed by a group of similar documents, one or several particular aspects. This kind of task, called instance or aspectual retrieval, has been explored in several TREC Interactive Tracks. In this article, we propose in addition to the classification capacity of clustering techniques, the possibility of offering a indicative extract about the contents of several sources by means of multidocument summarization techniques. Two kinds of summaries are provided. The first one covers the similarities of each cluster of documents retrieved. The second one shows the particularities of each document with respect to the common topic in the cluster. The document multitopic structure has been used in order to determine similarities and differences of topics in the cluster of documents. The system is independent of document domain and genre. An evaluation of the proposed system with users proves significant improvements in effectiveness. The results of previous experiments that have compared clustering algorithms are also reported.

01 Jan 2004
TL;DR: LetSum (Legal text Sum- marizer), a prototype system, is described, which determines the thematic structure of a judgment in four themes INTRODUCTION, CONTEXT, JURIDICAL ANALYSIS and CONCLUSION, which identifies the relevant sentences for each theme.
Abstract: This paper presents our work on the development of a new methodology for automatic summarization of justice decision. We describe LetSum (Legal text Sum- marizer), a prototype system, which determines the thematic structure of a judgment in four themes INTRODUCTION, CONTEXT, JURIDICAL ANALYSIS and CONCLUSION. Then it identifies the relevant sentences for each theme. We discuss the evaluation of produced summaries with statistical method and also human evaluation based on jurist judgment. The results so far indicate good performance of the system when compared with other summarization technologies.

Journal ArticleDOI
01 Jul 2004
TL;DR: The outline of Text Summarization Challenge 2 (TSC2 hereafter), a sequel text summarization evaluation conducted as one of the tasks at the NTCIR Workshop 3, is reported.
Abstract: We report the outline of Text Summarization Challenge 2 (TSC2 hereafter), a sequel text summarization evaluation conducted as one of the tasks at the NTCIR Workshop 3. First, we describe briefly the previous evaluation, Text Summarization Challenge (TSC1) as introduction to TSC2. Then we explain TSC2 including the participants, the two tasks in TSC2, data used, evaluation methods for each task, and brief report on the results. Lastly we describe plans for the next evaluation, TSC3.

01 Jan 2004
TL;DR: The structure of the system and the various compaction techniques developed in order to produce 10 words summaries of news articles are described and the score obtained using two different machine translation systems are presented.
Abstract: This paper describes the Arabic summarization system that we have developed and evaluated on the very short summary of noisy text task of DUC2004. We describe the structure of the system and the various compaction techniques we developed in order to produce 10 words summaries of news articles. We also present the score we obtained using two different machine translation systems.


01 Jan 2004
TL;DR: A sentence extraction system that produces two sorts of multi-document summaries: the rst is a general-purpose summary of a cluster of related documents while the second is an entity-based summary of documents related to a particular person.
Abstract: We describe a sentence extraction system that produces two sorts of multi-document summaries: the rst is a general-purpose summary of a cluster of related documents while the second is an entity-based summary of documents related to a particular person. The general-purpose summary is generated by a process that ranks sentences based on their document and cluster \worthiness". The personality-based summary is constructed by a process that ranks sentences according to a metric that uses coreference and lexical information in a person prole. In both cases, a process of redundancy removal is applied to exclude repeated information.

Proceedings ArticleDOI
28 Aug 2004
TL;DR: FarsiSum is an attempt to create an automatic text summarization system for Persian that uses modules implemented in an existing summarizer geared towards the Germanic languages, a Persian stop-list in Unicode format and a small set of heuristic rules.
Abstract: FarsiSum is an attempt to create an automatic text summarization system for Persian. The system is implemented as a HTTP client/server application written in Perl. It uses modules implemented in an existing summarizer geared towards the Germanic languages, a Persian stop-list in Unicode format and a small set of heuristic rules.

Book ChapterDOI
22 Nov 2004
TL;DR: The goal of the paper is to investigate the effectiveness of Genetic Algorithm (GA)-based attribute selection in improving the performance of classification algorithms solving the automatic text summarization task.
Abstract: The task of automatic text summarization consists of generating a summary of the original text that allows the user to obtain the main pieces of information available in that text, but with a much shorter reading time. This is an increasingly important task in the current era of information overload, given the huge amount of text available in documents. In this paper the automatic text summarization is cast as a classification (supervised learning) problem, so that machine learning-oriented classification methods are used to produce summaries for documents based on a set of attributes describing those documents. The goal of the paper is to investigate the effectiveness of Genetic Algorithm (GA)-based attribute selection in improving the performance of classification algorithms solving the automatic text summarization task. Computational results are reported for experiments with a document base formed by news extracted from The Wall Street Journal of the TIPSTER collection–a collection that is often used as a benchmark in the text summarization literature.

Proceedings ArticleDOI
02 May 2004
TL;DR: Empirically characterize human-written summaries provided in a widely used summarization corpus and suggest that extraction-based techniques which have been successful for single-document summarization may not be sufficient when summarizing multiple documents.
Abstract: Although single-document summarization is a well-studied task, the nature of multi-document summarization is only beginning to be studied in detail. While close attention has been paid to what technologies are necessary when moving from single to multi-document summarization, the properties of human-written multi-document summaries have not been quantified. In this paper, we empirically characterize human-written summaries provided in a widely used summarization corpus by attempting to answer the questions: Can multi-document summaries that are written by humans be characterized as extractive or generative? Are multi-document summaries less extractive than single-document summaries? Our results suggest that extraction-based techniques which have been successful for single-document summarization may not be sufficient when summarizing multiple documents.


Proceedings ArticleDOI
14 Sep 2004
TL;DR: Preliminary experimental results show that the proposed method outperforms the conventional basic summarization method under the evaluation scheme when dealing with diverse genres of Chinese documents with free writing style and flexible topic distribution.
Abstract: Automatic summarization is an important research issue in natural language processing. This paper presents a special summarization method to generate single-document summary with maximum topic completeness and minimum redundancy. It initially implements the semantic-class-based vector representations of various kinds of linguistic units in a document by means of HowNet (an existing ontology), which can improve the representation quality of traditional term-based vector space model in a certain degree. Then, by adopting K-means clustering algorithm as well as a clustering analysis algorithm, we can capture the number of different latent topic regions in a document adoptively. Finally, topic representative sentences are selected from each topic region to form the final summary. In order to evaluate the effectiveness of the proposed summarization method, a novel metric which is known as representation entropy is used for summarization redundancy evaluation. Preliminary experimental results show that the proposed method outperforms the conventional basic summarization method under the evaluation scheme when dealing with diverse genres of Chinese documents with free writing style and flexible topic distribution.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: A large-scale test collection for multiple document summarization, the Text Summarization Challenge 3 (TSC3) corpus, which annotates not only the important sentences in a document set, but also those among them that have the same content.
Abstract: In this paper, we introduce a large-scale test collection for multiple document summarization, the Text Summarization Challenge 3 (TSC3) corpus. We detail the corpus construction and evaluation measures. The significant feature of the corpus is that it annotates not only the important sentences in a document set, but also those among them that have the same content. Moreover, we define new evaluation metrics taking redundancy into account and discuss the effectiveness of redundancy minimization.

Book ChapterDOI
30 Nov 2004
TL;DR: The Lagrangian multiplier approach was employed to build optimization in allocating time-lengths for all the segmented shots and getting the best perceived motion activity of the summarized video.
Abstract: In this paper, an efficient and effective summarization algorithm based on the extraction and analysis of spatial and motion features for MPEG news video is proposed. We focus on video feature analysis techniques based on the compressed domain (i.e., MVs and DCT coefficients), without the need of transformation back to the pixel domain. To give the viewers a quick and enough browse of the news content, we adopted a new strategy that the anchor audio is overlaid with the summarized news video. Hence, the detection of anchor shots and the summarization of news segment subject to a time-budget constraint constitute the two main works in this paper. In summarization of news segments, the Lagrangian multiplier approach was employed to build optimization in allocating time-lengths for all the segmented shots and getting the best perceived motion activity of the summarized video. Experiments show that our summarized news videos present an average MOS score of above 4.0 in a subjective test.

Journal ArticleDOI
TL;DR: A relatively simple NLP method for extracting temporal information from Korean news articles, with the goal of improving performance of TDT tasks and showing that time information extracted from the text does indeed help to significantly improve both precision and recall.
Abstract: Temporal information plays an important role in natural language processing (NLP) applications such as information extraction, discourse analysis, automatic summarization, and question-answering. In the topic detection and tracking (TDT) area, the temporal information often used is the publication date of a message, which is readily available but limited in its usefulness. We developed a relatively simple NLP method for extracting temporal information from Korean news articles, with the goal of improving performance of TDT tasks. To extract temporal information, we make use of finite state automata and a lexicon containing timerevealing vocabulary. Extracted information is converted into a canonicalized representation of a time point or a time duration. We first evaluated and investigated the extraction and canonicalization methods for their accuracy and the extent to which temporal information extracted as such can help TDT tasks. The experimental results show that time information extracted from the text does indeed help to significantly improve both precision and recall.

01 Jul 2004
TL;DR: This evaluation system can not only grade the quality of a video summary, but also compare different automatic summarization algorithms and make stepwise improvements on algorithms, without the need for new user feedback.
Abstract: : This paper describes a system for automated performance evaluation of video summarization algorithms. We call it SUPERSIEV (System for Unsupervised Performance Evaluation of Ranked Summarization in Extended Videos). It is primarily designed for evaluating video summarization algorithms that perform frame ranking. The task of summarization is viewed as a kind of database retrieval, and we adopt some of the concepts developed for performance evaluation of retrieval in database systems. First, ground truth summaries are gathered in a user study from many assessors and for several video sequences. For each video sequence, these summaries are combined to generate a single reference file that represents the majority of assessors opinions. Then the system determines the best target reference frame for each frame of the whole video sequence and computes matching scores to form a lookup table that rates each frame. Given a summary from a candidate summarization algorithm, the system can then evaluate this summary from different aspects by computing recall, cumulated average precision, redundancy rate and average closeness. With this evaluation system, we can not only grade the quality of a video summary, but also (1) compare different automatic summarization algorithms and (2) make stepwise improvements on algorithms, without the need for new user feedback.

01 Jan 2004
TL;DR: A summarization system which automatically classifies type of document set and summarizes a document set with its appropriate summarizationmechanism is proposed.
Abstract: In this paper, we propose a summarization system which automatically classifies type of document set and summarizes a document set with its appropriate summarizationmechanism. This system will classify a document set into three types: (a) One topic type, (b) multi-topic type, and (c) others. These types will be identifi ed using information of high frequency nouns and Named Entity. In our multi-document summarization system, unnecessary parts are deleted after summarizing each documents and then multi-document summary is generated. In type (a), unnecessary parts are similar part between summarized documents by single document summarization. In type (b), unnecessary parts are unsimilar parts in documents. In type (c), unnecessaryparts are identified by scores used for single document summarization.

01 Jul 2004
TL;DR: It is argued that the message that the graphic designer intended to convey must play a major role in determining the content of the summary, and the approach to identifying this intended message and using it to construct the summary is outlined.
Abstract: Information graphics (non-pictorial graphics such as bar charts or line graphs) are an important component of multimedia documents. Often such graphics convey information that is not contained elsewhere in the document. Thus document summarization must be extended to include summarization of information graphics. This paper addresses our work on graphic summarization. It argues that the message that the graphic designer intended to convey must play a major role in determining the content of the summary, and it outlines our approach to identifying this intended message and using it to construct the summary.

Book ChapterDOI
22 Nov 2004
TL;DR: A new method for temporal web page summarization based on trend and variance analysis is presented, which can be also used for summarization of dynamic collections of topically related web pages.
Abstract: In the recent years the Web has become an important medium for communication and information storage. As this trend is predicted to continue, it is necessary to provide efficient solutions for retrieving and processing information found in WWW. In this paper we present a new method for temporal web page summarization based on trend and variance analysis. In the temporal summarization web documents are treated as dynamic objects that have changing contents and characteristics. The sequential versions of a single web page are retrieved during predefined time interval for which the summary is to be constructed. The resulting summary should represent the most popular, evolving concepts which are found in web document versions. The proposed method can be also used for summarization of dynamic collections of topically related web pages.

01 Jul 2004
TL;DR: The focus is on diagrams (line drawings) because they allow parsing techniques to be used, in contrast to the difficulties of general image understanding, and the advances in raster image vectorization and parsing needed to produce corpora for diagram summarization.
Abstract: Some document genres contain a large number of figures. This position paper outlines approaches to diagram summarization that can augment the many well-developed techniques of text summarization. We discuss figures as surrogates for entire documents, thumbnails, extraction, the relations between text and figures as well as how automation might be achieved. The focus is on diagrams (line drawings) because they allow parsing techniques to be used, in contrast to the difficulties of general image understanding. We describe the advances in raster image vectorization and parsing needed to produce corpora for diagram summarization.

01 Jan 2004
TL;DR: ClCL Research's participation in the Document Understanding Conference for 2004 was primarily intended to conduct further experiments in the use of XML-tagged documents containing increasingly richer characterizations of texts, and the Knowledge Management System was extended to include a refined capability for identifying multiword units for use in keyword generation.
Abstract: CL Research's participation in the Document Understanding Conference for 2004 was primarily intended to conduct further experiments in the use of XML-tagged documents containing increasingly richer characterizations of texts. We extended the Knowledge Management System to include (1) a refined capability for identifying multiword units (phrases) for use in keyword generation, (2) the incorporation of word-sense disambiguation to tag senses and identify semantic types, and (3) the integration of question-answering functionality into the summarization framework. We did not devote much effort in refining our system to create summaries for the five tasks, but achieved reasonable levels of performance. We viewed the length restrictions imposed on the tasks as not providing sufficient flexibility to investigate different modes of summarization. We viewed the tasks of summarizing machine translations of poor quality as not very interesting. We used Tasks 1 and 3 to develop and r efine a keywor d gen er ation capa bility, ach ieving levels of four th of 18 a nd fourth of 10 priority 1 systems. In the more general summarization tasks, our performance was near the bottom of participating systems, but still achieved acceptable levels of performance. We performed much better on quality measures with our extraction-based summaries, with an overall level of third of 14 systems for Task 5. For several quality measures, our performance was somewhat less; these levels identify specifically those areas of summarization analysis where the use of an XML representation are particularly amenable to improvement. While we will continue to improve our summarization capability within the general guidelines, we believe that summarization is only one part of document understanding and may not represent needs of users for document exploration at a much deeper level.