scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
01 Mar 2016
TL;DR: The empirical evaluation of the proposed summarization model on a standard dataset from the Document Understanding Conference showed the effectiveness of the approach which outperformed the baseline comparators in terms of ROUGE scores.
Abstract: This paper proposes an innovative graph-based text summarization model for generic single and multi-document summarization. The approach involves four unique processing stages: parsing sentences semantically using Semantic Role Labeling (SRL), grouping semantic arguments while matching semantic roles to Wikipedia concepts, constructing a weighted semantic graph for each document and linking its sentences (nodes) through the semantic relatedness of the Wikipedia concepts. An iterative ranking algorithm is then applied to the document graphs to extract the most important sentences deemed as the summary. The empirical evaluation of the proposed summarization model on a standard dataset from the Document Understanding Conference (DUC) showed the effectiveness of the approach which outperformed the baseline comparators in terms of ROUGE scores.

10 citations

Patent
20 Nov 2015
TL;DR: In this paper, the authors described multimedia document summarization techniques for extracting relevant text segments in the document and relevant segments of images with constraints on the amount of text and number/size of images in the summary.
Abstract: Multimedia document summarization techniques are described. That is, given a document that includes text and a set of images, various implementations generate a summary by extracting relevant text segments in the document and relevant segments of images with constraints on the amount of text and number/size of images in the summary.

10 citations

Proceedings ArticleDOI
01 Aug 2006
TL;DR: The term vector is adopted to represent the linguistic unit in Chinese document, which obtains higher representation quality than traditional word-based vector space model in a certain extent.
Abstract: This paper proposes a strategy for Chinese multi-document summarization based on clustering and sentence extraction. It adopts the term vector to represent the linguistic unit in Chinese document, which obtains higher representation quality than traditional word-based vector space model in a certain extent. As for clustering, we propose two heuristics to automatically detect the proper number of clusters: the first one makes full use of the summary length fixed by the user; the second is a stability method, which has been applied to other unsupervised learning problems. We also discuss a global searching method for sentence selection from the clusters. To evaluate our summarization strategy, an extrinsic evaluation method based on classification task is adopted. Experimental results on news document set show that the new strategy can significantly enhance the performance of Chinese multi-document summarization.

10 citations

Proceedings ArticleDOI
24 Aug 2002
TL;DR: This paper focuses on subject shift and presents a method for extracting key paragraphs from documents that discuss the same event using the results of event tracking which starts from a few sample documents and finds all subsequent documents that discusses the sameevent.
Abstract: For multi-document summarization where documents are collected over an extended period of time, the subject in a document changes over time. This paper focuses on subject shift and presents a method for extracting key paragraphs from documents that discuss the same event. Our extraction method uses the results of event tracking which starts from a few sample documents and finds all subsequent documents that discuss the same event. The method was tested on the TDT1 corpus, and the result shows the effectiveness of the method.

10 citations

Proceedings Article
01 May 2014
TL;DR: This work proposes an extensible framework META to enable analysts to easily and selectively extract and summarize events from different views with different resolutions, and defines a summarization language that includes a set of atomic operators to manipulate the meta-data.
Abstract: : Event summarization is an effective process that mines and organizes event patterns to represent the original events. It allows the analysts to quickly gain the general idea of the events. In recent years, several event summarization algorithms have been proposed, but they all focus on how to find out the optimal summarization results, and are designed for one-time analysis. As event summarization is a comprehensive analysis work, merely handling this problem with a single optimal algorithm is not enough. In the absence of an integrated summarization solution, we propose an extensible framework META to enable analysts to easily and selectively extract and summarize events from different views with different resolutions. In this framework, we store the original events in a carefully-designed data structure that enables an efficient storage and multiresolution analysis. On top of the data model, we define a summarization language that includes a set of atomic operators to manipulate the meta-data. Furthermore, we present 5 commonly used summarization tasks, and show that all these tasks can be easily expressed by the language. Experimental evaluation on both real and synthetic datasets demonstrates the efficiency and effectiveness of our framework.

10 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852