Topic
Multi-document summarization
About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.
Papers published on a yearly basis
Papers
More filters
••
01 Mar 2016TL;DR: The empirical evaluation of the proposed summarization model on a standard dataset from the Document Understanding Conference showed the effectiveness of the approach which outperformed the baseline comparators in terms of ROUGE scores.
Abstract: This paper proposes an innovative graph-based text summarization model for generic single and multi-document summarization. The approach involves four unique processing stages: parsing sentences semantically using Semantic Role Labeling (SRL), grouping semantic arguments while matching semantic roles to Wikipedia concepts, constructing a weighted semantic graph for each document and linking its sentences (nodes) through the semantic relatedness of the Wikipedia concepts. An iterative ranking algorithm is then applied to the document graphs to extract the most important sentences deemed as the summary. The empirical evaluation of the proposed summarization model on a standard dataset from the Document Understanding Conference (DUC) showed the effectiveness of the approach which outperformed the baseline comparators in terms of ROUGE scores.
10 citations
•
20 Nov 2015TL;DR: In this paper, the authors described multimedia document summarization techniques for extracting relevant text segments in the document and relevant segments of images with constraints on the amount of text and number/size of images in the summary.
Abstract: Multimedia document summarization techniques are described. That is, given a document that includes text and a set of images, various implementations generate a summary by extracting relevant text segments in the document and relevant segments of images with constraints on the amount of text and number/size of images in the summary.
10 citations
••
01 Aug 2006TL;DR: The term vector is adopted to represent the linguistic unit in Chinese document, which obtains higher representation quality than traditional word-based vector space model in a certain extent.
Abstract: This paper proposes a strategy for Chinese multi-document summarization based on clustering and sentence extraction. It adopts the term vector to represent the linguistic unit in Chinese document, which obtains higher representation quality than traditional word-based vector space model in a certain extent. As for clustering, we propose two heuristics to automatically detect the proper number of clusters: the first one makes full use of the summary length fixed by the user; the second is a stability method, which has been applied to other unsupervised learning problems. We also discuss a global searching method for sentence selection from the clusters. To evaluate our summarization strategy, an extrinsic evaluation method based on classification task is adopted. Experimental results on news document set show that the new strategy can significantly enhance the performance of Chinese multi-document summarization.
10 citations
••
24 Aug 2002TL;DR: This paper focuses on subject shift and presents a method for extracting key paragraphs from documents that discuss the same event using the results of event tracking which starts from a few sample documents and finds all subsequent documents that discusses the sameevent.
Abstract: For multi-document summarization where documents are collected over an extended period of time, the subject in a document changes over time. This paper focuses on subject shift and presents a method for extracting key paragraphs from documents that discuss the same event. Our extraction method uses the results of event tracking which starts from a few sample documents and finds all subsequent documents that discuss the same event. The method was tested on the TDT1 corpus, and the result shows the effectiveness of the method.
10 citations
•
01 May 2014TL;DR: This work proposes an extensible framework META to enable analysts to easily and selectively extract and summarize events from different views with different resolutions, and defines a summarization language that includes a set of atomic operators to manipulate the meta-data.
Abstract: : Event summarization is an effective process that mines and organizes event patterns to represent the original events. It allows the analysts to quickly gain the general idea of the events. In recent years, several event summarization algorithms have been proposed, but they all focus on how to find out the optimal summarization results, and are designed for one-time analysis. As event summarization is a comprehensive analysis work, merely handling this problem with a single optimal algorithm is not enough. In the absence of an integrated summarization solution, we propose an extensible framework META to enable analysts to easily and selectively extract and summarize events from different views with different resolutions. In this framework, we store the original events in a carefully-designed data structure that enables an efficient storage and multiresolution analysis. On top of the data model, we define a summarization language that includes a set of atomic operators to manipulate the meta-data. Furthermore, we present 5 commonly used summarization tasks, and show that all these tasks can be easily expressed by the language. Experimental evaluation on both real and synthetic datasets demonstrates the efficiency and effectiveness of our framework.
10 citations