scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Book ChapterDOI
18 Dec 2011
TL;DR: This paper incorporates a deeper semantic analysis of the source documents to select important concepts by using a predefined list of important aspects that act as a guide for selecting the most relevant sentences into the summaries.
Abstract: Recently, there has been increased interest in topic-focused multi-document summarization where the task is to produce automatic summaries in response to a given topic or specific information requested by the user. In this paper, we incorporate a deeper semantic analysis of the source documents to select important concepts by using a predefined list of important aspects that act as a guide for selecting the most relevant sentences into the summaries. We exploit these aspects and build a novel methodology for topic-focused multi-document summarization that operates on a Markov chain tuned to extract the most important sentences by following a random walk paradigm. Our evaluations suggest that the augmentation of important aspects with the random walk model can raise the summary quality over the random walk model up to 19.22%.

5 citations

Proceedings Article
17 Apr 2015
TL;DR: This paper presents an automatic way to mine domain-specific patterns from text documents so that scenario-based document summarization can both filter irrelevant documents and create summaries for relevant documents within the specified domain.
Abstract: Single-document summarization aims to reduce the size of a text document while preserving the most important information. Much work has been done on open-domain summarization. This paper presents an automatic way to mine domain-specific patterns from text documents. With a small amount of effort required for manual selection, these patterns can be used for domain-specific scenario-based document summarization and information extraction. Our evaluation shows that scenario-based document summarization can both filter irrelevant documents and create summaries for relevant documents within the specified domain.

5 citations

Book ChapterDOI
18 Sep 2005
TL;DR: Results indicate that the system-generated output achieves good precision and recall while extracting important concepts from each document, as well as good clusters of similar concepts from the set of documents.
Abstract: The design, implementation and evaluation of a multi-document summarization system for sociology dissertation abstracts are described. The system focuses on extracting variables and their relationships from different documents, integrating the extracted information, and presenting the integrated information using a variable-based framework. Two important summarization steps – information extraction and information integration were evaluated by comparing system-generated output against human-generated output. Results indicate that the system-generated output achieves good precision and recall while extracting important concepts from each document, as well as good clusters of similar concepts from the set of documents.

5 citations

Proceedings ArticleDOI
01 Dec 2015
TL;DR: Several effective formulations of proximity-based cues for use in the sentence modeling process involved in the LM-based summarization framework are explored and several well-practiced state-of-the-art methods are analyzed and compared extensively.
Abstract: Extractive speech summarization refers to automatic selection of an indicative set of sentences from a spoken document so as to offer a concise digest covering the most salient aspects of the original document. The language modeling (LM) framework alongside the pseudo-relevance feedback (PRF) technique has emerged as a promising line of research for conducting extractive speech summarization in an unsupervised manner, showing some preliminary success. This paper extends such a general line of research and its main contributions are two-fold. First, we explore several effective formulations of proximity-based cues for use in the sentence modeling process involved in the LM-based summarization framework. Second, the utilities of the methods instantiated from the LM-based summarization framework and several well-practiced state-of-the-art methods are analyzed and compared extensively. The empirical results suggest the effectiveness of our methods.

5 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852