scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This work proposes a model of video summarization based on three important parameters: Priority, Continuity, and non-Repetition, and shows examples of how CPR parameters can be computed and provide algorithms to find optimal summaries based on the CPR approach.
Abstract: Most past work on video summarization has been based on selecting key frames from videos. We propose a model of video summarization based on three important parameters: Priority (of frames), Continuity (of the summary), and non-Repetition (of the summary). In short, a summary must include high priority frames and must be continuous and non-repetitive. An optimal summary is one that maximizes an objective function based on these three parameters. We show examples of how CPR parameters can be computed and provide algorithms to find optimal summaries based on the CPR approach. Finally, we briefly report on the performance of these algorithms.

18 citations

Posted Content
TL;DR: In this paper, the authors used Positive Pointwise Mutual Information (PPMI) to assign weights for each entry in the Term-Sentence-Matrix (TSM) and then used the Sentence-Rank-Matrix generated from this weighted TSM, is then used to extract a summary from the document.
Abstract: The degree of success in document summarization processes depends on the performance of the method used in identifying significant sentences in the documents. The collection of unique words characterizes the major signature of the document, and forms the basis for Term-Sentence-Matrix (TSM). The Positive Pointwise Mutual Information, which works well for measuring semantic similarity in the Term-Sentence-Matrix, is used in our method to assign weights for each entry in the Term-Sentence-Matrix. The Sentence-Rank-Matrix generated from this weighted TSM, is then used to extract a summary from the document. Our experiments show that such a method would outperform most of the existing methods in producing summaries from large documents.

18 citations

Journal ArticleDOI
TL;DR: A novel compressive sensing based multi-document summarization with group sparse learning (SGS) framework is proposed, which can maximally reconstruct the original documents via minimizing the approximation error and jointly select summary sentences with the learnt group structure information among sentences.

18 citations

Journal ArticleDOI
TL;DR: This survey provides a comprehensive overview of the research on long document summarization and a systematic evaluation across the three principal components of its research setting: benchmark datasets, summarization models, and evaluation metrics.
Abstract: Long documents such as academic articles and business reports have been the standard format to detail out important issues and complicated subjects that require extra attention. An automatic summarization system that can effectively condense long documents into short and concise texts to encapsulate the most important information would thus be significant in aiding the reader’s comprehension. Recently, with the advent of neural architectures, significant research efforts have been made to advance automatic text summarization systems, and numerous studies on the challenges of extending these systems to the long document domain have emerged. In this survey, we provide a comprehensive overview of the research on long document summarization and a systematic evaluation across the three principal components of its research setting: benchmark datasets, summarization models, and evaluation metrics. For each component, we organize the literature within the context of long document summarization and conduct an empirical analysis to broaden the perspective on current research progress. The empirical analysis includes a study on the intrinsic characteristics of benchmark datasets, a multi-dimensional analysis of summarization models, and a review of the summarization evaluation metrics. Based on the overall findings, we conclude by proposing possible directions for future exploration in this rapidly growing field.

18 citations

Proceedings ArticleDOI
19 Jul 2009
TL;DR: This paper presents a transductive approach to learn ranking functions for extractive multi-document summarization by identifying topic themes within a document collection and iteratively trains a ranking function over these two sets of sentences.
Abstract: This paper presents a transductive approach to learn ranking functions for extractive multi-document summarization. At the first stage, the proposed approach identifies topic themes within a document collection, which help to identify two sets of relevant and irrelevant sentences to a question. It then iteratively trains a ranking function over these two sets of sentences by optimizing a ranking loss and fitting a prior model built on keywords. The output of the function is used to find further relevant and irrelevant sentences. This process is repeated until a desired stopping criterion is met.

18 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852