scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Posted Content
TL;DR: This paper proposes neural models to train computers not just to pay attention to specific regions and content of input documents with attention models, but also distract them to traverse between different content of a document so as to better grasp the overall meaning for summarization.
Abstract: Distributed representation learned with neural networks has recently shown to be effective in modeling natural languages at fine granularities such as words, phrases, and even sentences Whether and how such an approach can be extended to help model larger spans of text, eg, documents, is intriguing, and further investigation would still be desirable This paper aims to enhance neural network models for such a purpose A typical problem of document-level modeling is automatic summarization, which aims to model documents in order to generate summaries In this paper, we propose neural models to train computers not just to pay attention to specific regions and content of input documents with attention models, but also distract them to traverse between different content of a document so as to better grasp the overall meaning for summarization Without engineering any features, we train the models on two large datasets The models achieve the state-of-the-art performance, and they significantly benefit from the distraction modeling, particularly when input documents are long

61 citations

Journal ArticleDOI
TL;DR: A novel Cuckoo search based multi- document summarizer (MDSCSA) is proposed to address the problem of multi-document summarization and clearly reveals that the proposed approach outperforms the other summarizers included in this study.

61 citations

Journal ArticleDOI
TL;DR: A document summarization model which extracts salient sentences from given documents while reducing redundant information in the summaries and maximizing the summary relevancy is presented.
Abstract: Multi-document summarization is used to extract the main ideas of the documents and put them into a short summary. In multi-document summarization, it is important to reduce redundant information in the summaries and extract sentences, which are common to given documents. This paper presents a document summarization model which extracts salient sentences from given documents while reducing redundant information in the summaries and maximizing the summary relevancy. The model is represented as a modified p-median problem. The proposed approach not only expresses sentence-to-sentence relationship, but also expresses summary-to-document and summary-to-subtopics relationships. To solve the optimization problem a new differential evolution algorithm based on self-adaptive mutation and crossover parameters, called DESAMC, is proposed. Experimental studies on DUC benchmark data show the good performance of proposed model and its potential in summarization tasks.

61 citations

Journal ArticleDOI
TL;DR: A novel type of approach is outlined to be developed in the future, taking into account the generic components of a news story in order to generate a better summary.
Abstract: Problem statement: Text summarization can be of different nature ranging from indicative summary that identifies the topics of the document to informative summary which is meant to represent the concise description of the original document, providing an idea of what the whole content of document is all about. Approach: Single document summary seems to capture both the information well but it has not been the case for multi document summary where the overall comprehensive quality in presenting informative summary often lacks. It is found that most of the existing methods tend to focus on sentence scoring and less consideration is given to the contextual information content in multiple documents. Results: In this study, some survey on multi document summarization approaches has been presented. We will direct our focus notably on four well known approaches to multi document summarization namely the feature based method, cluster based method, graph based method and knowledge based method. The general ideas behind these methods have been described. Conclusion: Besides the general idea and concept, we discuss the benefits and limitations concerning these methods. With the aim of enhancing multi document summarization, specifically news documents, a novel type of approach is outlined to be developed in the future, taking into account the generic components of a news story in order to generate a better summary.

61 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: This paper proposes an approach for multi- document video summarization by exploring the redundancy between different videos and shows that multi-document video summarizations presents more elegant and informative summaries compared with single-document approach.
Abstract: Most previous works on video summarization target on a single video document. With the popularity of video corpus (e.g. news video archives) and web videos, video article that consists of a set of relevant videos are frequently confronted by users. By the traditional single-document summarization, these videos are treated independently and the results are usually redundant due to the lack of inter-video analysis. To efficiently manage video articles, in this paper, we propose an approach for multi-document video summarization by exploring the redundancy between different videos. The importance of keyframes is first measured by the content inclusion based on intra- and inter-video similarities. We then propose a Minimum Description Length (MDL) for automatically determining the appropriate length of the summary. Finally a video summary is generated for users to browse the content of the whole video article. We show that multi-document video summarization presents more elegant and informative summaries compared with single-document approach.

61 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852