scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jul 2019
TL;DR: This paper proposed a similarity measure inspired by capsule networks to measure redundancy between a pair of sentences based on surface form and semantic information and showed that the improved similarity measure performs competitively, outperforming strong summarization baselines on benchmark datasets.
Abstract: The most important obstacles facing multi-document summarization include excessive redundancy in source descriptions and the looming shortage of training data. These obstacles prevent encoder-decoder models from being used directly, but optimization-based methods such as determinantal point processes (DPPs) are known to handle them well. In this paper we seek to strengthen a DPP-based method for extractive multi-document summarization by presenting a novel similarity measure inspired by capsule networks. The approach measures redundancy between a pair of sentences based on surface form and semantic information. We show that our DPP system with improved similarity measure performs competitively, outperforming strong summarization baselines on benchmark datasets. Our findings are particularly meaningful for summarizing documents created by multiple authors containing redundant yet lexically diverse expressions.

46 citations

Proceedings ArticleDOI
01 Jan 2015
TL;DR: This work proposes a unified sentence scoring model which measures representativeness and diversity at the same time in multi-document Summarization, and demonstrates that the MDS method outperforms the DUC04 best method and the existing clustering-based methods.
Abstract: Multi-document Summarization (MDS) is of great value to many real world applications. Many scoring models are proposed to select appropriate sentences from documents to form the summary, in which the clustering-based methods are popular. In this work, we propose a unified sentence scoring model which measures representativeness and diversity at the same time. Experimental results on DUC04 demonstrate that our MDS method outperforms the DUC04 best method and the existing clustering-based methods, and it yields close results compared to the state-of-the-art generic MDS methods. Advantages of the proposed MDS method are two-fold: (1) The density peaks clustering algorithm is firstly adopted, which is effective and fast. (2) No external resources such as Wordnet and Wikipedia or complex language parsing algorithms is used, making reproduction and deployment very easy in real environment.

46 citations

Proceedings ArticleDOI
28 Aug 2004
TL;DR: FarsiSum is an attempt to create an automatic text summarization system for Persian that uses modules implemented in an existing summarizer geared towards the Germanic languages, a Persian stop-list in Unicode format and a small set of heuristic rules.
Abstract: FarsiSum is an attempt to create an automatic text summarization system for Persian. The system is implemented as a HTTP client/server application written in Perl. It uses modules implemented in an existing summarizer geared towards the Germanic languages, a Persian stop-list in Unicode format and a small set of heuristic rules.

46 citations

Journal ArticleDOI
TL;DR: This article presents eight different methods of generating multidocument summaries and evaluates each of these methods on a large set of topics used in past DUC workshops, showing a significant improvement in the quality of summaries based on topic themes over MDS methods that use other alternative topic representations.
Abstract: The problem of using topic representations for multidocument summarization (MDS) has received considerable attention recently. Several topic representations have been employed for producing informative and coherent summaries. In this article, we describe five previously known topic representations and introduce two novel representations of topics based on topic themes. We present eight different methods of generating multidocument summaries and evaluate each of these methods on a large set of topics used in past DUC workshops. Our evaluation results show a significant improvement in the quality of summaries based on topic themes over MDS methods that use other alternative topic representations.

46 citations

DOI
01 Jan 2001
TL;DR: The use of multido ument summarization as a post-pro essing step in do ument retrieval is proposed and the use of the summary as a repla ement to the standard ranked list is examined.
Abstract: In this paper, we propose the use of multido ument summarization as a post-pro essing step in do ument retrieval We examine the use of the summary as a repla ement to the standard ranked list The form of the summary is novel be ause it has both informative and indi ate elements, designed to help di erent users perform their tasks better Our summary uses the do uments' topi al stru ture as a ba kbone for its own stru ture, as it was deemed the most useful do ument feature in our study of a orpus of summaries

46 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852