scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Posted Content
TL;DR: Li et al. as discussed by the authors proposed an agreement-oriented multi-document summarization (MDS) task, where the goal is to provide abstractive summaries that represent information common and faithful to all input articles.
Abstract: We aim to renew interest in a particular multi-document summarization (MDS) task which we call AgreeSum: agreement-oriented multi-document summarization. Given a cluster of articles, the goal is to provide abstractive summaries that represent information common and faithful to all input articles. Given the lack of existing datasets, we create a dataset for AgreeSum, and provide annotations on article-summary entailment relations for a subset of the clusters in the dataset. We aim to create strong baselines for the task by applying the top-performing pretrained single-document summarization model PEGASUS onto AgreeSum, leveraging both annotated clusters by supervised losses, and unannotated clusters by T5-based entailment-related and language-related losses. Compared to other baselines, both automatic evaluation and human evaluation show better article-summary and cluster-summary entailment in generated summaries. On a separate note, we hope that our article-summary entailment annotations contribute to the community's effort in improving abstractive summarization faithfulness.

1 citations

Journal ArticleDOI
TL;DR: In this paper, the implicit information conveyed by the argumentative connectives such as : but, even, yet, and their effect on the summary is highlighted and a system focusing on acquiring knowledge that is implicit is presented.
Abstract: So far and trying to reach human capabilities, research in automatic summarization has been based on hypothesis that are both enabling and limiting. Some of these limitations are: how to take into account and reflect (in the generated summary) the implicit information conveyed in the text, the author intention, the reader intention, the context influence, the general world knowledge…. Thus, if we want machines to mimic human abilities, then they will need access to this same large variety of knowledge. The implicit is affecting the orientation and the argumentation of the text and consequently its summary. Most of Text Summarizers (TS) are processing as compressing the initial data and they necessarily suffer from information loss. TS are focusing on features of the text only, not on what the author intended or why the reader is reading the text. In this paper, we address this problem and we present a system focusing on acquiring knowledge that is implicit. We principally spotlight the implicit information conveyed by the argumentative connectives such as : but, even, yet …. and their effect on the summary.

1 citations

Journal Article
TL;DR: An automatic text summarization approach based on textual unit association network that can achieve better summarization performance than the existing methods and can be used for keyword extraction, text classification and clustering and other information retrieval tasks.
Abstract: An automatic text summarization approach is proposed based on textual unit association network.The word-based and sentence-based association networks are constructed respectively.For the word,a new approach is used to compute the word weights and then the weight of the sentence is evaluated based on the weights of words contained in the sentence.For the sentence,a new approach is presented to weight the salience of a sentence based on its cooccurrence information.Finally,salient sentences are extracted into the output summary till the desired summary length is satisfied.Experimental results show that the proposed approach can achieve better summarization performance than the existing methods.Moreover,the proposed scheme of term weighting can be used for keyword extraction,text classification and clustering and other information retrieval tasks.

1 citations

Proceedings ArticleDOI
20 Jul 2011
TL;DR: The main objective of this paper is to provide cluster summarization of huge text document and the dynamic nature of proposed system facilitates a scalable cluster wherein the peers may join or leave the group at will.
Abstract: The main objective of this paper is to provide cluster summarization of huge text document. Mining process includes the sharing of large scale amount of data from various sources, which gets concluded at the mined data. In distributed data mining, adopting aflat node distribution model can affect scalability, modularity, flexibility which are being overcome by using dynamic peer to peer document clustering and cluster summarization. The Dynamic P2P document clustering and cluster summarization (DP2PCS) architecture is based upon bonus words and stigma words. For document clustering applications, the system summarizes the distributed document clusters using a distributed key-phrase extraction algorithm, thus providing interpretation of the clusters. Document summarization is used for fast information retrieval in less time. Compared to existing system the dynamic nature of proposed system facilitates a scalable cluster wherein the peers may join or leave the group at will. The summarization process on an average reduces the original documents content by 63 percentage based on the word count.

1 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852