Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Analysis of GraphSum's Attention Weights to Improve the Explainability of Multi-Document Summarization

[...]

M. Lautaro Hickmann¹, Fabian Wurzberger¹, Megi Hoxhalli¹, Arne Lochner¹, Jessica Tollich¹, Ansgar Scherp¹ - Show less +2 more•Institutions (1)

University of Ulm¹

19 May 2021-arXiv: Computation and Language

TL;DR: The authors analyzed the paragraph-level attention weights of GraphSum's multi-heads and decoding layers in order to improve the explainability of a transformer-based multi-document summarization (MDS) model.

...read moreread less

Abstract: Modern multi-document summarization (MDS) methods are based on transformer architectures. They generate state of the art summaries, but lack explainability. We focus on graph-based transformer models for MDS as they gained recent popularity. We aim to improve the explainability of the graph-based MDS by analyzing their attention weights. In a graph-based MDS such as GraphSum, vertices represent the textual units, while the edges form some similarity graph over the units. We compare GraphSum's performance utilizing different textual units, i. e., sentences versus paragraphs, on two news benchmark datasets, namely WikiSum and MultiNews. Our experiments show that paragraph-level representations provide the best summarization performance. Thus, we subsequently focus oAnalysisn analyzing the paragraph-level attention weights of GraphSum's multi-heads and decoding layers in order to improve the explainability of a transformer-based MDS model. As a reference metric, we calculate the ROUGE scores between the input paragraphs and each sentence in the generated summary, which indicate source origin information via text similarity. We observe a high correlation between the attention weights and this reference metric, especially on the the later decoding layers of the transformer architecture. Finally, we investigate if the generated summaries follow a pattern of positional bias by extracting which paragraph provided the most information for each generated summary. Our results show that there is a high correlation between the position in the summary and the source origin.

...read moreread less

1 citations

Journal Article•DOI•

Link Analysis Based on Rhetorical Relations for Multi-Document Summarization

[...]

Nik Adilah Hanin Zahri¹, Fumiyo Fukumoto¹, Suguru Matsuyoshi¹•Institutions (1)

University of Yamanashi¹

01 May 2013-IEICE Transactions on Information and Systems

TL;DR: The evaluation results show that the combination of PageRank along with rhetorical relations does help to improve the quality of extractive summarization.

...read moreread less

Abstract: This paper presents link analysis based on rhetorical relations with the aim of performing extractive summarization for multiple documents. We first extracted sentences with salient terms from individual document using statistical model. We then ranked the extracted sentences by measuring their relative importance according to their connectivity among the sentences in the document set using PageRank based on the rhetorical relations. The rhetorical relations were examined beforehand to determine which relations are crucial to this task, and the relations among sentences from documents were automatically identified by SVMs. We used the relations to emphasize important sentences during sentence ranking by PageRank and eliminate redundancy from the summary candidates. Our framework omits fully annotated sentences by humans and the evaluation results show that the combination of PageRank along with rhetorical relations does help to improve the quality of extractive summarization. key words: probability model, N-grams, link-based analysis, support vector machine, rhetorical relation

...read moreread less

1 citations

Book Chapter•DOI•

Query Focused Multi document Summarization Based on the Multi facility Location Problem

[...]

Ercan Canhasi

26 Apr 2017

TL;DR: This paper modeled the input documents as a sentence dissimilarity graph and a given query as a query to sentences similarity vector to formalize the sentence selection/extraction as a multi facility location problem (mFLP).

...read moreread less

Abstract: In this paper we propose a query focused multi document summarization method based on facility location problem. In order to formalize the sentence selection/extraction as a multi facility location problem (mFLP) we modeled the input documents as a sentence dissimilarity graph and a given query as a query to sentences similarity vector. In mFLP terminology the former is known as a cost to serve matrix, and latter as a cost to establish vector. By formulating the mFLP as the mixed integer linear programming problem we were able to optimally select sentences (facilities) which minimize the weighted sum of distances from each demand point to its nearest facility, plus the sum of opening costs of the facilities (query to sentences similarity). The performance of this new method has been tested using the DUC2005 and DUC2006 data corpus. The effectiveness of this technique is measured using the ROUGE score. The results indicate that presented methodology is truly a promising research direction.

...read moreread less

1 citations

Journal Article•DOI•

BHLM: Bayesian theory-based hybrid learning model for multi-document summarization

[...]

S. Suneetha, A. Venugopal Reddy¹•Institutions (1)

Jawaharlal Nehru Technological University, Hyderabad¹

25 Jan 2018-International Journal of Modeling, Simulation, and Scientific Computing

TL;DR: A new Bayesian theory-based Hybrid Learning Model (BHLM) is proposed in this paper, used to perform the multi-document summarization and assign the class label assisted by the mean, variance and probability measures.

...read moreread less

Abstract: In order to understand and organize the document in an efficient way, the multi-document summarization becomes the prominent technique in the Internet world. As the information available is in a large amount, it is necessary to summarize the document for obtaining the condensed information. To perform the multi-document summarization, a new Bayesian theory-based Hybrid Learning Model (BHLM) is proposed in this paper. Initially, the input documents are preprocessed, where the stop words are removed from the document. Then, the feature of the sentence is extracted to determine the sentence score for summarizing the document. The extracted feature is then fed into the hybrid learning model for learning. Subsequently, learning feature, training error and correlation coefficient are integrated with the Bayesian model to develop BHLM. Also, the proposed method is used to assign the class label assisted by the mean, variance and probability measures. Finally, based on the class label, the sentences are sorted ou...

...read moreread less

1 citations

Journal Article•DOI•

Text Summarization Using Natural Language Processing

[...]

Narendrasinh Chauhan, Krunal Patel

01 May 2023-International journal of scientific research in computer science, engineering and information technology

TL;DR: This paper presented an effective way to summarize using a Text Rank algorithm, which focuses on summarizing single Hindi text document at a time based on natural language processing (NLP) for Hindi text documents.

...read moreread less

Abstract: The availability of information today accessible in digital form has accelerated. Retrieving useful document from such large pool of information gets difficult. So, to summarize these text documents is very crucial. Text summarization is a process of minimizing the original source document to get essential information of that document. It eliminates the redundant, less important content and provides you with the vital information in a shorter version usually half a length of the original text. Creating a manual summary is a very time-consuming task. Automatic summarization helps in getting the gist of information present in a particular document in a very short period. In the comparison of all Indian regional languages, there is very less amount of work done for summarization of Hindi documents. This paper presents an effective way to summarize using a Text Rank algorithm. It focuses on summarizing single Hindi text document at a time based on natural language processing (NLP).

...read moreread less

1 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics