Home
/
Topics
/
Multi-document summarization

Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1991
1989
1987
1986
1985
1982

Papers

PDF

Open Access

More filters

Summarization of Multiple Cooking Recipes

[...]

Nanba Hidetsugu, Doi Yoko, Tsujita Miho, Takezawa Toshiyuki, Sumiya Kazutoshi - Show less +1 more

27 Nov 2013

4 citations

Journal Article•DOI•

Enhancing multi-document summarization using concepts

[...]

Pattabhi R. K. Rao¹, S Lalitha Devi¹•Institutions (1)

Anna University¹

10 Mar 2018-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: The conceptual graph (CG) formalism as proposed by Sowa is modified and extended to represent the concepts and their relationships in the documents to generate an objective summary of all relevant documents.

...read moreread less

Abstract: In this paper we propose a methodology to mine concepts from documents and use these concepts to generate an objective summary of all relevant documents. We use the conceptual graph (CG) formalism as proposed by Sowa to represent the concepts and their relationships in the documents. In the present work we have modified and extended the definition of the concept given by Sowa. The modified and extended definition is discussed in detail in section 2 of this paper. A CG of a set of relevant documents can be considered as a semantic network. The semantic network is generated by automatically extracting CG for each document and merging them into one. We discuss (i) generation of semantic network using CGs and (ii) generation of multi-document summary. Here we use restricted Boltzmann machines, a deep learning technique, for automatically extracting CGs. We have tested our methodology using MultiLing 2015 corpus. We have obtained encouraging results, which are comparable to those from the state of the art systems.

...read moreread less

4 citations

Proceedings Article•DOI•

A Hybrid Solution To Abstractive Multi-Document Summarization Using Supervised and Unsupervised Learning

[...]

Gaurav Bhagchandani¹, Deep Bodra¹, Abhishek Gangan¹, Nikahat Mulla¹•Institutions (1)

Sardar Patel Institute of Technology¹

15 May 2019

TL;DR: This work hybridizes three components, viz.

...read moreread less

Abstract: In this work, we aim to develop an abstractive summarization system in the multi-document setup. The main challenge in this kind of a system is the identification of redundant information. Our approach hybridizes three components, viz. Clustering, Word Graphs, Neural Networks. In clustering, all the information from multiple documents is divided amongst clusters based on context and importance analysis, such that each cluster possesses sentences of a similar context - Redundancy Identification. Further, Shortest Path Detection in Word Graphs reduces the text. Along with that, we use a sequence to sequence sentence compression and perform paraphrasing using Supervised Recurrent Neural Network to generate an almost completely abstractive summary. The dataset DUC 2004 that was used indicates that the proposed system outperforms other systems in terms of metrics like ROUGE[1] and BLEU[2].

...read moreread less

4 citations

Proceedings Article•

GameWikiSum: a Novel Large Multi-Document Summarization Dataset

[...]

Diego Antognini¹, Boi Faltings¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

01 May 2020

TL;DR: This paper proposes GameWikiSum, a new domain-specific dataset for multi-document summarization, which is one hundred times larger than commonly used datasets, and in another domain than news.

...read moreread less

Abstract: Today’s research progress in the field of multi-document summarization is obstructed by the small number of available datasets. Since the acquisition of reference summaries is costly, existing datasets contain only hundreds of samples at most, resulting in heavy reliance on hand-crafted features or necessitating additional, manually annotated data. The lack of large corpora therefore hinders the development of sophisticated models. Additionally, most publicly available multi-document summarization corpora are in the news domain, and no analogous dataset exists in the video game domain. In this paper, we propose GameWikiSum, a new domain-specific dataset for multi-document summarization, which is one hundred times larger than commonly used datasets, and in another domain than news. Input documents consist of long professional video game reviews as well as references of their gameplay sections in Wikipedia pages. We analyze the proposed dataset and show that both abstractive and extractive models can be trained on it. We release GameWikiSum for further research: https://github.com/Diego999/GameWikiSum.

...read moreread less

4 citations

Fast and accurate query-based multi-document summarization

[...]

Frank Schilder, Ravi Kondadadi

01 Jan 2008

TL;DR: A fast query-based multi-document summarizer based solely on word-frequency features of clusters, documents and topics called FastSum, which can rely on a minimal set of features leading to fast processing times: 1250 news documents in 60 seconds.

...read moreread less

Abstract: We present a fast query-based multi-document summarizer called FastSum based solely on word-frequency features of clusters, documents and topics. Summary sentences are ranked by a regression SVM. The summarizer does not use any expensive NLP techniques such as parsing, tagging of names or even part of speech information. Still, the achieved accuracy is comparable to the best systems presented in recent academic competitions (i.e., Document Understanding Conference (DUC)). Because of a detailed feature analysis using Least Angle Regression (LARS), FastSum can rely on a minimal set of features leading to fast processing times: 1250 news documents in 60 seconds.

...read moreread less

4 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics