Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

A New Approach for Multi-document Summarization Based on Latent Semantic Analysis

[...]

Shuchu Xiong¹, Yihui Luo¹•Institutions (1)

Hunan University of Commerce¹

13 Dec 2014

TL;DR: This paper proposes a new method to evaluate a sentence subset based on its capacity to reproduce term projections on right singular vectors and demonstrates the effectiveness of these methods on DUC2002 and DUC2004 datasets.

...read moreread less

Abstract: Multi-document summary plays an increasingly important role with the exponential document growth on the web Among many traditional multi-document summarization techniques, the latent semantic analysis (LSA) is a unique duo to its using latent semantic information instead of original feature, which results in a better performance However, since those approaches based on LSA evaluate and select sentence individually, none of them is able to remove the redundant sentences In this paper, we propose a new method to evaluate a sentence subset based on its capacity to reproduce term projections on right singular vectors Finally, the experiments on DUC2002 and DUC2004 datasets validate the effectiveness of our proposed methods

...read moreread less

9 citations

Posted Content•

A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal

[...]

Demian Gholipour Ghalandari, Chris Hokamp¹, John Glover², Georgiana Ifrim³•Institutions (3)

Dublin City University¹, Maynooth University², University College Dublin³

20 May 2020-arXiv: Computation and Language

TL;DR: In this article, the authors present a new dataset for multi-document summarization that is large both in the total number of document clusters and in the size of individual clusters, and they build this dataset by leveraging the Wikipedia Current Events Portal (WCEP), which provides concise and neutral human-written summaries of news events with links to external source articles.

...read moreread less

Abstract: Multi-document summarization (MDS) aims to compress the content in large document collections into short summaries and has important applications in story clustering for newsfeeds, presentation of search results, and timeline generation. However, there is a lack of datasets that realistically address such use cases at a scale large enough for training supervised models for this task. This work presents a new dataset for MDS that is large both in the total number of document clusters and in the size of individual clusters. We build this dataset by leveraging the Wikipedia Current Events Portal (WCEP), which provides concise and neutral human-written summaries of news events, with links to external source articles. We also automatically extend these source articles by looking for related articles in the Common Crawl archive. We provide a quantitative analysis of the dataset and empirical results for several state-of-the-art MDS techniques.

...read moreread less

9 citations

Journal Article•DOI•

A Comprehensive Survey of Abstractive Text Summarization Based on Deep Learning

[...]

Mengli Zhang, Gang Zhou, Wanting Yu, Ningbo Huang, Wenfen Liu - Show less +1 more

01 Aug 2022-Computational Intelligence and Neuroscience

TL;DR: This paper provides researchers with a comprehensive survey of DL-based abstractive summarization, and highlights some open challenges in the abstractive summization task and outline some future research trends.

...read moreread less

Abstract: With the rapid development of the Internet, the massive amount of web textual data has grown exponentially, which has brought considerable challenges to downstream tasks, such as document management, text classification, and information retrieval. Automatic text summarization (ATS) is becoming an extremely important means to solve this problem. The core of ATS is to mine the gist of the original text and automatically generate a concise and readable summary. Recently, to better balance and develop these two aspects, deep learning (DL)-based abstractive summarization models have been developed. At present, for ATS tasks, almost all state-of-the-art (SOTA) models are based on DL architecture. However, a comprehensive literature survey is still lacking in the field of DL-based abstractive text summarization. To fill this gap, this paper provides researchers with a comprehensive survey of DL-based abstractive summarization. We first give an overview of abstractive summarization and DL. Then, we summarize several typical frameworks of abstractive summarization. After that, we also give a comparison of several popular datasets that are commonly used for training, validation, and testing. We further analyze the performance of several typical abstractive summarization systems on common datasets. Finally, we highlight some open challenges in the abstractive summarization task and outline some future research trends. We hope that these explorations will provide researchers with new insights into DL-based abstractive summarization.

...read moreread less

9 citations

Persian Document Summarization by Parsumist

[...]

Mehrnoush Shamsfard, Tara Akhavan, Mona Erfani Joorabchi

01 Jan 2009

TL;DR: Parsumist is introduced-a text summarization system for Persian documents that exploits a combination of statistical, semantic and heuristic-improved methods to generate generic or topic/query-driven extracts summaries for single-or multiple Persian documents.

...read moreread less

Abstract: The rapid growth of online information services has created the problem of information explosion. Automatic text summarization techniques are essential for dealing with this problem. The process of compacting a source document to reduce its complexity and length while retaining its most important contents is called text summarization. This paper introduces Parsumist-a text summarization system for Persian documents. It exploits a combination of statistical, semantic and heuristic-improved methods. It can generate generic or topic/query-driven extracts summaries for single-or multiple Persian documents. In this paper, we first review the related work in this field, especially for Persian text summarization. We then present the architecture of Parsumist, its components and feature s. The last section evaluates the system and compares it to other systems that exist.

...read moreread less

9 citations

Posted Content•

Multi-document Summarization by Graph Search and Matching

[...]

Inderjeet Mani¹, Eric Bloedorn¹•Institutions (1)

Mitre Corporation¹

10 Dec 1997-arXiv: Computation and Language

TL;DR: A new method for summarizing similarities and differences in a pair of related documents using a graph representation for text using a spreading activation technique to discover nodes semantically related to the topic.

...read moreread less

Abstract: We describe a new method for summarizing similarities and differences in a pair of related documents using a graph representation for text. Concepts denoted by words, phrases, and proper names in the document are represented positionally as nodes in the graph along with edges corresponding to semantic relations between items. Given a perspective in terms of which the pair of documents is to be summarized, the algorithm first uses a spreading activation technique to discover, in each document, nodes semantically related to the topic. The activated graphs of each document are then matched to yield a graph corresponding to similarities and differences between the pair, which is rendered in natural language. An evaluation of these techniques has been carried out.

...read moreread less

9 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics