scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
01 Nov 2020
TL;DR: A unified model for single-document and multi-document summarizations is built by fully sharing the encoder and decoder and utilizing a decoding controller to aggregate the decoder’s outputs for multiple input documents.
Abstract: Single-document and multi-document summarizations are very closely related in both task definition and solution method. In this work, we propose to improve neural abstractive multi-document summarization by jointly learning an abstractive single-document summarizer. We build a unified model for single-document and multi-document summarizations by fully sharing the encoder and decoder and utilizing a decoding controller to aggregate the decoder’s outputs for multiple input documents. We evaluate our model on two multi-document summarization datasets: Multi-News and DUC-04. Experimental results show the efficacy of our approach, and it can substantially outperform several strong baselines. We also verify the helpfulness of single-document summarization to abstractive multi-document summarization task.

6 citations

Journal ArticleDOI
TL;DR: This paper focuses on the Segmented Bushy Path, a sophisticated method which tries to represent in a summary the main subtopics from source texts while keeping its informativeness and achieves state of the art results, competing with the most sophisticated deep summarization methods in the area.
Abstract: In this paper we adapt and explore strategies for generating multi-document summaries based on relationship maps, which represent texts as graphs (maps) of interrelated segments and apply different traversing techniques for producing the summaries. In particular, we focus on the Segmented Bushy Path, a sophisticated method which tries to represent in a summary the main subtopics from source texts while keeping its informativeness. In addition, we also investigate some well-known subtopic segmentation and clustering techniques in order to correctly select the most relevant information to compose the final summary. We show that this subtopic-based method outperforms other methods for multi-document summarization and that achieves state of the art results, competing with the most sophisticated deep summarization methods in the area.

6 citations

01 Jan 2005
TL;DR: A multi-document summarization method based on Latent Semantic Indexing (LSI) that combines several reports on the same issue into a matrix of terms and sentences, and uses a Singular Value Decomposition (SVD) to reduce the dimension of the matrix and extract features.
Abstract: A multi-document summarization method based on Latent Semantic Indexing (LSI) is proposed. The method combines several reports on the same issue into a matrix of terms and sentences, and uses a Singular Value Decomposition (SVD) to reduce the dimension of the matrix and extract features, and then the sentence similarity is computed. The sentences are clustered according to similarity of sentences. The centroid sentences are selected from each class. Finally, the selected sentences are ordered to generate the summarization. The evaluation and results are presented, which prove that the proposed methods are efficient.

6 citations

Proceedings ArticleDOI
01 Nov 2018
TL;DR: A model for improving the quality of the scoring step, which benefits sentence selection to extract high-quality summaries and achieves sufficient improvements over traditional methods and competitive results with state-of-the-art deep learning models is presented.
Abstract: Sentence scoring is a vital step in an extractive summarization system. This paper presents a model for improving the quality of the scoring step, which benefits sentence selection to extract high-quality summaries. Different from previous methods, our model takes advantage of local information (inside a single document) and global information (on the whole corpus). The combination allows defining a rich set of features used for learning. Under a learning-to-rank formulation, the model learns to estimate the importance of sentences. After ranking, summaries are finally extracted by selecting top-ranked sentences with the consideration of diversity. Experiments on three benchmark datasets (DUC 2001, 2002, and 2004) indicate that our model achieves sufficient improvements over traditional methods and competitive results with state-of-the-art deep learning models.

6 citations

Proceedings ArticleDOI
06 Oct 2005
TL;DR: An evaluation method that is based on convolution kernels that measure the similarities between texts considering their substructures is presented that correlates more closely with human evaluations and is more robust.
Abstract: In order to promote the study of automatic summarization and translation, we need an accurate automatic evaluation method that is close to human evaluation. In this paper, we present an evaluation method that is based on convolution kernels that measure the similarities between texts considering their substructures. We conducted an experiment using automatic summarization evaluation data developed for Text Summarization Challenge 3 (TSC-3). A comparison with conventional techniques shows that our method correlates more closely with human evaluations and is more robust.

6 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852