scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Posted Content
TL;DR: This paper proposes an approach to extend the neural abstractive model trained on large scale SDS data to the MDS task, which makes use of a small number of multi-document summaries for fine tuning.
Abstract: Till now, neural abstractive summarization methods have achieved great success for single document summarization (SDS). However, due to the lack of large scale multi-document summaries, such methods can be hardly applied to multi-document summarization (MDS). In this paper, we investigate neural abstractive methods for MDS by adapting a state-of-the-art neural abstractive summarization model for SDS. We propose an approach to extend the neural abstractive model trained on large scale SDS data to the MDS task. Our approach only makes use of a small number of multi-document summaries for fine tuning. Experimental results on two benchmark DUC datasets demonstrate that our approach can outperform a variety of baseline neural models.

30 citations

23 Jun 2011
TL;DR: A novel unsupervised approach to the problem of multi-document summarization of scientific articles, in which the document collection is a list of papers cited together within the same source article, otherwise known as a co-citation, is presented.
Abstract: We present a novel unsupervised approach to the problem of multi-document summarization of scientific articles, in which the document collection is a list of papers cited together within the same source article, otherwise known as a co-citation. At the heart of the approach is a topic based clustering of fragments extracted from each co-cited article and relevance ranking using a query generated from the context surrounding the co-cited list of papers. This analysis enables the generation of an overview of common themes from the co-cited papers that relate to the context in which the co-citation was found. We present a system called SciSumm that embodies this approach and apply it to the 2008 ACL Anthology. We evaluate this summarization system for relevant content selection using gold standard summaries prepared on principle based guidelines. Evaluation with gold standard summaries demonstrates that our system performs better in content selection than an existing summarization system (MEAD). We present a detailed summary of our findings and discuss possible directions for future research.

30 citations

Journal ArticleDOI
TL;DR: An effective integrated framework using both of summary and category information that achieves significant improvement in text summarization and classification and has some advantages of easy implementation and language independence.
Abstract: An effective integrated framework using both of summary and category informationThe summarization technique utilizes the category information from classificationThe classification technique utilizes the summary information from summarizationThis integrated framework achieves significant improvement Text summarization and classification are core techniques to analyze a huge amount of text data in the big data environment Moreover, as the need to read texts on smart phones, tablets and television as well as personal computers continues to grow, text summarization and classification techniques become more important and both of them do essential processes for text analysis in many applicationsTraditional text summarization and classification techniques have individually been considered as different research fields in this literature However, we find out that they can help each other as text summarization makes use of category information from text classification and text classification does summary information from text summarization Therefore, we propose an effective integrated learning framework using both of summary and category information in this paper In this framework, the feature-weighting method for text summarization utilizes a language model to combine feature distributions in each category and text, and one for text classification does the sentence importance scores estimated from the text summarizationIn the experiments, the performances of the integrated framework are better than ones of individual text summarization and classification In addition, the framework has some advantages of easy implementation and language independence because it is based on only simple statistical approaches and POS tagger

30 citations

Proceedings ArticleDOI
27 Oct 2014
TL;DR: ROUGE-N is used to evaluate generated summary with abstractive summary of DUC 2002 dataset and the results on combinations of various features for scoring are discussed.
Abstract: Automatic text summarization is a major area of research in the domain of information systems. Most of the methods requires domain knowledge in order to produce a coherent and meaningful summary. In Extractive text summarization, sentences are scored on some features. A large number of feature based scoring methods have been proposed for extractive automatic text summarization by researchers. This paper reviews features for sentence scoring. The results on combinations of various features for scoring are discussed. ROUGE-N is used to evaluate generated summary with abstractive summary of DUC 2002 dataset.

30 citations

Proceedings ArticleDOI
03 Sep 2012
TL;DR: A method of personalized text summarization which improves the conventional automatic text summarizing methods by taking into account the differences in readers' characteristics is proposed, which uses annotations added by readers as one of the sources of personalization.
Abstract: Automatic text summarization aims to address the information overload problem by extracting the most important information from a document, which can help a reader to decide whether it is relevant or not. In this paper we propose a method of personalized text summarization which improves the conventional automatic text summarization methods by taking into account the differences in readers' characteristics. We use annotations added by readers as one of the sources of personalization. We have experimentally evaluated the proposed method in the domain of learning, obtaining better summaries capable of extracting important concepts explained in the document when considering the relevant domain terms in the process of summarization.

30 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852