scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jan 2022
TL;DR: This work introduces a human-annotated data set EntSUM for controllable summarization with a focus on named entities as the aspects to control and proposes extensions to state-of-the-art summarization approaches that achieve substantially better results.
Abstract: Controllable summarization aims to provide summaries that take into account user-specified aspects and preferences to better assist them with their information need, as opposed to the standard summarization setup which build a single generic summary of a document.We introduce a human-annotated data set EntSUM for controllable summarization with a focus on named entities as the aspects to control.We conduct an extensive quantitative analysis to motivate the task of entity-centric summarization and show that existing methods for controllable summarization fail to generate entity-centric summaries. We propose extensions to state-of-the-art summarization approaches that achieve substantially better results on our data set. Our analysis and results show the challenging nature of this task and of the proposed data set.

8 citations

Journal ArticleDOI
TL;DR: This paper proposes the notions of macro- and micro-level information, which believe that both levels of information form the ''important information'' which affects the modeling and evaluation of automatic summarization systems.

8 citations

Journal ArticleDOI
TL;DR: A detailed state-of-the-art analysis of text summarization concepts such as summarization approaches, techniques used, standard datasets, evaluation metrics and future scopes for research is provided in this paper .
Abstract: One of the most pressing issues that have arisen due to the rapid growth of the Internet is known as information overloading. Simplifying the relevant information in the form of a summary will assist many people because the material on any topic is plentiful on the Internet. Manually summarising massive amounts of text is quite challenging for humans. So, it has increased the need for more complex and powerful summarizers. Researchers have been trying to improve approaches for creating summaries since the 1950s, such that the machine-generated summary matches the human-created summary. This study provides a detailed state-of-theart analysis of text summarization concepts such as summarization approaches, techniques used, standard datasets, evaluation metrics and future scopes for research. The most commonly accepted approaches are extractive and abstractive, studied in detail in this work. Evaluating the summary and increasing the development of reusable resources and infrastructure aids in comparing and replicating findings, adding competition to improve the outcomes. Different evaluation methods of generated summaries are also discussed in this study. Finally, at the end of this study, several challenges and research opportunities related to text summarization research are mentioned that may be useful for potential researchers working in this area. Keyword: Automatic text summarization, Natural Language Processing, Categorization of text summarization system, abstractive text summarization, extractive text summarization, Hybrid Text Summarization, Evaluation of text summarization system

8 citations

Book ChapterDOI
20 Feb 2011
TL;DR: This paper presents a co-clustering based multi-document summarization method that makes full use of the diverse and redundant content within topically-related articles to generate a multidocument summary.
Abstract: Two issues are crucial to multi-document summarization: diversity and redundancy. Content within some topically-related articles are usually redundant while the topic is delivered from diverse perspectives. This paper presents a co-clustering based multi-document summarization method that makes full use of the diverse and redundant content. A multidocument summary is generated in three steps. First, the sentence-term co-occurrence matrix is designed to reflect diversity and redundancy. Second, the coclustering algorithm is performed on the matrix to find globally optimal clusters for sentences and terms in an iterative manner. Third, a more accurate summary is generated by selecting representative sentences from the optimal clusters. Experiments on DUC2004 dataset show that the co-clustering based multidocument summarization method is promising.

8 citations

Proceedings ArticleDOI
16 Oct 2014
TL;DR: This paper implemented an efficient technique for text summarization to reduce the computational cost and time and also the storage capacity.
Abstract: Now a day when huge amount of documents and web contents are available, so reading of full content is somewhat difficult. Summarization is a way to give abstract form of large document so that the moral of the document can be communicated easily. Current research in automatic summarization is dominated by some effective, yet naive approaches: summarization through extraction, summarization through Abstraction and multi-document summarization. These techniques are used to building a summary of a document. Although there are a number of techniques implemented for the summarization of text for the single document or for the online web data or for any language. Here in this paper we are implemented an efficient technique for text summarization to reduce the computational cost and time and also the storage capacity.

8 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852