scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jul 1995
TL;DR: An approach for estimating the number of elements needed from the basic rankings to compute a given number of element of the resulting ranking and experiments with a large text database prove the apphcability of this approach.
Abstract: In this paper, we consider vague queries n text and fact databases. A vague query can be formulated as a combination of vague cnterta. A single database object can meet a vague criterion to a certain degree. We confine ourselves to queries for which the answer can be computed efficiently by (perhaps repetitive) combtnatlon of ranktngs to new rankings. Since users usually w1lI tnspect some of the best answer objects only, the corresponding rarkngs need to be computed just as far as necessary to generate these first answer objects. In this contribution we describe an approach for esttmattng the number of elements needed from the basic rankings to compute a given number of elements of the resulting ranking. Experiments with a large text database prove the apphcability of our approach.

15 citations

Proceedings ArticleDOI
26 Nov 2008
TL;DR: It is shown that all RST-based methods overcome the extractive summarizer and that hybrid methods produce worse summaries, and that Mann and Thompson strong assumption in summarization and RST research area is not helpful in the way previously imagined.
Abstract: Motivated by governmental, commercial and academic interests, automatic text summarization area has experienced an increasing number of researches and products, which led to a countless number of summarization methods. In this paper, we present a comprehensive comparative evaluation of the main automatic text summarization methods based on rhetorical structure theory (RST), claimed to be among the best ones. We also propose new methods and compare our results to an extractive summarizer, which belongs to a summarization paradigm with severe limitations. To the best of our knowledge, most of our results are new in the area and reveal very interesting conclusions. The simplest RST-based method is among the best ones, although all of them present comparable results. We show that all RST-based methods overcome the extractive summarizer and that hybrid methods produce worse summaries. finally, we verify that Mann and Thompson strong assumption in summarization and RST research area is not helpful in the way previously imagined.

15 citations

Proceedings ArticleDOI
Zheng Yuan1, Taoran Lu, Dapeng Wu1, Yu Huang2, Heather Yu2 
07 Dec 2011
TL;DR: This work forms video summarization as an integer programming problem and gives a ranking based solution and proposes a novel method to discover the latent concepts by spectral clustering of bag-of-words features.
Abstract: A compelling video summarization should allow viewers to understand the summary content and recover the original plot correctly. To this end, we materialize the abstract elements that are cognitively informative for viewers as concepts. They implicitly convey the semantic structure and are instantiated by semantically redundant instances. Then we analyze that a good summary should i) keep various concepts complete and balanced so as to give viewers comparable cognitive clues from a complete perspective ii) pursue the most saliency so that the rendered summary is attractive to human perception. We then formulate video summarization as an integer programming problem and give a ranking based solution. We also propose a novel method to discover the latent concepts by spectral clustering of bag-of-words features. Experiment results on human evaluation scores demonstrate that our summarization approach performs well in terms of the informativeness, enjoyability and scalibility.

15 citations

Proceedings ArticleDOI
08 Sep 2015
TL;DR: This paper presents a qualitative and quantitative assessment of the 22 state-of-the-art extractive summarization systems using the CNN corpus, a dataset of 3,000 news articles.
Abstract: Text summarization is the process of automatically creating a shorter version of one or more text documents. This paper presents a qualitative and quantitative assessment of the 22 state-of-the-art extractive summarization systems using the CNN corpus, a dataset of 3,000 news articles.

15 citations

Journal Article
TL;DR: To improve the accuracy of term frequency, SBGA employs a novel method TFS, which takes word sense into account while calculating term frequency and shows that the strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04.
Abstract: The multi-document summarizer using genetic algorithm-based sentence extraction (SBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, SBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. To improve the accuracy of term frequency, SBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04.

15 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852