Topic
Multi-document summarization
About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.
Papers published on a yearly basis
Papers
More filters
••
01 Jul 1995
TL;DR: An approach for estimating the number of elements needed from the basic rankings to compute a given number of element of the resulting ranking and experiments with a large text database prove the apphcability of this approach.
Abstract: In this paper, we consider vague queries n text and fact databases. A vague query can be formulated as a combination of vague cnterta. A single database object can meet a vague criterion to a certain degree. We confine ourselves to queries for which the answer can be computed efficiently by (perhaps repetitive) combtnatlon of ranktngs to new rankings. Since users usually w1lI tnspect some of the best answer objects only, the corresponding rarkngs need to be computed just as far as necessary to generate these first answer objects. In this contribution we describe an approach for esttmattng the number of elements needed from the basic rankings to compute a given number of elements of the resulting ranking. Experiments with a large text database prove the apphcability of our approach.
15 citations
••
26 Nov 2008
TL;DR: It is shown that all RST-based methods overcome the extractive summarizer and that hybrid methods produce worse summaries, and that Mann and Thompson strong assumption in summarization and RST research area is not helpful in the way previously imagined.
Abstract: Motivated by governmental, commercial and academic interests, automatic text summarization area has experienced an increasing number of researches and products, which led to a countless number of summarization methods. In this paper, we present a comprehensive comparative evaluation of the main automatic text summarization methods based on rhetorical structure theory (RST), claimed to be among the best ones. We also propose new methods and compare our results to an extractive summarizer, which belongs to a summarization paradigm with severe limitations. To the best of our knowledge, most of our results are new in the area and reveal very interesting conclusions. The simplest RST-based method is among the best ones, although all of them present comparable results. We show that all RST-based methods overcome the extractive summarizer and that hybrid methods produce worse summaries. finally, we verify that Mann and Thompson strong assumption in summarization and RST research area is not helpful in the way previously imagined.
15 citations
••
07 Dec 2011TL;DR: This work forms video summarization as an integer programming problem and gives a ranking based solution and proposes a novel method to discover the latent concepts by spectral clustering of bag-of-words features.
Abstract: A compelling video summarization should allow viewers to understand the summary content and recover the original plot correctly. To this end, we materialize the abstract elements that are cognitively informative for viewers as concepts. They implicitly convey the semantic structure and are instantiated by semantically redundant instances. Then we analyze that a good summary should i) keep various concepts complete and balanced so as to give viewers comparable cognitive clues from a complete perspective ii) pursue the most saliency so that the rendered summary is attractive to human perception. We then formulate video summarization as an integer programming problem and give a ranking based solution. We also propose a novel method to discover the latent concepts by spectral clustering of bag-of-words features. Experiment results on human evaluation scores demonstrate that our summarization approach performs well in terms of the informativeness, enjoyability and scalibility.
15 citations
••
08 Sep 2015TL;DR: This paper presents a qualitative and quantitative assessment of the 22 state-of-the-art extractive summarization systems using the CNN corpus, a dataset of 3,000 news articles.
Abstract: Text summarization is the process of automatically creating a shorter version of one or more text documents. This paper presents a qualitative and quantitative assessment of the 22 state-of-the-art extractive summarization systems using the CNN corpus, a dataset of 3,000 news articles.
15 citations
•
TL;DR: To improve the accuracy of term frequency, SBGA employs a novel method TFS, which takes word sense into account while calculating term frequency and shows that the strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04.
Abstract: The multi-document summarizer using genetic algorithm-based sentence extraction (SBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, SBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. To improve the accuracy of term frequency, SBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04.
15 citations