Topic
Multi-document summarization
About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.
Papers published on a yearly basis
Papers
More filters
••
04 Jun 2009TL;DR: It is shown that the summarizer built is able to outperform most systems participating in task focused summarization evaluations at Text Analysis Conferences (TAC) 2008 and would perform better at producing short summaries than longer summaries.
Abstract: In this paper, we describe a sentence position based summarizer that is built based on a sentence position policy, created from the evaluation testbed of recent summarization tasks at Document Understanding Conferences (DUC). We show that the summarizer thus built is able to outperform most systems participating in task focused summarization evaluations at Text Analysis Conferences (TAC) 2008. Our experiments also show that such a method would perform better at producing short summaries (upto 100 words) than longer summaries. Further, we discuss the baselines traditionally used for summarization evaluation and suggest the revival of an old baseline to suit the current summarization task at TAC: the Update Summarization task.
36 citations
22 Jan 2007
TL;DR: The Document Understanding Conference (DUC) 2005 evaluation had a single user-oriented, question-focused summarization task, which was to synthesize from a set of 25--50 documents a well-organized, fluent answer to a complex question as discussed by the authors.
Abstract: The Document Understanding Conference (DUC) 2005 evaluation had a single user-oriented, question-focused summarization task, which was to synthesize from a set of 25--50 documents a well-organized, fluent answer to a complex question The evaluation shows that the best summarization systems have difficulty extracting relevant sentences in response to complex questions (as opposed to representative sentences that might be appropriate to a generic summary) The relatively generous allowance of 250 words for each answer also reveals how difficult it is for current summarization systems to produce fluent text from multiple documents
36 citations
••
01 Dec 2012TL;DR: This paper presents an approach to query focused multi document summarization by combining single document summary using sentence clustering, and observed an average F-measure on DUC 2002 multi-document dataset, which is comparable to three best performing systems reported on the same dataset.
Abstract: This paper presents an approach to query focused multi document summarization by combining single document summary using sentence clustering. Both syntactic and semantic similarity between sentences is used for clustering. Single document summary is generated using document feature, sentence reference index feature, location feature and concept similarity feature. Sentences from single document summaries are clustered and top most sentences from each cluster are used for creating multi-document summary. We observed an average F-measure of 0.33774 on DUC 2002 multi-document dataset, which is comparable to three best performing systems reported on the same dataset.
36 citations
••
13 Oct 1998TL;DR: A method for combining query-relevance with information-novelty in the context of text retrieval and summarization, where the clearest advantage is demonstrated in the automated construction of large document and non-redundant multi-document summaries, where MMR results are clearly superior to non-MMR passage selection.
Abstract: This paper develops a method for combining query-relevance with information-novelty in the context of text retrieval and summarization. The Maximal Marginal Relevance (MMR) criterion strives to reduce redundancy while maintaining query relevance in reranking retrieved documents and in selecting appropriate passages for text summarization. Preliminary results indicate some benefits for MMR diversity ranking in ad-hoc query and in single document summarization. The latter are borne out by the trial-run (unofficial) TREC-style evaluation of summarization systems. However, the clearest advantage is demonstrated in the automated construction of large document and non-redundant multi-document summaries, where MMR results are clearly superior to non-MMR passage selection. This paper also discusses our preliminary evaluation of summarization methods for single documents.
36 citations
••
TL;DR: A frequent term based text summarization algorithm which is implemented using open source technologies like java, DISCO, Porters stemmer etc and verified over the standard text mining corpus.
Abstract: Text summarization is an important activity in the analysis of a high volume text documents. Text summarization has number of applications; recently number of applications uses text summarization for the betterment of the text analysis and knowledge representation. In this paper a frequent term based text summarization algorithm is designed and implemented in java. The designed algorithm works in three steps. In the first step the document which is required to be summarized is processed by eliminating the stop word and by applying the stemmers. In the second step term-frequent data is calculated from the document and frequent terms are selected, for these selected words the semantic equivalent terms are also generated. Finally in the third step all the sentences in the document, which are containing the frequent and semantic equivalent terms, are filtered for summarization. The designed algorithm is implemented using open source technologies like java, DISCO, Porters stemmer etc. and verified over the standard text mining corpus. Keyword
36 citations