scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
24 Aug 2002
TL;DR: A framework for the evaluation of summaries in English and Chinese using similarity measures that can be used to evaluate extractive, non-extractive, single and multi-document summarization is described.
Abstract: We describe a framework for the evaluation of summaries in English and Chinese using similarity measures. The framework can be used to evaluate extractive, non-extractive, single and multi-document summarization. We focus on the resources developed that are made available for the research community.

62 citations

Journal ArticleDOI
TL;DR: This survey investigates several research studies that have been conducted in the field of Arabic text summarization, and addresses summarization and evaluation methods, as well as the corpora used in those studies.
Abstract: This survey investigates several research studies that have been conducted in the field of Arabic text summarization. Specifically, it addresses summarization and evaluation methods, as well as the corpora used in those studies. The literature in this field is fairly limited and relatively new compared to the available literature on other languages, such as English. Therefore, there exists a great opportunity for further research in Arabic text summarization. In addition, one of the largest problems in Arabic summarization was the absence of Arabic gold standard summaries, although this situation is beginning to change, especially with the inclusion of Arabic language as a part of the corpora and tasks in the TAC 2011 MultiLing Pilot and ACL 2013 MultiLing Workshop. Finally, providing the required corpora and adopting them in Arabic summarization studies is an essential demand.

62 citations

Proceedings ArticleDOI
Inderjeet Mani1
05 Oct 2001
TL;DR: The significance of some recent developments in summarization technology is discussed, often bundled with information retrieval tools, as well as from the need for corporate knowledge management.
Abstract: With the explosion in the quantity of on-line text and multimedia information in recent years, demand for text summarization technology is growing. Increased pressure for technology advances is coming from users of the web, on-line information sources, and new mobile devices, as well as from the need for corporate knowledge management. Commercial companies are increasingly starting to offer text summarization capabilities, often bundled with information retrieval tools. In this paper, I will discuss the significance of some recent developments in summarization technology.

62 citations

Journal ArticleDOI
TL;DR: A two-stage scene-based movie summarization method based on mining the relationship between role-communities, which achieves better subjective performance than attention-based and role-based summarization methods in terms of semantic content preservation for a movie summary.
Abstract: Video summarization techniques aim at condensing a full-length video to a significantly shortened version that still preserves the major semantic content of the original video. Movie summarization, being a special class of video summarization, is particularly challenging since a large variety of movie scenarios and film styles complicate the problem. In this paper, we propose a two-stage scene-based movie summarization method based on mining the relationship between role-communities since the role-communities in earlier scenes are usually used to develop the role relationship in later scenes. In the analysis stage, we construct a social network to characterize the interactions between role-communities. As a result, the social power of each role-community is evaluated by the community's centrality value and the role communities are clustered into relevant groups based on the centrality values. In the summarization stage, a set of feasible summary combinations of scenes is identified and an information-rich summary is selected from these candidates based on social power preservation. Our evaluation results show that in at most test cases the proposed method achieves better subjective performance than attention-based and role-based summarization methods in terms of semantic content preservation for a movie summary.

62 citations

Proceedings ArticleDOI
25 Jun 2007
TL;DR: A principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization finds that log-likelihood ratio is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.
Abstract: The increasing complexity of summarization systems makes it difficult to analyze exactly which modules make a difference in performance. We carried out a principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization: raw frequency (word probability) and log-likelihood ratio. We demonstrate that the advantages of log-likelihood ratio come from its known distributional properties which allow for the identification of a set of words that in its entirety defines the aboutness of the input. We also find that LLR is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.

62 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852