Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

Text Summarization of Turkish Texts using Latent Semantic Analysis

[...]

Makbule Gulcin Ozsoy, Ilyas Cicekli¹, Ferda Nur Alpaslan•Institutions (1)

Bilkent University¹

23 Aug 2010

TL;DR: Two new LSA based summarization algorithms are proposed and their performances are compared using their ROUGE-L scores to find out well-formed summaries.

...read moreread less

Abstract: Text summarization solves the problem of extracting important information from huge amount of text data. There are various methods in the literature that aim to find out well-formed summaries. One of the most commonly used methods is the Latent Semantic Analysis (LSA). In this paper, different LSA based summarization algorithms are explained and two new LSA based summarization algorithms are proposed. The algorithms are evaluated on Turkish documents, and their performances are compared using their ROUGE-L scores. One of our algorithms produces the best scores.

...read moreread less

63 citations

Journal Article•DOI•

Summarizing large text collection using topic modeling and clustering based on MapReduce framework

[...]

Naresh Kumar Nagwani¹•Institutions (1)

National Institute of Technology, Raipur¹

26 Jun 2015-Journal of Big Data

TL;DR: A novel framework based on MapReduce technology is proposed using semantic similarity based clustering and topic modeling using Latent Dirichlet Allocation (LDA) for summarizing the large text collection over MapReduced framework.

...read moreread less

Abstract: Document summarization provides an instrument for faster understanding the collection of text documents and has a number of real life applications. Semantic similarity and clustering can be utilized efficiently for generating effective summary of large text collections. Summarizing large volume of text is a challenging and time consuming problem particularly while considering the semantic similarity computation in summarization process. Summarization of text collection involves intensive text processing and computations to generate the summary. MapReduce is proven state of art technology for handling Big Data. In this paper, a novel framework based on MapReduce technology is proposed for summarizing large text collection. The proposed technique is designed using semantic similarity based clustering and topic modeling using Latent Dirichlet Allocation (LDA) for summarizing the large text collection over MapReduce framework. The summarization task is performed in four stages and provides a modular implementation of multiple documents summarization. The presented technique is evaluated in terms of scalability and various text summarization parameters namely, compression ratio, retention ratio, ROUGE and Pyramid score are also measured. The advantages of MapReduce framework are clearly visible from the experiments and it is also demonstrated that MapReduce provides a faster implementation of summarizing large text collections and is a powerful tool in Big Text Data analysis.

...read moreread less

63 citations

Proceedings Article•DOI•

Document update summarization using incremental hierarchical clustering

[...]

Dingding Wang¹, Tao Li¹•Institutions (1)

Florida International University¹

26 Oct 2010

TL;DR: A new summarization method based on an incremental hierarchical clustering framework to update summaries as soon as a new document arrives to demonstrate the effectiveness and efficiency of this proposed method.

...read moreread less

Abstract: Document summarization has become a hot topic in recent years. However, most of existing summarization methods work on a batch of documents and do not consider that documents may arrive in a sequence and the corresponding summaries need to be updated in real time. In this paper, we propose a new summarization method based on an incremental hierarchical clustering framework to update summaries as soon as a new document arrives. Extensive experimental results demonstrate the effectiveness and efficiency of our proposed method.

...read moreread less

63 citations

Proceedings Article•

Summarization of Spontaneous Conversations

[...]

Xiaodan Zhu, Gerald Penn

01 Jan 2006

TL;DR: This paper summarizes spontaneous conversations with features of a wide variety that have not been explored before, and examines the role of disfluencies in summarization, which in all previous work was either not explicitly handled or removed as noise.

...read moreread less

Abstract: Most speech summarization research is conducted on broadcast news. In our viewpoint, spontaneous conversations are a more “typical” speech source that distinguishes speech summarization from text summarization, and hence a more appropriate domain for studying speech summarization. For example, spontaneous conversations contain more spoken-language characteristics, e.g. disfluencies and false starts. They are also more vulnerable to ASR errors. Previous research has studied some aspects of this type of data, but this paper addresses the problem further in several important respects. First, we summarize spontaneous conversations with features of a wide variety that have not been explored before. Second, we examine the role of disfluencies in summarization, which in all previous work was either not explicitly handled or removed as noise. Third, we breakdown and analyze the impact of WER on the individual features for summarization. Index Terms: speech summarization, utterance selection, spontaneous conversations

...read moreread less

62 citations

Journal Article•DOI•

Multidocument summarization: An added value to clustering in interactive retrieval

[...]

Manuel J. Maña-López¹, Manuel de Buenaga², José M. Gómez-Hidalgo²•Institutions (2)

University of Vigo¹, European University of Madrid²

01 Apr 2004-ACM Transactions on Information Systems

TL;DR: This article proposes in addition to the classification capacity of clustering techniques, the possibility of offering a indicative extract about the contents of several sources by means of multidocument summarization techniques.

...read moreread less

Abstract: A more and more generalized problem in effective information access is the presence in the same corpus of multiple documents that contain similar information. Generally, users may be interested in locating, for a topic addressed by a group of similar documents, one or several particular aspects. This kind of task, called instance or aspectual retrieval, has been explored in several TREC Interactive Tracks. In this article, we propose in addition to the classification capacity of clustering techniques, the possibility of offering a indicative extract about the contents of several sources by means of multidocument summarization techniques. Two kinds of summaries are provided. The first one covers the similarities of each cluster of documents retrieved. The second one shows the particularities of each document with respect to the common topic in the cluster. The document multitopic structure has been used in order to determine similarities and differences of topics in the cluster of documents. The system is independent of document domain and genre. An evaluation of the proposed system with users proves significant improvements in effectiveness. The results of previous experiments that have compared clustering algorithms are also reported.

...read moreread less

62 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics