scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
11 Sep 2009
TL;DR: The purpose of present paper is to show that summarization result is not only depends the sentence features, but also depends on the sentence similarity measure, and can improve the performance compared to other summarization methods.
Abstract: Technology of automatic text summarization plays an important role in information retrieval and text classification, and may provide a solution to the information overload problem. Text summarization is a process of reducing the size of a text while preserving its information content. This paper proposes a sentences clustering based summarization approach. The proposed approach consists of three steps: first clusters the sentences based on the semantic distance among sentences in the document, and then on each cluster calculates the accumulative sentence similarity based on the multi-features combination method, at last chooses the topic sentences by some extraction rules. The purpose of present paper is to show that summarization result is not only depends the sentence features, but also depends on the sentence similarity measure. The experimental result on the DUC 2003 dataset show that our proposed approach can improve the performance compared to other summarization methods.

78 citations

Journal ArticleDOI
TL;DR: A coherent graph-based semantic clustering and summarization approach for biomedical literature that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries.
Abstract: Background A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature.

77 citations

Proceedings ArticleDOI
01 Jun 2014
TL;DR: A hybrid method to generate summaries of product and services reviews by combining natural language generation and salient sentence selection techniques is presented.
Abstract: We present a hybrid method to generate summaries of product and services reviews by combining natural language generation and salient sentence selection techniques. Our system, STARLET-H, receives as input textual reviews with associated rated topics, and produces as output a natural language document summarizing the opinions expressed in the reviews. STARLET-H operates as a hybrid

77 citations

Journal ArticleDOI
TL;DR: A system, SumGen, is described, which selects key information from an event database by reasoning about event frequencies, frequencies of relations between events, and domain specific importance measures and then aggregates similar information and plans a summary presentation tailored to a stereotypical user.
Abstract: Summarization entails analysis of source material, selection of key information, condensation of this, and generation of a compact summary form. While there have been many investigations into the automatic summarization of text, relatively little attention has been given to the summarization of information from structured information sources such as data or knowledge bases, despite this being a desirable capability for a number of application areas including report generation from databases (e.g. weather, financial, medical) and simulations (e.g. military, manufacturing, economic). After a brief introduction indicating the main elements of summarization and referring to some illustrative approaches to it, this article considers specific issues in the generation of text summaries of event data. It describes a system, SumGen, which selects key information from an event database by reasoning about event frequencies, frequencies of relations between events, and domain specific importance measures. The article describes how SumGen then aggregates similar information and plans a summary presentation tailored to a stereotypical user. Finally, the article evaluates SumGen performance, and also that of a much more limited second summariser, by assessesing information extraction by 22 human subjects from both source and summary texts. This evaluation shows that the use of SumGen reduces average sentence length by approx. 15%, document length by 70%, and time to perform information extraction by 58%.

77 citations

Journal ArticleDOI
TL;DR: A document summarization model which extracts key sentences from given documents while reducing redundant information in the summaries is presented, and an innovative aspect of the model lies in its ability to remove redundancy while selecting representative sentences.
Abstract: For effective multi-document summarization, it is important to reduce redundant information in the summaries and extract sentences, which are common to given documents. This paper presents a document summarization model which extracts key sentences from given documents while reducing redundant information in the summaries. An innovative aspect of our model lies in its ability to remove redundancy while selecting representative sentences. The model is represented as a discrete optimization problem. To solve the discrete optimization problem in this study an adaptive DE algorithm is created. We implemented our model on multi-document summarization task. Experiments have shown that the proposed model is to be preferred over summarization systems. We also showed that the resulting summarization system based on the proposed optimization approach is competitive on the DUC2002 and DUC2004 datasets.

76 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852