scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: An optimized semantic technique for multi-document abstractive summarization that utilizes the benefits of semantic role labeling, clustering and Particle Swarm Optimization to rank predicate argument structures (semantic representation) in each cluster using optimized features.
Abstract: Background/Objective: Multi-document summarization produces a concise summary from several online topically related documents. A major challenge in this domain is usually the information overlap in documents emanating from various sources. This paper introduces an optimized semantic technique for multi-document abstractive summarization. Methods/Statistical Analysis: Linguistic and semantic approaches are usually employed for abstractive summarization of multiple documents. Linguistic approaches lack semantic representation of source text while semantic approaches mostly rely on human experts to construct domain ontology and rules; which require immense time and effort. The technique in this paper utilizes the benefits of semantic role labeling, clustering and Particle Swarm Optimization (PSO) to rank predicate argument structures (semantic representation) in each cluster using optimized features. Findings: The summary quality is susceptible to the text features i.e., different features have varied importance towards summary generation. Therefore, optimal features weights obtained using PSO integrated in the semantic technique to rank semantic representation improved summarization results. The performance of the technique is evaluated against the benchmark summarization systems using pyramid evaluation measures (mean coverage score, precision and F-measure). A Paired- Samples T-test is carried out to validate the summarization results. Applications/Improvements: Experiment of this research is performed with DUC-2002, a benchmark data set for text summarization. Experimental results confirm that the proposed technique yields better results than other comparison summarization models in terms of mean coverage score and average F-measure.

2 citations

Journal Article
TL;DR: This work proposes a two-phase algorithm in which the paragraphs in the documents are first classified according to given topics and then each topic is summarized to constitute the automatically generated report.
Abstract: The expansion of on-line text with the rapid growth of the Internet imposes utilizing Data Mining techniques to reveal the information embedded in these documents. Therefore text classification and text summarization are two of the most important application areas. In this work, we attempt to integrate these two techniques to help the user to compile and extract the information that is needed. Basically, we propose a two-phase algorithm in which the paragraphs in the documents are first classified according to given topics and then each topic is summarized to constitute the automatically generated report.

2 citations

Proceedings ArticleDOI
23 May 2013
TL;DR: An automatic summarization algorithm for Chinese document is implemented, and obtains remarkable effect according to the information in Word2007 format documents.
Abstract: With the fast development of information technology, office automation, quick and effective information document automatic summarization becomes more and more important. Therefore, based on the ICTCLAS segmentation techniques from the CAS Institute of computing technology and Microsoft's Open XML technology, this paper implements an automatic summarization algorithm for Chinese document, and obtains remarkable effect according to the information in Word2007 format documents. Thesis provides a feasible scheme for the automatic abstracting needs in the document management.

2 citations

Book ChapterDOI
01 Jan 2014
TL;DR: This chapter concentrates on studying and developing techniques for summarizing Webpages in the field of contextual advertising, the task of automatically suggesting ads within the content of a generic Webpage.
Abstract: Recently, there has been a renewed interest on automatic text summarization techniques. The Internet has caused a continuous growth of information overload, focusing the attention on retrieval and filtering needs. Since digitally stored information is more and more available, users need suitable tools able to select, filter, and extract only relevant information. This chapter concentrates on studying and developing techniques for summarizing Webpages. In particular, the focus is the field of contextual advertising, the task of automatically suggesting ads within the content of a generic Webpage. Several novel text summarization techniques are proposed, comparing them with state of the art techniques and assessing whether the proposed techniques can be successfully applied to contextual advertising. Comparative experimental results are also reported and discussed. Results highlight the improvements of the proposals with respect to well-known text summarization techniques.

2 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852