scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Book ChapterDOI
01 Jan 2019
TL;DR: This work focuses on multi-document summarization which is based on context score and Bernoulli model of randomness is used to provide an informative score of bi-gram terms based on lexical association.
Abstract: Automatic text summarization is leading topic of information retrieval research due to increasing online transfer of information. The large volume of information is limited due to constraint of memory devices and access time. The existing summarization system uses the sentence extraction technique where the important sentences are extracted and presented as summary. Various summarization methods are used which do not take context into consideration. The proposed system focuses on multi-document summarization which is based on context score. Bernoulli model of randomness is used to provide an informative score of bi-gram terms based on lexical association. The resulting weight is then used in the graph-based iterative algorithm to generate a summary. Experiments have been conducted over the self-generated 100 document and benchmark DUC data sets. It has been shown that proposed system outperforms the existing methods.

2 citations

Book ChapterDOI
01 Jan 2018
TL;DR: Evaluating the system-generated summaries is performed using ROUGE, results showed that the new summarizer outperforms the other summarization techniques, and it takes a relatively short time to generate summaries comparing to other summarizers.
Abstract: In this study, we address the multi-document summarization challenge. We proposed a summarizer application that implements three well-known multi-document summarization techniques; Topic-word summarizer, LexPageRank summarizer and Centroid summarizer. The contribution in this study is demonstrated by proposing a fourth summarization technique that is built on the previous acquired knowledge and experiments performed on the previously mentioned summarization techniques. Evaluating the system-generated summaries is performed using ROUGE [1], results showed that the new summarizer outperforms the other summarization techniques, and it takes a relatively short time to generate summaries comparing to other summarizers. However, LexPageRank summarizer evaluation performed better than the new summarizer evaluation, the cost of achieving a better evaluation using this technique was the time needed to generate the summaries, LexPageRank summarizer needs a long time to generate summaries comparing to other summarizers. In this study, DUC04 is used as a corpus in testing and implementing the proposed application.

2 citations

Proceedings ArticleDOI
08 Dec 2016
TL;DR: This paper proposes the combination of similarity measures for sentence comparison and outperforms the recent works in English with the significant improvement and achieves the competitive result in Vietnamese.
Abstract: The key task in extractive summarization is to determine the importance of the sentence in the input. Several recent studies have focused on comparing the similarity between sentences to assess the significance of them efficiently. Each comparison method has its strengths and weaknesses. In this paper, we propose the combination of similarity measures for sentence comparison. Experiments conducted on both English and Vietnamese datasets demonstrate the efficiency of our proposed approach. Our model outperforms the recent works in English with the significant improvement (9.4 ROUGE-2 F1-score) and achieves the competitive result in Vietnamese.

2 citations

Journal ArticleDOI
TL;DR: Existing methods and state of the art in automatic summarisation system from recent articles are discussed and achievement and challenges involve are also discussed.
Abstract: formation is knowledge if it is rightly applied. Information are stored with different formats in databases but retrieving such from different documents has been a challenge. People want ready-made information for the purpose of decision making in minimal time and thereby crave for summary of information. Automatic summarization helps in mining data and delivering timely and cogent information to users. These systems attempt to address the issue of data mining using different summarization methods. This paper discusses existing methods and state of the art in automatic summarisation system from recent articles. Achievement and challenges involve are also discussed.

2 citations

Journal Article
TL;DR: This article presents a dynamic pattern driven approach to summarize social network content and topology via pattern utilities and ranking (SPUR), and describes variants that take the implicit graph of connections into account to realize the Graph-based SPUR variant (G-SPUR).
Abstract: The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. The data can be classified into two categories: the textual content written by the users and the topological structure of the connections among users. Real-time analytics on such data is challenging with most current efforts largely focusing on the efficient querying and retrieval of data produced recently. In this article, we present a dynamic pattern driven approach to summarize social network content and topology. The resulting family of algorithms relies on the common principles of summarization via pattern utilities and ranking (SPUR). SPUR and its dynamic variant (D-SPUR) relies on an in-memory summary while retaining sufficient information to facilitate a range of user-specific and topic-specific temporal analytics. We then follow up by describing variants that take the implicit graph of connections into account to realize the Graph-based SPUR variant (G-SPUR). Finally we describe scalable algorithms for implementing these ideas on a commercial GPU-based systems. We examine the effectiveness of the summarization approaches along the axes of storage cost, query accuracy, and efficiency using real data from Twitter.

2 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852