Topic
Multi-document summarization
About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.
Papers published on a yearly basis
Papers
More filters
••
01 Jan 2019TL;DR: This work focuses on multi-document summarization which is based on context score and Bernoulli model of randomness is used to provide an informative score of bi-gram terms based on lexical association.
Abstract: Automatic text summarization is leading topic of information retrieval research due to increasing online transfer of information. The large volume of information is limited due to constraint of memory devices and access time. The existing summarization system uses the sentence extraction technique where the important sentences are extracted and presented as summary. Various summarization methods are used which do not take context into consideration. The proposed system focuses on multi-document summarization which is based on context score. Bernoulli model of randomness is used to provide an informative score of bi-gram terms based on lexical association. The resulting weight is then used in the graph-based iterative algorithm to generate a summary. Experiments have been conducted over the self-generated 100 document and benchmark DUC data sets. It has been shown that proposed system outperforms the existing methods.
2 citations
••
01 Jan 2018TL;DR: Evaluating the system-generated summaries is performed using ROUGE, results showed that the new summarizer outperforms the other summarization techniques, and it takes a relatively short time to generate summaries comparing to other summarizers.
Abstract: In this study, we address the multi-document summarization challenge. We proposed a summarizer application that implements three well-known multi-document summarization techniques; Topic-word summarizer, LexPageRank summarizer and Centroid summarizer. The contribution in this study is demonstrated by proposing a fourth summarization technique that is built on the previous acquired knowledge and experiments performed on the previously mentioned summarization techniques. Evaluating the system-generated summaries is performed using ROUGE [1], results showed that the new summarizer outperforms the other summarization techniques, and it takes a relatively short time to generate summaries comparing to other summarizers. However, LexPageRank summarizer evaluation performed better than the new summarizer evaluation, the cost of achieving a better evaluation using this technique was the time needed to generate the summaries, LexPageRank summarizer needs a long time to generate summaries comparing to other summarizers. In this study, DUC04 is used as a corpus in testing and implementing the proposed application.
2 citations
••
08 Dec 2016
TL;DR: This paper proposes the combination of similarity measures for sentence comparison and outperforms the recent works in English with the significant improvement and achieves the competitive result in Vietnamese.
Abstract: The key task in extractive summarization is to determine the importance of the sentence in the input. Several recent studies have focused on comparing the similarity between sentences to assess the significance of them efficiently. Each comparison method has its strengths and weaknesses. In this paper, we propose the combination of similarity measures for sentence comparison. Experiments conducted on both English and Vietnamese datasets demonstrate the efficiency of our proposed approach. Our model outperforms the recent works in English with the significant improvement (9.4 ROUGE-2 F1-score) and achieves the competitive result in Vietnamese.
2 citations
••
TL;DR: Existing methods and state of the art in automatic summarisation system from recent articles are discussed and achievement and challenges involve are also discussed.
Abstract: formation is knowledge if it is rightly applied. Information are stored with different formats in databases but retrieving such from different documents has been a challenge. People want ready-made information for the purpose of decision making in minimal time and thereby crave for summary of information. Automatic summarization helps in mining data and delivering timely and cogent information to users. These systems attempt to address the issue of data mining using different summarization methods. This paper discusses existing methods and state of the art in automatic summarisation system from recent articles. Achievement and challenges involve are also discussed.
2 citations
•
TL;DR: This article presents a dynamic pattern driven approach to summarize social network content and topology via pattern utilities and ranking (SPUR), and describes variants that take the implicit graph of connections into account to realize the Graph-based SPUR variant (G-SPUR).
Abstract: The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. The data can be classified into two categories: the textual content written by the users and the topological structure of the connections among users. Real-time analytics on such data is challenging with most current efforts largely focusing on the efficient querying and retrieval of data produced recently. In this article, we present a dynamic pattern driven approach to summarize social network content and topology. The resulting family of algorithms relies on the common principles of summarization via pattern utilities and ranking (SPUR). SPUR and its dynamic variant (D-SPUR) relies on an in-memory summary while retaining sufficient information to facilitate a range of user-specific and topic-specific temporal analytics. We then follow up by describing variants that take the implicit graph of connections into account to realize the Graph-based SPUR variant (G-SPUR). Finally we describe scalable algorithms for implementing these ideas on a commercial GPU-based systems. We examine the effectiveness of the summarization approaches along the axes of storage cost, query accuracy, and efficiency using real data from Twitter.
2 citations