scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Book ChapterDOI
01 Jan 2021
TL;DR: In this article, the authors proposed Abstractive Text Summarization using Deep Learning with Attention Mechanism, which removes duplicate data and generates new sentences by rephrasing them or adding words originally absent in the source text.
Abstract: Due to the advancement of the Internet nowadays, a lot of people are mostly dependent on the Web to get the required information. As data is increasing exponentially, there is a high chance of duplication of data; it is difficult and tedious for the manual reading of all the documents as well as the rejection of the duplicates and extraction of useful information. One of the solutions to this issue is “Text Summarization,” through which a huge volume of data can be read quickly; but it is very hard to summarize documents manually, thus necessitating the use of an automatic tool to perform this task. Abstractive Text Summarization is one such automated technique of producing a short and accurate summary of a document while preserving essential information and comprehensive meaning. In this paper, Abstractive Text Summarization using Deep Learning with Attention Mechanism has been proposed. The designed framework removes duplicate data and generates new sentences by rephrasing them or adding words originally absent in the source text. Experimental results on the dataset, Amazon Fine Food Review, are evaluated by utilizing performance metrics such as Rouge scores.

1 citations

Journal ArticleDOI
TL;DR: The design of generic text summarization model based on sentence extraction has been redirected into a more semantic measure reflecting individually the two significant objectives: content coverage and diversity when generating summaries from multiple documents as an explicit optimization model.
Abstract: Currently, the prominence of automatic multi document summarization task belongs to the information rapid increasing on the Internet Automatic document summarization technology is progressing and may offer a solution to the problem of information overload Automatic text summarization system has the challenge of producing a high quality summary In this study, the design of generic text summarization model based on sentence extraction has been redirected into a more semantic measure reflecting individually the two significant objectives: content coverage and diversity when generating summaries from multiple documents as an explicit optimization model The proposed two models have been then coupled and defined as a single-objective optimization problem Also, for improving the performance of the proposed model, different integrations concerning two similarity measures have been introduced and applied to the proposed model along with the single similarity measures that are based on using Cosine, Dice and similarity measures for measuring text similarity For solving the proposed model, Genetic Algorithm (GA) has been used Document sets supplied by Document Understanding Conference 2002 ( ) have been used for the proposed system as an evaluation dataset Also, as an evaluation metric, Recall-Oriented Understudy for Gisting Evaluation ( ) toolkit has been used for performance evaluation of the proposed method Experimental results have illustrated the positive impact of measuring text similarity using double integration of similarity measures against single similarity measure when applied to the proposed model wherein the best performance in terms of and has been recorded for the integration of Cosine similarity and similarity

1 citations

Proceedings ArticleDOI
23 Mar 2022
TL;DR: In this paper , an extensive discussion on various techniques that are used to extract useful information are mentioned and the central objective of this is to highlight the issues or drawbacks of each such technique, try to resolve them, and select one such method that could be used to summarize the textual content.
Abstract: The Internet is a house of indeterminate and ever-growing resource of information. Text Summarization has become a critical requirement for effectively managing the clogged data. The technique of retrieving relevant information from various unorganized sources is commonly known as the process of text mining. Multiple methods like the term-based approach, phrase-based approach are applied to structure data. In this research paper, extensive discussion on numerous techniques that are used to extract useful information are mentioned. The central objective of this is to highlight the issues or drawbacks of each such technique, try to resolve them, and select one such method that could be used to summarize the textual content. The algorithm used for text summarization is Sum Basic. Sum Basic is a text summarization process that creates multi-document summaries. The primary concept of the text summarization is to produce a summary, wherein the persistent occurring terms in a document should be prioritized above less frequently appearing words.

1 citations

Proceedings ArticleDOI
25 Jan 2023
TL;DR: In this paper , the authors investigated the performance of two unsupervised methods, Latent Semantic Analysis (LSA) and Maximal Marginal Relevance (MMR), in summarization of Persian broadcast news.
Abstract: The methods of automatic speech summarization are classified into two groups: supervised and unsupervised methods. Supervised methods are based on a set of features, while unsupervised methods perform summarization based on a set of rules. Latent Semantic Analysis (LSA) and Maximal Marginal Relevance (MMR) are considered the most important and well-known unsupervised methods in automatic speech summarization. This study set out to investigate the performance of two aforementioned unsupervised methods in transcriptions of Persian broadcast news summarization. The results show that in generic summarization, LSA outperforms MMR, and in query-based summarization, MMR outperforms LSA in broadcast news summarization.

1 citations

01 Jan 2004
TL;DR: A method to calculate sentence importance using scores produced by a Question-Answering engine in response to multiple questions and an integration of it into a generic multi-document summarization system is described.
Abstract: Recent years, the answer-focused summarization is paid attention to as a technology complementary to information retrieval and question answering. In order to realize multi-document summarization focused by multiple questions, we propose a method to calculate sentence importance using scores produced by a Question-Answering engine in response to multiple questions. We also describe an integration of it into a generic multi-document summarization system. The evaluation results show that the proposed method has better performance than not only several baselines but also other participants’ systems in the evaluation workshop NTCIR4 TSC3 Formal Run, although we have to take notice of the fact that some of the other systems do not use the information of questions.

1 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852