scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
13 Jun 2005
TL;DR: Experimental results indicate that the proposed approach effectively summarizes concise spoken sentences containing important words with semantic dependency.
Abstract: For the purpose of wireless data transformation, spoken document summarization can efficiently reduce the redundant contents. This study presents a voice-activated spoken document summarization and retrieval scheme using text and speech analysis. In this method, prosody, speech recognition confidence, word significance, word trigram and semantic dependency are considered in the summarization score. A dynamic programming algorithm is used to seek the best summarization result. Experimental results indicate that the proposed approach effectively summarizes concise spoken sentences containing important words with semantic dependency.

1 citations

01 Jan 2011
TL;DR: A novel algorithm, called TriangleSum for single document summarization based on graph theory, is proposed, which builds a dependency graph for the document based on syntactic dependency relation analysis and identifies triangles of nodes that represent the main document information.
Abstract: Document summarization is a technique aimed to automatically extract the main ideas from electronic documents. With the fast increase of electronic documents available on the network, techniques for making efficient use of such documents become increasingly important. In this paper, we propose a novel algorithm, called TriangleSum for single document summarization based on graph theory. The algorithm builds a dependency graph for the document based on syntactic dependency relation analysis. The nodes represent words or phrases of high frequency, and edges represent dependency relations between them. Then, a modified version of clustering coefficient is used to measure the strength of connection between nodes in a graph. By identifying triangles of nodes, a part of the dependency graph can be extracted. At last, a set of key sentences that represent the main document information can be extracted.

1 citations

01 Jan 2014
TL;DR: An integrated approach is proposed that overcomes the drawback that it provides ranking for same meaning of different terms in the given set of documents and aims to improve the clustering results compared to the existing methods.
Abstract: Multi document summarization is an automatic procedure aimed at extraction of information from multiple texts written about the same topic. The output will be a paragraph summary. Multi document summarization is very useful for presenting and organizing search results. An existing cluster-based summarization approaches utilize the clustering results to select the representative sentences in order to generate summaries. But it not considers providing the ranking for same meaning with different words or terms in the given set of documents. We proposed integrated approach that overcomes the drawback that we provide ranking for same meaning of different terms. The proposed approach aims to improve the clustering results compared to the existing methods.

1 citations

Book ChapterDOI
24 Nov 2015
TL;DR: An Extractive Maximum Coverage KnaPsack MCKP based model for query-based multi document summarization which integrates three monotone and submodular measures to detect importance of a sentence including Coverage, Relevance, and compression is proposed.
Abstract: In this paper, we propose an Extractive Maximum Coverage KnaPsack MCKP based model for query-based multi document summarization which integrates three monotone and submodular measures to detect importance of a sentence including Coverage, Relevance, and Compression. We apply an efficient scalable greedy algorithm to generate a summary which has a near optimal solution when its scoring functions are monotone nondecreasing and submodular. We use DUC 2007 dataset to evaluate our proposed method and the result shows improvement over two closely related works.

1 citations

Book ChapterDOI
01 Jan 2023
TL;DR: In this paper , the authors proposed a Fuzzy Bi-GRU model for extracting the most useful and relevant information from a massive amount of information from that massive list and it can be possible through automatic text summarization (ATS).
Abstract: As a massive amount of information is produced on the internet nowadays, the need for extracting the most useful and relevant information from that massive list is one of the most attractive research and it can be possible through a mechanism called automatic text summarization (ATS). This summarization mechanism is classified into single and multi-documents based on the number of source documents. When multiple source documents communicate similar information called multi-documents and it is the biggest challenge in the field of ATS. This motivates us to work on the long multi-documents by calculating the sentence scores using a fuzzy inference system. From the extracted sentences, the similarity or redundancy has to be removed using Bi- GRU, and then an abstractive summary need to be generated for those identified sentences has to produce. The proposed system is validated and tested using Standard datasets namely, DUC, BBC news, and CNN/daily mail. The proposed Fuzzy Bi-GRU is compared with other cutting-edge models, and empirical results indicate that it outperforms all other models in terms of ROUGE- N and L scores.

1 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852