scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
01 Dec 2009
TL;DR: A generic summarization method that uses cluster refinement and NMF is introduced to extract meaningful sentences from documents and uses the weighted semantic variable to select meaningful sentences because the extracted sentences are well covered with the major topics of document.
Abstract: In this paper, a generic summarization method that uses cluster refinement and NMF is introduced to extract meaningful sentences from documents. The proposed method uses cluster refinement to improve the quality of document clustering since it helps us to remove dissimilarity information easily and avoid biased inherent semantics of documents to be reflected in clusters by NMF. In addition, it uses the weighted semantic variable to select meaningful sentences because the extracted sentences are well covered with the major topics of document. The experimental results demonstrate that the proposed method has better performance than other methods that use the other methods.

3 citations

Journal ArticleDOI
TL;DR: The result shows that GTASum is superior to many extractive and abstract approaches in terms of ROUGE measurement and proves that the model has the ability to capture the original subject and the correct information and improve the factual accuracy of the summarization.
Abstract: The purpose of text summarization is to compress a text document into a summary containing key information. abstract approaches are challenging tasks, it is necessary to design a mechanism to effectively extract salient information from the source text, and then generate a summary. However, most of the existing abstract approaches are difficult to capture global semantics, ignoring the impact of global information on obtaining important content. To solve this problem, this paper proposes a Graph-Based Topic Aware abstract Text Summarization (GTASum) framework. Specifically, GTASum seamlessly incorporates a neural topic model to discover potential topic information, which can provide document-level features for generating summaries. In addition, the model integrates the graph neural network which can effectively capture the relationship between sentences through the document representation of graph structure, and simultaneously update the local and global information. The further discussion showed that latent topics can help the model capture salient content. We conducted experiments on two datasets, and the result shows that GTASum is superior to many extractive and abstract approaches in terms of ROUGE measurement. The result of the ablation study proves that the model has the ability to capture the original subject and the correct information and improve the factual accuracy of the summarization.

3 citations

Book ChapterDOI
13 Dec 2004
TL;DR: The results of the experiment show that relevance feedback effectively improves the performance of automatic fractal summarization.
Abstract: As a result of the recent information explosion, there is an increasing demand for automatic summarization, and human abstractors often synthesize summaries that are based on sentences that have been extracted by machine. However, the quality of machine-generated summaries is not high. As a special application of information retrieval systems, the precision of automatic summarization can be improved by user relevance feedback, in which the human abstractor can direct the sentence extraction process and useful information can be retrieved efficiently. Automatic summarization with relevance feedback is a helpful tool to assist professional abstractors in generating summaries, and in this work we propose a relevance feedback model for fractal summarization. The results of the experiment show that relevance feedback effectively improves the performance of automatic fractal summarization.

3 citations

Proceedings ArticleDOI
28 Mar 2011
TL;DR: A concept of hierarchical topic is proposed for multi-document automatic summarization task, which used multi-layer topic tree structure to represent the text set and may describe accurately the similarity between sentences at different levels of granularity.
Abstract: A concept of hierarchical topic is proposed for multi-document automatic summarization task, which used multi-layer topic tree structure to represent the text set. Each node in the topic tree represent specific topic and contains multiple similar sentences in the text set. The hierarchical topic structure may describe accurately the similarity between sentences at different levels of granularity. Therefore it can reflect the real content of the text set than single layer topic set. And can be used to find the important sentences in the important topic which can compose the summary of the text set. Concretely, a series of algorithms including building hierarchical topic tree, key sentences extraction based on hierarchical topic tree and summarization generation are proposed. The capability of summarization system is testified by sets of experiments and shows good result.

3 citations

Book ChapterDOI
18 Aug 2010
TL;DR: This work improved the summary generation effectiveness by involving semantic information in the machine learning process and found these improvements are more significant when query term occurrences are relatively low in the document.
Abstract: Query-biased summary is a query-centered document brief representation In many scenarios, query-biased summarization can be accomplished by implementing query-customized ranking of sentences within the web page However, it is a tough work to generate this summary since it is hard to consider the similarity between the query and the sentences of a particular document for lacking of information and background knowledge behind these short texts We focused on this problem and improved the summary generation effectiveness by involving semantic information in the machine learning process And we found these improvements are more significant when query term occurrences are relatively low in the document

3 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852