scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
01 Jan 2016
TL;DR: The tweet contextualization process aims at generating a short summary from Wikipedia documents related to the tweet by combining Information Retrieval and Automatic Text Summarization methods to generate the tweet context.
Abstract: In this paper we describe our participation in the INEX 2016 Tweet Contextualization track. The tweet contextualization process aims at generating a short summary from Wikipedia documents related to the tweet. In our approach, we analyzed tweets and created a query to retrieve the most relevant Wikipedia article. We combine Information Retrieval and Automatic Text Summarization methods to generate the tweet context.

4 citations

Proceedings ArticleDOI
09 Sep 2010
TL;DR: An algorithm for WDS based on sentences extraction that considers both the Web formats and hyperlink attributes and the weight proportion of words and structures is learned by machine learning approach is presented.
Abstract: Web document summarization (WDS) is becoming one of the hot subjects in the text summarization field due to the rapidly increasing number of documents on Web. WDS is different from traditional text summarization because it must process hyperlinked texts. This paper first analyses the features of Web documents, then gives a definition for WDS, and finally presents an algorithm for WDS based on sentences extraction. Each sentence's weight is a weighted sum of words' weight and its sentence-structure's weight. The former weight is adjusted by document class graph and latter weight considers both the Web formats and hyperlink attributes. The weight proportion of words and structures is learned by machine learning approach. Experiments on 2,000 Web documents show that our algorithm is feasible.

4 citations

Book ChapterDOI
20 Sep 2006
TL;DR: There is a big gap to be overcome for summarization to be directly applied to question-answering tasks, and it is hypothesized that topicoriented summarization techniques could be able to produce more accurate answers.
Abstract: In this paper, we present and analyze the results of the application of a text summarization system - GistSumm - to the task of monolingual question answering at CLEF 2006 for Portuguese texts. We hypothesized that topicoriented summarization techniques could be able to produce more accurate answers. However, our results showed that there is a big gap to be overcome for summarization to be directly applied to question-answering tasks.

4 citations

Proceedings ArticleDOI
TL;DR: This article proposed SummPip, an unsupervised method for multi-document summarization, in which they convert the original documents to a sentence graph, taking both linguistic and deep representation into account, then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary.
Abstract: Obtaining training data for multi-document summarization (MDS) is time consuming and resource-intensive, so recent neural models can only be trained for limited domains. In this paper, we propose SummPip: an unsupervised method for multi-document summarization, in which we convert the original documents to a sentence graph, taking both linguistic and deep representation into account, then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary. Experiments on Multi-News and DUC-2004 datasets show that our method is competitive to previous unsupervised methods and is even comparable to the neural supervised approaches. In addition, human evaluation shows our system produces consistent and complete summaries compared to human written ones.

4 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852