scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A summarization method that transforms online content for delivery to small devices and induces a hierarchical structure based on the relative importance of sentences within the document is presented.
Abstract: Access to information via handheld devices supports decision making away from one's computer. However, limitations include small screens and constrained wireless bandwidth. We present a summarization method that transforms online content for delivery to small devices. Unlike previous algorithms, ours assumes nothing about document formatting, and induces a hierarchical structure based on the relative importance of sentences within the document. As compared to delivering full documents, the method reduces the bytes transferred by half. An experiment also demonstrates that when given hierarchical summaries, users are no less accurate in answering questions about the documents.

16 citations

Proceedings ArticleDOI
Tingting He1, Fang Li1, Wei Shao1, Jinguang Chen1, Liang Ma1 
23 Jul 2008
TL;DR: This paper proposes a feature fusion based sentence selecting strategy, to identify the sentences with high query-relevance and high information density, and adopts MMR for sentence extracting.
Abstract: The most important step of query-focused extractive summarization is deciding which sentences are appropriately included in the final summary. In this paper, we propose a feature fusion based sentence selecting strategy, to identify the sentences with high query-relevance and high information density. We score each sentence by computing its similarity and Skip-Bigram co-occurrence with query. These two features can measure the query-relevance from content and structure respectively. Then, we re-score the sentences using the information density feature gained from a text graph which can provide position information. And finally, we adopt MMR for sentence extracting. Experimental results indicate that this method is effective in capturing important sentences. The ROUGE-2 and ROUGE-SU4 scores are 0.0640 and 0.1233, which are at the top of the DUC2005 scores.

16 citations

Proceedings Article
01 Dec 2016
TL;DR: A performance comparison reveals that the summaries generated by the proposed system achieve comparable results in terms of the ROUGE metric, and show improvements in readability by human evaluation.
Abstract: Summarization aims to represent source documents by a shortened passage. Existing methods focus on the extraction of key information, but often neglect coherence. Hence the generated summaries suffer from a lack of readability. To address this problem, we have developed a graph-based method by exploring the links between text to produce coherent summaries. Our approach involves finding a sequence of sentences that best represent the key information in a coherent way. In contrast to the previous methods that focus only on salience, the proposed method addresses both coherence and informativeness based on textual linkages. We conduct experiments on the DUC2004 summarization task data set. A performance comparison reveals that the summaries generated by the proposed system achieve comparable results in terms of the ROUGE metric, and show improvements in readability by human evaluation.

16 citations

Proceedings ArticleDOI
17 Jul 2006
TL;DR: The validity of the algorithm has been tested, showing that the novel automatic text summarization algorithm tallies with author's intentional subjects and is information-redundancy-free.
Abstract: As the amount of textual information available grows rapidly, automatic text summarization methods are becoming increasingly important. Based on the subject information from term co-occurrence graph and linkage information of different subjects, a novel automatic summarization algorithm is proposed in this paper. This algorithm can get better summarization and can be adaptable to different document style. And it also can pick up subject information, whose significance will be evaluated in accordance to the rules presented in this paper. Besides, it can dynamically decide the summary size. The validity of the algorithm has been tested, showing that the novel automatic text summarization algorithm tallies with author's intentional subjects and is information-redundancy-free.

16 citations

Journal ArticleDOI
01 Jun 2016
TL;DR: The results show that incorporating social information into the summary generation process can improve the accuracy of summary, and the user preference for TBS was significantly higher than GS.
Abstract: We examined whether the microblog comments given by people after reading a web document could be exploited to improve the accuracy of a web document summarization system. We examined the effect of social information i.e., tweets on the accuracy of the generated summaries by comparing the user preference for TBS tweet-biased summary with GS generic summary. The result of crowdsourcing-based evaluation shows that the user preference for TBS was significantly higher than GS. We also took random samples of the documents to see the performance of summaries in a traditional evaluation using ROUGE, which, in general, TBS was also shown to be better than GS. We further analyzed the influence of the number of tweets pointed to a web document on summarization accuracy, finding a positive moderate correlation between the number of tweets pointed to a web document and the performance of generated TBS as measured by user preference. The results show that incorporating social information into the summary generation process can improve the accuracy of summary. The reason for people choosing one summary over another in a crowdsourcing-based evaluation is also presented in this article.

16 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852