scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
Liang Ma1, Tingting He1, Fang Li1, Zhuomin Gui1, Jinguang Chen1 
12 Dec 2008
TL;DR: This paper proposes a strategy of the summary sentence selection for query-focused multi-document summarization through extracting keywords from relevant document set by calculating the query related feature and the topic related feature for every word inrelevant document set, and obtains the importance of the word by combining the two features.
Abstract: This paper proposes a strategy of the summary sentence selection for query-focused multi-document summarization through extracting keywords from relevant document set. It calculates the query related feature and the topic related feature for every word in relevant document set, then obtains the importance of the word by combining the two features. The score of candidate sentence is computed through the importance of words which they contains, and the modified MMR technology is used to adjust the score of the candidate sentence, then the candidate sentence with the highest score is selected as the summary sentence, till the length of the summary is enough. Experimental result shows that our method performs very well in DUC 2005 corpus and DUC 2006 corpus.

7 citations

Proceedings ArticleDOI
18 Dec 2006
TL;DR: A series of experiments carried out adopting varying summarization schemes for summarizing the IMF staff reports are presented and the summaries produced are evaluated by comparing to the staff-written executive summaries included in the original reports.
Abstract: In this paper we present a series of experiments carried out adopting varying summarization schemes for summarizing the IMF staff reports. The summaries produced by the system are evaluated by comparing to the staff-written executive summaries included in the original reports. The results and learned lessons are analyzed and discussed.

7 citations

Journal Article
TL;DR: The results show that using semantic discourse knowledge for content selection improve the informativeness of automatic summaries.
Abstract: Automatic multi-document summarization aims at reducing the size of texts while preserving the important content. In this paper, we propose some methods for automatic summarization based on two semantic discourse models: Rhetorical Structure Theory (RST) and Cross-document Structure Theory (CST). These models are chosen in order to properly address the relevance of information, multi-document phenomena and subtopical distribution in the source texts. The results show that using semantic discourse knowledge for content selection improve the informativeness of automatic summaries.

7 citations

Proceedings Article
01 Oct 2006
TL;DR: A method which collects original news text from on-line information and extracts summary sentences from them automatically and adopts WML (Wireless Markup Language) to build a news website for mobile devices browsing through the news summary.
Abstract: A large amount of on-line information and lengthiness information can’t fit for the mobile devices In order to save this problem, we propose a method which collects original news text from on-line information and extracts summary sentences from them automatically On this basis, we adopt WML(Wireless Markup Language) to build a news website for mobile devices browsing through the news summary The system is mainly made up by Automatic News Collection and Auto Text Summarization Our experimental results proved the effectiveness of the means Keyword: World Wide Web; Automatic Text Summarization; Automatic News Collection

7 citations

Book ChapterDOI
11 Mar 2012
TL;DR: The task of summarization as selection of top ranked sentences from ranked sentence-clusters is considered, and Wikipedia anchor text based phrase mapping scheme is introduced to solve the problem of lot of noisy entries in the text.
Abstract: Similar to the traditional approach, we consider the task of summarization as selection of top ranked sentences from ranked sentence-clusters. To achieve this goal, we rank the sentence clusters by using the importance of words calculated by using page rank algorithm on reverse directed word graph of sentences. Next, to rank the sentences in every cluster we introduce the use of weighted clustering coefficient. We use page rank score of words for calculation of weighted clustering coefficient. Finally the most important issue is the presence of a lot of noisy entries in the text, which downgrades the performance of most of the text mining algorithms. To solve this problem, we introduce the use of Wikipedia anchor text based phrase mapping scheme. Our experimental results on DUC-2002 and DUC-2004 dataset show that our system performs better than unsupervised systems and better than/comparable with novel supervised systems of this area.

7 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852