scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
10 Aug 1998
TL;DR: The role of automated document summarization in building effective search statements is investigated and the results of latest evaluation of the system at the annual Text Retrieval Conference (TREC) are discussed.
Abstract: We discuss a semi-interactive approach to information retrieval which consists of two tasks performed in a sequence. First, the system assists the searcher in building a comprehensive statement of information need, using automatically generated topical summaries of sample documents. Second, the detailed statement of information need is automatically processed by a series of natural language processing routines in order to derive an optimal search query for a statistical information retrieval system. In this paper, we investigate the role of automated document summarization in building effective search statements. We also discuss the results of latest evaluation of our system at the annual Text Retrieval Conference (TREC).

21 citations

01 Dec 2008
TL;DR: This paper proposes a method using fuzzy logic for sentence extraction and compares its result with the baseline summarizer and Microsoft Word 2007 summarizers and shows that the highest average precision, recall, and F-mean for the summaries are conducted from fuzzy method.
Abstract: Automatic text summarization is to compress the original text into a shorter version and help the user to quickly understand large volumes of information. This paper focuses on the automatic text summarization by sentence extraction with important features based on fuzzy logic. In our experiment, we used 6 test documents in DUC2002 data set. Each document is prepared by preprocessing process: sentence segmentation, tokenization, remuving Stop Word and stemming Word. Then, we use 8 important features and calculate their score for each sentence. We propose a method using fuzzy logic for sentence extraction and compare our result with the baseline summarizer and Microsoft Word 2007 summarizers. The results show that the highest average precision, recall, and F-mean for the summaries are conducted from fuzzy method.

20 citations

Proceedings ArticleDOI
Cuneyt M. Taskiran1
TL;DR: The summary evaluation problem is examined, text summarization is the oldest and most successful summarization domain, and some parallels between these to domains and methods and terminology are shown.
Abstract: Compact representations of video, or video summaries, data greatly enhances efficient video browsing. However, rigorous evaluation of video summaries generated by automatic summarization systems is a complicated process. In this paper we examine the summary evaluation problem. Text summarization is the oldest and most successful summarization domain. We show some parallels between these to domains and introduce methods and terminology. Finally, we present results for a comprehensive evaluation summary that we have performed.

20 citations

Proceedings ArticleDOI
24 Oct 2016
TL;DR: This paper analyzes Twitter data and discovers two social contexts which are important for topic generation and dissemination, namely (i) CrowdExp topic score that captures the influence of both the crowd and the expert users in Twitter and (ii) Retweet topic Score that capturesThe influence of Twitter users' actions.
Abstract: While social data is being widely used in various applications such as sentiment analysis and trend prediction, its sheer size also presents great challenges for storing, sharing and processing such data. These challenges can be addressed by data summarization which transforms the original dataset into a smaller, yet still useful, subset. Existing methods find such subsets with objective functions based on data properties such as representativeness or informativeness but do not exploit social contexts, which are distinct characteristics of social data. Further, till date very little work has focused on topic preserving data summarization, despite the abundant work on topic modeling. This is a challenging task for two reasons. First, since topic model is based on latent variables, existing methods are not well-suited to capture latent topics. Second, it is difficult to find such social contexts that provide valuable information for building effective topic-preserving summarization model. To tackle these challenges, in this paper, we focus on exploiting social contexts to summarize social data while preserving topics in the original dataset. We take Twitter data as a case study. Through analyzing Twitter data, we discover two social contexts which are important for topic generation and dissemination, namely (i) CrowdExp topic score that captures the influence of both the crowd and the expert users in Twitter and (ii) Retweet topic score that captures the influence of Twitter users' actions. We conduct extensive experiments on two real-world Twitter datasets using two applications. The experimental results show that, by leveraging social contexts, our proposed solution can enhance topic-preserving data summarization and improve application performance by up to 18%.

20 citations

Proceedings Article
01 Nov 2010
TL;DR: This paper proposes a method for multi- aspects review summarization based on evaluative sentence extraction that combines ratings of aspects, the tf -idf value, and the number of mentions with a similar topic and applies a clustering algorithm.
Abstract: The development of the Web services lets many users easily provide their opinions recently. Automatic summarization of enormous sentiments has been expected. Intuitively, we can summarize a review with traditional document summarization methods. However, such methods have not well-discussed "aspects". Basically, a review consists of sentiments with various aspects. We summarize reviews for each aspect so that the summary presents information without biasing to a specific topic. In this paper, we propose a method for multi- aspects review summarization based on evaluative sentence extraction. We handle three fea- tures; ratings of aspects, the tf -idf value, and the number of mentions with a similar topic. For estimating the number of mentions, we apply a clustering algorithm. By integrating these features, we generate a more appropriate summary. The experiment results show the effectiveness of our method.

20 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852