scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Proceedings ArticleDOI
31 May 2003
TL;DR: A new summarization system based on users' annotations that shows by experiments that annotations can help a lot in improving summarization performance compared to no annotation consideration, and proposes a variety of techniques to evaluate the relationships between annotations and summaries.
Abstract: For one document, current summarization systems produce a uniform version of summary for all users. Personalized summarizations are necessary in order to represent users' preferences and interests. Annotation is getting important for document sharing and collaborative filtering, which in fact record users' dynamic behaviors compared to traditional steady profiles. In this paper we introduce a new summarization system based on users' annotations. Annotations and their contexts are extracted to represent features of sentences, which are given different weights for representation of the document. Our system produces two versions of summaries for each document: generic summary without considering annotations and annotation-based summary. Since annotation is a kind of personal data, annotation-based summary is tailored to user's interests to some extent. We show by experiments that annotations can help a lot in improving summarization performance compared to no annotation consideration. At the same time, we make an extensive study on users' annotating behaviors and annotations distribution, and propose a variety of techniques to evaluate the relationships between annotations and summaries, such as how the number of annotations affects the summarizing performance. A study about collaborative filtering is also made to evaluate the summarization based on annotations of similar users.

27 citations

Proceedings ArticleDOI
01 Apr 2007
TL;DR: An approach which draws on methods from each of these areas of information retrieval, topical summarization, and Information Extraction is presented, and the effectiveness of this approach with a query-focused summarization approach is compared.
Abstract: This paper addresses the task of providing extended responses to questions regarding specialized topics. This task is an amalgam of information retrieval, topical summarization, and Information Extraction (IE). We present an approach which draws on methods from each of these areas, and compare the effectiveness of this approach with a query-focused summarization approach. The two systems are evaluated in the context of the prosecution queries like those in the DARPA GALE distillation evaluation.

27 citations

01 Sep 2015
TL;DR: This paper presents a RL method, which takes into account intermediate steps during the creation of a summary, and introduces a new feature set, which describes sentences with respect to already selected sentences.
Abstract: Reinforcement Learning (RL) is a generic framework for modeling decision making processes and as such very suited to the task of automatic summarization. In this paper we present a RL method, which takes into account intermediate steps during the creation of a summary. Furthermore, we introduce a new feature set, which describes sentences with respect to already selected sentences. We carry out a range of experiments on various data sets – including several DUC data sets, but also scientific publications and encyclopedic articles. Our results show that our approach a) successfully adapts to data sets from various domains, b) outperforms previous RL-based methods for summarization and state-ofthe-art summarization systems in general, and c) can be equally applied to singleand multi-document summarization on various domains and document lengths.

27 citations

Journal ArticleDOI
TL;DR: An empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance.
Abstract: The purpose of extractive speech summarization is to automatically select a number of indicative sentences or paragraphs (or audio segments) from the original spoken document according to a target summarization ratio and then concatenate them to form a concise summary. Much work on extractive summarization has been initiated for developing machine-learning approaches that usually cast important sentence selection as a two-class classification problem and have been applied with some success to a number of speech summarization tasks. However, the imbalanced-data problem sometimes results in a trained speech summarizer with unsatisfactory performance. Furthermore, training the summarizer by improving the associated classification accuracy does not always lead to better summarization evaluation performance. In view of such phenomena, we present in this paper an empirical investigation of the merits of two schools of training criteria to alleviate the negative effects caused by the aforementioned problems, as well as to boost the summarization performance. One is to learn the classification capability of a summarizer on the basis of the pair-wise ordering information of sentences in a training document according to a degree of importance. The other is to train the summarizer by directly maximizing the associated evaluation score or optimizing an objective that is linked to the ultimate evaluation. Experimental results on the broadcast news summarization task suggest that these training criteria can give substantial improvements over a few existing summarization methods.

27 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852