scispace - formally typeset
Search or ask a question
Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.


Papers
More filters
Journal ArticleDOI
01 Jun 2018
TL;DR: An elaborate user evaluation study to determine human preferences in forum summarization and to create a reference data set is presented and shows that even for a summarization task with low inter-rater agreement, a model can be trained that generates sensible summaries.
Abstract: In this paper we address extractive summarization of long threads in online discussion fora. We present an elaborate user evaluation study to determine human preferences in forum summarization and to create a reference data set. We showed long threads to ten different raters and asked them to create a summary by selecting the posts that they considered to be the most important for the thread. We study the agreement between human raters on the summarization task, and we show how multiple reference summaries can be combined to develop a successful model for automatic summarization. We found that although the inter-rater agreement for the summarization task was slight to fair, the automatic summarizer obtained reasonable results in terms of precision, recall, and ROUGE. Moreover, when human raters were asked to choose between the summary created by another human and the summary created by our model in a blind side-by-side comparison, they judged the model’s summary equal to or better than the human summary in over half of the cases. This shows that even for a summarization task with low inter-rater agreement, a model can be trained that generates sensible summaries. In addition, we investigated the potential for personalized summarization. However, the results for the three raters involved in this experiment were inconclusive. We release the reference summaries as a publicly available dataset.

16 citations

Proceedings ArticleDOI
01 Nov 1999
TL;DR: Techniques from the field of data analysis may be applied to the problem of generating summaries of query results efficiently to permit the incorporation of classification hierarchies in order to provide powerful browsing environments for digital library users.
Abstract: Summarization of intermediary query result sets plays an important role when users browse through digital library collections. Summarization enables users to quickly digest the results of their queries, and provides users with important information they can use to narrow their search interactively. Techniques from the field of data analysis may be applied to the problem of generating summaries of query results efficiently. Such techniques should permit the incorporation of classification hierarchies in order to provide powerful browsing environments for digital library users.

16 citations

Proceedings ArticleDOI
23 May 2011
TL;DR: This position paper discusses in this position paper how text summarization techniques could be used to address the problem of candidate traceability links by selecting the most important features of the software artifacts that the developers would investigate.
Abstract: Analyzing candidate traceability links is a difficult, time consuming and error prone task, as it usually requires a detailed study of a long list of software artifacts of various kinds One option to alleviate this problem is to select the most important features of the software artifacts that the developers would investigate We discuss in this position paper how text summarization techniques could be used to address this problem The potential gains in using summaries are both in terms of time and correctness of the traceability link recovery process

16 citations

Journal ArticleDOI
01 Jul 2010
TL;DR: The keyword extraction techniques are presented, exploring the effects that part of speech tagging has on the summarization procedure of an existing system, and the profiling features that are used as an extension to an already constructed news indexing system, PeRSSonal are described.
Abstract: Text summarization and categorization, as well as personalization of the results, have always been some of the most demanding information retrieval tasks. Deploying a generalized, multi-functional mechanism that produces good results for the aforementioned tasks seems to be a panacea for most of the text-based, information retrieval needs. In this article, we present the keyword extraction techniques, exploring the effects that part of speech tagging has on the summarization procedure of an existing system. Moreover, we describe the profiling features that are used as an extension to an already constructed news indexing system, PeRSSonal. We are thus enhancing the personalization algorithm that the system utilizes with various features derived from the user's profile, such as the list of viewed articles and the time spent on them. In addition, we analyze the system's interconnection channels that are used with the client-side desktop application that was developed and we evaluate the approaches that we propose.

16 citations

Book ChapterDOI
07 Dec 2009
TL;DR: QA@INEX aims to evaluate a complex question-answering task and presents the groundwork carried out in 2009 to determine the tasks and a novel evaluation methodology that will be used in 2010.
Abstract: QA@INEX aims to evaluate a complex question-answering task. In such a task, the set of questions is composed of factoid, precise questions that expect short answers, as well as more complex questions that can be answered by several sentences or by an aggregation of texts from different documents. Question-answering, XML/passage retrieval and automatic summarization are combined in order to get closer to real information needs. This paper presents the groundwork carried out in 2009 to determine the tasks and a novel evaluation methodology that will be used in 2010.

16 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
85% related
Ontology (information science)
57K papers, 869.1K citations
84% related
Web page
50.3K papers, 975.1K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
83% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202374
2022160
202152
202061
201947
201852