Topic
Multi-document summarization
About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.
Papers published on a yearly basis
Papers
More filters
01 Jan 2015
TL;DR: This paper addresses the summarization of forum threads with domain-independent and language-independent methodology, using recurrent neural networks, and evaluates the system on data from four different web forums, covering different domains, languages and user communities.
Abstract: In the DISCOSUMO project, we aim to develop a computational toolkit to automatically summarize
discussion forum threads. In this paper, we present the initial design of the toolkit, the data that
we work with and the challenges we face. Discussion threads on a single topic can easily consist of
hundreds or even thousands of individual contributions, with no obvious way to gain a quick overview
of what kind of information is contained within the thread. We address the summarization of forum
threads with domain-independent and language-independent methodology. We evaluate our system
on data from four different web forums, covering different domains, languages and user communities.
Our approach is largely unsupervised, using recurrent neural networks. Evaluation of the first version
should point out where in the pipeline supervised techniques and/or heuristics are required to improve
our summarization toolbox. If successful, the automatic summarization of discussion forum threads
will play an important role in facilitating easy participation in online discussions.
1 citations
•
01 Jan 2009TL;DR: The result reveals that the approach based on proposed event-oriented ontology outperformed the traditional text summarization approach in capturing conceptual and procedural knowledge, but the latter was still better in delivering factual knowledge.
Abstract: Document summarization is an important function for knowledge management when a digital library of text documents grows. It allows documents to be presented in a concise manner for easy reading and understanding. Traditionally, document summarization adopts sentence-based mechanisms that identify and extract key sentences from long documents and assemble them together. Although that approach is useful in providing an abstract of documents, it cannot extract the relationship or sequence of a set of related events (also called episodes). This paper proposes an event-oriented ontology approach to constructing episodic knowledge to facilitate the understanding of documents. We also empirically evaluated the proposed approach by using instruments developed based on Bloom’s Taxonomy. The result reveals that the approach based on proposed event-oriented ontology outperformed the traditional text summarization approach in capturing conceptual and procedural knowledge, but the latter was still better in delivering factual knowledge .
1 citations
••
20 Sep 2010TL;DR: The focus of this paper is a question answering system, where the answers are retrieved from a collection of textual documents, which includes automatic document summarization and document visualization by means of a semantic graph.
Abstract: The focus of this paper is a question answering system, where the answers are retrieved from a collection of textual documents. The system also includes automatic document summarization and document visualization by means of a semantic graph. The information extracted from the documents is stored as subject-predicate-object triplets, and the indexed terms are expanded using Cyc, a large common sense ontology.
1 citations
••
TL;DR: The influence of the token chosen in the two-stage sentence selection approach on the quality of the generated summaries is analyzed and proves its validity, compared with the traditional method of sentence selection.
Abstract: Compared with the traditional method of adding sentences to get summary in multi-document summarization, a two-stage sentence selection approach based on deleting sentences in a candidate sentence set to generate summary is proposed, which has two stages, the acquisition of a candidate sentence set and the optimum selection of sentence. At the first stage, the candidate sentence set is obtained by redundancy-based sentence selection approach. At the second stage, optimum selection of sentences is proposed to delete sentences in the candidate sentence set according to its contribution to the whole set until getting the appointed summary length. With a test corpus, the ROUGE value of summaries gotten by the proposed approach proves its validity, compared with the traditional method of sentence selection. The influence of the token chosen in the two-stage sentence selection approach on the quality of the generated summaries is analyzed.
1 citations
••
09 Dec 2008TL;DR: This paper considers the problem of generating summary from the Web reviews and the rank (usefulness) assigned to these reviews by other users and proposes a technique which takes ranked reviews as input and generates a summary.
Abstract: We propose a technique for summarizing Web reviews. Information summarization has become an important problem in the current content saturated world. One such example is the World Wide Web which provides a platform to publish and evaluate information. This collaborative nature of the Web has enabled users to write their opinion on certain topics and also evaluate others' opinions by assigning ranks. In this paper we show that the above aspect of Web can be utilized to generate more useful summary. We consider the problem of generating summary from the Web reviews and the rank (usefulness) assigned to these reviews by other users. We study the usefulness of user ranks in the summarization task. Based on the study, we propose a technique which takes ranked reviews as input and generates a summary. We experiment with different variations of the proposed technique and evaluate them based on different criteria.
1 citations