Topic
Multi-document summarization
About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.
Papers published on a yearly basis
Papers
More filters
••
01 Aug 2017TL;DR: This paper aims to develop extractive summarization system on Indonesian parliamentary meeting minutes using rule-based information extraction with regular expression and achieves performance in terms of ROUGE-2 F-measure.
Abstract: Meeting minutes contain many important decisions and fact from a meeting. Since meeting minutes are unstructured document, summarization should be conducted in order to easily get the main information. Some works in this research area have been done for meeting minutes in English, but have not conducted yet in Indonesian. Therefore, this paper aims to develop extractive summarization system on Indonesian parliamentary meeting minutes using rule-based information extraction with regular expression. Summary structure is defined based on summaries from the House of Representatives of the Republic of Indonesia. Regular expressions are designed to recognize patterns in meeting minutes and then to fill 24 slot values in summary templates. Our summarizer consists of four main processes i.e.: preprocessing, information extraction, postprocess, and template filling. Our experimental results evaluated by the ROUGE summarization metrics achieves performance in terms of ROUGE-2 F-measure of 0.718.
2 citations
••
TL;DR: In this paper , a text document is compressed using a summarizing system to produce a new form that conveys the core idea of the content it contains using text summarization method.
2 citations
•
01 May 2004TL;DR: The process of developing a taxonomy of cohesion problems and corrective revision operators that address such problems are described, as well as an annotation schema for a corpus of 240 extractive, multi-document summaries that have been manually revised to promote cohesion.
Abstract: Multi-document summaries produced via sentence extraction often suffer from a number of cohesion problems, including dangling anaphora, sudden shifts in topic and incorrect or awkward chronological ordering. Therefore, the development of an automated revision process to correct such problems is a research area of current interest. We present the RevisionBank, a corpus of 240 extractive, multi-document summaries that have been manually revised to promote cohesion. The summaries were revised by six linguistic students using a constrained set of revision operations that we previously developed. In the current paper, we describe the process of developing a taxonomy of cohesion problems and corrective revision operators that address such problems, as well as an annotation schema for our corpus. Finally, we discuss how our taxonomy and corpus can be used for the study of revision-based multi-document summarization as well as for summary evaluation.
2 citations
••
29 Jun 2018
TL;DR: This paper proposes an inter and intra cluster which consist of four weighted criteria functions (coherence, coverage, diversity, and inter-cluster analysis) to be optimized by using SaDE (Self Adaptive Differential Evolution) to get the best summary result.
Abstract: Multi – document as one of summarization type has become more challenging issue than single-document because its larger space and its different content of each document. Hence, some of optimization algorithms consider some criteria in producing the best summary, such as relevancy, content coverage, and diversity. Those weighted criteria based on the assumption that the multi-documents are already located in the same cluster. However, in a certain condition, multi-documents consist of many categories and need to be considered too. In this paper, we propose an inter and intra cluster which consist of four weighted criteria functions (coherence, coverage, diversity, and inter-cluster analysis) to be optimized by using SaDE (Self Adaptive Differential Evolution) to get the best summary result. Therefore, the proposed method will deal not only with the value of compactness quality of the cluster within but also the separation of each cluster. Experimental results on Text Analysis Conference (TAC) 2008 datasets yields better summaries results with average ROUGE-1 on precision, recall, and f - measure 0.77, 0.07, and 0.12 compared to another method that only consider the analysis of intra-cluster.
2 citations
••
01 Dec 2016
TL;DR: This paper studied the effectiveness of this issue about event elements on the size of event about CEC corpus, and found that recall and precision had got better results to many other methods and the average value of F of this method can be raised to 0.63, which can better generalize the text content.
Abstract: When adopting traditional automatic summarization, it emerged information redundancy and incomplete content covering, but currently the mainstream automatic summarization turned towards to extracting words. This paper studied the effectiveness of this issue about event elements on the size of event. Firstly obtaining the event elements through the tagged CEC corpus, then building an event elements network, calculating each node importance of the event elements network, finally getting the concise summary sentences and outputting the text summarization in accordance with the original text sequence. Experiments were conducted on CEC corpus, recall and precision had got better results to many other methods and the average value of F of this method can be raised to 0.63, which can better generalize the text content.
2 citations