Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Genetic Clustering Algorithm for Extractive Text Summarization

[...]

Sebastian Suarez Benjumea, Elizabeth León¹•Institutions (1)

National University of Colombia¹

01 Dec 2015

TL;DR: A novel approach for automatic extractive text summarization called SENCLUS uses a genetic clustering algorithm, which clusters the sentences as close representation of the text topics using a fitness function based on redundancy and coverage, and applies a scoring function to select the most relevant sentences of each topic to be part of the extractive summary.

...read moreread less

Abstract: Automatic text summarization has become a relevant topic due to the information overload. This automatization aims to help humans and machines to deal with the vast amount of text data (structured and un-structured) offered on the web and deep web. In this paper a novel approach for automatic extractive text summarization called SENCLUS is presented. Using a genetic clustering algorithm, SENCLUS clusters the sentences as close representation of the text topics using a fitness function based on redundancy and coverage, and applies a scoring function to select the most relevant sentences of each topic to be part of the extractive summary. The approach was validated using the DUC2002 data set and ROUGE summary quality measures. The results shows that the approach is representative against the state of the art methods for extractive automatic text summarization.

...read moreread less

4 citations

Proceedings Article•DOI•

Survey on Extractive Text Summarization Methods with Multi-Document Datasets

[...]

Jagadish S. Kallimani¹•Institutions (1)

M. S. Ramaiah Institute of Technology¹

01 Sep 2018

TL;DR: This paper provides the knowledge a few of the current methods to perform extractive text summarization where the input would be multi document sets and a way to create a heterogeneous multi-documentcorpus along with the limitations of each of these methods.

...read moreread less

Abstract: Text summarization has been one of the key research areas in Natural Language Processing (NLP) for a while. The various methods to summarize one or more documents can be broadly classified into extractive and abstractive text summarization where the former involves selecting key parts in the document and embedding into the summary while balancing between salience and redundancy. The latter involves creating new sentences to provide a summary of the documents. Extractive summarization can further be done in a supervised manner with humans or an unsupervised manner without any human intervention. This paper provides the knowledge a few of the current methods to perform extractive text summarization where the input would be multi document sets. Multi document summarization can consider two types of document sets; a homogeneous set of documents which have a common topic or theme and a heterogeneous set where the main topic for the documents are unrelated but they contain some form information that is related to the summary. The first method uses sentence regression where they consider performing sentence ranking along with sentence relations followed by greedy selection process. The second is an unsupervised paragraph embedding method utilizing a density peaks clustering method. The third method proposes document-level reconstruction using a neural document model. The fourth method is a query focused, joint neural network based model with an attention mechanism. The fifth method concentrates on coherence by providing a graph-based model which does not require discourse analysis as a prerequisite. We also see a way to create a heterogeneous multi-documentcorpus along with the limitations of each of these methods.

...read moreread less

4 citations

Proceedings Article•DOI•

Template Based Chinese News Event Summarization

[...]

Ying Han Ying Han¹, Fang Li¹, Kebin Liu Kebin Liu¹, Lei Liu Lei Liu¹•Institutions (1)

Shanghai Jiao Tong University¹

01 Nov 2006

TL;DR: A Chinese news events summarization system based on a predefined template that automatically collects query-related news online, applies Language model to detect event related information, calculates word frequencies based on semantic meaning and uses information fusion techniques to merge the results.

...read moreread less

Abstract: Summarization can facilitate users to acquire large amount of information. This paper introduces a Chinese news events summarization system based on a predefined template. The system automatically collects query-related news online, applies Language model to detect event related information, calculates word frequencies based on semantic meaning and uses information fusion techniques to merge the results. After experiments on 50 events, the system can achieve the average of 82.1% precision based on human judgment.

...read moreread less

4 citations

Proceedings Article•

Exploring Document Content with XML to Answer Questions.

[...]

Kenneth C. Litkowski

01 Jan 2005

TL;DR: The overall architecture of KMS is described and how it permits examination of the question-answering task and strategies within TREC, but also in a real-world application in the bioterrorism domain.

...read moreread less

Abstract: CL Research participated in the question answering track in TREC 2004, submitting runs for the main task, the document relevance task, and the relationship task. The tasks were performed using the Knowledge Management System (KMS), which provides a single interface for question answering, text summarization, information extraction, and document exploration. These tasks are based on creating and exploiting an XML representation of the texts in the AQUAINT collection. Question answering is performed directly within KMS, which answers questions either from the collection or from the Internet projected back onto the collection. For the main task, we submitted one run and our average per-series score was 0.136, with scores of 0.180 for factoid questions, 0.026 for list questions, and 0.152 for “other” questions. For the document ranking task, the average precision was 0.2253 and the R-precision was 0.2405. For the relationship task, we submitted two runs, with scores of 0.276 and 0.216, the first run was the best score on this task. We describe the overall architecture of KMS and how it permits examination of the question-answering task and strategies within TREC, but also in a real-world application in the bioterrorism domain. We also raise some issues concerning the judgments used for evaluating TREC results and their possible relevance in a wider context.

...read moreread less

4 citations

Journal Article•DOI•

An Adaptive Semantic Descriptive Model for Multi-Document Representation to Enhance Generic Summarization

[...]

Nada A. Dief¹, Ali E. Al-Desouky¹, Amr Aly Eldin¹, Asmaa M. El-Said•Institutions (1)

Mansoura University¹

28 Feb 2017-International Journal of Software Engineering and Knowledge Engineering

TL;DR: An adaptive extractive multi-document generic (EMDG) methodology for automatic text summarization that relies on a novel approach for sentence similarity measure, a discriminative sentence selection method for sentence scoring and a reordering technique for the extracted sentences after removing the redundant sentences.

...read moreread less

Abstract: Due to the increasing accessibility of online data and the availability of thousands of documents on the Internet, it becomes very difficult for a human to review and analyze each document manually. The sheer size of such documents and data presents a significant challenge for users. Providing automatic summaries of specific topics helps the users to overcome this problem. Most of the current extractive multi-document summarization systems can successfully extract summary sentences; however, many limitations exist which include the degree of redundancy, inaccurate extraction of important sentences, low coverage and poor coherence among the selected sentences. This paper introduces an adaptive extractive multi-document generic (EMDG) methodology for automatic text summarization. The framework of this methodology relies on a novel approach for sentence similarity measure, a discriminative sentence selection method for sentence scoring and a reordering technique for the extracted sentences after removing the...

...read moreread less

4 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics