Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Image Summarization Algorithm for Reviewing the Virtual Reality Experience

[...]

Eun Joo Kwak, Yongjoo Cho, Hyun Sang Cho, Kyoung Shin Park

01 Mar 2008-The Kips Transactions:partb

TL;DR: A new image summarization algorithm designed for automatically summarizing user's snapshot photos taken in a virtual environment based on user's context information and educational contents, and then presenting a summarized photos shortly after user's virtual reality experience is proposed.

...read moreread less

Abstract: In this paper, we proposed a new image summarization algorithm designed for automatically summarizing user's snapshot photos taken in a virtual environment based on user's context information and educational contents, and then presenting a summarized photos shortly after user's virtual reality experience. While other image summarization algorithms used date, location, and keyword to effectively summarize a large amount of photos, this algorithm is intended to improve users' memory retention by recalling their interests and important educational contents. This paper first describes some criteria of extracting the meaningful images to improve learning effects and the identification rate calculations, followed by the system architecture that integrates the virtual environment and the viewer interface. It will also discuss a user study to model the algorithm's optimal identification rate and then future research directions.Key Words:Image Summarization Algorithm, Memory Improvement, Educational Virtual Environment

...read moreread less

Journal Article•DOI•

A Survey on Various Methodologies of Automatic Text Summarization

[...]

Rahul Lahkar, Anup Kumar Barman

04 Oct 2015-International journal of engineering research and technology

TL;DR: This survey focuses on some of the existing techniques of statistical document summarization as well as summarization using semantic approaches to deal with the improvements that can be done for Extractive Text.

...read moreread less

Abstract: Conversion of text-to-text, to generate summary has been a key research area now a days. Automatic text summarization reduces human effort in generating summary from text document(s) with the help of computer program. Various approaches, methods and systems have been suggested and developed so far till date. This survey focuses on some of the existing techniques of statistical document summarization as well as summarization using semantic approaches to deal with the improvements that can be done for Extractive Text

...read moreread less

Posted Content•

PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization.

[...]

Wen Xiao¹, Iz Beltagy¹, Giuseppe Carenini², Arman Cohan²•Institutions (2)

University of British Columbia¹, Allen Institute for Artificial Intelligence²

16 Oct 2021-arXiv: Computation and Language

TL;DR: The authors adopt the Longformer architecture with proper input transformation and global attention to fit for multi-document inputs, and use Gap Sentence Generation objective with a new strategy to select salient sentences for the whole cluster, called Entity Pyramid, to teach the model to select and aggregate information across a cluster of related documents.

...read moreread less

Abstract: Recently proposed pre-trained generation models achieve strong performance on single-document summarization benchmarks. However, most of them are pre-trained with general-purpose objectives and mainly aim to process single document inputs. In this paper, we propose PRIMER, a pre-trained model for multi-document representation with focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. Specifically, we adopt the Longformer architecture with proper input transformation and global attention to fit for multi-document inputs, and we use Gap Sentence Generation objective with a new strategy to select salient sentences for the whole cluster, called Entity Pyramid, to teach the model to select and aggregate information across a cluster of related documents. With extensive experiments on 6 multi-document summarization datasets from 3 different domains on the zero-shot, few-shot, and full-supervised settings, our model, PRIMER, outperforms current state-of-the-art models on most of these settings with large margins. Code and pre-trained models are released at this https URL

...read moreread less

Proceedings Article•DOI•

Automatically Discarding Straplines to Improve Data Quality for Abstractive News Summarization

[...]

Amr Keleg, Matthias Lindemann, Dandan Liu, Wanqiu Long, Bonnie Webber - Show less +1 more

01 Jan 2022

TL;DR: Automatic evaluation indicates that removing straplines and noise from the training data of a news summarizer results in higher quality summaries, with improvements as high as 7 points ROUGE score.

...read moreread less

Abstract: Recent improvements in automatic news summarization fundamentally rely on large corpora of news articles and their summaries. These corpora are often constructed by scraping news websites, which results in including not only summaries but also other kinds of texts. Apart from more generic noise, we identify straplines as a form of text scraped from news websites that commonly turn out not to be summaries. The presence of these non-summaries threatens the validity of scraped corpora as benchmarks for news summarization. We have annotated extracts from two news sources that form part of the Newsroom corpus (Grusky et al., 2018), labeling those which were straplines, those which were summaries, and those which were both. We present a rule-based strapline detection method that achieves good performance on a manually annotated test set. Automatic evaluation indicates that removing straplines and noise from the training data of a news summarizer results in higher quality summaries, with improvements as high as 7 points ROUGE score.

...read moreread less

Journal Article•

A hybrid model for sentence ordering in extractive multi-document summarization

[...]

Dexi Liu, Zengchang Zhang, Yanxiang He, Donghong Ji

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: This article proposed a hybrid model for sentence ordering in extractive multi-document summarization that combines four relations between sentences, where sentence as vertex and combined relation as edge of a directed graph on which the approximately optimal ordering can be generated.

...read moreread less

Abstract: Ordering information is a critical task for multi-document summarization because it heavily influent the coherence of the generated summary In this paper, we propose a hybrid model for sentence ordering in extractive multi-document summarization that combines four relations between sentences This model regards sentence as vertex and combined relation as edge of a directed graph on which the approximately optimal ordering can be generated with Pag-eRank analysis Evaluation of our hybrid model shows a significant improvement of the ordering over strategies losing some relations and the results also indicate that this hybrid model is robust for articles with different genre

...read moreread less

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics