Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Summarization of scalable multimedia documents

[...]

Benoit Pellan¹, Cyril Concolato¹•Institutions (1)

Télécom ParisTech¹

06 May 2009

TL;DR: A scalable multimedia model that structures the multimedia scene into incremental Spatial, Temporal and Interactive layers and progressively provides presentation details is presented and has been technically validated on PowerPoint-like documents using a generic MPEG-21-based adaptation framework.

...read moreread less

Abstract: The summarization of a multimedia document is a challenge that requires the summarization of media elements combined into a document but also relies on an appropriate adaptation of its presentation. In this paper, we present a scalable multimedia model that structures the multimedia scene into incremental Spatial, Temporal and Interactive layers and progressively provides presentation details. Our proposal consists in summarizing such scalable multimedia documents based on three adaptation parameters: a targeted level of expertise, a preferred duration and a level of expectation for extended information. Our approach has been technically validated on PowerPoint-like documents using a generic MPEG-21-based adaptation framework.

...read moreread less

1 citations

Proceedings Article•DOI•

Semantic analysis for focused multi-document summarization (fMDS) of text

[...]

Quinsulon L. Israel¹, Hyoil Han², Il-Yeol Song¹•Institutions (2)

Drexel University¹, Marshall University²

13 Apr 2015

TL;DR: A simple, inexpensive and domain-independent system architecture is created for adding semantic analysis to the summarization process that outperforms the baseline system by more than ten rankings and shows that semantic analysis and light-weight, open-domain techniques have potential.

...read moreread less

Abstract: Excess amounts of unstructured data are easily accessible in digital format quickly, yet there is no way for a human reader to easily 'ingest and digest' as quickly. This information overload places too heavy a burden on society for its analysis and execution needs. Focused (i.e. topic, query, question, category, etc.) multidocument summarization is an information reduction solution that has reached a state-of-the-art and now demands further exploration into other techniques to model human summarization activity. Such techniques have been mainly extractive and rely on distribution and complex machine learning on corpora in order to perform closely to humans. Consequently, the field needs to move toward more abstractive approaches to model human ways of summarizing. A simple, inexpensive and domain-independent system architecture is created for adding semantic analysis to the summarization process. Our system is novel for a couple of reasons. First, in its use of a semantic cue words feature and semantic class weighting to determine sentences with important information as a new semantic analysis metric. Second, in its use of semantic triples clustering to decompose natural language sentences into their most basic meaning to reduce the complexity of processing sentences and capture more likely semantic-related information. In competition against the gold standard baseline system from the Text Analysis Conference on the standardized summarization evaluation metric ROUGE, this work outperforms the baseline system by more than ten rankings. This work shows that semantic analysis and light-weight, open-domain techniques have potential.

...read moreread less

1 citations

Journal Article•

Performance of Opinion Summarization towards Extractive Summarization

[...]

H. Iboi¹, Stephanie Chua¹, Bali Ranaivo-Malançon¹, Narayanan Kulathuramaiyer¹•Institutions (1)

Universiti Malaysia Sarawak¹

15 Sep 2017-Journal of Telecommunication, Electronic and Computer Engineering

TL;DR: A comparative study is conducted on two types of summarizations; opinion summarization using the proposed method, which uses two different sentiment lexicons: VADER and SentiWordNet against extractive summarizations using established methods: Luhn, Latent Semantic Analysis (LSA) and LexRank.

...read moreread less

Abstract: Opinion summarization summarizes opinion in texts while extractive summarization summarizes texts without considering opinion in the texts. Can opinion summarization be used to produce a better extractive summary? This paper proposes to determine the effectiveness of opinion summarization generation against extractive text summarization. Sentiment that includes emotion which indicates whether a sentence may be positive, negative or neutral is considered. Sentences that have strong sentiment, either positive or negative are deemed important in text summarization to capture the sentiments in a story text. Thus, a comparative study is conducted on two types of summarizations; opinion summarization using the proposed method, which uses two different sentiment lexicons: VADER and SentiWordNet against extractive summarization using established methods: Luhn, Latent Semantic Analysis (LSA) and LexRank. An experiment was performed on 20 news stories, comparing summaries generated by the proposed opinion summarization method against the summaries generated by established extractive summarization methods. From the experiment, the VADER sentiment analyzer produced the best score of 0.51 when evaluated against the LSA method using ROUGE-1 metric. This implies that opinion summarization converges with extractive summarization.

...read moreread less

1 citations

Journal Article•DOI•

Automatic Text Summarization using Document Clustering Named Entity Recognition

[...]

S. R, K Arutchelvan

01 Jan 2022-International Journal of Advanced Computer Science and Applications

TL;DR: This research work has proposed an Entity Aware Text Summarization using Document Clustering (EASDC) technique to extract summary from multidocuments and it shown an improvement of 1.6 percentage when compared with the baseline methods of Textrank and Lexrank.

...read moreread less

Abstract: Due to the rapid development of internet technology, social media and popular research article databases have generated many open text information. This large amount of textual information leads to 'Big Data'. Textual information can be recorded repeatedly about an event or topic on different websites. Text summarization (TS) is an emerging research field that helps to produce summary from a single or multiple documents. The redundant information in the documents is difficult, hence part or all of the sentences may be omitted without changing the gist of the document. TS can be organized as an exposition to collect accents from its special position, rather than being semantic in nature. Non-ASCII characters and pronunciation, including tokenizing and lemmatization are involved in generating a summary. This research work has proposed an Entity Aware Text Summarization using Document Clustering (EASDC) technique to extract summary from multidocuments. Named Entity Recognition (NER) has a vital part in the proposed work. The topics and key terms are identified using the NER technique. Extracted entities are ranked with Zipf’s law and sentence clusters are formed using k-means clustering. Cosine similarity-based technique is used to eliminate the similar sentences from multi-documents and produce unique summary. The proposed EASDC technique is evaluated using CNN dataset and it shown an improvement of 1.6 percentage when compared with the baseline methods of Textrank and Lexrank. Keywords—Named entity recognition; text summarization; kmeans clustering; Zipf’s law

...read moreread less

1 citations

Text Summarization: News and Beyond

[...]

Kathleen R. McKeown

01 Dec 2005

TL;DR: Redundancy in large text collections, such as the web, creates both problems and opportunities for natural language systems and can be exploited to identify important and accurate information for applications such as summarization and question answering.

...read moreread less

Abstract: Redundancy in large text collections, such as the web, creates both problems and opportunities for natural language systems. On the one hand, the presence of numerous sources conveying the same information causes difficulties for end users of search engines and news providers; they must read the same information over and over again. On the other hand, redundancy can be exploited to identify important and accurate information for applications such as summarization and question answering.

...read moreread less

1 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics