Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Multi-document Summarization Using Minimum Distortion

[...]

Tengfei Ma¹, Xiaojun Wan¹•Institutions (1)

Peking University¹

13 Dec 2010

TL;DR: By defining a proper distortion measure and a new representation method, the combination of the last two models (the linear representation model and the facility location model) gains good experimental results on the DUC2002 and DUC2004 datasets.

...read moreread less

Abstract: Document summarization plays an important role in the area of natural language processing and text mining. This paper proposes several novel information-theoretic models for multi-document summarization. They consider document summarization as a transmission system and assume that the best summary should have the minimum distortion. By defining a proper distortion measure and a new representation method, the combination of the last two models (the linear representation model and the facility location model) gains good experimental results on the DUC2002 and DUC2004 datasets. Moreover, we also indicate that the model has high interpretability and extensibility.

...read moreread less

18 citations

Proceedings Article•DOI•

Quantitative argument summarization and beyond: Cross-domain key point analysis.

[...]

Roy Bar-Haim¹, Yoav Kantor¹, Lilach Eden¹, Roni Friedman¹, Dan Lahav¹, Noam Slonim¹ - Show less +2 more•Institutions (1)

IBM¹

01 Nov 2020

TL;DR: This work develops a method for automatic extraction of key points, which enables fully automatic analysis, and is shown to achieve performance comparable to a human expert, and demonstrates that the applicability of key point analysis goes well beyond argumentation data.

...read moreread less

Abstract: When summarizing a collection of views, arguments or opinions on some topic, it is often desirable not only to extract the most salient points, but also to quantify their prevalence. Work on multi-document summarization has traditionally focused on creating textual summaries, which lack this quantitative aspect. Recent work has proposed to summarize arguments by mapping them to a small set of expert-generated key points, where the salience of each key point corresponds to the number of its matching arguments. The current work advances key point analysis in two important respects: first, we develop a method for automatic extraction of key points, which enables fully automatic analysis, and is shown to achieve performance comparable to a human expert. Second, we demonstrate that the applicability of key point analysis goes well beyond argumentation data. Using models trained on publicly available argumentation datasets, we achieve promising results in two additional domains: municipal surveys and user reviews. An additional contribution is an in-depth evaluation of argument-to-key point matching models, where we substantially outperform previous results.

...read moreread less

18 citations

Proceedings Article•DOI•

Automatically generating summaries for musical video

[...]

Xi Shao¹, Changsheng Xu¹, Mohan S. Kankanhalli•Institutions (1)

National University of Singapore¹

24 Nov 2003

TL;DR: The experiments on different genres of musical video and comparisons with the summaries only based on music track and video track indicate that the results of summarization using proposed method are significant and effective to help realize user's expectation.

...read moreread less

Abstract: In this paper, we propose a novel approach to automatically summarize musical videos. The proposed summarization scheme is different from the current methods used for video summarization. The musical video is separated into the musical and visual tracks. A music summary is created by analyzing the music content based on music features, adaptive clustering algorithm and musical domain knowledge. Then, shots are detected and clustered in the visual track. Finally, the music video summary is created by aligning the music summary and clustered video shots. Subjective studies by experienced users have been conducted to evaluate the quality of summarization. The experiments on different genres of musical video and comparisons with the summaries only based on music track and video track indicate that the results of summarization using proposed method are significant and effective to help realize user's expectation.

...read moreread less

18 citations

Proceedings Article•DOI•

Multi-document summarization of scientific corpora

[...]

Ozge Yeloglu¹, Evangelos E. Milios¹, Nur Zincir-Heywood¹•Institutions (1)

Dalhousie University¹

21 Mar 2011

TL;DR: Evaluations with pyramid method indicates that including a corpus specific vocabulary to the traditional summarization methods improves the performance but not significantly, while results show that the state of the art summarization method LexRank is not feasible for scientific corpus summarization because of its high computational cost.

...read moreread less

Abstract: In this paper, we investigated four approaches for scientific corpora summarization when only gold-standard keyterms available. MEAD with built-in default vocabulary, MEAD with corpus specific vocabulary extracted by Keyphrase Extraction Algorithm (KEA), LexRank (a state-of-the-art summarization algorithm based on random walk) and W3SS (summarization algorithm based on keyword density) are tested on two Computer Science research paper collections. We use a content evaluation method, pyramid method, instead of the well-known ROUGE metrics since there are no gold-standard summaries available for our data. Evaluations with pyramid method indicates that including a corpus specific vocabulary to the traditional summarization methods improves the performance but not significantly. On the other hand, visual inspection shows us that current content evaluation methods, which use only the gold-standard keyterm information, are not intuitive and focus must turn into better evaluation techniques especially for the multi-document summarization problem. Even though the pyramid method looks for important keyterms in the resulting summaries, it cannot distinguish between a general introductory sentence about the area and a specific sentence on the core idea, if they both contain the same keyterm. Also, our results show that the state of the art summarization method LexRank is not feasible for scientific corpus summarization because of its high computational cost.

...read moreread less

18 citations

Proceedings Article•DOI•

A Novel Technique for Efficient Text Document Summarization as a Service

[...]

Anusha Bagalkotkar¹, Ashesh Kandelwal¹, Shivam Pandey¹, S Sowmya Kamath¹•Institutions (1)

National Institute of Technology, Karnataka¹

29 Aug 2013

TL;DR: A novel technique for generating the summarization of domain specific text from a single Web document by using statistical NLP techniques on the text in a reference corpus and on the web document is presented.

...read moreread less

Abstract: Due to an exponential growth in the generation of web data, the need for tools and mechanisms for automatic summarization of Web documents has become very critical. Web data can be accessed from multiple sources, for e.g. on different Web pages, which makes searching for relevant pieces of information a difficult task. Therefore, an automatic summarizer is vital towards reducing human effort. Text summarization is an important activity in the analysis of a high volume text documents and is currently a major research topic in Natural Language Processing. It is the process of generation of the summary of an input document by extracting the representative sentences from it. In this paper, we present a novel technique for generating the summarization of domain specific text from a single Web document by using statistical NLP techniques on the text in a reference corpus and on the web document. The summarizer proposed generates a summary based on the calculated Sentence Weight (SW), the rank of a sentence in the document's content, the number of terms and the number of words in a sentence, and using term frequency in the input corpus.

...read moreread less

18 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics