Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Abstractive Multi-Document Summarization Based on Semantic Link Network

[...]

Wei Li¹, Hai Zhuge¹•Institutions (1)

Chinese Academy of Sciences¹

01 Jan 2021-IEEE Transactions on Knowledge and Data Engineering

TL;DR: Experiments on benchmark datasets show that the proposed summarization approach significantly outperforms relevant state-of-the-art baselines and the Semantic Link Network plays an important role in representing and understanding documents.

...read moreread less

Abstract: The key to realize advanced document summarization is semantic representation of documents. This paper investigates the role of Semantic Link Network in representing and understanding documents for multi-document summarization. It proposes a novel abstractive multi-document summarization framework by first transforming documents into a Semantic Link Network of concepts and events and then transforming the Semantic Link Network into the summary of the documents based on the selection of important concepts and events while keeping semantics coherence. Experiments on benchmark datasets show that the proposed summarization approach significantly outperforms relevant state-of-the-art baselines and the Semantic Link Network plays an important role in representing and understanding documents.

...read moreread less

19 citations

Proceedings Article•

Towards Holistic Summarization – Selecting Summaries, Not Sentences

[...]

Martin Hassel, Jonas Sjöbergh

01 May 2006

TL;DR: A novel method for automatic text summarization through text extraction, using computational semantics, to construct a summarizer that can be quickly assembled, with the use of only a very few basic language tools, for languages that lack large amounts of structured or annotated data or advanced tools for linguistic processing.

...read moreread less

Abstract: In this paper we present a novel method for automatic text summarization through text extraction, using computational semantics The new idea is to view all the extracted text as a whole and compute a score for the total impact of the summary, instead of ranking for instance individual sentences A greedy search strategy is used to search through the space of possible summaries and select the summary with the highest score of those found The aim has been to construct a summarizer that can be quickly assembled, with the use of only a very few basic language tools, for languages that lack large amounts of structured or annotated data or advanced tools for linguistic processing The proposed method is largely language independent, though we only evaluate it on English in this paper, using ROUGE-scores on texts from among others the DUC 2004 task 2 On this task our method performs better than several of the systems evaluated there, but worse than the best systems

...read moreread less

19 citations

Journal Article•DOI•

Search engine reinforced semi-supervised classification and graph-based summarization of microblogs

[...]

Yan Chen¹, Xiaoming Zhang¹, Zhoujun Li¹, Jun-Ping Ng²•Institutions (2)

Beihang University¹, Bloomberg L.P.²

25 Mar 2015-Neurocomputing

TL;DR: This work integrates automatically acquired information from external search engines with a semi-supervised probabilistic graphical model, and shows that this helps to achieve significantly better classification performance without the need for much training data.

...read moreread less

19 citations

Posted Content•

Multi-Document Summarization via Discriminative Summary Reranking.

[...]

Xiaojun Wan, Ziqiang Cao, Furu Wei, Sujian Li, Ming Zhou - Show less +1 more

08 Jul 2015-arXiv: Computation and Language

TL;DR: This work proposes to extract a set of candidate summaries for each document set based on an ILP framework, and then leverage Ranking SVM for summary reranking, and Evaluation results on the benchmark DUC datasets validate the efficacy and robustness of the proposed approach.

...read moreread less

Abstract: Existing multi-document summarization systems usually rely on a specific summarization model (i.e., a summarization method with a specific parameter setting) to extract summaries for different document sets with different topics. However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and even a summarization model with good overall performance may produce low-quality summaries for some document sets. On the contrary, a baseline summarization model may produce high-quality summaries for some document sets. Based on the above observations, we treat the summaries produced by different summarization models as candidate summaries, and then explore discriminative reranking techniques to identify high-quality summaries from the candidates for difference document sets. We propose to extract a set of candidate summaries for each document set based on an ILP framework, and then leverage Ranking SVM for summary reranking. Various useful features have been developed for the reranking process, including word-level features, sentence-level features and summary-level features. Evaluation results on the benchmark DUC datasets validate the efficacy and robustness of our proposed approach.

...read moreread less

19 citations

Proceedings Article•DOI•

Latent Topic Modeling for Audio Corpus Summarization

[...]

Timothy J. Hazen¹•Institutions (1)

Massachusetts Institute of Technology¹

27 Aug 2011

TL;DR: This work presents techniques for automatically summarizing the topical content of an audio corpus and an example summarization of conversational data from the Fisher corpus that demonstrates the effectiveness of the approach is presented and evaluated.

...read moreread less

Abstract: This work presents techniques for automatically summarizing the topical content of an audio corpus. Probabilistic latent semantic analysis (PLSA) is used to learn a set of latent topics in an unsupervised fashion. These latent topics are ranked by their relative importance in the corpus and a summary of each topic is generated from signature words that aptly describe the content of that topic. This paper presents techniques for producing a high quality summarization. An example summarization of conversational data from the Fisher corpus that demonstrates the effectiveness of our approach is presented and evaluated. Index Terms: latent topic modeling, speech summarization

...read moreread less

19 citations

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics