Topic

Multi-document summarization

About: Multi-document summarization is a research topic. Over the lifetime, 2270 publications have been published within this topic receiving 71850 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Natural Language Processing with Graph and Machine Learning Algorithms-based Large-scale Text Document Summarization and Its Applications

[...]

07 Nov 2022

TL;DR: In this paper , an integrated approach for text summarization using images, table labels, etc. is presented, where the first step is text processing and the second step involves using graph-based algorithms to extract the most important sentences from the document.

...read moreread less

Abstract: Analyzing text and image data is always time-consuming, and with the rapid growth in the amount of data, important meanings of the information may be lost. Natural language processing (NLP) and graph-based methods can be used to summarize large documents, The system proposed in this chapter is an integrated approach for text summarization using images, table labels, etc. The first step is text processing, which extracts important text information. The second step involves using graph-based algorithms to extract the most important sentences from the document. Techniques such as node ranking aim to extract the meaningful information from the document. A machine learning (ML) model is used to train the system and provide an accurate document summary. NLP-based text summarization opens up various applications areas, such as e-learning, meeting summarization, e-news systems, social network data analysis, rapid decision support in business analysis, and many more.

...read moreread less

Proceedings Article•DOI•

Incremental multi-document summarization: An incremental clustering based approach

[...]

Johney John¹, S. Asharaf¹•Institutions (1)

Indian Institute of Information Technology and Management, Kerala¹

04 Dec 2014

TL;DR: This work proposes a mechanism that can overcome difficulties and generate update summaries in an affordable way because of the inability to keep summaries unaffected by the order in which new documents are added to the pool and the need to process whole set of documents each time the summary needs to be updated.

...read moreread less

Abstract: Documents which are published both online and offline are considered to be the primary source of information. Astonishing growth of documentation and communication systems tends to flood these pools of information sources with enormous amount of documents. In such a scenario, it is critical to envisage algorithms and methodologies that can convert these huge collection of documents to their best possible form of summaries. It will help to cater to the information hunters who needs only the abstract summaries in a fully digestible form. Different methods which can perform this task can be compared on the basis of quality of summary that it generates and the amount of processing power that it demands. Existing methods are capable of generating summaries incrementally (update summaries as and when new documents are added to the pool). But the inability to keep summaries unaffected by the order in which new documents are added to the pool and the need to process whole set of documents (together with those which are already summarized)each time the summary needs to be updated, pulls them back from their potential applications. We propose a mechanism that can overcome these difficulties and generate update summaries in an affordable way.

...read moreread less

Dissertation•

Automation of summarization evaluation methods and their application to the summarization process

[...]

Thade Nahnsen

30 Jun 2011

TL;DR: The development of an automatic summarization system which draws on the conceptual idea of the Pyramid evaluation scheme and the techniques developed for the proposed evaluation system, and the development of a fully automated evaluation method.

...read moreread less

Abstract: Summarization is the process of creating a more compact textual representation of a document or a collection of documents. In view of the vast increase in electronically available information sources in the last decade, filters such as automatically generated summaries are becoming ever more important to facilitate the efficient acquisition and use of required information. Different methods using natural language processing (NLP) techniques are being used to this end. One of the shallowest approaches is the clustering of available documents and the representation of the resulting clusters by one of the documents; an example of this approach is the Google News website. It is also possible to augment the clustering of documents with a summarization process, which would result in a more balanced representation of the information in the cluster, NewsBlaster being an example. However, while some systems are already available on the web, summarization is still considered a difficult problem in the NLP community. One of the major problems hampering the development of proficient summarization systems is the evaluation of the (true) quality of system-generated summaries. This is exemplified by the fact that the current state-of-the-art evaluation method to assess the information content of summaries, the Pyramid evaluation scheme, is a manual procedure. In this light, this thesis has three main objectives. 1. The development of a fully automated evaluation method. The proposed scheme is rooted in the ideas underlying the Pyramid evaluation scheme and makes use of deep syntactic information and lexical semantics. Its performance improves notably on previous automated evaluation methods. 2. The development of an automatic summarization system which draws on the conceptual idea of the Pyramid evaluation scheme and the techniques developed for the proposed evaluation system. The approach features the algorithm for determining the pyramid and bases importance on the number of occurrences of the variable-sized contributors of the pyramid as opposed to word-based methods exploited elsewhere. 3. The development of a text coherence component that can be used for obtaining the best ordering of the sentences in a summary.

...read moreread less

Automatic Summarization of News Articles about Hurricane Florence

[...]

Frank Wanye, Samit Ganguli, Matt Tuckman, Joy Zhang, Fangzheng Zhang - Show less +1 more

07 Dec 2018

TL;DR: This work states that with the recent success of sequence-to-sequence techniques, which are based on recurrent neural networks, there has been an increase in the amount of attention paid to automatic abstractive summarization techniques.

...read moreread less

Abstract: ive summarization, where new words and sentences that may not have been present in the original document or collection of documents are generated in the summary, is a difficult task, because it requires an understanding of the contents of the documents. As such, more research has historically been done in extractive summarization techniques than in abstractive ones. However, with the recent success of sequence-to-sequence techniques, which are based on recurrent neural networks, there has been an increase in the amount of attention paid to automatic abstractive summarization techniques. [2]

...read moreread less

Implementation of Optimization Techniques for Multi-Document Summarization

[...]

Ashwini Jadhav, K. V. Metre

01 Jan 2015

TL;DR: This paper is analysing the way to automatic generation of the presentation slides from academic papers and proposes PPSGen system, which have lots of advantages over baseline methods.

...read moreread less

Abstract: Presentation-based learning is an effective way of learning in which students or employee of the organization are able to receive continuous feedback from teammates or from their coaches. It is the pictorial way to represent the work. Before going towards the presentation of the work presenters have to work on the slides. These slides of presentation are made from the article, some academic papers or with the help of internet. It results into wastage of more timing to create slides rather than focusing on preparation of the presentation. In this paper, we are analysing the way to automatic generation of the presentation slides from academic papers. Due to this the presenters can prepare their formal slides in a quicker way. Therefore, we are proposing PPSGen system to address this problem of existing system. PPSGen have lots of advantages over baseline methods.

...read moreread less

Collapse

Network Information

Performance

Metrics

2,507

Papers

81,726

Citations

No. of papers in the topic in previous years
Year	Papers
2023	74
2022	160
2021	52
2020	61
2019	47
2018	52

Multi-document summarization

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics