Showing papers on "Multi-document summarization published in 2023"

PDF

Open Access

Proceedings Article•DOI•

PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream

[...]

10 Feb 2023

TL;DR: In this article , a prototype-driven continuous summarization (PDSum) algorithm is proposed for multi-document sets stream summarization, which builds a lightweight prototype of each document set and exploits it to adapt to new documents while preserving accumulated knowledge from previous documents.

...read moreread less

Abstract: Summarizing text-rich documents has been long studied in the literature, but most of the existing efforts have been made to summarize a static and predefined multi-document set. With the rapid development of online platforms for generating and distributing text-rich documents, there arises an urgent need for continuously summarizing dynamically evolving multi-document sets where the composition of documents and sets is changing over time. This is especially challenging as the summarization should be not only effective in incorporating relevant, novel, and distinctive information from each concurrent multi-document set, but also efficient in serving online applications. In this work, we propose a new summarization problem, Evolving Multi-Document sets stream Summarization (EMDS), and introduce a novel unsupervised algorithm PDSum with the idea of prototype-driven continuous summarization. PDSum builds a lightweight prototype of each multi-document set and exploits it to adapt to new documents while preserving accumulated knowledge from previous documents. To update new summaries, the most representative sentences for each multi-document set are extracted by measuring their similarities to the prototypes. A thorough evaluation with real multi-document sets streams demonstrates that PDSum outperforms state-of-the-art unsupervised multi-document summarization algorithms in EMDS in terms of relevance, novelty, and distinctiveness and is also robust to various evaluation settings.

...read moreread less

2 citations

Journal Article•DOI•

A sentence is known by the company it keeps: Improving Legal Document Summarization Using Deep Clustering

[...]

Deepali Jain, Malaya Dutta Borah, Anupam Biswas

01 Feb 2023-Artificial Intelligence and Law

2 citations

Journal Article•DOI•

Text Summarization Using Natural Language Processing

[...]

Narendrasinh Chauhan, Krunal Patel

01 May 2023-International journal of scientific research in computer science, engineering and information technology

TL;DR: This paper presented an effective way to summarize using a Text Rank algorithm, which focuses on summarizing single Hindi text document at a time based on natural language processing (NLP) for Hindi text documents.

...read moreread less

Abstract: The availability of information today accessible in digital form has accelerated. Retrieving useful document from such large pool of information gets difficult. So, to summarize these text documents is very crucial. Text summarization is a process of minimizing the original source document to get essential information of that document. It eliminates the redundant, less important content and provides you with the vital information in a shorter version usually half a length of the original text. Creating a manual summary is a very time-consuming task. Automatic summarization helps in getting the gist of information present in a particular document in a very short period. In the comparison of all Indian regional languages, there is very less amount of work done for summarization of Hindi documents. This paper presents an effective way to summarize using a Text Rank algorithm. It focuses on summarizing single Hindi text document at a time based on natural language processing (NLP).

...read moreread less

1 citations

Journal Article•DOI•

From video summarization to real time video summarization in smart cities and beyond: A survey

[...]

Prashant Giridhar Shambharkar, Ruchira Goel

09 Jan 2023-Frontiers in big data

TL;DR: In this article , the authors present a classification and analysis of video summarization approaches, with a focus on real-time video summarisation (RVS) domain techniques that can be used to summarize videos.

...read moreread less

Abstract: With the massive expansion of videos on the internet, searching through millions of them has become quite challenging. Smartphones, recording devices, and file sharing are all examples of ways to capture massive amounts of real time video. In smart cities, there are many surveillance cameras, which has created a massive volume of video data whose indexing, retrieval, and administration is a difficult problem. Exploring such results takes time and degrades the user experience. In this case, video summarization is extremely useful. Video summarization allows for the efficient storing, retrieval, and browsing of huge amounts of information from video without sacrificing key features. This article presents a classification and analysis of video summarization approaches, with a focus on real-time video summarization (RVS) domain techniques that can be used to summarize videos. The current study will be useful in integrating essential research findings and data for quick reference, laying the preliminaries, and investigating prospective research directions. A variety of practical uses, including aberrant detection in a video surveillance system, have made successful use of video summarization in smart cities.

...read moreread less

1 citations

Journal Article•DOI•

State-of-the-art approach to extractive text summarization: a comprehensive review

[...]

Avaneesh K. Yadav, Ranvijay, Rama Shankar Yadav, Ashish K. Maurya

16 Feb 2023-Multimedia Tools and Applications

1 citations

Proceedings Article•DOI•

Unsupervised Broadcast News Summarization; a Comparative Study on Maximal Marginal Relevance (MMR) and Latent Semantic Analysis (LSA)

[...]

25 Jan 2023

TL;DR: In this paper , the authors investigated the performance of two unsupervised methods, Latent Semantic Analysis (LSA) and Maximal Marginal Relevance (MMR), in summarization of Persian broadcast news.

...read moreread less

Abstract: The methods of automatic speech summarization are classified into two groups: supervised and unsupervised methods. Supervised methods are based on a set of features, while unsupervised methods perform summarization based on a set of rules. Latent Semantic Analysis (LSA) and Maximal Marginal Relevance (MMR) are considered the most important and well-known unsupervised methods in automatic speech summarization. This study set out to investigate the performance of two aforementioned unsupervised methods in transcriptions of Persian broadcast news summarization. The results show that in generic summarization, LSA outperforms MMR, and in query-based summarization, MMR outperforms LSA in broadcast news summarization.

...read moreread less

1 citations

Book Chapter•DOI•

Fuzzy Bi-GRU Based Hybrid Extractive and Abstractive Text Summarization for Long Multi-documents

[...]

L. Agilandeeswari¹, FERNANDA CARDOSO KINKER²•Institutions (2)

VIT University¹, GIET University²

01 Jan 2023

TL;DR: In this paper , the authors proposed a Fuzzy Bi-GRU model for extracting the most useful and relevant information from a massive amount of information from that massive list and it can be possible through automatic text summarization (ATS).

...read moreread less

Abstract: As a massive amount of information is produced on the internet nowadays, the need for extracting the most useful and relevant information from that massive list is one of the most attractive research and it can be possible through a mechanism called automatic text summarization (ATS). This summarization mechanism is classified into single and multi-documents based on the number of source documents. When multiple source documents communicate similar information called multi-documents and it is the biggest challenge in the field of ATS. This motivates us to work on the long multi-documents by calculating the sentence scores using a fuzzy inference system. From the extracted sentences, the similarity or redundancy has to be removed using Bi- GRU, and then an abstractive summary need to be generated for those identified sentences has to produce. The proposed system is validated and tested using Standard datasets namely, DUC, BBC news, and CNN/daily mail. The proposed Fuzzy Bi-GRU is compared with other cutting-edge models, and empirical results indicate that it outperforms all other models in terms of ROUGE- N and L scores.

...read moreread less

1 citations

Journal Article•DOI•

A Multi-Granularity Heterogeneous Graph for Extractive Text Summarization

[...]

Thomas Frinault¹, Ruqaiijah Yearby²•Institutions (2)

Shandong Institute of Automation¹, Hainan University²

10 May 2023-Electronics

TL;DR: Li et al. as mentioned in this paper proposed an extractive summarization model based on the graph neural network (GNN) to learn cross-sentence relationships using a graph-structured document representation.

...read moreread less

Abstract: Extractive text summarization selects the most important sentences from a document, preserves their original meaning, and produces an objective and fact-based summary. It is faster and less computationally intensive than abstract summarization techniques. Learning cross-sentence relationships is crucial for extractive text summarization. However, most of the language models currently in use process text data sequentially, which makes it difficult to capture such inter-sentence relations, especially in long documents. This paper proposes an extractive summarization model based on the graph neural network (GNN) to address this problem. The model effectively represents cross-sentence relationships using a graph-structured document representation. In addition to sentence nodes, we introduce two nodes with different granularity in the graph structure, words and topics, which bring different levels of semantic information. The node representations are updated by the graph attention network (GAT). The final summary is obtained using the binary classification of the sentence nodes. Our text summarization method was demonstrated to be highly effective, as supported by the results of our experiments on the CNN/DM and NYT datasets. To be specific, our approach outperformed baseline models of the same type in terms of ROUGE scores on both datasets, indicating the potential of our proposed model for enhancing text summarization tasks.

...read moreread less

Book Chapter•DOI•

SmartEDU: Accelerating Slide Deck Production with Natural Language Processing

[...]

Maria J. Costa, Hugo Amaro, Hugo Oliveira

01 Jan 2023-Lecture Notes in Computer Science

TL;DR: SmartEDU as discussed by the authors is a platform for drafting slides for a textual document, and the research that lead to its development can be found in the introduction of SmartEDU, which is based on the Distillbart model for unsupervised summarization.

...read moreread less

Abstract: Slide decks are a common medium for presenting a topic. To reduce the time required for their preparation, we present SmartEDU, a platform for drafting slides for a textual document, and the research that lead to its development. Drafts are Powerpoint files generated in three steps: pre-processing, for acquiring or discovering section titles; summarization, for compressing the contents of each section; slide composition, for organizing the summaries into slides. The resulting file may be further edited by the user. Several summarization methods were experimented in public datasets of presentations and in Wikipedia articles. Based on automatic evaluation measures and collected human opinions, we conclude that a Distillbart model is preferred to unsupervised summarization, especially when it comes to overall draft quality.

...read moreread less

Journal Article•DOI•

Multi-Document News Web Page Summarization Using Content Extraction and Lexical Chain Based Key Phrase Extraction

[...]

Chandrakala Arya, Manoj Diwakar, Prabhishek Singh, Vijendra Singh, Seifedine Kadry, Jung-eon Kim - Show less +2 more

07 Apr 2023-Mathematics

TL;DR: This article presented a method for summarizing multi-document news web pages based on similarity models and sentence ranking, where relevant sentences are extracted from the original article, where they collected from five news websites that cover the same topic and event.

...read moreread less

Abstract: In the area of text summarization, there have been significant advances recently. In the meantime, the current trend in text summarization is focused more on news summarization. Therefore, developing a synthesis approach capable of extracting, comparing, and ranking sentences is vital to create a summary of various news articles in the context of erroneous online data. It is necessary, however, for the news summarization system to be able to deal with multi-document summaries due to content redundancy. This paper presents a method for summarizing multi-document news web pages based on similarity models and sentence ranking, where relevant sentences are extracted from the original article. English-language articles are collected from five news websites that cover the same topic and event. According to our experimental results, our approach provides better results than other recent methods for summarizing news.

...read moreread less

Posted Content•DOI•

Long Text and Multi-Table Summarization: Dataset and Method

[...]

07 Feb 2023

TL;DR: The FINDSum dataset as mentioned in this paper is a large-scale dataset for long text and multi-table summarization, built on 21,125 annual reports from 3,794 companies, it has two subsets for summarizing each company's results of operations and liquidity.

...read moreread less

Abstract: Automatic document summarization aims to produce a concise summary covering the input document's salient information. Within a report document, the salient information can be scattered in the textual and non-textual content. However, existing document summarization datasets and methods usually focus on the text and filter out the non-textual content. Missing tabular data can limit produced summaries' informativeness, especially when summaries require covering quantitative descriptions of critical metrics in tables. Existing datasets and methods cannot meet the requirements of summarizing long text and multiple tables in each report. To deal with the scarcity of available data, we propose FINDSum, the first large-scale dataset for long text and multi-table summarization. Built on 21,125 annual reports from 3,794 companies, it has two subsets for summarizing each company's results of operations and liquidity. To summarize the long text and dozens of tables in each report, we present three types of summarization methods. Besides, we propose a set of evaluation metrics to assess the usage of numerical information in produced summaries. Dataset analyses and experimental results indicate the importance of jointly considering input textual and tabular data when summarizing report documents.

...read moreread less

Journal Article•DOI•

An overview of legal document summarization techniques

[...]

19 Jun 2023-International Research Journal of Modernization in Engineering Technology and Science

Journal Article•DOI•

Graph-based extractive text summarization method for Hausa text

[...]

Abdulkadir Abubakar Bichi, Ruhaidah Samsudin, Rohayanti Hassan

09 May 2023-PLOS ONE

TL;DR: This paper proposed a graph-based extractive single-document summarization method for Hausa text by modifying the existing PageRank algorithm using the normalized common bigrams count between adjacent sentences as the initial vertex score.

...read moreread less

Abstract: Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization research, research involving the development of automatic text summarization methods for documents written in Hausa, a Chadic language widely spoken in West Africa by approximately 150,000,000 people as either their first or second language, is still in early stages of development. This study proposes a novel graph-based extractive single-document summarization method for Hausa text by modifying the existing PageRank algorithm using the normalized common bigrams count between adjacent sentences as the initial vertex score. The proposed method is evaluated using a primarily collected Hausa summarization evaluation dataset comprising of 113 Hausa news articles on ROUGE evaluation toolkits. The proposed approach outperformed the standard methods using the same datasets. It outperformed the TextRank method by 2.1%, LexRank by 12.3%, centroid-based method by 19.5%, and BM25 method by 17.4%.

...read moreread less

Book Chapter•DOI•

Survey of Text Summarization Stratification

[...]

Nannan Yang¹•Institutions (1)

National Institute of Technology, Hamirpur¹

01 Jan 2023

TL;DR: In this paper , extractive and abstract methods for text summarization are investigated and the implications of sentences are calculated using linguistic and statistical characteristics, and they explore many efforts in automatic summarization, particularly recent ones.

...read moreread less

Abstract: The volume of data on the Internet has increased at an exponential rate during the previous decade. Consequently, the need for a method for converting this massive amount of raw data into meaningful information that a human brain can comprehend emerges. Text summarization is a common research technique that aids in dealing with a massive quantity of data. Automatic summarization is a well-known approach for distilling the important ideas in a document. It works by creating a shortened form of the text and preserving important information. Techniques for text summarizing are classified as extractive or abstractive. Extractive summarization methods reduce the burden of summarization by choosing a few relevant sentences from the original text. The implications of sentences are calculated using linguistic and statistical characteristics. This paper investigates extractive and abstract methods for text summarization. We will also explore many efforts in automatic summarization, particularly recent ones, in this article.

...read moreread less

Journal Article•DOI•

Automatic text summarization–A systematic literature review

[...]

Veena R, D. Ramesh, Hanumantappa M

30 Mar 2023-World Journal of Advanced Engineering Technology and Sciences

TL;DR: Automatic summarization is the act of computationally condensing a set of data to produce a subset (a summary) that captures the key ideas or information within the original text as discussed by the authors .

...read moreread less

Abstract: Automatic summarization is the act of computationally condensing a set of data to produce a subset (a summary) that captures the key ideas or information within the original text. To do this, artificial intelligence algorithms that are tailored for diverse sorts of data are frequently created and used. Ten research articles considering databases like IEEE, Scopus, and Springer Nature have been considered. The paradigm shift that AI has created in the field of Automatic Text Summarization is discussed in detail.

...read moreread less

Proceedings Article•DOI•

Improving Multi-Document Summarization with GRU-BERT Network

[...]

01 May 2023

TL;DR: In this article , the authors leverage the power of two popular natural language processing techniques, Bidirectional Encoder Representations from Transformers (BERT) and Gated Recurrent Unit (GRU), for multi-document summarization.

...read moreread less

Abstract: Multi-document summarization has been a challenging task due to the difficulties in capturing essential information from multiple sources and generating coherent and non-redundant summaries. In this proposed model, we address these challenges by leveraging the power of two popular natural language processing techniques, Bidirectional Encoder Representations from Transformers (BERT) and Gated Recurrent Unit (GRU). The Document Understanding Conference (DUC) dataset, a widely recognized benchmark dataset for multi-document summarization, was used to train and evaluate the model. By using BERT to generate contextual embeddings and GRU to capture sequence information, the proposed method outperforms previous methods in terms of summarization quality metrics such as ROUGE (RecallOriented Understudy for Gisting Evaluation). The proposed model has significant potential for use in various applications, such as news summarization, document summarization, and automated content creation. This study demonstrates that combining BERT and GRU models can effectively capture the contextual and sequential information in multi-document summarization, leading to high-quality summaries that overcome the limitations of previous methods.

...read moreread less

Proceedings Article•DOI•

User Adaptive Video Summarization

[...]

03 Mar 2023

TL;DR: In this article , the authors explore the possible approaches and techniques available for generating a user adaptive video summary and present a comparative analysis of the techniques to provide an insight to the researchers working in this area.

...read moreread less

Abstract: Video Summarization shortens a video content by extracting the most significant part from it and presenting the extracted contents in a summarized form that maybe a collection of keyframes or key shots in temporal sequence. In the recent past, various techniques have been suggested for automatic summarization of videos. It has been observed that summarization of videos is a subjective task and the traditional approaches of summarization though, are capable of generating generic summaries but are often incapable of generating the most appropriate and customized summary as desired by the user. A user intuitive and adaptive approach enables to summarize the video as per the preference of the user. In this paper, we discuss various frameworks for generating a user preference-based summary from a video. We explore the possible approaches and techniques available for generating a user adaptive video summary and present a comparative analysis of the techniques to provide an insight to the researchers working in this area.

...read moreread less

Proceedings Article•DOI•

HISum: Hyperbolic Interaction Model for Extractive Multi-Document Summarization

[...]

Mingyang Song, Yi Feng, Liping Jing

30 Apr 2023

TL;DR: Zhang et al. as mentioned in this paper proposed a new hyperbolic interaction model for extractive multi-document summarization (HISum), which first learns document and candidate summary representations in the same hyper-bolic space to capture latent hierarchical structures and then estimates the importance scores of candidates by jointly modeling interactions between each candidate and the document from global and local views.

...read moreread less

Abstract: Extractive summarization helps provide a short description or a digest of news or other web texts. It enhances the reading experience of users, especially when they are reading on small displays (e.g., mobile phones). Matching-based methods are recently proposed for the extractive summarization task, which extracts a summary from a global view via a document-summary matching framework. However, these methods only calculate similarities between candidate summaries and the entire document embeddings, insufficiently capturing interactions between different contextual information in the document to accurately estimate the importance of candidates. In this paper, we propose a new hyperbolic interaction model for extractive multi-document summarization (HISum). Specifically, HISum first learns document and candidate summary representations in the same hyperbolic space to capture latent hierarchical structures and then estimates the importance scores of candidates by jointly modeling interactions between each candidate and the document from global and local views. Finally, the importance scores are used to rank and extract the best candidate as the extracted summary. Experimental results on several benchmarks show that HISum outperforms the state-of-the-art extractive baselines1.

...read moreread less

Proceedings Article•DOI•

Aspect-based Summarization of Legal Case Files using Sentence Classification

[...]

Anshul Padhi, Pulkit V. Parikh, Swati Kanwal, Kamalakar Karlapalem, Natraj Raman - Show less +1 more

30 Apr 2023

TL;DR: In this paper , a multi-step process for aspect-based summarization of a legal case file related to regulating bodies is proposed, which allows different stakeholders to consume information of interest therein efficiently.

...read moreread less

Abstract: Aspect-based summarization of a legal case file related to regulating bodies allows different stakeholders to consume information of interest therein efficiently. In this paper, we propose a multi-step process to achieve the same. First, we explore the semantic sentence segmentation of SEBI case files via classification. We also propose a dataset of Indian legal adjudicating orders which contain tags from carefully crafted domain-specific sentence categories with the help of legal experts. We experiment with various machine learning and deep learning methods for this multi-class classification. Then, we examine the performance of numerous summarization methods on the segmented document to generate persona-specific summaries. Finally, we develop a pipeline making use of the best methods in both sub-tasks to achieve high recall.

...read moreread less

Book Chapter•DOI•

LexRank and PEGASUS Transformer for Summarization of Legal Documents

[...]

Amit Singhal¹•Institutions (1)

Netaji Subhas Institute of Technology¹

01 Jan 2023

TL;DR: In this article , a summarization and paraphrasing technique using the LexRank algorithm and PEGASUS transformer for abstractive summarization of legal documents has been proposed and compared on six different documents.

...read moreread less

Abstract: Legal documents are generally verbose and contain lots of dense legal text. Lawyers often must study prior cases, and reading such documents can be time-consuming. Such documents may also be incomprehensible to the ordinary public who lack legal understanding. Enormous amounts of legal data online have made access to case files and documents simple. Hence, automatic summarization and paraphrasing using machine learning (ML) of such documents has become an important area of research to make the documents comprehensible for lawyers and the ordinary public. In this paper, previous approaches to automatic summarization are reviewed by us and we compare their output on six different documents. We also provide a summarization and paraphrasing technique using the LexRank algorithm and PEGASUS transformer for abstractive summarization. The summary produced by the LexRank model outperforms the other models tested by obtaining a higher ROUGE-F1-score. The final summary retains the important information from the source document and is paraphrased to a simpler language.

...read moreread less

Journal Article•DOI•

Video-to-Text Summarization using Natural Language Processing

[...]

16 Apr 2023-International Journal of Advanced Research in Science, Communication and Technology

TL;DR: In this article , an extractive-video-summarizer that utilizes state-of-the-art pre-trained ML models and open-source libraries at its core is presented.

...read moreread less

Abstract: Video summarization aims to produce a high-quality text-based summary of videos so that it can convey all the important information or the zest of the videos to users. The process of video summarization involves the conversion of video files to audio files, which are then converted to text files. This entire process is accompanied by the use of transformer architecture of Natural Language Processing. Although a lot of studies have been carried out for text summarization, we present our model, an extractive-video-summarizer, that utilizes state-of-the-art pre-trained ML models and open-source libraries at its core. The extractive-video-summarizer uses the following regime(I) Preparation of a multidisciplinary dataset of videos, (II) Extraction of audios from video files, (III)Text generation from audio files, (IV) Text summarization using extractive summarizers, (V)Entity extraction. We conducted our research primarily on two widely used languages in India - Hindi and English. To conclude, our model performs significantly well and generates tags for videos appropriately.

...read moreread less

Journal Article•DOI•

Two-Step Text Recognition and Summarization of Scanned Documents

[...]

V. T. Varun, Steffina Muthukumar

27 Feb 2023-Advances in Science and Technology

TL;DR: In this paper , text recognition from scanned documents via Optical Character Recognition (OCR) was used to perform extractive text summarization using TextRank algorithm, which is an unsupervised summarization technique.

...read moreread less

Abstract: With the explosion of unstructured textual data circulating the digital space in present times, there has been an increase in the necessity of developing tools that can perform automatic text summarization to allow people to get insights from them easily and extract significant and essential data using Automatic Text Summarizers. The readability of documents can be improved and the time spent on researching for information can be improved by the implementation of text summarization tools. In this project, extractive summarization will be performed on text recognized from scanned documents via Optical Character Recognition (OCR), using the TextRank algorithm which is an unsupervised text summarization technique for performing extractive text summarization.

...read moreread less

Journal Article•DOI•

Centroid Based Clustering Approach for Extractive Text Summarization

[...]

Shalu Mall, Avinash Kumar Maurya

30 Jun 2023-International Journal For Science Technology And Engineering

TL;DR: This paper presented a centroid-based clustering algorithm for email summarization that combines the use of word embeddings with a clustering approach. But their results show that their approach stands close to existing methods in terms of summary quality, while also being computationally efficient.

...read moreread less

Abstract: Abstract: Extractive text summarization is the process of identifying the most important information from a large text and presenting it in a condensed form. One popular approach to this problem is the use of centroid-based clustering algorithms, which group together similar sentences based on their content and then select representative sentences from each cluster to form a summary. In this research, we present a centroid-based clustering algorithm for email summarization that combines the use of word embeddings with a clustering algorithm. We compare our algorithm to existing summarization techniques. Our results show that our approach stands close to existing methods in terms of summary quality, while also being computationally efficient. Overall, our work demonstrates the potential of centroid-based clustering algorithms for extractive text summarization and suggests avenues for further research in this area.

...read moreread less

Journal Article•DOI•

Performance of Optimizers in Text Summarization for News Articles

[...]

Namrata Kumari, Nikhil Sharma

01 Jan 2023-Procedia Computer Science

TL;DR: The authors compared and assessed the effectiveness of two optimizers on a variety of datasets and compared their performance on various datasets as they are widely employed in text summarization, and concluded that the two utilized optimizers are adam and rmsprop.

...read moreread less

Book Chapter•DOI•

A Topic-Aware Graph-Based Neural Network for User Interest Summarization and Item Recommendation in Social Media

[...]

Junyang Chen, Ge Fan, Zhiguo Gong, Xueliang Li, Victor C. M. Leung, Meng-zhu Wang, Ming-Hao Yang - Show less +3 more

01 Jan 2023-Lecture Notes in Computer Science

TL;DR: Zhang et al. as discussed by the authors proposed a topic-aware graph-based neural interest summarization method (UGraphNet), enhancing user semantic mining by unearthing potential user relations and jointly learning the latent topic representations of posts that facilitates user interest learning.

...read moreread less

Abstract: User-generated content is daily produced in social media, as such user interest summarization is critical to distill salient information from massive information. While the interested messages (e.g., tags or posts) from a single user are usually sparse becoming a bottleneck for existing methods, we propose a topic-aware graph-based neural interest summarization method (UGraphNet), enhancing user semantic mining by unearthing potential user relations and jointly learning the latent topic representations of posts that facilitates user interest learning. Experiments on two datasets collected from well-known social media platforms demonstrate the superior performance of our model in the tasks of user interest summarization and item recommendation. Further discussions also show that exploiting the latent topic representations and user relations are conductive to the user automatic language understanding.

...read moreread less

Journal Article•DOI•

Opinion Summarization via Submodular Information Measures

[...]

Yang Zhao, Tommy W. S. Chow

01 Jan 2023-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The authors proposed a set of new requirements for opinion-Topic-sentence, which are essential for performing opinion summarization, and proposed four submodular functions and two optimization algorithms with proven performance bounds.

...read moreread less

Abstract: This paper focuses on opinion summarization for constructing subjective and concise summaries representing essential opinions of online text reviews. As previous works rarely focus on the relationship between opinions, topics, and sentences, we propose a set of new requirements for Opinion-Topic-Sentence, which are essential for performing opinion summarization. We prove that Opinion-Topic-Sentence can be theoretically analyzed by submodular information measures. Thus, our proposed method can reduce redundant information, strengthen the relevance to given topics, and informatively represent the underlying emotional variations. While conventional methods require human-labeled topics for extractive summarization, we use unsupervised topic modeling methods to generate topic features. We propose four submodular functions and two optimization algorithms with proven performance bounds that can maximize opinion summarization's utility. An automatic evaluation metric, Topic-based Opinion Variance, is also derived to compensate for ROUGE-based metrics of opinion summarization evaluation. Four large, diversified, and representative corpora, OPOSUM, Opinosis, Yelp, and Amazon reviews, are used in our study. The results on these online review texts corroborate the efficacy of our proposed metric and framework.

...read moreread less

Posted Content•DOI•

A Topic-aware Summarization Framework with Different Modal Side Information

[...]

19 May 2023

TL;DR: The authors proposed a unified topic encoder, which jointly discovers latent topics from the document and various kinds of side information through a graph encoder through a topic-aware interaction, and then proposed a triplet contrastive learning mechanism to align the single-and multi-modal information into a unified semantic space.

...read moreread less

Abstract: Automatic summarization plays an important role in the exponential document growth on the Web. On content websites such as CNN.com and WikiHow.com, there often exist various kinds of side information along with the main document for attention attraction and easier understanding, such as videos, images, and queries. Such information can be used for better summarization, as they often explicitly or implicitly mention the essence of the article. However, most of the existing side-aware summarization methods are designed to incorporate either single-modal or multi-modal side information, and cannot effectively adapt to each other. In this paper, we propose a general summarization framework, which can flexibly incorporate various modalities of side information. The main challenges in designing a flexible summarization model with side information include: (1) the side information can be in textual or visual format, and the model needs to align and unify it with the document into the same semantic space, (2) the side inputs can contain information from various aspects, and the model should recognize the aspects useful for summarization. To address these two challenges, we first propose a unified topic encoder, which jointly discovers latent topics from the document and various kinds of side information. The learned topics flexibly bridge and guide the information flow between multiple inputs in a graph encoder through a topic-aware interaction. We secondly propose a triplet contrastive learning mechanism to align the single-modal or multi-modal information into a unified semantic space, where the summary quality is enhanced by better understanding the document and side information. Results show that our model significantly surpasses strong baselines on three public single-modal or multi-modal benchmark summarization datasets.

...read moreread less

Book Chapter•DOI•

Automated Text Summarization Using Transformers

[...]

Living Tree Cbd Gummies

01 Jan 2023

TL;DR: In this paper , the main idea of this paper is to summarize text and know how transformers work in case of text summarization, which is the process of creating a condensed form of text document which maintains significant information and general meaning of source text.

...read moreread less

Abstract: Text summarization is the process of creating a condensed form of text document which maintains significant information and general meaning of source text. Automatic text summarization becomes an important way of finding relevant information precisely in large text in a short time with little efforts. There are two main strategies involved in text summarization such as abstractive and extractive. In extractive method, the algorithm generates the summary by just picking up the words and line from the corpus. On the other hand in abstractive method, the algorithm generates the summary by rewriting the sentences. The main idea of this paper is to summarize text and know how transformers work in case of text summarization.

...read moreread less

Journal Article•DOI•

Generic and Update Multi-Document Text Summarization based on Genetic Algorithm

[...]

30 Mar 2023-Computación Y Sistemas

TL;DR: In this article , the authors addressed the generic and update text summarization tasks of a set of documents as a combinatorial optimization problem through a genetic algorithm and unsupervised textual features.

...read moreread less

Abstract: In this paper, we addressed the generic and update text summarization tasks of a set of documents as a combinatorial optimization problem through a genetic algorithm and unsupervised textual features. Particularly under the news domain, input documents are a set of articles of varying sizes covering the same event. The main advantage of the proposed method is that it is language-independent. The experimental results demonstrated that the method performs well for both kinds of summarization. Moreover, we calculated the heuristics for update text summarization like a benchmark to compare state-of-the-art methods.

...read moreread less

Journal Article•DOI•

Human attention based movie summarization: Dataset and baseline model

[...]

ALICE Collaboration

01 May 2023-Neurocomputing

TL;DR: Wang et al. as mentioned in this paper proposed an audiovisual neural network that takes advantage of spatio-temporal visual and auditory information to better simulate human attention as well as exploit more plentiful information.

...read moreread less