scispace - formally typeset
Search or ask a question

Showing papers on "Multi-document summarization published in 2017"


Journal ArticleDOI
TL;DR: A comprehensive survey of recent text summarization extractive approaches developed in the last decade is presented and the discussion of useful future directions that can help researchers to identify areas where further research is needed are discussed.
Abstract: As information is available in abundance for every topic on internet, condensing the important information in the form of summary would benefit a number of users. Hence, there is growing interest among the research community for developing new approaches to automatically summarize the text. Automatic text summarization system generates a summary, i.e. short length text that includes all the important information of the document. Since the advent of text summarization in 1950s, researchers have been trying to improve techniques for generating summaries so that machine generated summary matches with the human made summary. Summary can be generated through extractive as well as abstractive methods. Abstractive methods are highly complex as they need extensive natural language processing. Therefore, research community is focusing more on extractive summaries, trying to achieve more coherent and meaningful summaries. During a decade, several extractive approaches have been developed for automatic summary generation that implements a number of machine learning and optimization techniques. This paper presents a comprehensive survey of recent text summarization extractive approaches developed in the last decade. Their needs are identified and their advantages and disadvantages are listed in a comparative manner. A few abstractive and multilingual text summarization approaches are also covered. Summary evaluation is another challenging issue in this research field. Therefore, intrinsic as well as extrinsic both the methods of summary evaluation are described in detail along with text summarization evaluation conferences and workshops. Furthermore, evaluation results of extractive summarization approaches are presented on some shared DUC datasets. Finally this paper concludes with the discussion of useful future directions that can help researchers to identify areas where further research is needed.

581 citations


Journal ArticleDOI
TL;DR: This study proposes a novel multi-text summarization technique for identifying the top-k most informative sentences of hotel reviews, and developed a new sentence importance metric.
Abstract: Text summarization technique can extract essential information from online reviews.Our method can identify top-k most informative sentences from online hotel reviews.We jointly considered author, review time, usefulness, and opinion factors.Online hotel reviews were collected from TripAdvisor in experimental evaluation.The results show that our approach provides more comprehensive hotel information. Online travel forums and social networks have become the most popular platform for sharing travel information, with enormous numbers of reviews posted daily. Automatically generated hotel summaries could aid travelers in selecting hotels. This study proposes a novel multi-text summarization technique for identifying the top-k most informative sentences of hotel reviews. Previous studies on review summarization have primarily examined content analysis, which disregards critical factors like author credibility and conflicting opinions. We considered such factors and developed a new sentence importance metric. Both the content and sentiment similarities were used to determine the similarity of two sentences. To identify the top-k sentences, the k-medoids clustering algorithm was used to partition sentences into k groups. The medoids from these groups were then selected as the final summarization results. To evaluate the performance of the proposed method, we collected two sets of reviews for the two hotels posted on TripAdvisor.com. A total of 20 subjects were invited to review the text summarization results from the proposed approach and two conventional approaches for the two hotels. The results indicate that the proposed approach outperforms the other two, and most of the subjects believed that the proposed approach can provide more comprehensive hotel information.

243 citations


Proceedings ArticleDOI
01 Aug 2017
TL;DR: The authors employed a Graph Convolutional Network (GCNets) on the relation graphs, with sentence embeddings obtained from Recurrent Neural Networks as input node features for saliency estimation.
Abstract: We propose a neural multi-document summarization system that incorporates sentence relation graphs. We employ a Graph Convolutional Network (GCN) on the relation graphs, with sentence embeddings obtained from Recurrent Neural Networks as input node features. Through multiple layer-wise propagation, the GCN generates high-level hidden sentence features for salience estimation. We then use a greedy heuristic to extract salient sentences that avoid redundancy. In our experiments on DUC 2004, we consider three types of sentence relation graphs and demonstrate the advantage of combining sentence relations in graphs with the representation power of deep neural networks. Our model improves upon other traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems.

148 citations


Journal ArticleDOI
TL;DR: Significant contributions made in recent years are emphasized, including progress on modern sentence extraction approaches that improve concept coverage, information diversity and content coherence, as well as attempts from summarization frameworks that integrate sentence compression, and more abstractive systems that are able to produce completely new sentences.
Abstract: The task of automatic document summarization aims at generating short summaries for originally long documents. A good summary should cover the most important information of the original document or a cluster of documents, while being coherent, non-redundant and grammatically readable. Numerous approaches for automatic summarization have been developed to date. In this paper we give a self-contained, broad overview of recent progress made for document summarization within the last 5 years. Specifically, we emphasize on significant contributions made in recent years that represent the state-of-the-art of document summarization, including progress on modern sentence extraction approaches that improve concept coverage, information diversity and content coherence, as well as attempts from summarization frameworks that integrate sentence compression, and more abstractive systems that are able to produce completely new sentences. In addition, we review progress made for document summarization in domains, genres and applications that are different from traditional settings. We also point out some of the latest trends and highlight a few possible future directions.

143 citations


Posted Content
TL;DR: The main approaches to automatic text summarization are described and the effectiveness and shortcomings of the different methods are described.
Abstract: In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.

142 citations


Proceedings ArticleDOI
01 Jan 2017
TL;DR: This paper interprets extractive text summarization methods with a less redundant summary, highly adhesive, coherent and depth information.
Abstract: Text Summarization is the process of obtaining salient information from an authentic text document. In this technique, the extracted information is achieved as a summarized report and conferred as a concise summary to the user. It is very crucial for humans to understand and to describe the content of the text. Text Summarization techniques are classified into abstractive and extractive summarization. The extractive summarization technique focuses on choosing how paragraphs, important sentences, etc produces the original documents in precise form. The implication of sentences is determined based on linguistic and statistical features. In this work, a comprehensive review of extractive text summarization process methods has been ascertained. In this paper, the various techniques, populous benchmarking datasets and challenges of extractive summarization have been reviewed. This paper interprets extractive text summarization methods with a less redundant summary, highly adhesive, coherent and depth information.

142 citations


Posted Content
TL;DR: This work argues that faithfulness is also a vital prerequisite for a practical abstractive summarization system and proposes a dual-attention sequence-to-sequence framework to force the generation conditioned on both the source text and the extracted fact descriptions.
Abstract: Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system. To avoid generating fake facts in a summary, we leverage open information extraction and dependency parse technologies to extract actual fact descriptions from the source text. The dual-attention sequence-to-sequence framework is then proposed to force the generation conditioned on both the source text and the extracted fact descriptions. Experiments on the Gigaword benchmark dataset demonstrate that our model can greatly reduce fake summaries by 80%. Notably, the fact descriptions also bring significant improvement on informativeness since they often condense the meaning of the source text.

132 citations


Journal ArticleDOI
TL;DR: A novel word-sentence co-ranking model named CoRank is proposed, which combines the word- Sentence relationship with the graph-based unsupervised ranking model and can serve as an important building-block of the intelligent summarization systems.
Abstract: A principled word-sentence co-ranking model called CoRank is proposed.The convergence of CoRank with matrix notation is proved.A redundancy elimination technique is presented to further improve the performance of CoRank. Extractive summarization aims to automatically produce a short summary of a document by concatenating several sentences taken exactly from the original material. Due to its simplicity and easy-to-use, the extractive summarization methods have become the dominant paradigm in the realm of text summarization. In this paper, we address the sentence scoring technique, a key step of the extractive summarization. Specifically, we propose a novel word-sentence co-ranking model named CoRank, which combines the word-sentence relationship with the graph-based unsupervised ranking model. CoRank is quite concise in the view of matrix operations, and its convergence can be theoretically guaranteed. Moreover, a redundancy elimination technique is presented as a supplement to CoRank, so that the quality of automatic summarization can be further enhanced. As a result, CoRank can serve as an important building-block of the intelligent summarization systems. Experimental results on two real-life datasets including nearly 600 documents demonstrate the effectiveness of the proposed methods.

96 citations


Posted Content
TL;DR: This model improves upon other traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems.
Abstract: We propose a neural multi-document summarization (MDS) system that incorporates sentence relation graphs. We employ a Graph Convolutional Network (GCN) on the relation graphs, with sentence embeddings obtained from Recurrent Neural Networks as input node features. Through multiple layer-wise propagation, the GCN generates high-level hidden sentence features for salience estimation. We then use a greedy heuristic to extract salient sentences while avoiding redundancy. In our experiments on DUC 2004, we consider three types of sentence relation graphs and demonstrate the advantage of combining sentence relations in graphs with the representation power of deep neural networks. Our model improves upon traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems.

84 citations


Journal ArticleDOI
TL;DR: This study proposes a novel Cat Swarm Optimization (CSO) based multi document summarizer that outperforms the other summarizers included in the study and demonstrates non-redundancy, cohesiveness and readability of the summary respectively.
Abstract: Today, World Wide Web has brought us enormous quantity of on-line information. As a result, extracting relevant information from massive data has become a challenging issue. In recent past text summarization is recognized as one of the solution to extract useful information from vast amount documents. Based on number of documents considered for summarization, it is categorized as single document or multi document summarization. Rather than single document, multi document summarization is more challenging for the researchers to find accurate summary from multiple documents. Hence in this study, a novel Cat Swarm Optimization (CSO) based multi document summarizer is proposed to address the problem of multi document summarization. The proposed CSO based model is also compared with two other nature inspired based summarizer such as Harmony Search (HS) based summarizer and Particle Swarm Optimization (PSO) based summarizer. With respect to the benchmark Document Understanding Conference (DUC) datasets, the performance of all algorithms are compared in terms of different evaluation metrics such as ROUGE score, F score, sensitivity, positive predicate value, summary accuracy, inter sentence similarity and readability metric to validate non-redundancy, cohesiveness and readability of the summary respectively. The experimental analysis clearly reveals that the proposed approach outperforms the other summarizers included in the study.

65 citations


Proceedings Article
12 Feb 2017
TL;DR: An unsupervised data reconstruction framework is proposed, which jointly considers the reconstruction for latent semantic space and observed term vector space and can capture the salience of sentences from these two different and complementary vector spaces.
Abstract: We propose a new unsupervised sentence salience framework for Multi-Document Summarization (MDS), which can be divided into two components: latent semantic modeling and salience estimation. For latent semantic modeling, a neural generative model called Variational Auto-Encoders (VAEs) is employed to describe the observed sentences and the corresponding latent semantic representations. Neural variational inference is used for the posterior inference of the latent variables. For salience estimation, we propose an unsupervised data reconstruction framework, which jointly considers the reconstruction for latent semantic space and observed term vector space. Therefore, we can capture the salience of sentences from these two different and complementary vector spaces. Thereafter, the VAEs-based latent semantic model is integrated into the sentence salience estimation component in a unified fashion, and the whole framework can be trained jointly by back-propagation via multi-task learning. Experimental results on the benchmark datasets DUC and TAC show that our framework achieves better performance than the state-of-the-art models.

Journal ArticleDOI
TL;DR: This work presents a framework for scientific summarization which takes advantage of the citations and the scientific discourse structure, and proposes three approaches for contextualizing citations which are based on query reformulation, word embeddings, and supervised learning.
Abstract: The rapid growth of scientific literature has made it difficult for the researchers to quickly learn about the developments in their respective fields. Scientific document summarization addresses this challenge by providing summaries of the important contributions of scientific papers. We present a framework for scientific summarization which takes advantage of the citations and the scientific discourse structure. Citation texts often lack the evidence and context to support the content of the cited paper and are even sometimes inaccurate. We first address the problem of inaccuracy of the citation texts by finding the relevant context from the cited paper. We propose three approaches for contextualizing citations which are based on query reformulation, word embeddings, and supervised learning. We then train a model to identify the discourse facets for each citation. We finally propose a method for summarizing scientific papers by leveraging the faceted citations and their corresponding contexts. We evaluate our proposed method on two scientific summarization datasets in the biomedical and computational linguistics domains. Extensive evaluation results show that our methods can improve over the state of the art by large margins.

Journal ArticleDOI
TL;DR: The experiment results show that compared to those from other candidate algorithms, each automatic summary generated by the proposed topic modeling based approach has not only a higher compression ratio, but also better summarization quality.
Abstract: A topic model based approach for novel summarization is proposed.An importance evaluation function for sentence candidates is designed.A summary smoothing approach is presented to improve the summary readability. Most of existing text automatic summarization algorithms are targeted for multi-documents of relatively short length, thus difficult to be applied immediately to novel documents of structure freedom and long length. In this paper, aiming at novel documents, we propose a topic modeling based approach to extractive automatic summarization, so as to achieve a good balance among compression ratio, summarization quality and machine readability. First, based on topic modeling, we extract the candidate sentences associated with topic words from a preprocessed novel document. Second, with the goals of compression ratio and topic diversity, we design an importance evaluation function to select the most important sentences from the candidate sentences and thus generate an initial novel summary. Finally, we smooth the initial summary to overcome the semantic confusion caused by ambiguous or synonymous words, so as to improve the summary readability. We evaluate experimentally our proposed approach on a real novel dataset. The experiment results show that compared to those from other candidate algorithms, each automatic summary generated by our approach has not only a higher compression ratio, but also better summarization quality.

Proceedings ArticleDOI
07 Aug 2017
TL;DR: A novel unsupervised query-focused multi-document summarization approach that generates a summary by extracting a subset of sentences using the Cross-Entropy (CE) Method is presented.
Abstract: We present a novel unsupervised query-focused multi-document summarization approach. To this end, we generate a summary by extracting a subset of sentences using the Cross-Entropy (CE) Method. The proposed approach is generic and requires no domain knowledge. Using an evaluation over DUC 2005-2007 datasets with several other state-of-the-art baseline methods, we demonstrate that, our approach is both effective and efficient.

Proceedings Article
12 Feb 2017
TL;DR: This paper proposed a novel summarization system called TCSum, which leverages plentiful text classification data to improve the performance of multi-document summarization, and also utilizes the classification results to produce summaries of different styles.
Abstract: Developed so far, multi-document summarization has reached its bottleneck due to the lack of sufficient training data and diverse categories of documents. Text classification just makes up for these deficiencies. In this paper, we propose a novel summarization system called TCSum, which leverages plentiful text classification data to improve the performance of multi-document summarization. TCSum projects documents onto distributed representations which act as a bridge between text classification and summarization. It also utilizes the classification results to produce summaries of different styles. Extensive experiments on DUC generic multi-document summarization datasets show that, TCSum can achieve the state-of-the-art performance without using any hand-crafted features and has the capability to catch the variations of summary styles with respect to different text categories.

Journal ArticleDOI
30 Jun 2017
TL;DR: Comparison study of various text summarization techniques is given, used to understanding the main concepts in a given document and then expresses those concepts in clear natural language.
Abstract: Text summarization is compress the source text into a diminished version conserving its information content and overall meaning. Because of the great amount of the information we are provided it and thanks to development of Internet Technologies, text summarization has become an important tool for interpreting text information. Text summarization methods can be classified into extractive and abstractive summarization. An extractive summarization method involves selecting sentences of high rank from the document based on word and sentence features and put them together to generate summary. The importance of the sentences is decided based on statistical and linguistic features of sentences. An abstractive summarization is used to understanding the main concepts in a given document and then expresses those concepts in clear natural language. In this paper, gives comparative study of various text summarization techniques.

Proceedings ArticleDOI
14 Jul 2017
TL;DR: A novel coarse-to-fine attention model that hierarchically reads a document, using coarse attention to select top-level chunks of text and fine attention to read the words of the chosen chunks, which achieves the desired behavior of sparsely attending to subsets of the document for generation.
Abstract: Sequence-to-sequence models with attention have been successful for a variety of NLP problems, but their speed does not scale well for tasks with long source sequences such as document summarization. We propose a novel coarse-to-fine attention model that hierarchically reads a document, using coarse attention to select top-level chunks of text and fine attention to read the words of the chosen chunks. While the computation for training standard attention models scales linearly with source sequence length, our method scales with the number of top-level chunks and can handle much longer sequences. Empirically, we find that while coarse-to-fine attention models lag behind state-of-the-art baselines, our method achieves the desired behavior of sparsely attending to subsets of the document for generation.

Journal ArticleDOI
TL;DR: This paper investigates to generate extractive and abstractive summaries of opinions, and studies some well-known methods in the area and compares them, and develops new methods that consider the main advantages of the ones before.
Abstract: Research about some aspect-based opinion summarization methods.A new content selection strategy to produce extractive summaries is proposed.A novel NLG template-based system to generate abstractive summaries is proposed.Extractive and abstractive opinion summarization methods are compared. In the last years, the opinion summarization task has gained much importance because of the large amount of online information and the increasing interest in learning the user evaluation about products, services, companies, and people. Although there are many works in this area, there is room for improvement, as the results are far from ideal. In this paper, we present our investigations to generate extractive and abstractive summaries of opinions. We study some well-known methods in the area and compare them. Besides using these methods, we also develop new methods that consider the main advantages of the ones before. We evaluate them according to three traditional summarization evaluation measures: informativeness, linguistic quality, and utility of the summary. We show that we produce interesting results and that our methods outperform some methods from literature.

Journal ArticleDOI
TL;DR: This paper introduces for the first time the features of the human visual system within the summarization framework itself to allow for the emphasis of perceptually significant events while simultaneously eliminating perceptual redundancy from the summaries.
Abstract: The enormous growth of video content in recent times has raised the need to abbreviate the content for human consumption. Thus, there is a need for summaries of a quality that meets the requirements of human users. This also means that the summarization must incorporate the peculiar features of human perception. We present a new framework for video summarization in this paper. Unlike many available summarization algorithms that utilize only statistical redundancy, we introduce for the first time the features of the human visual system within the summarization framework itself to allow for the emphasis of perceptually significant events while simultaneously eliminating perceptual redundancy from the summaries. The subjective and objective evaluation scores have evaluated the framework.

Proceedings Article
12 Feb 2017
TL;DR: This paper introduces Active Video Summarization (AVS), an interactive approach to gather the user's preferences while creating the summary, and introduces a new dataset for customized video summarization (CSumm).
Abstract: To facilitate the browsing of long videos, automatic video summarization provides an excerpt that represents its content. In the case of egocentric and consumer videos, due to their personal nature, adapting the summary to specific user's preferences is desirable. Current approaches to customizable video summarization obtain the user's preferences prior to the summarization process. As a result, the user needs to manually modify the summary to further meet the preferences. In this paper, we introduce Active Video Summarization (AVS), an interactive approach to gather the user's preferences while creating the summary. AVS asks questions about the summary to update it on-line until the user is satisfied. To minimize the interaction, the best segment to inquire next is inferred from the previous feedback. We evaluate AVS in the commonly used UTEgo dataset. We also introduce a new dataset for customized video summarization (CSumm) recorded with a Google Glass. The results show that AVS achieves an excellent compromise between usability and quality. In 41% of the videos, AVS is considered the best over all tested baselines, including summaries manually generated. Also, when looking for specific events in the video, AVS provides an average level of satisfaction higher than those of all other baselines after only six questions to the user.

Journal ArticleDOI
TL;DR: This paper proposes a population-based multicriteria optimization method with multiple objective functions which generates an extractive generic summary with maximum relevance and minimum redundancy by representing each sentence of the input document as a vector of words in Proper Noun,Noun, Verb and Adjective set.
Abstract: Multi-document summarization is the process of extracting salient information from a set of source texts and present that information to the user in a condensed form. In this paper, we propose a multi-document summarization system which generates an extractive generic summary with maximum relevance and minimum redundancy by representing each sentence of the input document as a vector of words in Proper Noun, Noun, Verb and Adjective set. Five features, such as TF_ISF, Aggregate Cross Sentence Similarity, Title Similarity, Proper Noun and Sentence Length associated with the sentences, are extracted, and scores are assigned to sentences based on these features. Weights that can be assigned to different features may vary depending upon the nature of the document, and it is hard to discover the most appropriate weight for each feature, and this makes generation of a good summary a very tough task without human intelligence. Multi-document summarization problem is having large number of decision parameters and number of possible solutions from which most optimal summary is to be generated. Summary generated may not guarantee the essential quality and may be far from the ideal human generated summary. To address this issue, we propose a population-based multicriteria optimization method with multiple objective functions. Three objective functions are selected to determine an optimal summary, with maximum relevance, diversity, and novelty, from a global population of summaries by considering both the statistical and semantic aspects of the documents. Semantic aspects are considered by Latent Semantic Analysis (LSA) and Non Negative Matrix Factorization (NMF) techniques. Experiments have been performed on DUC 2002, DUC 2004 and DUC 2006 datasets using ROUGE tool kit. Experimental results show that our system outperforms the state of the art works in terms of Recall and Precision.

Journal ArticleDOI
TL;DR: A novel technique for linguistic summarization of event logs is presented, which generates linguistic summary that are concise enough to be used in a practical setting, while at the same time enriching the summaries that are produced by also enabling conjunctive statements.

Proceedings Article
01 Nov 2017
TL;DR: A new model for concept-map-based multi-document summarization is proposed that learns to identify and merge coreferent concepts to reduce redundancy, determines their importance with a strong supervised model and finds an optimal summary concept map via integer linear programming.
Abstract: Concept-map-based multi-document summarization is a variant of traditional summarization that produces structured summaries in the form of concept maps. In this work, we propose a new model for the task that addresses several issues in previous methods. It learns to identify and merge coreferent concepts to reduce redundancy, determines their importance with a strong supervised model and finds an optimal summary concept map via integer linear programming. It is also computationally more efficient than previous methods, allowing us to summarize larger document sets. We evaluate the model on two datasets, finding that it outperforms several approaches from previous work.

Proceedings ArticleDOI
01 Jun 2017
TL;DR: A rank based sentence selection using continuous vector representations along with key-phrases is implemented and a model to tackle summary coherence for increasing readability is proposed.
Abstract: In this work, we aim at developing an extractive summarizer in the multi-document setting. We implement a rank based sentence selection using continuous vector representations along with key-phrases. Furthermore, we propose a model to tackle summary coherence for increasing readability. We conduct experiments on the Document Understanding Conference (DUC) 2004 datasets using ROUGE toolkit. Our experiments demonstrate that the methods bring significant improvements over the state of the art methods in terms of informativity and coherence.

Journal ArticleDOI
TL;DR: The main challenges for Arabic text summarization are described and the various methodologies and systems in the literature are surveyed and this survey would be a good basis for the design of an Arabic automaticText summarization that combines the various “good” features of the existing systems and dismiss the “not-so-good’ features.

Proceedings ArticleDOI
01 Jul 2017
TL;DR: A new supervised framework is presented that learns to estimate automatic Pyramid scores and uses them for optimization-based extractive multi-document summarization and there is much room for improvement in comparison with the upper-bound for automatic Pyramid.
Abstract: We present a new supervised framework that learns to estimate automatic Pyramid scores and uses them for optimization-based extractive multi-document summarization. For learning automatic Pyramid scores, we developed a method for automatic training data generation which is based on a genetic algorithm using automatic Pyramid as the fitness function. Our experimental evaluation shows that our new framework significantly outperforms strong baselines regarding automatic Pyramid, and that there is much room for improvement in comparison with the upper-bound for automatic Pyramid.

Journal ArticleDOI
TL;DR: A novel unsupervised integrated score framework to generate generic extractive multi-document summaries by ranking sentences based on dynamic programming (DP) strategy that takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively.

Proceedings ArticleDOI
20 Jul 2017
TL;DR: This research focuses on improving the summarization accuracy with sentiment analysis of movie review posts using RapidMiner operators using the Aylien Text Analysis extension.
Abstract: Automatic text summarization is one of the important challenges of natural language tasks. It will help the readers save time to get the important information from a lengthy document automatically. Sentiment Analysis is the process of identifying and categorizing opinions expressed in a piece of text computationally to determine whether the writer's attitude towards a particular topic, product, etc., is positive, negative, or neutral. This research focuses on improving the summarization accuracy with sentiment analysis of movie review posts using RapidMiner operators. The first model of summarization is built using the Aylien Text Analysis extension. The proposed second model is built using the Text Processing extension. For both these methods the sentiment analysis is done using the same Aylien Text Analysis extension for evaluating the summarization results. An accuracy of 90% is achieved for sentiment analysis using the first model and 96% for the second model.

Journal ArticleDOI
TL;DR: This paper provides a methodology to model temporal context globally and locally, and proposes a novel unsupervised summarization framework with social-temporal context for Twitter data, to assess the proposed framework and demonstrate the importance of social- Temporal context in Twitter summarization.
Abstract: Twitter is one of the most popular social media platforms for online users to create and share information. Tweets are short, informal, and large-scale, which makes it difficult for online users to find reliable and useful information, arising the problem of Twitter summarization. On the one hand, tweets are short and highly unstructured, which makes traditional document summarization methods difficult to handle Twitter data. On the other hand, Twitter provides rich social-temporal context beyond texts, bringing about new opportunities. In this paper, we investigate how to exploit social-temporal context for Twitter summarization. In particular, we provide a methodology to model temporal context globally and locally, and propose a novel unsupervised summarization framework with social-temporal context for Twitter data. To assess the proposed framework, we manually label a real-world Twitter dataset. Experimental results from the dataset demonstrate the importance of social-temporal context in Twitter summarization.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: A novel interactive summarization system that is based on abstractive summarization, derived from a recent consolidated knowledge representation for multiple texts, providing a bullet-style summary while allowing to attain the most important information first and interactively drill down to more specific details.
Abstract: We present a novel interactive summarization system that is based on abstractive summarization, derived from a recent consolidated knowledge representation for multiple texts. We incorporate a couple of interaction mechanisms, providing a bullet-style summary while allowing to attain the most important information first and interactively drill down to more specific details. A usability study of our implementation, for event news tweets, suggests the utility of our approach for text exploration.