Information Extraction and Sentiment Analysis to Gain Insight into the COVID-19 Crisis

doi:10.1007/978-981-16-2594-7_28

Home
/
Papers
/
Information Extraction and Sentiment Analysis to Gain Insight into the COVID-19 Crisis

Book Chapter•DOI•

Information Extraction and Sentiment Analysis to Gain Insight into the COVID-19 Crisis

Sandhya Avasthi, Ritu Chauhan, D. P. Acharjya

01 Jan 2022-pp 343-353

About: The article was published on 2022-01-01. It has received 8 citations till now. The article focuses on the topics: Sentiment analysis & Information extraction.

...read moreread less

Citations

PDF

Open Access

More filters

Book Chapter•DOI•

Augmenting Mental Healthcare With Artificial Intelligence, Machine Learning, and Challenges in Telemedicine

[...]

Sandhya Avasthi, Tanushree Sanwal, Puja Sareen, Suman Lata Tripathi

01 Jan 2022-Advances in computational intelligence and robotics book series

TL;DR: The goal of this chapter is to review the literature on artificial intelligence and machine learning algorithms for detecting a person's mental health by utilizing patient health records and explains the use of artificial intelligence in curing and monitoring a patient with mental illness through telemedicine.

...read moreread less

Abstract: Artificial intelligence is a huge part of the healthcare industry, having applications and uses in oncology, cardiology, dermatology, and many other fields. Another area where AI is constantly attempting to improve is mental healthcare by integrating machine learning to evaluate data generated by mobile and IoT devices. AI aids in the diagnosis and tailoring of therapy for mentally ill individuals at various stages. The artificial intelligence and machine learning methods utilize electronic health records, mood rating scales, brain images, mobile devices monitoring data in prediction, classification, and grouping of mental health issues, mainly psychiatric illness, suicide attempts, schizophrenia, and depression. The goal of this chapter is to review the literature on artificial intelligence and machine learning algorithms for detecting a person's mental health by utilizing patient health records. In addition, the chapter explains the use of artificial intelligence in curing and monitoring a patient with mental illness through telemedicine.

...read moreread less

6 citations

Journal Article•DOI•

Extracting information and inferences from a large text corpus

[...]

Sandhya Avasthi, Ritu Chauhan, D. P. Acharjya

20 Nov 2022-International journal of information technology

TL;DR: In this article , an incremental topic model with word embedding (ITMWE) is proposed that processes large text data in an incremental environment and extracts latent topics that best describe the document collections.

...read moreread less

Abstract: The usage of various software applications has grown tremendously due to the onset of Industry 4.0, giving rise to the accumulation of all forms of data. The scientific, biological, and social media text collections demand efficient machine learning methods for data interpretability, which organizations need in decision-making of all sorts. The topic models can be applied in text mining of biomedical articles, scientific articles, Twitter data, and blog posts. This paper analyzes and provides a comparison of the performance of Latent Dirichlet Allocation (LDA), Dynamic Topic Model (DTM), and Embedded Topic Model (ETM) techniques. An incremental topic model with word embedding (ITMWE) is proposed that processes large text data in an incremental environment and extracts latent topics that best describe the document collections. Experiments in both offline and online settings on large real-world document collections such as CORD-19, NIPS papers, and Tweet datasets show that, while LDA and DTM is a good model for discovering word-level topics, ITMWE discovers better document-level topic groups more efficiently in a dynamic environment, which is crucial in text mining applications.

...read moreread less

2 citations

Book Chapter•DOI•

Significance of Preprocessing Techniques on Text Classification Over Hindi and English Short Texts

[...]

Faculty¹•Institutions (1)

Amity University¹

01 Jan 2022

TL;DR: In this article , a preprocessing framework combines all the best preprocessing techniques for both the languages Hindi and English for text classification, and experiments on Tweets, Movie Reviews, and Product Reviews datasets reveal that selecting optimal pairings of preprocessing tasks rather than allowing or deleting them fully can significantly improve classification accuracy.

...read moreread less

Abstract: Good quality text data gives a better interpretation of results and provides efficient learning models. The real-world text data comes with many irregularities and errors that need good preprocessing strategies to get quality text corpus. The preprocessing steps before performing text classification of documents involve various steps of transformation and manipulation of the documents. The proposed preprocessing framework combines all the best preprocessing techniques for both the languages Hindi and English. This study will look at the significance of preprocessing methods on text classification from different perspectives including classification accuracy, text language, and feature selection. Experiments on Tweets, Movie Reviews, and Product Reviews datasets reveal that, depending on the domain and language under consideration, selecting optimal pairings of preprocessing tasks rather than allowing or deleting them fully can significantly improve classification accuracy. Some preprocessing step is useful in Hindi language texts but not in English language texts, according to our findings.

...read moreread less

1 citations

Journal Article•DOI•

Tourist reviews summarization and sentiment analysis based on aspects

[...]

Sandhya Avasthi, Ayushi Prakash, Tanushree Sanwal, Meenakshi Tyagi, Sapna Yadav - Show less +1 more

19 Jan 2023-Confluence: The Journal of Graduate Liberal Studies

TL;DR: In this paper , a framework is proposed based on extracting coherent aspects from the reviews and applying the extractive summarization method to generate summaries, providing insights into the reviews of tourist attractions by using aspect-based sentiment analysis.

...read moreread less

Abstract: Reading massive amounts of user-generated text and pulling out the relevant aspects and opinions is a complicated process. Summaries, on the other hand, help busy people who only have a little bit of time to read get the gist of the information quickly. Text summarization is the process of taking the original text and making a shorter version of it that still has all of its informational value and main idea. Humans have a hard time summarizing long texts by hand. Different ways to summarize a text can be put into groups based on the more general techniques of extractive and abstractive summarization. The research paper discusses the need for generating aspect-based summaries and Sentiment analysis. A framework is proposed based on extracting coherent aspects from the reviews and applying the extractive summarization method to generate summaries. In addition, providing insights into the reviews of tourist attractions by using aspect-based sentiment analysis. The results are evaluated using crowdsourcing, Fairsumm, and Centroid method. The crowdsourcing method gives the best result on aspect-based summaries.

...read moreread less

1 citations

Book Chapter•DOI•

Depression Analysis of Real Time Tweets During Covid Pandemic

[...]

G. B. Gour

01 Jan 2022

TL;DR: In this article , a real-time framework for the assessment of depression and suicidal tendencies among people due to covid-19 was presented, which gives a better alternate option to reduce the suicidal tendency in covid time with retweeting and other alternate real-to-time ways.

...read moreread less

Abstract: The assessment of depression and suicidal tendencies among people due to covid-19 was less explored. The paper presents the real-time framework for the assessment of depression in covid pandemic. This approach gives a better alternate option to reduce the suicidal tendency in covid time with retweeting and other alternate real-time ways. Hence, the main objective of the present work is, to develop a real time frame-work to analyse sentiment and depression in people due to covid. The experimental investigation is carried out based on real time streamed tweets from twitter adopting lexicon and machine learning (ML) approach. Linear regression, K-nearest neighbor (KNN), Naive Bayes models are trained and tested with 1000 tweets to ascertain the accuracy for the sentiment’s distribution. Comparatively, the decision tree (98.75%) and Naive Bayes (80.33%) have shown better accuracy with the visualisation of data to draw any inferences from sentiments using word cloud.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

Latent dirichlet allocation

[...]

David M. Blei¹, Andrew Y. Ng², Michael I. Jordan¹•Institutions (2)

University of California, Berkeley¹, Stanford University²

01 Mar 2003-Journal of Machine Learning Research

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.

...read moreread less

Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

...read moreread less

30,570 citations

Proceedings Article•

Latent Dirichlet Allocation

[...]

David M. Blei¹, Andrew Y. Ng¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

03 Jan 2001

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

...read moreread less

25,546 citations

Software Framework for Topic Modelling with Large Corpora

[...]

Radim Řehůřek¹, Petr Sojka¹•Institutions (1)

Masaryk University¹

22 May 2010

TL;DR: This work describes a Natural Language Processing software framework which is based on the idea of document streaming, i.e. processing corpora document after document, in a memory independent fashion, and implements several popular algorithms for topical inference, including Latent Semantic Analysis and Latent Dirichlet Allocation in a way that makes them completely independent of the training corpus size.

...read moreread less

Abstract: Large corpora are ubiquitous in today's world and memory quickly becomes the limiting factor in practical applications of the Vector Space Model (VSM). We identify gap in existing VSM implementations, which is their scalability and ease of use. We describe a Natural Language Processing software framework which is based on the idea of document streaming, i.e. processing corpora document after document, in a memory independent fashion. In this framework, we implement several popular algorithms for topical inference, including Latent Semantic Analysis and Latent Dirichlet Allocation, in a way that makes them completely independent of the training corpus size. Particular emphasis is placed on straightforward and intuitive framework design, so that modifications and extensions of the methods and/or their application by interested practitioners are effortless. We demonstrate the usefulness of our approach on a real-world scenario of computing document similarities within an existing digital library DML-CZ.

...read moreread less

3,965 citations

Proceedings Article•

VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text

[...]

Clayton J. Hutto¹, Eric Gilbert¹•Institutions (1)

Georgia Institute of Technology¹

16 May 2014

TL;DR: Interestingly, using the authors' parsimonious rule-based model to assess the sentiment of tweets, it is found that VADER outperforms individual human raters, and generalizes more favorably across contexts than any of their benchmarks.

...read moreread less

Abstract: The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly, using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.

...read moreread less

3,299 citations

Proceedings Article•DOI•

Dynamic topic models

[...]

David M. Blei¹, John Lafferty²•Institutions (2)

Princeton University¹, Carnegie Mellon University²

25 Jun 2006

TL;DR: A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection.

...read moreread less

Abstract: A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections. The approach is to use state space models on the natural parameters of the multinomial distributions that represent the topics. Variational approximations based on Kalman filters and nonparametric wavelet regression are developed to carry out approximate posterior inference over the latent topics. In addition to giving quantitative, predictive models of a sequential corpus, dynamic topic models provide a qualitative window into the contents of a large document collection. The models are demonstrated by analyzing the OCR'ed archives of the journal Science from 1880 through 2000.

...read moreread less

2,410 citations