scispace - formally typeset
Search or ask a question
Author

Rajib Bag

Bio: Rajib Bag is an academic researcher. The author has contributed to research in topics: Salt-and-pepper noise & Impulse noise. The author has an hindex of 8, co-authored 29 publications receiving 207 citations.

Papers
More filters
Journal ArticleDOI
01 Dec 2020
TL;DR: The research demonstrates that though people have tweeted mostly positive regarding COVID-19, yet netizens were busy engrossed in re-tweeting the negative tweets and that no useful words could be found in WordCloud or computations using word frequency in tweets.
Abstract: COVID-19 originally known as Corona VIrus Disease of 2019, has been declared as a pandemic by World Health Organization (WHO) on 11th March 2020. Unprecedented pressures have mounted on each country to make compelling requisites for controlling the population by assessing the cases and properly utilizing available resources. The rapid number of exponential cases globally has become the apprehension of panic, fear and anxiety among people. The mental and physical health of the global population is found to be directly proportional to this pandemic disease. The current situation has reported more than twenty four million people being tested positive worldwide as of 27th August, 2020. Therefore, it is the need of the hour to implement different measures to safeguard the countries by demystifying the pertinent facts and information. This paper aims to bring out the fact that tweets containing all handles related to COVID-19 and WHO have been unsuccessful in guiding people around this pandemic outbreak appositely. This study analyzes two types of tweets gathered during the pandemic times. In one case, around twenty three thousand most re-tweeted tweets within the time span from 1st Jan 2019 to 23rd March 2020 have been analyzed and observation says that the maximum number of the tweets portrays neutral or negative sentiments. On the other hand, a dataset containing 226,668 tweets collected within the time span between December 2019 and May 2020 have been analyzed which contrastingly show that there were a maximum number of positive and neutral tweets tweeted by netizens. The research demonstrates that though people have tweeted mostly positive regarding COVID-19, yet netizens were busy engrossed in re-tweeting the negative tweets and that no useful words could be found in WordCloud or computations using word frequency in tweets. The claims have been validated through a proposed model using deep learning classifiers with admissible accuracy up to 81%. Apart from these the authors have proposed the implementation of a Gaussian membership function based fuzzy rule base to correctly identify sentiments from tweets. The accuracy for the said model yields up to a permissible rate of 79%.

226 citations

Journal ArticleDOI
TL;DR: The process of capturing data from social media over the years along with the similarity detection based on similar choices of the users in social networks are addressed.
Abstract: In the current era of automation, machines are constantly being channelized to provide accurate interpretations of what people express on social media. The human race nowadays is submerged in the idea of what and how people think and the decisions taken thereafter are mostly based on the drift of the masses on social platforms. This article provides a multifaceted insight into the evolution of sentiment analysis into the limelight through the sudden explosion of plethora of data on the internet. This article also addresses the process of capturing data from social media over the years along with the similarity detection based on similar choices of the users in social networks. The techniques of communalizing user data have also been surveyed in this article. Data, in its different forms, have also been analyzed and presented as a part of survey in this article. Other than this, the methods of evaluating sentiments have been studied, categorized, and compared, and the limitations exposed in the hope that this shall provide scope for better research in the future.

82 citations

Journal ArticleDOI
09 Aug 2020
TL;DR: A deep learning-based ‘You Only Look Once (YOLO)’ algorithm, which is based on the application of DCNNs to detect melanoma from dermoscopic and digital images and offer faster and more precise output as compared to conventional CNNs.
Abstract: Melanoma or malignant melanoma is a type of skin cancer that develops when melanocyte cells, damaged by excessive exposure to harmful UV radiations, start to grow out of control. Though less common than some other kinds of skin cancers, it is more dangerous because it rapidly metastasizes if not diagnosed and treated at an early stage. The distinction between benign and melanocytic lesions could at times be perplexing, but the manifestations of the disease could fairly be distinguished by a skilled study of its histopathological and clinical features. In recent years, deep convolutional neural networks (DCNNs) have succeeded in achieving more encouraging results yet faster and computationally effective systems for detection of the fatal disease are the need of the hour. This paper presents a deep learning-based ‘You Only Look Once (YOLO)’ algorithm, which is based on the application of DCNNs to detect melanoma from dermoscopic and digital images and offer faster and more precise output as compared to conventional CNNs. In terms with the location of the identified object in the cell, this network predicts the bounding box of the detected object and the class confidence score. The highlight of the paper, however, lies in its infusion of certain resourceful concepts like two phase segmentation done by a combination of the graph theory using minimal spanning tree concept and L-type fuzzy number based approximations and mathematical extraction of the actual affected area of the lesion region during feature extraction process. Experimented on a total of 20250 images from three publicly accessible datasets—PH2, International Symposium on Biomedical Imaging (ISBI) 2017 and The International Skin Imaging Collaboration (ISIC) 2019, encouraging results have been obtained. It achieved a Jac score of 79.84% on ISIC 2019 dataset and 86.99% and 88.64% on ISBI 2017 and PH2 datasets, respectively. Upon comparison of the pre-defined parameters with recent works in this area yielded comparatively superior output in most cases.

53 citations

Book ChapterDOI
22 Feb 2018
TL;DR: Google’s algorithm Word2Vec has been applied on a large movie review dataset to classify text so that the semantic associations between the terms stay conserved, and a comparative study of the performances of some notable clustering algorithms is demonstrated.
Abstract: This paper provides an insight to one of the recent additions in the turf of Machine Learning culture - the process of learning representation or features, known as Deep Learning It is highly anticipated that Deep Learning will fare much better than the traditional machine learning algorithms not only because of scalability but also of its ability to perform automatic feature extraction from raw data This paper deals with the analyzing of sentiments on a set of movie reviews, which is considered to be the most demanding facet of NLP’s In this paper, Google’s algorithm Word2Vec has been applied on a large movie review dataset to classify text so that the semantic associations between the terms stay conserved A comparative study of the performances of some notable clustering algorithms is demonstrated concerning their application involving a variable number of features and classifier types as well as variable number of clusters

27 citations


Cited by
More filters
Journal ArticleDOI
01 Dec 2020
TL;DR: The research demonstrates that though people have tweeted mostly positive regarding COVID-19, yet netizens were busy engrossed in re-tweeting the negative tweets and that no useful words could be found in WordCloud or computations using word frequency in tweets.
Abstract: COVID-19 originally known as Corona VIrus Disease of 2019, has been declared as a pandemic by World Health Organization (WHO) on 11th March 2020. Unprecedented pressures have mounted on each country to make compelling requisites for controlling the population by assessing the cases and properly utilizing available resources. The rapid number of exponential cases globally has become the apprehension of panic, fear and anxiety among people. The mental and physical health of the global population is found to be directly proportional to this pandemic disease. The current situation has reported more than twenty four million people being tested positive worldwide as of 27th August, 2020. Therefore, it is the need of the hour to implement different measures to safeguard the countries by demystifying the pertinent facts and information. This paper aims to bring out the fact that tweets containing all handles related to COVID-19 and WHO have been unsuccessful in guiding people around this pandemic outbreak appositely. This study analyzes two types of tweets gathered during the pandemic times. In one case, around twenty three thousand most re-tweeted tweets within the time span from 1st Jan 2019 to 23rd March 2020 have been analyzed and observation says that the maximum number of the tweets portrays neutral or negative sentiments. On the other hand, a dataset containing 226,668 tweets collected within the time span between December 2019 and May 2020 have been analyzed which contrastingly show that there were a maximum number of positive and neutral tweets tweeted by netizens. The research demonstrates that though people have tweeted mostly positive regarding COVID-19, yet netizens were busy engrossed in re-tweeting the negative tweets and that no useful words could be found in WordCloud or computations using word frequency in tweets. The claims have been validated through a proposed model using deep learning classifiers with admissible accuracy up to 81%. Apart from these the authors have proposed the implementation of a Gaussian membership function based fuzzy rule base to correctly identify sentiments from tweets. The accuracy for the said model yields up to a permissible rate of 79%.

226 citations

05 Feb 2006
TL;DR: In this article, Naive Bayes Classifiers are used for Naïve Bayes classifiers to train a classifier for classifier training, and the classifier is evaluated.
Abstract: 在Naive Bayes Classifiers模型中,要求父节点下的子节点(特征变量)之间相对独立,然而在现实世界中,特征与特征之间是非独立的、相关的。提出一种预处理方法,实验结果表明,该方法明显地提高了分类精度。

213 citations

Book Chapter
01 Dec 2001
TL;DR: In this article, a summary of the issues discussed during the one day workshop on SVM Theory and Applications organized as part of the Advanced Course on Artificial Intelligence (ACAI ’99) in Chania, Greece is presented.
Abstract: This chapter presents a summary of the issues discussed during the one day workshop on “Support Vector Machines (SVM) Theory and Applications” organized as part of the Advanced Course on Artificial Intelligence (ACAI ’99) in Chania, Greece [19]. The goal of the chapter is twofold: to present an overview of the background theory and current understanding of SVM, and to discuss the papers presented as well as the issues that arose during the workshop.

170 citations

Journal ArticleDOI
25 Feb 2021-PLOS ONE
TL;DR: In this paper, the authors performed Covid-19 tweets sentiment analysis using a supervised machine learning approach using a bag-of-words and the term frequency-inverse document frequency.
Abstract: The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.

144 citations

Journal ArticleDOI
TL;DR: This paper used topic identification and sentiment analysis to explore a large number of tweets in both countries with a high number of spreading and deaths by COVID-19, Brazil, and the USA.

139 citations