scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Survey of Sentiment Analysis from Social Media Data

TL;DR: The process of capturing data from social media over the years along with the similarity detection based on similar choices of the users in social networks are addressed.
Abstract: In the current era of automation, machines are constantly being channelized to provide accurate interpretations of what people express on social media. The human race nowadays is submerged in the idea of what and how people think and the decisions taken thereafter are mostly based on the drift of the masses on social platforms. This article provides a multifaceted insight into the evolution of sentiment analysis into the limelight through the sudden explosion of plethora of data on the internet. This article also addresses the process of capturing data from social media over the years along with the similarity detection based on similar choices of the users in social networks. The techniques of communalizing user data have also been surveyed in this article. Data, in its different forms, have also been analyzed and presented as a part of survey in this article. Other than this, the methods of evaluating sentiments have been studied, categorized, and compared, and the limitations exposed in the hope that this shall provide scope for better research in the future.
Citations
More filters
Journal ArticleDOI
01 Dec 2020
TL;DR: The research demonstrates that though people have tweeted mostly positive regarding COVID-19, yet netizens were busy engrossed in re-tweeting the negative tweets and that no useful words could be found in WordCloud or computations using word frequency in tweets.
Abstract: COVID-19 originally known as Corona VIrus Disease of 2019, has been declared as a pandemic by World Health Organization (WHO) on 11th March 2020. Unprecedented pressures have mounted on each country to make compelling requisites for controlling the population by assessing the cases and properly utilizing available resources. The rapid number of exponential cases globally has become the apprehension of panic, fear and anxiety among people. The mental and physical health of the global population is found to be directly proportional to this pandemic disease. The current situation has reported more than twenty four million people being tested positive worldwide as of 27th August, 2020. Therefore, it is the need of the hour to implement different measures to safeguard the countries by demystifying the pertinent facts and information. This paper aims to bring out the fact that tweets containing all handles related to COVID-19 and WHO have been unsuccessful in guiding people around this pandemic outbreak appositely. This study analyzes two types of tweets gathered during the pandemic times. In one case, around twenty three thousand most re-tweeted tweets within the time span from 1st Jan 2019 to 23rd March 2020 have been analyzed and observation says that the maximum number of the tweets portrays neutral or negative sentiments. On the other hand, a dataset containing 226,668 tweets collected within the time span between December 2019 and May 2020 have been analyzed which contrastingly show that there were a maximum number of positive and neutral tweets tweeted by netizens. The research demonstrates that though people have tweeted mostly positive regarding COVID-19, yet netizens were busy engrossed in re-tweeting the negative tweets and that no useful words could be found in WordCloud or computations using word frequency in tweets. The claims have been validated through a proposed model using deep learning classifiers with admissible accuracy up to 81%. Apart from these the authors have proposed the implementation of a Gaussian membership function based fuzzy rule base to correctly identify sentiments from tweets. The accuracy for the said model yields up to a permissible rate of 79%.

226 citations


Cites background from "A Survey of Sentiment Analysis from..."

  • ...standing the demanding situation especially on social media [7]....

    [...]

05 Feb 2006
TL;DR: In this article, Naive Bayes Classifiers are used for Naïve Bayes classifiers to train a classifier for classifier training, and the classifier is evaluated.
Abstract: 在Naive Bayes Classifiers模型中,要求父节点下的子节点(特征变量)之间相对独立,然而在现实世界中,特征与特征之间是非独立的、相关的。提出一种预处理方法,实验结果表明,该方法明显地提高了分类精度。

213 citations

Book Chapter
01 Dec 2001
TL;DR: In this article, a summary of the issues discussed during the one day workshop on SVM Theory and Applications organized as part of the Advanced Course on Artificial Intelligence (ACAI ’99) in Chania, Greece is presented.
Abstract: This chapter presents a summary of the issues discussed during the one day workshop on “Support Vector Machines (SVM) Theory and Applications” organized as part of the Advanced Course on Artificial Intelligence (ACAI ’99) in Chania, Greece [19]. The goal of the chapter is twofold: to present an overview of the background theory and current understanding of SVM, and to discuss the papers presented as well as the issues that arose during the workshop.

170 citations

Journal ArticleDOI
TL;DR: A comparative review of state-of-the-art deep learning methods is provided and several commonly used benchmark data sets, evaluation metrics, and the performance of the existingDeep learning methods are introduced.
Abstract: Sentiment analysis is a process of analyzing, processing, concluding, and inferencing subjective texts with the sentiment. Companies use sentiment analysis for understanding public opinion, performing market research, analyzing brand reputation, recognizing customer experiences, and studying social media influence. According to the different needs for aspect granularity, it can be divided into document, sentence, and aspect-based ones. This article summarizes the recently proposed methods to solve an aspect-based sentiment analysis problem. At present, there are three mainstream methods: lexicon-based, traditional machine learning, and deep learning methods. In this survey article, we provide a comparative review of state-of-the-art deep learning methods. Several commonly used benchmark data sets, evaluation metrics, and the performance of the existing deep learning methods are introduced. Finally, existing problems and some future research directions are presented and discussed.

59 citations


Cites background from "A Survey of Sentiment Analysis from..."

  • ...many statistical surveys and studies conducted in this area [82]....

    [...]

Journal ArticleDOI
01 Jul 2020
TL;DR: This article aims to summarize the recent advancement in the fake account detection methodology on social networking websites, and discusses the challenges and limitations of the existing models in brief.
Abstract: This article aims to summarize the recent advancement in the fake account detection methodology on social networking websites. Over the past decade, social networking websites have received huge attention from users all around the world. As a result, popular websites such as Facebook, Twitter, LinkedIn, Instagram, and others saw an unexpected rise in registered users. However, researchers claim that all registered accounts are not real; many of them are fake and created for specific purposes. The primary purpose of fake accounts is to spread spam content, rumor, and other unauthentic messages on the platform. Hence, it is needed to filter out the fake accounts, but it has many challenges. In the past few years, researchers applied many advanced technologies to identify fake accounts. In the survey presented in this article, we summarize the recent development of fake account detection technologies. We discuss the challenges and limitations of the existing models in brief. The survey may help future researchers to identify the gaps in the current literature and develop a generalized framework for fake profile detection on social networking websites.

30 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

38,681 citations

01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

36,760 citations


"A Survey of Sentiment Analysis from..." refers methods in this paper

  • ...With the recent outburst of social data, problem of finding recurrent itemsets in large data has been solved by MapReduce model [23], which deploys k-means clustering algorithm [24] to preprocess the data and frequent data sets are mined through a priori [25] and Eclat [26] algorithms [27]....

    [...]

Journal ArticleDOI
01 Apr 1998
TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

14,696 citations


"A Survey of Sentiment Analysis from..." refers methods in this paper

  • ...Accordingly, the framework of Google was explained in detail [6] along with its incorporated features of scalability and sturdiness so that they yield perfect search results....

    [...]

Journal ArticleDOI
TL;DR: This work proposes a heuristic method that is shown to outperform all other known community detection methods in terms of computation time and the quality of the communities detected is very good, as measured by the so-called modularity.
Abstract: We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform all other known community detection method in terms of computation time. Moreover, the quality of the communities detected is very good, as measured by the so-called modularity. This is shown first by identifying language communities in a Belgian mobile phone network of 2.6 million customers and by analyzing a web graph of 118 million nodes and more than one billion links. The accuracy of our algorithm is also verified on ad-hoc modular networks. .

13,519 citations


"A Survey of Sentiment Analysis from..." refers background in this paper

  • ...Other notable works that need mention in the community detection field are that of techniques in which heuristics were used to optimize the modularity [48], genetics supported approach...

    [...]

Journal Article
TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.

13,327 citations