scispace - formally typeset
Search or ask a question
Journal Article•DOI•

Machine Learning Techniques for Sentiment Analysis: A Review

About: This article is published in SAMRIDDHI : A Journal of Physical Sciences, Engineering and Technology.The article was published on 2020-12-30 and is currently open access. It has received 58 citations till now. The article focuses on the topics: Sentiment analysis.
Citations
More filters
Journal Article•DOI•
TL;DR: A new sentiment analysis model-SLCABG, which is based on the sentiment lexicon and combines Convolutional Neural Network (CNN) and attention-based Bidirectional Gated Recurrent Unit (BiGRU).
Abstract: In recent years, with the rapid development of Internet technology, online shopping has become a mainstream way for users to purchase and consume. Sentiment analysis of a large number of user reviews on e-commerce platforms can effectively improve user satisfaction. This paper proposes a new sentiment analysis model-SLCABG, which is based on the sentiment lexicon and combines Convolutional Neural Network (CNN) and attention-based Bidirectional Gated Recurrent Unit (BiGRU). In terms of methods, the SLCABG model combines the advantages of sentiment lexicon and deep learning technology, and overcomes the shortcomings of existing sentiment analysis model of product reviews. The SLCABG model combines the advantages of the sentiment lexicon and deep learning techniques. First, the sentiment lexicon is used to enhance the sentiment features in the reviews. Then the CNN and the Gated Recurrent Unit (GRU) network are used to extract the main sentiment features and context features in the reviews and use the attention mechanism to weight. And finally classify the weighted sentiment features. In terms of data, this paper crawls and cleans the real book evaluation of dangdang.com, a famous Chinese e-commerce website, for training and testing, all of which are based on Chinese. The scale of the data has reached 100000 orders of magnitude, which can be widely used in the field of Chinese sentiment analysis. The experimental results show that the model can effectively improve the performance of text sentiment analysis.

242 citations


Cites methods from "Machine Learning Techniques for Sen..."

  • ...Traditional machine learning methods commonly used include naive bayes, support vector machine, maximum entropy, random forest and conditional random fields model [18], [19]....

    [...]

Journal Article•DOI•
TL;DR: This systematic overview of affective computing systematically review recent advances, survey and taxonomize state-of-the-art unimodal affects recognition and multimodal affective analysis in terms of their detailed architectures and performances, and concludes with an indication of the most promising future directions.

61 citations

Journal Article•DOI•
TL;DR: Several machine learning classification techniques are used to predict the software defects in twelve widely used NASA datasets and can be used as a baseline for other researches so that any claim regarding the improvement in prediction through any new technique, model or framework can be compared and verified.
Abstract: Defect prediction at early stages of software development life cycle is a crucial activity of quality assurance process and has been broadly studied in the last two decades. The early prediction of defective modules in developing software can help the development team to utilize the available resources efficiently and effectively to deliver high quality software product in limited time. Until now, many researchers have developed defect prediction models by using machine learning and statistical techniques. Machine learning approach is an effective way to identify the defective modules, which works by extracting the hidden patterns among software attributes. In this study, several machine learning classification techniques are used to predict the software defects in twelve widely used NASA datasets. The classification techniques include: Naive Bayes (NB), Multi-Layer Perceptron (MLP). Radial Basis Function (RBF), Support Vector Machine (SVM), K Nearest Neighbor (KNN), kStar (K*), One Rule (OneR), PART, Decision Tree (DT), and Random Forest (RF). Performance of used classification techniques is evaluated by using various measures such as: Precision, Recall, F-Measure, Accuracy, MCC, and ROC Area. The detailed results in this research can be used as a baseline for other researches so that any claim regarding the improvement in prediction through any new technique, model or framework can be compared and verified.

59 citations


Cites methods from "Machine Learning Techniques for Sen..."

  • ...During the training process these techniques make rules to classify the unseen data (test data) [18-19], [20-23], [26-27]....

    [...]

Journal Article•DOI•
TL;DR: This study proposes a technique to tune the SVM performance by using grid search method for sentiment analysis and performance of proposed technique is evaluated using three information retrieval metrics: precision, recall and f-measure.
Abstract: Exponential growth in mobile technology and mini computing devices has led to a massive increment in social media users, who are continuously posting their views and comments about certain products and services, which are in their use. These views and comments can be extremely beneficial for the companies which are interested to know about the public opinion regarding their offered products or services. This type of public opinion otherwise can be obtained via questionnaires and surveys, which is no doubt a difficult and complex task. So, the valuable information in the form of comments and posts from micro-blogging sites can be used by the companies to eliminate the flaws and to improve the products or services according to customer needs. However, extracting a general opinion out of a staggering number of users’ comments manually cannot be feasible. A solution to this is to use an automatic method for sentiment mining. Support Vector Machine (SVM) is one of the widely used classification techniques for polarity detection from textual data. This study proposes a technique to tune the SVM performance by using grid search method for sentiment analysis. In this paper, three datasets are used for the experiment and performance of proposed technique is evaluated using three information retrieval metrics: precision, recall and f-measure.

53 citations


Cites background or methods from "Machine Learning Techniques for Sen..."

  • ...Today, there are fundamentally three approaches available to extract the opinion from text: Lexicon driven [1], machine learning based [2] and finally the hybrid of both [3]....

    [...]

  • ...For training purpose pre-classified/pre-labeled data (training data) is used and then it can be capable to classify the real input data (test data) [2], [4], [5]....

    [...]

Journal Article•DOI•
TL;DR: A novel Google App numeric reviews & ratings contradiction prediction framework using Deep Learning approaches is proposed that significantly predicts unbiased star rating of app.
Abstract: Nowadays online reviews play a significant role in influencing the decision of consumers. Consumers show their experience and information about product quality in their reviews. Product Reviews from Amazon to Restaurant Reviews from Yelp are facing problems with fake reviews and fake numeric ratings. Online reviews typically consist of qualitative (text format) and quantitative (rating) formats. In the case of Google Play store fake numeric ratings can play a big role in the success of apps. People tend to believe that a high-star rating may be significantly attached with a good review. However, user star level rating information does not usually match with text format of review. Despite many efforts to resolve this issue, Apple App Store and Google Play Store are still facing this problem. This study proposes a novel Google App numeric reviews & ratings contradiction prediction framework using Deep Learning approaches. The framework consists of two phases. In the first phase, the polarity of reviews are predicted using sentiment analysis tool to build ground truth. In the second phase, star ratings are predicted from text format of reviews after training deep learning models on ground truth obtained in the first phase. Experimental results demonstrate that based on actual user reviews the proposed framework significantly predicts unbiased star rating of app.

52 citations

References
More filters
Journal Article•DOI•
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

Book•
Bo Pang1, Lillian Lee2•
08 Jul 2008
TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.
Abstract: An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

7,452 citations

01 Jan 2002
TL;DR: In this paper, the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, was considered and three machine learning methods (Naive Bayes, maximum entropy classiflcation, and support vector machines) were employed.
Abstract: We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we flnd that standard machine learning techniques deflnitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classiflcation, and support vector machines) do not perform as well on sentiment classiflcation as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classiflcation problem more challenging.

6,980 citations

Proceedings Article•DOI•
06 Jul 2002
TL;DR: This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.
Abstract: We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging.

6,626 citations

Proceedings Article•
01 May 2010
TL;DR: This paper shows how to automatically collect a corpus for sentiment analysis and opinion mining purposes and builds a sentiment classifier, that is able to determine positive, negative and neutral sentiments for a document.
Abstract: Microblogging today has become a very popular communication tool among Internet users. Millions of users share opinions on different aspects of life everyday. Therefore microblogging web-sites are rich sources of data for opinion mining and sentiment analysis. Because microblogging has appeared relatively recently, there are a few research works that were devoted to this topic. In our paper, we focus on using Twitter, the most popular microblogging platform, for the task of sentiment analysis. We show how to automatically collect a corpus for sentiment analysis and opinion mining purposes. We perform linguistic analysis of the collected corpus and explain discovered phenomena. Using the corpus, we build a sentiment classifier, that is able to determine positive, negative and neutral sentiments for a document. Experimental evaluations show that our proposed techniques are efficient and performs better than previously proposed methods. In our research, we worked with English, however, the proposed technique can be used with any other language.

2,570 citations