scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Sentiment Analysis of Movie Reviews using Machine Learning Classifiers

11 Apr 2019-International Journal of Computer Applications (Foundation of Computer Science (FCS), NY, USA)-Vol. 182, Iss: 50, pp 25-28
TL;DR: The main focus is to anatomize the reviews conveyed by viewers on various movies and to use this analysis to understand the customers’ sentiments and market behaviour for better customer experience.
Abstract: In today’s world, it has become customary to collect opinions and reviews from people through various surveys, polls, social media platform and analyse them in order to understand the preferences of customers. So, in order to understand the sentiments of customers and their view on the services offered by producers, there comes the need for an accurate and canonical mechanism for speculating and anticipating sentiments which possess the ability to fabricate a positive or negative impact in the market and thus making this kind of analysis important for the pair of producers and consumers. In this paper, the main focus is to anatomize the reviews conveyed by viewers on various movies and to use this analysis to understand the customers’ sentiments and market behaviour for better customer experience.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors investigated the effectiveness of e-learning by analyzing the sentiments of people about online learning using a Twitter dataset containing 17,155 tweets about elearning and found that uncertainty of campus opening date, children's disabilities to grasp online education, and lagging efficient networks for online education are the top three problems.
Abstract: Amid the worldwide COVID-19 pandemic lockdowns, the closure of educational institutes leads to an unprecedented rise in online learning. For limiting the impact of COVID-19 and obstructing its widespread, educational institutions closed their campuses immediately and academic activities are moved to e-learning platforms. The effectiveness of e-learning is a critical concern for both students and parents, specifically in terms of its suitability to students and teachers and its technical feasibility with respect to different social scenarios. Such concerns must be reviewed from several aspects before e-learning can be adopted at such a larger scale. This study endeavors to investigate the effectiveness of e-learning by analyzing the sentiments of people about e-learning. Due to the rise of social media as an important mode of communication recently, people’s views can be found on platforms such as Twitter, Instagram, Facebook, etc. This study uses a Twitter dataset containing 17,155 tweets about e-learning. Machine learning and deep learning approaches have shown their suitability, capability, and potential for image processing, object detection, and natural language processing tasks and text analysis is no exception. Machine learning approaches have been largely used both for annotation and text and sentiment analysis. Keeping in view the adequacy and efficacy of machine learning models, this study adopts TextBlob, VADER (Valence Aware Dictionary for Sentiment Reasoning), and SentiWordNet to analyze the polarity and subjectivity score of tweets’ text. Furthermore, bearing in mind the fact that machine learning models display high classification accuracy, various machine learning models have been used for sentiment classification. Two feature extraction techniques, TF-IDF (Term Frequency-Inverse Document Frequency) and BoW (Bag of Words) have been used to effectively build and evaluate the models. All the models have been evaluated in terms of various important performance metrics such as accuracy, precision, recall, and F1 score. The results reveal that the random forest and support vector machine classifier achieve the highest accuracy of 0.95 when used with Bow features. Performance comparison is carried out for results of TextBlob, VADER, and SentiWordNet, as well as classification results of machine learning models and deep learning models such as CNN (Convolutional Neural Network), LSTM (Long Short Term Memory), CNN-LSTM, and Bi-LSTM (Bidirectional-LSTM). Additionally, topic modeling is performed to find the problems associated with e-learning which indicates that uncertainty of campus opening date, children’s disabilities to grasp online education, and lagging efficient networks for online education are the top three problems.

62 citations

Journal ArticleDOI
25 Apr 2021
TL;DR: An aspect-based analysis of reviews on hotels is carried out, which will make it easier for tourists to determine the right hotel based on the best category aspects and gets better accuracy than the classification using one algorithm on multi-aspect data.
Abstract: In the hotel tourism sector, of course, it cannot be separated from the role of social media because tourists tend to share experiences about services and products offered by a hotel, such as adding pictures, reviews, and ratings which will be helpful as references for other tourists, for example on the media online TripAdvisor. However, tourists' many experiences regarding a hotel make some people feel confused in determining the right hotel to visit. Therefore, in this study, an aspect-based analysis of reviews on hotels is carried out, which will make it easier for tourists to determine the right hotel based on the best category aspects. The dataset used is the TripAdvisor Hotel Reviews dataset which is already on the Kaggle website. And has five aspects, namely Room, Location, Cleanliness, Registration, and Service. A review analysis was carried out into positive and negative categories using the Random Forest, SVM, and Naive Bayes based Hybrid Classifier methods to solve this problem. In this study the Hybrid Classifier method gets better accuracy than the classification using one algorithm on multi-aspect data, namely the Hybrid Classifier got an average accuracy 84%, Naive Bayes got an average accuracy 82.4%, Random Forest got an average accuracy 82.2%, and use SVM got an average accuracy 81%

2 citations


Cites background from "Sentiment Analysis of Movie Reviews..."

  • ..., the maximum number of similar courses produced by various trees is considered to be output from Random Forest [12]....

    [...]

Journal ArticleDOI
13 Feb 2020
TL;DR: A machine learning model is proposed to improve classification accuracy through using hybrid feature selection method Chi-square+BCA plus wrapper-based binary coordinate ascent (Chi-2 + BCA) to select optimal subset of features from term frequency-inverse document frequency (TF-IDF) generated features for classification through support vector machine (SVM), and Bag of words generate features for logistic regression (LR) classifiers using different n-gram ranges.
Abstract: Nowadays, people from every part of the world use social media and social networks to express their feelings toward different topics and aspects. One of the trendiest social media is Twitter, which is a microblogging website that provides a platform for its users to share their views and feelings about products, services, events, etc., in public. Which makes Twitter one of the most valuable sources for collecting and analyzing data by researchers and developers to reveal people sentiment about different topics and services, such as products of commercial companies, services, well-known people such as politicians and athletes, through classifying those sentiments into positive and negative. Classification of people sentiment could be automated through using machine learning algorithms and could be enhanced through using appropriate feature selection methods. We collected most recent tweets about (Amazon, Trump, Chelsea FC, CR7) using Twitter-Application Programming Interface and assigned sentiment score using lexicon rule-based approach, then proposed a machine learning model to improve classification accuracy through using hybrid feature selection method, namely, filter-based feature selection method Chi-square (Chi-2) plus wrapper-based binary coordinate ascent (Chi-2 + BCA) to select optimal subset of features from term frequency-inverse document frequency (TF-IDF) generated features for classification through support vector machine (SVM), and Bag of words generated features for logistic regression (LR) classifiers using different n-gram ranges. After comparing the hybrid (Chi-2+BCA) method with (Chi-2) selected features, and also with the classifiers without feature subset selection, results show that the hybrid feature selection method increases classification accuracy in all cases. The maximum attained accuracy with LR is 86.55% using (1 + 2 + 3-g) range, with SVM is 85.575% using the unigram range, both in the CR7 dataset.

1 citations


Cites background from "Sentiment Analysis of Movie Reviews..."

  • ...[14], tweets have been gathered from Twitter’s API first....

    [...]

Book ChapterDOI
17 Dec 2020
TL;DR: In this paper, a machine learning-based intelligent system was developed to evaluate the ratings from users' reviews and provide a reflection about the products which are popular simply by analyzing those reviews.
Abstract: When it comes to making decisions on which product to buy, knowing the overall reviews from other users becomes very helpful. Evaluating this task from user ratings is so simple. Although a machine can be used for evaluating the recommendations, simply by calculating its user ratings, sometimes it becomes difficult to provide accurate and efficient results. As therefore, evaluating users’ comments usually leads to assigning humans to read all the comments one by one and then let them decide on how useful the product seems. This is a tedious process which wastes our valuable time and resources due to no way of automating the process. On the other hand, selecting the most valuable product from an enormous number of reviews becomes a hectic task for the consumers. Considering all of the above, we have developed a machine learning based intelligent system which not only evaluates the ratings from users’ reviews but also provides a reflection about the products which are popular simply by analyzing those reviews.
Posted Content
TL;DR: The authors used Logistic Regression, Random Forest, Naive Bayes (NB), and Support Vector Machines (SVM) classifiers to detect sentiment in poems written in Misurata Arabic sub-dialect spoken in Misraata, Libya, and found that the traditional classifiers score a higher level of accuracy as compared to Mazajak which is built on an algorithm that includes deep learning techniques.
Abstract: Over the recent decades, there has been a significant increase and development of resources for Arabic natural language processing. This includes the task of exploring Arabic Language Sentiment Analysis (ALSA) from Arabic utterances in both Modern Standard Arabic (MSA) and different Arabic dialects. This study focuses on detecting sentiment in poems written in Misurata Arabic sub-dialect spoken in Misurata, Libya. The tools used to detect sentiment from the dataset are Sklearn as well as Mazajak sentiment tool 1. Logistic Regression, Random Forest, Naive Bayes (NB), and Support Vector Machines (SVM) classifiers are used with Sklearn, while the Convolutional Neural Network (CNN) is implemented with Mazajak. The results show that the traditional classifiers score a higher level of accuracy as compared to Mazajak which is built on an algorithm that includes deep learning techniques. More research is suggested to analyze Arabic sub-dialect poetry in order to investigate the aspects that contribute to sentiments in these multi-line texts; for example, the use of figurative language such as metaphors.
References
More filters
Journal ArticleDOI
TL;DR: A hybrid approach that involves a sentiment analyzer that includes machine learning and a comparison of techniques of sentiment analysis in the analysis of political views by applying supervised machine-learning algorithms such as Naive Bayes and support vector machines (SVM).
Abstract: Growth in the area of opinion mining and sentiment analysis has been rapid and aims to explore the opinions or text present on different platforms of social media through machine-learning techniques with sentiment, subjectivity analysis or polarity calculations. Despite the use of various machine-learning techniques and tools for sentiment analysis during elections, there is a dire need for a state-of-the-art approach. To deal with these challenges, the contribution of this paper includes the adoption of a hybrid approach that involves a sentiment analyzer that includes machine learning. Moreover, this paper also provides a comparison of techniques of sentiment analysis in the analysis of political views by applying supervised machine-learning algorithms such as Naive Bayes and support vector machines (SVM).

289 citations

Proceedings Article
08 Jul 2012
TL;DR: This work uses character-level translation trained on n-gram-character-aligned bitexts and tuned using word-level BLEU to augment with character-based transliteration at the word level and combine with a word- level translation model.
Abstract: We propose several techniques for improving statistical machine translation between closely-related languages with scarce resources. We use character-level translation trained on n-gram-character-aligned bitexts and tuned using word-level BLEU, which we further augment with character-based transliteration at the word level and combine with a word-level translation model. The evaluation on Macedonian-Bulgarian movie subtitles shows an improvement of 2.84 BLEU points over a phrase-based word-level baseline.

119 citations

Journal ArticleDOI
TL;DR: This paper aims to review some papers regarding research in sentiment analysis on Twitter, describing the methodologies adopted and models applied, along with describing a generalized Python based approach.
Abstract: Twitter is a platform widely used by people to express their opinions and display sentiments on different occasions. Sentiment analysis is an approach to analyze data and retrieve sentiment that it embodies. Twitter sentiment analysis is an application of sentiment analysis on data from Twitter (tweets), in order to extract sentiments conveyed by the user. In the past decades, the research in this field has consistently grown. The reason behind this is the challenging format of the tweets which makes the processing difficult. The tweet format is very small which generates a whole new dimension of problems like use of slang, abbreviations etc. In this paper, we aim to review some papers regarding research in sentiment analysis on Twitter, describing the methodologies adopted and models applied, along with describing a generalized Python based approach.

77 citations

Proceedings ArticleDOI
29 Apr 2016
TL;DR: This paper presents the sentiment analysis process to classify highly unstructured data on Twitter and discusses various techniques to carryout sentiment analysis on Twitter data in detail and presents the parametric comparison of the discussed techniques based on the identified parameters.
Abstract: The World Wide Web has intensely evolved a novel way for people to express their views and opinions about different topics, trends and issues. The user-generated content present on different mediums such as internet forums, discussion groups, and blogs serves a concrete and substantial base for decision making in various fields such as advertising, political polls, scientific surveys, market prediction and business intelligence. Sentiment analysis relates to the problem of mining the sentiments from online available data and categorizing the opinion expressed by an author towards a particular entity into at most three preset categories: positive, negative and neutral. In this paper, firstly we present the sentiment analysis process to classify highly unstructured data on Twitter. Secondly, we discuss various techniques to carryout sentiment analysis on Twitter data in detail. Moreover, we present the parametric comparison of the discussed techniques based on our identified parameters.

54 citations

Journal Article
TL;DR: In this paper, sentiment primitive was introduced into sentiment terms identification, and then semantic values of phrases were obtained and the improved method based on semantic comprehension was proposed.
Abstract: Text sentiment classification is used widely,such as information filtering,information security and information recommendation.This paper proposed an improved method based on semantic comprehension.In this paper,sentiment primitive was introduced into sentiment terms identification,and then semantic values of phrases were obtained.Furthermore,this paper further analysed adverbs and its influence on identification of text orientation in the semantic level and achieved the text sentiment classification.The experimental results show that the proposed approach is sui-table for judging sentiment orientation.

14 citations