scispace - formally typeset
Journal ArticleDOI

Analysing user sentiment of Indian movie reviews: A probabilistic committee selection model

Shrawan Kumar Trivedi, +1 more
- 29 Oct 2018 - 
- Vol. 36, Iss: 4, pp 590-606
TLDR
A novel probabilistic committee selection classifier (PCC) is proposed and used for classifying movie reviews, and is found to be highly effective in comparison with other state-of-the-art classifiers.
Abstract
Purpose To be sustainable and competitive in the current business environment, it is useful to understand users’ sentiment towards products and services. This critical task can be achieved via natural language processing and machine learning classifiers. This paper aims to propose a novel probabilistic committee selection classifier (PCC) to analyse and classify the sentiment polarities of movie reviews. Design/methodology/approach An Indian movie review corpus is assembled for this study. Another publicly available movie review polarity corpus is also involved with regard to validating the results. The greedy stepwise search method is used to extract the features/words of the reviews. The performance of the proposed classifier is measured using different metrics, such as F-measure, false positive rate, receiver operating characteristic (ROC) curve and training time. Further, the proposed classifier is compared with other popular machine-learning classifiers, such as Bayesian, Naive Bayes, Decision Tree (J48), Support Vector Machine and Random Forest. Findings The results of this study show that the proposed classifier is good at predicting the positive or negative polarity of movie reviews. Its performance accuracy and the value of the ROC curve of the PCC is found to be the most suitable of all other classifiers tested in this study. This classifier is also found to be efficient at identifying positive sentiments of reviews, where it gives low false positive rates for both the Indian Movie Review and Review Polarity corpora used in this study. The training time of the proposed classifier is found to be slightly higher than that of Bayesian, Naive Bayes and J48. Research limitations/implications Only movie review sentiments written in English are considered. In addition, the proposed committee selection classifier is prepared only using the committee of probabilistic classifiers; however, other classifier committees can also be built, tested and compared with the present experiment scenario. Practical implications In this paper, a novel probabilistic approach is proposed and used for classifying movie reviews, and is found to be highly effective in comparison with other state-of-the-art classifiers. This classifier may be tested for different applications and may provide new insights for developers and researchers. Social implications The proposed PCC may be used to classify different product reviews, and hence may be beneficial to organizations to justify users’ reviews about specific products or services. By using authentic positive and negative sentiments of users, the credibility of the specific product, service or event may be enhanced. PCC may also be applied to other applications, such as spam detection, blog mining, news mining and various other data-mining applications. Originality/value The constructed PCC is novel and was tested on Indian movie review data.

read more

Citations
More filters
Journal ArticleDOI

A Review on Sentiment Analysis Techniques and Applications

TL;DR: The comparison among these two main approaches reveals that Machine Learning techniques can solve classification task with reasonable success and with very high accuracy compared to NLP-based techniques but it is depending on the training and test data with respect to the domain.
Journal ArticleDOI

Improving the affective analysis in texts: Automatic method to detect affective intensity in lexicons based on Plutchik’s wheel of emotions

TL;DR: In this article, the authors proposed a method for automatically labeling an affective lexicon with intensity values by using the WordNet Similarity (WS) software package with the purpose of improving the results of the affective analysis process, which is relevant to interpreting the textual information available in social networks.
Proceedings ArticleDOI

Sentiment analysis of preschool teachers’ perceptions on ICT use for young children

TL;DR: This paper summarizes the findings using sentiment analysis as well as comparing it to the quantitative data obtained from the survey, where most teachers agreed upon the benefits of ICT use and conclude more positive sentiment polarity.
References
More filters
Book ChapterDOI

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.
Book

Opinion Mining and Sentiment Analysis

TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.

Thumbs up? Sentiment Classiflcation using Machine Learning Techniques

TL;DR: In this paper, the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, was considered and three machine learning methods (Naive Bayes, maximum entropy classiflcation, and support vector machines) were employed.
Proceedings ArticleDOI

Thumbs up? Sentiment Classification using Machine Learning Techniques

TL;DR: This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.
Journal ArticleDOI

Original Contribution: Stacked generalization

David H. Wolpert
- 05 Feb 1992 - 
TL;DR: The conclusion is that for almost any real-world generalization problem one should use some version of stacked generalization to minimize the generalization error rate.
Related Papers (5)