A boosted SVM based sentiment analysis approach for online opinionated text

doi:10.1145/2513228.2513311

Home
/
Papers
/
A boosted SVM based sentiment analysis approach for online opinionated text

Proceedings Article•DOI•

A boosted SVM based sentiment analysis approach for online opinionated text

Anuj Sharma¹, Shubhamoy Dey¹•Institutions (1)

Indian Institute of Management Ahmedabad¹

01 Oct 2013-pp 28-34

TL;DR: The proposed model exploits classification performance of two techniques (Boosting and SVM) applied for the task of sentiment based classification of online reviews and shows that SVM ensemble with bagging or boosting significantly outperforms a single SVM in terms of accuracy of sentimentbased classification.

read less

Abstract: The opinionated text available on the Internet and Web 2.0 social media has created ample research opportunities related to mining and analyzing public sentiments. At the same time, the large volume of such data poses severe data processing and sentiment extraction related challenges. Different contemporary solutions based on machine learning, dictionary, statistical, and semantic based approaches have been proposed in literature for sentiment analysis of online user-generated data. Recent research studies have proved that supervised machine learning techniques like Naive Bayes (NB) and Support Vector Machines (SVM) are very effective for sentiment based classification of opinionated text. This paper proposes a hybrid sentiment classification model based on Boosted SVM. The proposed model exploits classification performance of two techniques (Boosting and SVM) applied for the task of sentiment based classification of online reviews. The results on movies and hotel review corpora of 2000 reviews have shown that the proposed approach has succeeded in improving performance of SVM when used as a weak learner for sentiment based classification. Specifically, the results show that SVM ensemble with bagging or boosting significantly outperforms a single SVM in terms of accuracy of sentiment based classification.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Sentiment Analysis of Tweets using SVM

[...]

Munir Ahmad, Shabib Aftab, Iftikhar Ali

15 Nov 2017-International Journal of Computer Applications

TL;DR: To analyze the performance of SVM, two pre classified datasets of tweets are used and for comparative analysis, three measures are used: Precision, Recall and F-Measure.

...read moreread less

Abstract: Community's view and feedback have always proved to be the most essential and valuable resource for companies and organizations. With social media being the emerging trend among everyone, it paves way for unprecedented analysis and evaluation of various aspects for which organizations had to rely on unconventional, time consuming and error prone methods earlier. This technique of analysis directly falls under the domain of \"sentiment analysis\". Sentiment analysis encompasses the vast field of effective classification of user generated text under defined polarities. There are several tools and algorithms available to perform sentiment detection and analysis including supervised machine learning algorithms that perform classification on the target corpus, after getting trained with training data. Lexical techniques which performs classification on the basis of dictionary based annotated corpus and Hybrid tools which are combination of machine learning and lexicon based algorithms. In this paper we have used Support Vector Machine (SVM) for sentiment analysis in Weka. SVM is one of the widely used supervised machine learning algorithms for textual polarity detection. To analyze the performance of SVM, two pre classified datasets of tweets are used and for comparative analysis, three measures are used: Precision, Recall and F-Measure. Results are shown in the form of tables and graphs.

...read moreread less

83 citations

Proceedings Article•DOI•

A Comparison of SVM Versus Naive-Bayes Techniques for Sentiment Analysis in Tweets: A Case Study with the 2013 FIFA Confederations Cup

[...]

André Luiz Firmino Alves¹, Cláudio de Souza Baptista¹, Anderson Almeida Firmino¹, Maxwell Guimarães de Oliveira¹, Anselmo Cardoso de Paiva² - Show less +1 more•Institutions (2)

Federal University of Campina Grande¹, Federal University of Maranhão²

18 Nov 2014

TL;DR: A case study is carried out in order to compare two techniques for sentiment analysis: a SVM versus Naive-Bayes classifiers, and indicated that the SVM technique surpassed the Naive -Bayes one, concerning performance issues.

...read moreread less

Abstract: The widespread of social communication media on the Web has made available a large volume of opinionated textual data stored in digital format. These media constitute a rich source for sentiment analysis and understanding of the opinions spontaneously expressed. Traditional techniques for sentiment analysis are based on POS Tagger. Considering the Portuguese language, the use of POS Tagging ends up being too costly, due to the complex grammatical structure of this language. Faced with this problem, a case study is carried out in order to compare two techniques for sentiment analysis: a SVM versus Naive-Bayes classifiers. Our study focused on tweets written in Portuguese during the 2013 FIFA Confederations Cup, although our technique could be applied to any other language. The achieved results indicated that the SVM technique surpassed the Naive-Bayes one, concerning performance issues.

...read moreread less

40 citations

Cites background or methods from "A boosted SVM based sentiment analy..."

...vectors by separating it into positive and negative classes with a hyperplane, which can be further extended to nonlinear decision boundaries using various kernels [27]....
[...]
...Other sentiment analysis studies applied to the English language obtained, at the best scenarios, an accuracy of around 95% for detection of sentiment polarity [27]....
[...]

Journal Article•DOI•

Sentiment Analysis using SVM: A Systematic Literature Review

[...]

Munir Ahmad, Shabib Aftab, Muhammad Salman Bashir, Noureen Hameed

01 Jan 2018-International Journal of Advanced Computer Science and Applications

TL;DR: This systematic review will serve the scholars and researchers to analyze the latest work of sentiment analysis with SVM as well as provide them a baseline for future trends and comparisons.

...read moreread less

Abstract: The world has revolutionized and phased into a new era, an era which upholds the true essence of technology and digitalization. As the market has evolved at a staggering scale, it is must to exploit and inherit the advantages and opportunities, it provides. With the advent of web 2.0, considering the scalability and unbounded reach that it provides, it is detrimental for an organization to not to adopt the new techniques in the competitive stakes that this emerging virtual world has set along with its advantages. The transformed and highly intelligent data mining approaches now allow organizations to collect, categorize, and analyze users’ reviews and comments from micro-blogging sites regarding their services and products. This type of analysis makes those organizations capable to assess, what the consumers want, what they disapprove of, and what measures can be taken to sustain and improve the performance of products and services. This study focuses on critical analysis of the literature from year 2012 to 2017 on sentiment analysis by using SVM (support vector machine). SVM is one of the widely used supervised machine learning techniques for text classification. This systematic review will serve the scholars and researchers to analyze the latest work of sentiment analysis with SVM as well as provide them a baseline for future trends and comparisons.

...read moreread less

36 citations

Cites background or methods from "A boosted SVM based sentiment analy..."

...All selected papers [26]–[33] have used one or more techniques in comparison with SVM....
[...]
...Authors in [33] proposed a hybrid sentiment classification model....
[...]

Journal Article•DOI•

Rainfall prediction in Lahore City using data mining techniques

[...]

Shabib Aftab, Munir Ahmad, Noureen Hameed, Muhammad Salman Bashir, Iftikhar Ali, Zahid Nawaz - Show less +2 more

01 Jan 2018-International Journal of Advanced Computer Science and Applications

TL;DR: Performance of used data mining techniques is analyzed in terms of precision, recall and f-measure with various ratios of training and test data.

...read moreread less

Abstract: Rainfall prediction has extreme significance in countless aspects and scopes. It can be very helpful to reduce the effects of sudden and extreme rainfall by taking effective security measures in advance. Due to climate variations, an accurate rainfall prediction has become more complex than before. Data mining techniques can predict the rainfall through extracting the hidden patterns among weather attributes of past data. This research contributes by exploring the use of various data mining techniques for rainfall prediction in Lahore city. Techniques include: Support Vector Machine (SVM), Naive Bayes (NB), k Nearest Neighbor (kNN), Decision Tree (J48) and Multilayer Perceptron (MLP). The dataset is obtained from a weather forecasting website and consists of several atmospheric attributes. For effective prediction, pre-processing technique is used which consists of cleaning and normalization processes. Performance of used data mining techniques is analyzed in terms of precision, recall and f-measure with various ratios of training and test data.

...read moreread less

25 citations

Cites methods from "A boosted SVM based sentiment analy..."

...The output result after processing is compared with the known class and performance is measured in terms of precision, recall and f measure [1], [20], [21], [24], [26]....
[...]

Book Chapter•DOI•

Reviewing Classification Approaches in Sentiment Analysis

[...]

Nor Nadiah Yusof¹, Azlinah Mohamed¹, Shuzlina Abdul-Rahman¹•Institutions (1)

Universiti Teknologi MARA¹

02 Sep 2015

TL;DR: An overview of classification approaches in sentiment analysis is presented and various advantages and limitations of the sentiment classification approaches based on several criteria such as domain, classification type and accuracy are discussed.

...read moreread less

Abstract: The advancement of web technologies has changed the way people share and express their opinions. People enthusiastically shared their thoughts and opinions via online media such as forums, blogs and social networks. The overwhelmed of online opinionated data have gained much attention by researchers especially in the field of text mining and natural language processing (NLP) to study in depth about sentiment analysis. There are several methods in classifying sentiment, including lexicon-based approach and machine learning approach. Each approach has its own advantages and disadvantages. However, there are not many literatures deliberate on the comparison of both approaches. This paper presents an overview of classification approaches in sentiment analysis. Various advantages and limitations of the sentiment classification approaches based on several criteria such as domain, classification type and accuracy are also discussed in this paper.

...read moreread less

24 citations

1
2
3
4
…

References

PDF

Open Access

More filters

Journal Article•DOI•

Bagging predictors

[...]

Leo Breiman

01 Aug 1996

TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

...read moreread less

Abstract: Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.

...read moreread less

16,118 citations

Proceedings Article•

Experiments with a new boosting algorithm

[...]

Yoav Freund¹, Robert E. Schapire¹•Institutions (1)

AT&T¹

03 Jul 1996

TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.

...read moreread less

Abstract: In an earlier paper, we introduced a new "boosting" algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learning algorithm that con- sistently generates classifiers whose performance is a little better than random guessing. We also introduced the related notion of a "pseudo-loss" which is a method for forcing a learning algorithm of multi-label concepts to concentrate on the labels that are hardest to discriminate. In this paper, we describe experiments we carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems. We performed two sets of experiments. The first set compared boosting to Breiman's "bagging" method when used to aggregate various classifiers (including decision trees and single attribute- value tests). We compared the performance of the two methods on a collection of machine-learning benchmarks. In the second set of experiments, we studied in more detail the performance of boosting using a nearest-neighbor classifier on an OCR problem.

...read moreread less

7,601 citations

"A boosted SVM based sentiment analy..." refers background or methods in this paper

...SVM with boosting and SVM with AdaBoost outperformed the other two methods....
[...]
...The best accuracy of 92% was achieved by SVM with AdaBoost, and classical single SVM was the worst performer in all four SVM implementations....
[...]
...Some popular methods for selecting the representative training samples from a collection of datasets are bagging, boosting, randomization, stacking and dagging [9]....
[...]
...them lies in the way the training set is prepared by taking samples from the population [9]....
[...]
...3 Adaptive Boosting (AdaBoost) One of the most popular Boosting methods, AdaBoost [9] creates a collection of weak learners by computing a set of weights over training samples in each iteration instead of performing random sampling....
[...]

Book•

Opinion Mining and Sentiment Analysis

[...]

Bo Pang¹, Lillian Lee²•Institutions (2)

Yahoo!¹, Cornell University²

08 Jul 2008

TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.

...read moreread less

Abstract: An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

...read moreread less

7,452 citations

"A boosted SVM based sentiment analy..." refers background in this paper

...Different contemporary solutions based on different machine learning, dictionary, statistical, and semantic based approaches have been proposed for sentiment analysis of online textual data [6, 18, 27]....
[...]
...com provide reviews for more or less every product category in the consumer market, ranging from mobile phones, books, movies to cars and hotel services [18]....
[...]
...The details of work apart from machine learning approaches are out of scope of this study and can be found in recent surveys [18, 27]....
[...]

Experiment with a new boosting algorithm

[...]

Y. Freund

01 Jan 1996

7,386 citations

Thumbs up? Sentiment Classiflcation using Machine Learning Techniques

[...]

Bo Pang, Lillian Lee, Shivakumar Vaithyanathan

01 Jan 2002

TL;DR: In this paper, the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, was considered and three machine learning methods (Naive Bayes, maximum entropy classiflcation, and support vector machines) were employed.

...read moreread less

Abstract: We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we flnd that standard machine learning techniques deflnitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classiflcation, and support vector machines) do not perform as well on sentiment classiflcation as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classiflcation problem more challenging.

...read moreread less

6,980 citations