Home
/
Authors
/
Mohd Ridzwan Yaakub

Author

Mohd Ridzwan Yaakub

Other affiliations: Queensland University of Technology

Bio: Mohd Ridzwan Yaakub is an academic researcher from National University of Malaysia. The author has contributed to research in topics: Sentiment analysis & Feature selection. The author has an hindex of 9, co-authored 37 publications receiving 287 citations. Previous affiliations of Mohd Ridzwan Yaakub include Queensland University of Technology.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A review of feature selection techniques in sentiment analysis

[...]

Siti Rohaidah Ahmad¹, Azuraliza Abu Bakar², Mohd Ridzwan Yaakub²•Institutions (2)

National Defence University of Malaysia¹, National University of Malaysia²

01 Jan 2019

69 citations

Proceedings Article•DOI•

Metaheuristic algorithms for feature selection in sentiment analysis

[...]

Siti Rohaidah Ahmad¹, Azuraliza Abu Bakar², Mohd Ridzwan Yaakub²•Institutions (2)

National Defence University of Malaysia¹, National University of Malaysia²

28 Jul 2015

TL;DR: It can be concluded that metaheuristic based algorithms have the potential to be implemented in sentiment analysis research and can produce an optimal subset of features by eliminating features that are irrelevant and redundant.

...read moreread less

Abstract: Sentiment analysis functions by analyzing and extracting opinions from documents, websites, blogs, discussion forums and others to identify sentiment patterns on opinions expressed by consumers. It analyzes people's sentiment and identifies types of sentiment in comments expressed by consumers on certain matters. This paper highlights comparative studies on the types of feature selection in sentiment analysis based on natural language processing and modern methods such as Genetic Algorithm and Rough Set Theory. This study compares feature selection in text classification based on traditional and sentiment analysis methods. Feature selection is an important step in sentiment analysis because a suitable feature selection can identify the actual product features criticized or discussed by consumers. It can be concluded that metaheuristic based algorithms have the potential to be implemented in sentiment analysis research and can produce an optimal subset of features by eliminating features that are irrelevant and redundant.

...read moreread less

56 citations

Journal Article•DOI•

Hybrid N-gram model using Naïve Bayes for classification of political sentiments on Twitter

[...]

Jamilu Awwalu¹, Azuraliza Abu Bakar¹, Mohd Ridzwan Yaakub¹•Institutions (1)

National University of Malaysia¹

01 Jan 2019-Neural Computing and Applications

TL;DR: This study hybridize two n-gram models, unigram and n- gram, and applied Laplace smoothing to Naïve Bayesian classifier and Katz back-off on the model in order to smoothen and address the limitation of accuracy in terms of precision and recall of n- Gram models caused by the ‘zero count problem.’

...read moreread less

Abstract: Twitter, an online micro-blogging and social networking service, provides registered users the ability to write in 140 characters anything they wish and hence providing them the opportunity to express their opinions and sentiments on events taking place. Politically sentimental tweets are top-trending tweets; whenever election is near, users tweet about their favorite candidates or political parties and at times give their reasons for that. In this study, we hybridize two n-gram [two n-gram models used in this study are unigram and n-gram. Therefore, in this study, where unigram is mentioned that refers to a least-order n-gram (unigram) and where n-gram is mentioned that refers to the highest-order (full sentence or tweet level) n-gram] models and applied Laplace smoothing to Naive Bayesian classifier and Katz back-off on the model. This was done in order to smoothen and address the limitation of accuracy in terms of precision and recall of n-gram models caused by the ‘zero count problem.’ Result from our baseline model shows an increase of 6.05% in average F-Harmonic accuracy in comparison with the n-gram model and 1.75% increase in comparison with the semantic-topic model proposed from a previous study on the same dataset, i.e., Obama–McCain dataset.

...read moreread less

35 citations

Journal Article•DOI•

Ant colony optimization for text feature selection in sentiment analysis

[...]

Siti Rohaidah Ahmad¹, Azuraliza Abu Bakar², Mohd Ridzwan Yaakub²•Institutions (2)

National Defence University of Malaysia¹, National University of Malaysia²

01 Jan 2019

31 citations

Journal Article•DOI•

Movie Revenue Prediction Based on Purchase Intention Mining Using YouTube Trailer Reviews

[...]

Ibrahim Said Ahmad¹, Azuraliza Abu Bakar², Mohd Ridzwan Yaakub²•Institutions (2)

Bayero University Kano¹, National University of Malaysia²

01 Sep 2020-Information Processing and Management

TL;DR: This paper builds a model for movie revenue prediction prior to the movie's release using YouTube trailer reviews and proves the superiority of this approach compared to three baseline approaches and achieved a relative absolute error of 29.65%.

...read moreread less

Abstract: The increase in acceptability and popularity of social media has made extracting information from the data generated on social media an emerging field of research. An important branch of this field is predicting future events using social media data. This paper is focused on predicting box-office revenue of a movie by mining people's intention to purchase a movie ticket, termed purchase intention, from trailer reviews. Movie revenue prediction is important due to risks involved in movie production despite the high cost involved in the production. Previous studies in this domain focus on the use of twitter data and IMDB reviews for the prediction of movies that have already been released. In this paper, we build a model for movie revenue prediction prior to the movie's release using YouTube trailer reviews. Our model consists of novel methods of calculating purchase intention, positive-to-negative sentiment ratio, and like-to-dislike ratio for movie revenue prediction. Our experimental results prove the superiority of our approach compared to three baseline approaches and achieved a relative absolute error of 29.65%.

...read moreread less

29 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Computational Thinking 計算論的思考

[...]

Jeannette M. Wing, 翻訳：中島秀之

15 May 2015

TL;DR: In this article, a universally applicable attitude and skill set for computer science is presented, which is a set of skills and attitudes that everyone would be eager to learn and use, not just computer scientists.

...read moreread less

Abstract: It represents a universally applicable attitude and skill set everyone, not just computer scientists, would be eager to learn and use.

...read moreread less

430 citations

Journal Article•DOI•

A review of unsupervised feature selection methods

[...]

Saúl Solorio-Fernández, J. Ariel Carrasco-Ochoa, José Fco. Martínez-Trinidad

01 Feb 2020-Artificial Intelligence Review

TL;DR: A comprehensive and structured review of the most relevant and recent unsupervised feature selection methods reported in the literature is provided and a taxonomy of these methods is presented.

...read moreread less

Abstract: In recent years, unsupervised feature selection methods have raised considerable interest in many research areas; this is mainly due to their ability to identify and select relevant features without needing class label information. In this paper, we provide a comprehensive and structured review of the most relevant and recent unsupervised feature selection methods reported in the literature. We present a taxonomy of these methods and describe the main characteristics and the fundamental ideas they are based on. Additionally, we summarized the advantages and disadvantages of the general lines in which we have categorized the methods analyzed in this review. Moreover, an experimental comparison among the most representative methods of each approach is also presented. Finally, we discuss some important open challenges in this research area.

...read moreread less

325 citations

Journal Article•DOI•

An up-to-date comparison of state-of-the-art classification algorithms

[...]

Chongsheng Zhang¹, Changchang Liu¹, Xiangliang Zhang², George Almpanidis¹•Institutions (2)

Henan University¹, King Abdullah University of Science and Technology²

01 Oct 2017-Expert Systems With Applications

TL;DR: It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines and Random Forests, while being the fastest algorithm in terms of prediction efficiency.

...read moreread less

Abstract: Up-to-date report on the accuracy and efficiency of state-of-the-art classifiers.We compare the accuracy of 11 classification algorithms pairwise and groupwise.We examine separately the training, parameter-tuning, and testing time.GBDT and Random Forests yield highest accuracy, outperforming SVM.GBDT is the fastest in testing, Naive Bayes the fastest in training. Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.

...read moreread less

307 citations

Journal Article•

Mining Opinion Features in Customer Reviews.

[...]

J. Keziya Rani

17 May 2016-International Journal of Research

TL;DR: A survey on the techniques used for designing software to mine opinion features in reviews and how Natural Language Processing techniques such as NLTK for Python can be applied to raw customer reviews and keywords can be extracted.

...read moreread less

Abstract: Now days, E-commerce systems have become extremely important. Large numbers of customers are choosing online shopping because of its convenience, reliability, and cost. Client generated information and especially item reviews are significant sources of data for consumers to make informed buy choices and for makers to keep track of customer’s opinions. It is difficult for customers to make purchasing decisions based on only pictures and short product descriptions. On the other hand, mining product reviews has become a hot research topic and prior researches are mostly based on pre-specified product features to analyse the opinions. Natural Language Processing (NLP) techniques such as NLTK for Python can be applied to raw customer reviews and keywords can be extracted. This paper presents a survey on the techniques used for designing software to mine opinion features in reviews. Elven IEEE papers are selected and a comparison is made between them. These papers are representative of the significant improvements in opinion mining in the past decade.

...read moreread less

229 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81

Collapse