Author
Mohd Shamrie Sainin
Other affiliations: Florida State University College of Arts and Sciences, Universiti Utara Malaysia
Bio: Mohd Shamrie Sainin is an academic researcher from Universiti Malaysia Sabah. The author has contributed to research in topics: Feature selection & Ensemble learning. The author has an hindex of 5, co-authored 29 publications receiving 88 citations. Previous affiliations of Mohd Shamrie Sainin include Florida State University College of Arts and Sciences & Universiti Utara Malaysia.
Papers
More filters
••
28 Jun 2011TL;DR: A genetic based wrapper approach that optimizes feature selection process embedded in a classification technique called a supervised Nearest Neighbour Distance Matrix (NNDM) and demonstrates a significant impact on the predictive accuracy for feature selection combined with the supervised NNDM in classifying new instances.
Abstract: Feature selection for data mining optimization receives quite a high demand especially on high-dimensional feature vectors of a data Feature selection is a method used to select the best feature (or combination of features) for the data in order to achieve similar or better classification rate Currently, there are three types of feature selection methods: filter, wrapper and embedded This paper describes a genetic based wrapper approach that optimizes feature selection process embedded in a classification technique called a supervised Nearest Neighbour Distance Matrix (NNDM) This method is implemented and tested on several datasets obtained from the UCI Machine Learning Repository and other datasets The results demonstrate a significant impact on the predictive accuracy for feature selection combined with the supervised NNDM in classifying new instances Therefore it can be used in other applications that require feature dimension reduction such as image and bioinformatics classifications
18 citations
••
01 Aug 2014TL;DR: A novel framework in order to identify and classify tropical medicinal plants in Malaysia based on the extracted patterns from the leaf is presented and the ensemble classifier called Direct Ensemble Classifier for Imbalanced Multiclass Learning (DECIML) is used as a classifier.
Abstract: Malaysian medicinal plants may be abundant natural resources but there has not been much research done on preserving the knowledge of these medicinal plants which enables general public to know the leaf using computing capability. Therefore, in this preliminary study, a novel framework in order to identify and classify tropical medicinal plants in Malaysia based on the extracted patterns from the leaf is presented. The extracted patterns from medicinal plant leaf are obtained based on several angle features. However, the extracted features create quite large number of attributes (features), thus degrade the performance most of the classifiers. Thus, a feature selection is applied to leaf data and to investigate whether the performance of a classifier can be improved. Wrapper based genetic algorithm (GA) feature selection is used to select the features and the ensemble classifier called Direct Ensemble Classifier for Imbalanced Multiclass Learning (DECIML) is used as a classifier. The performance of the feature selection is compared with two feature selections from Weka. In the experiment, five species of Malaysian medicinal plants are identified and classified in which will be represented by using 65 images. This study is important in order to assist local community to utilize the knowledge and application of Malaysian medicinal plants for future generation.
15 citations
12 Aug 2014
TL;DR: This study proposes a framework to identify and classify tropical medicinal plants in Malaysia based the extracted patterns from the leaf based on several angle features.
Abstract: Malaysian medicinal plants may be abundant natural resources but there has not been much research done on preserving the knowledge of these medicinal plants which enables general public to know the leaf using computing capability. This study proposes a framework to identify and classify tropical medicinal plants in Malaysia based the extracted patterns from the leaf. The extracted patterns from medicinal plant leaf are obtained based on several angle features. Five classifiers, obtained from WEKA and an ensemble classifier, called Direct Ensemble Classifier for Imbalanced Multiclass Learning (DECIML), are used to compare their performance accuracies over this data. In this experiment, five species of Malaysian medicinal plants are identified and classified in which each species will be represented by using 65 images. This study is important in order to assist local community to utilize the knowledge discovery and application of Malaysian medicinal plants for future generation. In this paper, a preliminary study is conducted to
12 citations
06 Dec 2005
TL;DR: The paper explains about the use of the basic naive Bayes algorithm to classify forum text me ssages into two classes namely clean and bad, which can reduce the decision time in the problem of document text classification for conference paper.
Abstract: The basic text classification technique in forum application has been discussed in Sainin (2005a) and Sainin (2005b).The paper explains about the use of the basic naive Bayes algorithm to classify forum text me
ssages into two classes namely clean and bad.
In the problem of document text classification for conference paper, various papers themes will normally be classified manually by the conference management.Once the classification of the papers is ready, the parallel sessions for presentation according to the themes will be scheduled. This process is time consuming and may classify papers into unrelated themes.Based on this situation, an automated text document classification can replace the manual classification; hence reduce the decision time.In this paper, the similar algorithm that was applied in the previous experiment for the forum messages classification will be discussed according to the experiment for conference paper classification.
7 citations
••
15 Oct 2012TL;DR: An ensemble learner called a Direct Ensemble Classifier for Imbalanced Multiclass Learning (DECIML) that combines simple nearest neighbour and Naive Bayes algorithms is proposed and a combiner method called OR-tree is used to combine the decisions obtained from the ensemble classifiers.
Abstract: Researchers have shown that although traditional direct classifier algorithm can be easily applied to multiclass classification, the performance of a single classifier is decreased with the existence of imbalance data in multiclass classification tasks. Thus, ensemble of classifiers has emerged as one of the hot topics in multiclass classification tasks for imbalance problem for data mining and machine learning domain. Ensemble learning is an effective technique that has increasingly been adopted to combine multiple learning algorithms to improve overall prediction accuraciesand may outperform any single sophisticated classifiers. In this paper, an ensemble learner called a Direct Ensemble Classifier for Imbalanced Multiclass Learning (DECIML) that combines simple nearest neighbour and Naive Bayes algorithms is proposed. A combiner method called OR-tree is used to combine the decisions obtained from the ensemble classifiers. The DECIML framework has been tested with several benchmark dataset and shows promising results.
7 citations
Cited by
More filters
•
9,185 citations
••
01 Jan 2021TL;DR: A systematic review approach was used to analyse 53 articles from recognised digital databases to provide a comprehensive understanding of prior research related to the use of Chatbots in education, including information on existing studies, benefits, and challenges.
Abstract: The introduction of Artificial Intelligence technology enables the integration of Chatbot systems into various aspects of education. This technology is increasingly being used for educational purposes. Chatbot technology has the potential to provide quick and personalised services to everyone in the sector, including institutional employees and students. This paper presents a systematic review of previous studies on the use of Chatbots in education. A systematic review approach was used to analyse 53 articles from recognised digital databases. The review results provide a comprehensive understanding of prior research related to the use of Chatbots in education, including information on existing studies, benefits, and challenges, as well as future research areas on the implementation of Chatbot technology in the field of education. The implications of the findings were discussed, and suggestions were made.
98 citations
••
TL;DR: A survey of the latest research on intelligent data processing technology applied in agriculture, particularly in rice production, can be found in this paper, where the authors describe the data captured and elaborate role of machine learning algorithms in paddy rice smart agriculture.
Abstract: Big Data (BD), Machine Learning (ML) and Internet of Things (IoT) are expected to have a large impact on Smart Farming and involve the whole supply chain, particularly for rice production The increasing amount and variety of data captured and obtained by these emerging technologies in IoT offer the rice smart farming strategy new abilities to predict changes and identify opportunities The quality of data collected from sensors greatly influences the performance of the modelling processes using ML algorithms These three elements (eg, BD, ML and IoT) have been used tremendously to improve all areas of rice production processes in agriculture, which transform traditional rice farming practices into a new era of rice smart farming or rice precision agriculture In this paper, we perform a survey of the latest research on intelligent data processing technology applied in agriculture, particularly in rice production We describe the data captured and elaborate role of machine learning algorithms in paddy rice smart agriculture, by analyzing the applications of machine learning in various scenarios, smart irrigation for paddy rice, predicting paddy rice yield estimation, monitoring paddy rice growth, monitoring paddy rice disease, assessing quality of paddy rice and paddy rice sample classification This paper also presents a framework that maps the activities defined in rice smart farming, data used in data modelling and machine learning algorithms used for each activity defined in the production and post-production phases of paddy rice Based on the proposed mapping framework, our conclusion is that an efficient and effective integration of all these three technologies is very crucial that transform traditional rice cultivation practices into a new perspective of intelligence in rice precision agriculture Finally, this paper also summarizes all the challenges and technological trends towards the exploitation of multiple sources in the era of big data in agriculture
59 citations
•
TL;DR: A novel shape recognition method based on radial basis probabilistic neural network (RBPNN) that achieves higher recognition rate and better classification efficiency with respect to radial basis function neural network, BP neural network and multi-Layer perceptron network for the plant species identification.
Abstract: In this paper, a novel shape recognition method based on radial basis probabilistic neural network (RBPNN) is proposed. The orthogonal least square algorithm (OLSA) is used to train the RBPNN and the recursive OLSA is adopted to optimize the structure of the RBPNN. A leaf image database is used to test the proposed method. And a modified Fourier method is applied to descript the shape of the plant leaf. The experimental result shows that the RBPNN achieves higher recognition rate and better classification efficiency with respect to radial basis function neural network (RBFNN), BP neural network (BPNN) and multi-Layer perceptron network (MLPN) for the plant species identification.
55 citations