scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Ensemble Machine Learning Model for Higher Learning Scholarship Award Decisions

01 Jan 2020-International Journal of Advanced Computer Science and Applications (The Science and Information (SAI) Organization Limited)-Vol. 11, Iss: 5
TL;DR: An ensemble knowledge model is proposed to support the scholarship award decision made by the organization and generates list of eligible candidates to reduce human error and time taken to select the eligible candidate manually.
Abstract: The role of higher learning in Malaysia is to ensure high quality educational ecosystems in developing individual potentials to fulfill the national aspiration. To implement this role with success, scholarship offer is an important part of strategic plan. Since the increasing number of undergraduates’ student every year, the government must consider to apply a systematic strategy to manage the scholarship offering to ensure the scholarship recipient must be selected in effective way. The use of predictive model has shown effective can be made. In this paper, an ensemble knowledge model is proposed to support the scholarship award decision made by the organization. It generates list of eligible candidates to reduce human error and time taken to select the eligible candidate manually. Two approached of ensemble are presented. Firstly, ensembles of model and secondly ensembles of rule-based knowledge. The ensemble learning techniques, namely, boosting, bagging, voting and rules-based ensemble technique and five base learners’ algorithm, namely, J48, Support Vector Machine (SVM), Artificial Neuron Network (ANN), Naive Bayes (NB) and Random Tree (RT) are used to develop the model. Total of 87,000 scholarship application data are used in modelling process. The result on accuracy, precision, recall and F-measure measurement shows that the ensemble voting techniques gives the best accuracy of 86.9% compare to others techniques. This study also explores the rules obtained from the rules-based model J48 and Apriori and managed to select the best rules to develop an ensemble rules-based models which is improved the study for classification model for scholarship award.

Content maybe subject to copyright    Report

Citations
More filters
References
More filters
Journal ArticleDOI
Robi Polikar1
TL;DR: Conditions under which ensemble based systems may be more beneficial than their single classifier counterparts are reviewed, algorithms for generating individual components of the ensemble systems, and various procedures through which the individual classifiers can be combined are reviewed.
Abstract: In matters of great importance that have financial, medical, social, or other implications, we often seek a second opinion before making a decision, sometimes a third, and sometimes many more. In doing so, we weigh the individual opinions, and combine them through some thought process to reach a final decision that is presumably the most informed one. The process of consulting "several experts" before making a final decision is perhaps second nature to us; yet, the extensive benefits of such a process in automated decision making applications have only recently been discovered by computational intelligence community. Also known under various other names, such as multiple classifier systems, committee of classifiers, or mixture of experts, ensemble based systems have shown to produce favorable results compared to those of single-expert systems for a broad range of applications and under a variety of scenarios. Design, implementation and application of such systems are the main topics of this article. Specifically, this paper reviews conditions under which ensemble based systems may be more beneficial than their single classifier counterparts, algorithms for generating individual components of the ensemble systems, and various procedures through which the individual classifiers can be combined. We discuss popular ensemble based algorithms, such as bagging, boosting, AdaBoost, stacked generalization, and hierarchical mixture of experts; as well as commonly used combination rules, including algebraic combination of outputs, voting based techniques, behavior knowledge space, and decision templates. Finally, we look at current and future research directions for novel applications of ensemble systems. Such applications include incremental learning, data fusion, feature selection, learning with missing features, confidence estimation, and error correcting output codes; all areas in which ensemble systems have shown great promise

2,628 citations


"Ensemble Machine Learning Model for..." refers background in this paper

  • ...In conclusion, the disadvantages of ensemble learning model is depends on some factors such as bias, noise and variants [21][22][12]....

    [...]

Journal ArticleDOI
TL;DR: This paper reviews existing ensemble techniques and can be served as a tutorial for practitioners who are interested in building ensemble based systems.
Abstract: The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used for improving prediction performance. Researchers from various disciplines such as statistics and AI considered the use of ensemble methodology. This paper, review existing ensemble techniques and can be served as a tutorial for practitioners who are interested in building ensemble based systems.

2,273 citations


"Ensemble Machine Learning Model for..." refers background or methods in this paper

  • ...This is because using this method, the final model is capable to combine the characteristics of a single classifier used with either the same or different function [10] and gives a better result than a single classifier [10][13][12][11]....

    [...]

  • ...In conclusion, the disadvantages of ensemble learning model is depends on some factors such as bias, noise and variants [21][22][12]....

    [...]

  • ...Many studies as shown in [11][10][12][31] have proven that using this method may overcome the weaknesses of single classification method and a more robust final model can be developed....

    [...]

Book ChapterDOI
Robert E. Schapire1
01 Jan 2003
TL;DR: This chapter overviews some of the recent work on boosting including analyses of AdaBoost's training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extensions of Ada boost for multiclass classification problems; methods of incorporating human knowledge into boosting; and experimental and applied work using boosting.
Abstract: Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, this chapter overviews some of the recent work on boosting including analyses of AdaBoost’s training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extensions of AdaBoost for multiclass classification problems; methods of incorporating human knowledge into boosting; and experimental and applied work using boosting.

1,979 citations


"Ensemble Machine Learning Model for..." refers background in this paper

  • ...The model performance on examples not seen during training demonstrate the actual capabilities of the model [14]....

    [...]

Journal ArticleDOI
TL;DR: In this article, a linear combination of simple rules derived from the data is used for general regression and classification models, where each rule consists of a conjunction of a small number of simple statements concerning the values of individual input variables.
Abstract: General regression and classification models are constructed as linear combinations of simple rules derived from the data. Each rule consists of a conjunction of a small number of simple statements concerning the values of individual input variables. These rule ensembles are shown to produce predictive accuracy comparable to the best methods. However, their principal advantage lies in interpretation. Because of its simple form, each rule is easy to understand, as is its influence on individual predictions, selected subsets of predictions, or globally over the entire space of joint input variable values. Similarly, the degree of relevance of the respective input variables can be assessed globally, locally in different regions of the input space, or at individual prediction points. Techniques are presented for automatically identifying those variables that are involved in interactions with other variables, the strength and degree of those interactions, as well as the identities of the other variables with which they interact. Graphical representations are used to visualize both main and interaction effects.

874 citations

Journal ArticleDOI
TL;DR: An overview on the data mining techniques that have been used to predict students performance and how the prediction algorithm can be used to identify the most important attributes in a students data is provided.

558 citations


"Ensemble Machine Learning Model for..." refers background in this paper

  • ...For the student’s group issues, there are many studies involved such as a classification and prediction model in monitoring student academic performance based on existing student achievement records [2][3][4], selecting best student who are eligible to be offered certain university program and selecting the best study programme or courses for student to register....

    [...]