Using Ensemble Learning and Association Rules to Help Car Buyers Make Informed Choices

doi:10.1145/3010089.3010093

Home
/
Papers
/
Using Ensemble Learning and Association Rules to Help Car Buyers Make Informed Choices

Proceedings Article•DOI•

Using Ensemble Learning and Association Rules to Help Car Buyers Make Informed Choices

Akarsh Goyal¹, Saurabh Singh Thakur¹, Rahul Chowdhury¹•Institutions (1)

VIT University¹

10 Nov 2016-pp 8

TL;DR: Bagging, boosting and voting ensemble learning have been used to improve the precision rate i.e. accuracy of classification and class association rules are performed to see if it performs better than collaborative filtering for suggesting item to the user.

read less

Abstract: Cars are an essential part of our everyday life. Nowadays we have a wide plethora of cars produced by a number of companies in all segments. The buyer has to consider a lot of factors while buying a car which makes the whole process a lot more difficult. So in this paper we have developed a method of ensemble learning to aid people in making the decision. Bagging, boosting and voting ensemble learning have been used to improve the precision rate i.e. accuracy of classification. Also we have performed class association rules to see if it performs better than collaborative filtering for suggesting item to the user.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Classification Models for Higher Learning Scholarship Award Decisions

[...]

Wirawati Dewi Ahmad, Azuraliza Abu Bakar

30 Dec 2018-Asia-Pacific Journal of Information Technology and Multimedia

TL;DR: This study found that the classification model from SVM algorithm provided the best result with 86.45% accuracy to correctly classify ‘Eligible’ status of candidates, while RT was the weakest model with the lowest accuracy rate for this purpose.

...read moreread less

Abstract: Scholarship is a financial facility given to eligible students to extend Higher Education. Limited funding sources with the growing number of applicants force the Government to find solutions to help speed up and facilitate the selection of eligible students and then adopt a systematic approach for this purpose. In this study, a data mining approach was used to propose a classification model of scholarship award result determination. A dataset of successful and unsuccessful applicants was taken and processed as training data and testing data used in the modelling process. Five algorithms were employed to develop a classification model in determining the award of the scholarship, namely J48, SVM, NB, ANN and RT algorithms. Each model was evaluated using technical evaluation metric , such contingency table metrics, accuracy, precision , and recall measures. As a result, the best models were classified into two different categories: The best model classified for ‘Eligible’ status, and the best model classified for ‘Not Eligible’ status. The knowledge obtained from the rules-based model was evaluated through knowledge analysis conducted by technical and domain experts. This study found that the classification model from SVM algorithm provided the best result with 86.45% accuracy to correctly classify ‘Eligible’ status of candidates, while RT was the weakest model with the lowest accuracy rate of for this purpose, with only 82.9% accuracy. The model that had the highest accuracy rate for ‘Not Eligible’ status of scholarship offered was NB model, whereas SVM model was the weakest model to classify ‘Not Eligible’ status. In addition, the knowledge analysis of the decision tree model was also made and found that some new information derived from the acquisition of this research information may help the stakeholders in making new policies and scholarship programmes in the future.

...read moreread less

5 citations

Journal Article•DOI•

Ensemble Machine Learning Model for Higher Learning Scholarship Award Decisions

[...]

Wirawati Dewi Ahmad, Azuraliza Abu Bakar

01 Jan 2020-International Journal of Advanced Computer Science and Applications

TL;DR: An ensemble knowledge model is proposed to support the scholarship award decision made by the organization and generates list of eligible candidates to reduce human error and time taken to select the eligible candidate manually.

...read moreread less

Abstract: The role of higher learning in Malaysia is to ensure high quality educational ecosystems in developing individual potentials to fulfill the national aspiration. To implement this role with success, scholarship offer is an important part of strategic plan. Since the increasing number of undergraduates’ student every year, the government must consider to apply a systematic strategy to manage the scholarship offering to ensure the scholarship recipient must be selected in effective way. The use of predictive model has shown effective can be made. In this paper, an ensemble knowledge model is proposed to support the scholarship award decision made by the organization. It generates list of eligible candidates to reduce human error and time taken to select the eligible candidate manually. Two approached of ensemble are presented. Firstly, ensembles of model and secondly ensembles of rule-based knowledge. The ensemble learning techniques, namely, boosting, bagging, voting and rules-based ensemble technique and five base learners’ algorithm, namely, J48, Support Vector Machine (SVM), Artificial Neuron Network (ANN), Naive Bayes (NB) and Random Tree (RT) are used to develop the model. Total of 87,000 scholarship application data are used in modelling process. The result on accuracy, precision, recall and F-measure measurement shows that the ensemble voting techniques gives the best accuracy of 86.9% compare to others techniques. This study also explores the rules obtained from the rules-based model J48 and Apriori and managed to select the best rules to develop an ensemble rules-based models which is improved the study for classification model for scholarship award.

...read moreread less

1 citations

Cites methods from "Using Ensemble Learning and Associa..."

...Based on the review by [17] the single-core tree algorithm and decision tree will produce different tree outputs....
[...]

References

PDF

Open Access

More filters

Journal Article•DOI•

Bagging predictors

[...]

Leo Breiman

01 Aug 1996

TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

...read moreread less

Abstract: Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.

...read moreread less

16,118 citations

Journal Article•DOI•

Bagging, Boosting and the Random Subspace Method for Linear Classifiers

[...]

Marina Skurichina¹, Robert P. W. Duin¹•Institutions (1)

Delft University of Technology¹

07 Jun 2002-Pattern Analysis and Applications

TL;DR: Simulation studies show that the performance of the combining techniques is strongly affected by the small sample size properties of the base classifier: boosting is useful for large training sample sizes, while bagging and the random subspace method are useful for criticalTraining sample sizes.

...read moreread less

Abstract: Recently bagging, boosting and the random subspace method have become popular combining techniques for improving weak classifiers. These techniques are designed for, and usually applied to, decision trees. In this paper, in contrast to a common opinion, we demonstrate that they may also be useful in linear discriminant analysis. Simulation studies, carried out for several artificial and real data sets, show that the performance of the combining techniques is strongly affected by the small sample size properties of the base classifier: boosting is useful for large training sample sizes, while bagging and the random subspace method are useful for critical training sample sizes. Finally, a table describing the possible usefulness of the combining techniques for linear classifiers is presented.

...read moreread less

449 citations

Is Naïve Bayes a Good Classifier for Document Classification

[...]

Hung Hum

01 Jan 2011

TL;DR: Results show that Naive Bayes is the best classifiers against several common classifiers (such as decision tree, neural network, and support vector machines) in term of accuracy and computational efficiency.

...read moreread less

Abstract: Document classification is a growing interest in the research of text mining. Correctly identifying the documents into particular category is still presenting challenge because of large and vast amount of features in the dataset. In regards to the existing classifying approaches, Naive Bayes is potentially good at serving as a document classification model due to its simplicity. The aim of this paper is to highlight the performance of employing Naive Bayes in document classification. Results show that Naive Bayes is the best classifiers against several common classifiers (such as decision tree, neural network, and support vector machines) in term of accuracy and computational efficiency.

...read moreread less

140 citations

Posted Content•

A Comparative Study of Collaborative Filtering Algorithms

[...]

Joonseok Lee, Mingxuan Sun, Guy Lebanon

14 May 2012-arXiv: Information Retrieval

TL;DR: This paper conducts a study comparing several collaborative ltering techniques, both classic and recent state-of-the-art, in a variety of experimental contexts to identify what algorithms work well and in what conditions.

...read moreread less

Abstract: Collaborative ltering is a rapidly advancing research area. Every year several new techniques are proposed and yet it is not clear which of the techniques work best and under what conditions. In this paper we conduct a study comparing several collaborative ltering techniques { both classic and recent state-of-the-art { in a variety of experimental contexts. Specically, we report conclusions controlling for number of items, number of users, sparsity level, performance criteria, and computational complexity. Our conclusions identify what algorithms work well and in what conditions, and contribute to both industrial deployment collaborative ltering algorithms and to the research community.

...read moreread less

130 citations

Proceedings Article•

Machine Learning by Function Decomposition

[...]

Blaz Zupan, Marko Bohanec, Ivan Bratko, Janez Demšar

08 Jul 1997

TL;DR: A new machine learning method is presented that, given a set of training examples, induces a definition of the target concept in terms of a hierarchy of intermediate concepts and their definitions, which effectively decomposes the problem into smaller, less complex problems.

...read moreread less

Abstract: We present a new machine learning method that, given a set of training examples, induces a definition of the target concept in terms of a hierarchy of intermediate concepts and their definitions. This effectively decomposes the problem into smaller, less complex problems. The method is inspired by the Boolean function decomposition approach to the design of digital circuits. To cope with high time complexity of finding an optimal decomposition, we propose a suboptimal heuristic algorithm. The method, implemented in program HINT (HIerarchy Induction Tool), is experimentally evaluated using a set of artificial and real-world learning problems. It is shown that the method performs well both in terms of classification accuracy and discovery of meaningful concept hierarchies.

...read moreread less

85 citations

"Using Ensemble Learning and Associa..." refers methods in this paper

...Data Mining [3], [10] is the process of discovering patterns in large datasets by using methods from various fields of interest....
[...]
...In [10] proper use of car evaluation dataset is demonstrated by developing a new machine learning method....
[...]