scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Reinforced quasi-random forest

01 Oct 2019-Pattern Recognition (Pergamon)-Vol. 94, pp 13-24
TL;DR: A reinforced quasi-random forest for classification task that assigns an importance to each of the attributes and identifies the attributes that causes the mis-classification of data points during training is proposed.
About: This article is published in Pattern Recognition.The article was published on 2019-10-01. It has received 6 citations till now.
Citations
More filters
Journal ArticleDOI
TL;DR: There is some potential for AI techniques to contribute towards better frailty identification within residential care, but potential benefits will need to be weighed against administrative burden, data quality concerns and presence of potential bias.

30 citations

Journal ArticleDOI
TL;DR: The proposed method for few-shot diagnosis of diseases and conditions from chest x-rays using discriminative ensemble learning is modular and easily adaptable to new tasks requiring the training of only the saliency-based classifier.

29 citations

Journal ArticleDOI
TL;DR: This work empirically proves that the performance of the MaRF improves due to the improvement in the strength of the M-ary trees, and proposes to use multiple features at a node for splitting the data as in axis parallel method.
Abstract: Random Forest (RF) is composed of decision trees as base classifiers. In general, a decision tree recursively partitions the feature space into two disjoint subspaces using a single feature as axis-parallel splits for each internal node. The oblique decision tree uses a linear combination of features (to form a hyperplane) to partition the feature space in two subspaces. The later approach is an NP-hard problem to compute the best-suited hyperplane. In this work, we propose to use multiple features at a node for splitting the data as in axis parallel method. Each feature independently divides into two subspaces and this process is done by multiple features at one node. Hence, the given space is divided into multiple subspaces simultaneously, and in turn, to construct the M-ary trees. Hence, the forest formed is named as M-ary Random Forest (MaRF). To measure the performance of the task in MaRF, we have extended the notion of tree strength of the regression tree. We empirically prove that the performance of the MaRF improves due to the improvement in the strength of the M-ary trees. We have shown the performance to wide range of datasets ranging from UCI datasets, Hyperspectral dataset, MNIST dataset, Caltech 101 and Caltech 256 datasets. The efficiency of the MaRF approach is found satisfactory as compared to state-of-the-art methods.

7 citations

Journal ArticleDOI
TL;DR: In this article , three models were constructed using different algorithms: Naïve Bayes (NB), Random Forest (RF), and J48, which achieved a prediction accuracy of 99.34%.
Abstract: A problem that pervades throughout students’ careers is their poor performance in high school. Predicting students’ academic performance helps educational institutions in many ways. Knowing and identifying the factors that can affect the academic performance of students at the beginning of the thread can help educational institutions achieve their educational goals by providing support to students earlier. The aim of this study was to predict the achievement of early secondary students. Two sets of data were used for high school students who graduated from the Al-Baha region in the Kingdom of Saudi Arabia. In this study, three models were constructed using different algorithms: Naïve Bayes (NB), Random Forest (RF), and J48. Moreover, the Synthetic Minority Oversampling Technique (SMOTE) technique was applied to balance the data and extract features using the correlation coefficient. The performance of the prediction models has also been validated using 10-fold cross-validation and direct partition in addition to various performance evaluation metrics: accuracy curve, true positive (TP) rate, false positive (FP) rate, accuracy, recall, F-Measurement, and receiver operating characteristic (ROC) curve. The NB model achieved a prediction accuracy of 99.34%, followed by the RF model with 98.7%.

5 citations

Journal ArticleDOI
TL;DR: In this paper , a local quadratic embedding learning (LQEL) algorithm for regression problems based on metric learning and neural networks (NNs) is proposed, which can make full use of the information implied in target labels to find more reliable reference instances.
References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Proceedings ArticleDOI
01 Jul 1992
TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.
Abstract: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

11,211 citations

Proceedings Article
Yoav Freund1, Robert E. Schapire1
03 Jul 1996
TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
Abstract: In an earlier paper, we introduced a new "boosting" algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learning algorithm that con- sistently generates classifiers whose performance is a little better than random guessing. We also introduced the related notion of a "pseudo-loss" which is a method for forcing a learning algorithm of multi-label concepts to concentrate on the labels that are hardest to discriminate. In this paper, we describe experiments we carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems. We performed two sets of experiments. The first set compared boosting to Breiman's "bagging" method when used to aggregate various classifiers (including decision trees and single attribute- value tests). We compared the performance of the two methods on a collection of machine-learning benchmarks. In the second set of experiments, we studied in more detail the performance of boosting using a nearest-neighbor classifier on an OCR problem.

7,601 citations

01 Jan 1996

7,386 citations