scispace - formally typeset
Open AccessJournal ArticleDOI

An up-to-date comparison of state-of-the-art classification algorithms

Reads0
Chats0
TLDR
It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines and Random Forests, while being the fastest algorithm in terms of prediction efficiency.
Abstract
Up-to-date report on the accuracy and efficiency of state-of-the-art classifiers.We compare the accuracy of 11 classification algorithms pairwise and groupwise.We examine separately the training, parameter-tuning, and testing time.GBDT and Random Forests yield highest accuracy, outperforming SVM.GBDT is the fastest in testing, Naive Bayes the fastest in training. Current benchmark reports of classification algorithms generally concern common classifiers and their variants but do not include many algorithms that have been introduced in recent years. Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, we carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine (ELM), Sparse Representation based Classification (SRC), and Deep Learning (DL), which have not been thoroughly investigated in existing comparative studies. It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines (SVM) and Random Forests (RF), while being the fastest algorithm in terms of prediction efficiency. ELM also yields good accuracy results, ranking in the top-5, alongside GBDT, RF, SVM, and C4.5 but this performance varies widely across all data sets. Unsurprisingly, top accuracy performers have average or slow training time efficiency. DL is the worst performer in terms of accuracy but second fastest in prediction efficiency. SRC shows good accuracy performance but it is the slowest classifier in both training and testing.

read more

Citations
More filters
Proceedings ArticleDOI

An Expeditious kNN Algorithm for Massive IoT Data Classification

Sneha Jacob, +1 more
TL;DR: Experiments prove that the increase in the size of the dataset does not affect the execution time and performance of the proposed method, and it is 8X faster than the traditional kNN.
Journal ArticleDOI

Using Machine Learning Techniques to Predict Learner Drop-out Rate in Higher Educational Institutions

TL;DR: In this article , a study deployed a machine learning algorithm with high model accuracy to predict students' drop-out rates and identify dominant attributes that affect learner attrition and retention in tertiary education.
Journal ArticleDOI

An Approach to Growth Delimitation of Straight Line Segment Classifiers Based on a Minimum Bounding Box.

TL;DR: In this paper, the authors proposed an approach for adjusting the placements of labeled straight-line segment extremities to build reliable classifiers in a constrained search space (tuned by a scale factor parameter) in order to restrict their lengths.
Book ChapterDOI

Single-Subject vs. Cross-Subject Motor Imagery Models

TL;DR: In this article , the performance of machine learning algorithms trained and tested on single-subject EEG data compared to nine-person cross-person EEG data from the BCI IV 2a dataset was compared.
Journal ArticleDOI

Kernel-based Nonlinear Manifold Learning for EEG-based Functional Connectivity Analysis and Channel Selection with Application to Alzheimer’s Disease

TL;DR: In this article , the authors used kernel-based nonlinear manifold learning to measure the functional connectivity (FC) between EEG channels and found significant differences in FC between bipolar channels of the occipital region and other regions (i.e. parietal, centro-parietal, and fronto-central) between healthy controls and patients with mild to moderate AD.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Journal ArticleDOI

Support-Vector Networks

TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Book

C4.5: Programs for Machine Learning

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
Related Papers (5)
Frequently Asked Questions (2)
Q1. What have the authors contributed in "An up-to-date comparison of state- of-the-art classification algorithms" ?

Moreover, important properties such as the dependency on number of classes and features and CPU running time are typically not examined. In this paper, the authors carry out a comparative empirical study on both established classifiers and more recently proposed ones on 71 data sets originating from different domains, publicly available at UCI and KEEL repositories. The list of 11 algorithms studied includes Extreme Learning Machine ( ELM ), Sparse Representation based Classification ( SRC ), and Deep Learning ( DL ), which have not been thoroughly investigated in existing comparative studies. 

In the future work, the authors will further investigate the performance of the 11 classifiers in specific application domains and with different feature selection methods. 

Trending Questions (1)
What are the state of the art video classification algorithms?

The paper does not mention any specific video classification algorithms.