scispace - formally typeset
Journal ArticleDOI

Benchmarking state-of-the-art classification algorithms for credit scoring

Reads0
Chats0
TLDR
It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.
Abstract
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.

read more

Citations
More filters
Journal ArticleDOI

Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

TL;DR: A framework for comparative software defect prediction experiments is proposed and applied in a large-scale empirical comparison of 22 classifiers over 10 public domain data sets from the NASA Metrics Data repository, showing an appealing degree of predictive accuracy, which supports the view that metric-based classification is useful.
Journal ArticleDOI

Credit scoring with a data mining approach based on support vector machines

TL;DR: Experimental results show that SVM is a promising addition to the existing data mining methods and three strategies to construct the hybrid SVM-based credit scoring models are used.
Journal ArticleDOI

The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients

TL;DR: Among the six data mining techniques, artificial neural network is the only one that can accurately estimate the real probability of default, and its regression intercept is close to zero, and regression coefficient to one.
Journal ArticleDOI

Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research

TL;DR: The study of Baesens et al. (2003) is updated and several novel classification algorithms to the state-of-the-art in credit scoring are compared, providing an independent assessment of recent scoring methods and offering a new baseline to which future approaches can be compared.
Journal ArticleDOI

An experimental comparison of classification algorithms for imbalanced credit scoring data sets

TL;DR: The results from this empirical study indicate that the random forest and gradient boosting classifiers perform very well in a credit scoring context and are able to cope comparatively well with pronounced class imbalances in these data sets.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Book

C4.5: Programs for Machine Learning

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
Book

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

TL;DR: In this paper, the authors describe the important ideas in these areas in a common conceptual framework, and the emphasis is on concepts rather than mathematics, with a liberal use of color graphics.
Book

Neural networks for pattern recognition

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Related Papers (5)