scispace - formally typeset
Open AccessJournal ArticleDOI

A state of the art survey of data mining-based fraud detection and credit scoring

TLDR
A dense review of up-to-date techniques for fraud detection and credit scoring, a general analysis on the results achieved and upcoming challenges for further researches is provided.
Abstract
Credit risk has been a widespread and deep penetrating problem for centuries, but not until various credit derivatives and products were developed and novel technologies began radically changing the human society, have fraud detection, credit scoring and other risk management systems become so important not only to some specific firms, but to industries and governments worldwide. Frauds and unpredictable defaults cost billions of dollars each year, thus, forcing financial institutions to continuously improve their systems for loss reduction. In the past twenty years, amounts of studies have proposed the use of data mining techniques to detect frauds, score credits and manage risks, but issues such as data selection, algorithm design, and hyperparameter optimization affect the perceived ability of the proposed solutions and it is difficult for auditors and researchers to explore and figure out the highest level of general development in this area. In this survey we focus on a state of the art survey of recently developed data mining techniques for fraud detection and credit scoring. Several outstanding experiments are recorded and highlighted, and the corresponding techniques, which are mostly based on supervised learning algorithms, unsupervised learning algorithms, semisupervised algorithms, ensemble learning, transfer learning, or some hybrid ideas are explained and analysed. The goal of this paper is to provide a dense review of up-to-date techniques for fraud detection and credit scoring, a general analysis on the results achieved and upcoming challenges for further researches.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Deep Learning for Anomaly Detection: A Survey.

TL;DR: A structured and comprehensive overview of research methods in deep learning-based anomaly detection, grouped state-of-the-art research techniques into different categories based on the underlying assumptions and approach adopted.
Journal Article

An Improved K-Means Clustering Algorithm

TL;DR: The simulation experiment with IRIS data set shows that the proposed algorithm converges faster and the value k found is close to the actual value, which proves the validity of the algorithm.
Journal ArticleDOI

AESMOTE: Adversarial Reinforcement Learning With SMOTE for Anomaly Detection

TL;DR: The goal is not only to exploit the auto-learning ability of the reinforcement-learning loop but also to address the dataset imbalance problem, which is pervasive in existing learning-based solutions.
Journal ArticleDOI

Intelligent financial fraud detection practices in post-pandemic era.

TL;DR: A comprehensive overview of intelligent financial fraud detection practices can be found in this article, where the authors analyze the new features of fraud risk caused by the recent coronavirus pandemic and review the development of data types used in fraud detection from quantitative tabular data to various unstructured data.
Proceedings ArticleDOI

An Efficient Real Time Model For Credit Card Fraud Detection Based On Deep Learning

TL;DR: This paper proposes a live credit card fraud detection system based on a deep neural network technology based on an auto-encoder and it permits to classify, in real-time, credit card transactions as legitimate or fraudulent.
References
More filters
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Journal ArticleDOI

Statistical Fraud Detection: A Review

TL;DR: This work describes the tools available for statistical fraud detection and the areas in which fraud detection technologies are most used, and statistics and machine learning provide effective technologies for fraud detection.
Journal ArticleDOI

A comparative study of efficient initialization methods for the k-means clustering algorithm

TL;DR: It is demonstrated that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods, and eight commonly used linear time complexity initialization methods are compared.
Journal ArticleDOI

The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature

TL;DR: The main data mining techniques used for FFD are logistic models, neural networks, the Bayesian belief network, and decision trees, all of which provide primary solutions to the problems inherent in the detection and classification of fraudulent data.
Related Papers (5)