ADASYN: Adaptive synthetic sampling approach for imbalanced learning

doi:10.1109/IJCNN.2008.4633969

Open AccessProceedings ArticleDOI

ADASYN: Adaptive synthetic sampling approach for imbalanced learning

Haibo He, +3 more

- pp 1322-1328

Chats0

TLDR

Simulation analyses on several machine learning data sets show the effectiveness of the ADASYN sampling approach across five evaluation metrics.

Abstract:

This paper presents a novel adaptive synthetic (ADASYN) sampling approach for learning from imbalanced data sets. The essential idea of ADASYN is to use a weighted distribution for different minority class examples according to their level of difficulty in learning, where more synthetic data is generated for minority class examples that are harder to learn compared to those minority examples that are easier to learn. As a result, the ADASYN approach improves learning with respect to the data distributions in two ways: (1) reducing the bias introduced by the class imbalance, and (2) adaptively shifting the classification decision boundary toward the difficult examples. Simulation analyses on several machine learning data sets show the effectiveness of this method across five evaluation metrics.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Learning from Imbalanced Data

Haibo He, +1 more

- 01 Sep 2009 -

IEEE Transactions on Knowledge and Data ...

TL;DR: A critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario is provided.

...read moreread less

Journal ArticleDOI

Learning from imbalanced data: open challenges and future directions

Bartosz Krawczyk

- 22 Apr 2016 -

Progress in Artificial Intelligence

TL;DR: Seven vital areas of research in this topic are identified, covering the full spectrum of learning from imbalanced data: classification, regression, clustering, data streams, big data analytics and applications, e.g., in social media and computer vision.

...read moreread less

Proceedings ArticleDOI

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui, +4 more

TL;DR: This work designs a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss and introduces a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point.

...read moreread less

Journal ArticleDOI

An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics

Victoria López, +4 more

- 20 Nov 2013 -

Information Sciences

TL;DR: This work carries out a thorough discussion on the main issues related to using data intrinsic characteristics in this classification problem, and introduces several approaches and recommendations to address these problems in conjunction with imbalanced data.

...read moreread less

Journal ArticleDOI

SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary

Alberto Fernández, +3 more

- 01 Jan 2018 -

Journal of Artificial Intelligence Resea...

TL;DR: The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data because of its simplicity in the design, as well as its robustness when applied to different type of problems.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

SMOTE: synthetic minority over-sampling technique

Nitesh V. Chawla, +3 more

- 01 Jan 2002 -

Journal of Artificial Intelligence Resea...

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

Journal ArticleDOI

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

Yoav Freund, +1 more

TL;DR: The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.

...read moreread less

Journal ArticleDOI

SMOTE: Synthetic Minority Over-sampling Technique

Nitesh V. Chawla, +3 more

- 09 Jun 2011 -

arXiv: Artificial Intelligence

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

Proceedings Article

Experiments with a new boosting algorithm

Yoav Freund, +1 more

TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.

...read moreread less