scispace - formally typeset
Proceedings ArticleDOI

Application of Natural Neighbor-based Algorithm on Oversampling SMOTE Algorithms

TLDR
In this article, the authors proposed an approach to suggest a value of the parameter k using Natural Neighbor algorithm, which is the number of nearest neighbors with respect to a certain data point.
Abstract
Classification performance depends highly on data distribution. In real life, data often come imbalanced where one class is found more often than others. SMOTE-based algorithms are usually used to handle the class imbalance problem. One key parameter that algorithms in SMOTE family require is k-the number of nearest neighbors with respect to a certain data point. K that fits the dataset the most gives the optimum performance. This paper proposes an approach to suggest a value of the parameter k using Natural Neighbor algorithm. Datasets are made balanced by four SMOTE-based algorithms–standard SMOTE, Safe-Level-SMOTE, ModifiedSMOTE and Weighted-SMOTE. The F-measure and Recall matrices are used to evaluate classification performance of a Support Vector Machine classifier running against six datasets with different imbalance ratios. The results show that, the average classification performance achieved by the proposed k’s is closer to the optimum when compared with the performance given by the default value of k.

read more

Citations
More filters
Book ChapterDOI

Detecting Sybil Node in Intelligent Transport System

TL;DR: In this article , the authors proposed a method to detect Sybil vehicle using machine learning algorithms using simulation of urban mobility (SUMO) and OpenStreetMap (OSM) and collected the dataset and features for training the machine learning models.
Journal ArticleDOI

Image Classification Under Class-Imbalanced Situation

TL;DR: In this article , the authors summarized the literature on class-imbalanced image classification methods in recent years, and analyzed the classification methods from both the data level and the algorithm level.
Book ChapterDOI

Improving Multi-class Text Classification Using Balancing Techniques

Jin Youxin
TL;DR: In this article , an ensemble of mathematical balancing techniques was introduced to increase the efficiency of sentiment analysis models based on BERT scheme, and the obtained results are significant, indicating that their two main metrics, AVG-Recall and F1-PN, are 17% and 19% higher, respectively, when compared to the classifiers' results applied to the imbalanced dataset.
Book ChapterDOI

ASNN: Accelerated Searching for Natural Neighbors

TL;DR: ASNN as mentioned in this paper is based on the assumption that if the remote objects have NaNs, others certainly have the NaNs and it first extracts remote points and then only searches the neighbors of remote points, instead of all points, so that ASNN can quickly obtain the natural neighbor eigenvalue.
Proceedings ArticleDOI

Predicting Insurance Churn to Reduce Clawback

TL;DR: In this article , the authors use machine learning techniques to predict policy churn prior to payment of the commission to the agent, which avoids the process of having the agent repay the commission (a process termed clawback) in such cases.
References
More filters
Journal ArticleDOI

Support-Vector Networks

TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Journal ArticleDOI

SMOTE: synthetic minority over-sampling technique

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Journal ArticleDOI

OpenML: networked science in machine learning

TL;DR: OpenML as discussed by the authors is a place for machine learning researchers to share and organize data in fine detail, so that they can work more effectively, be more visible, and collaborate with others to tackle harder problems.
Book ChapterDOI

Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem

TL;DR: The technique called Safe-Level-SMOTE carefully samples minority instances along the same line with different weight degree, called safe level, and achieves a better accuracy performance than SMOTE and Borderline- SMOTE.
Proceedings ArticleDOI

MSMOTE: Improving Classification Performance When Training Data is Imbalanced

TL;DR: A modified approach (MSMOTE) for learning from imbalanced data sets, based on the SMOTE algorithm, which not only considers the distribution of minority class samples, but also eliminates noise samples by adaptive mediation.
Related Papers (5)
Trending Questions (1)
How does SMOTE help to improve the performance of SVM on imbalanced data?

SMOTE helps to improve the performance of SVM on imbalanced data by generating synthetic samples for the minority class, thus balancing the dataset.