Local neighbourhood extension of SMOTE for mining imbalanced data
Citations
905 citations
Cites background from "Local neighbourhood extension of SM..."
...The interpolation mechanisms can be range restricted (Han et al., 2005; Bunkhumpornpat et al., 2009; Maciejewski & Stefanowski, 2011), for example by looking not only for nearest neighbours from the minority class but also from majority class; creating new examples closer to the selected instance…...
[...]
873 citations
Cites background from "Local neighbourhood extension of SM..."
...To counter the negative effects, one often chooses from a few available options, which have been extensively studied in the past [7, 9, 11, 17, 18, 30, 40, 41, 46, 48]....
[...]
...To address this, SMOTE [7] creates new non-replicated examples by interpolating neighboring minority class instances....
[...]
...Several variants of SMOTE [17, 30] followed for improvements....
[...]
...Related Work Previous efforts to tackle the class imbalance problem can be mainly divided into two groups: data re-sampling [7, 11, 17, 18, 30] and cost-sensitive learning [9, 40, 41, 46, 48]....
[...]
...Jeatrakul et al. [21] treated the Complementary Neural Network as an under-sampling technique, and combined it with SMOTE over-sampling to balance training data....
[...]
730 citations
524 citations
425 citations
References
17,313 citations
11,512 citations
10,306 citations
"Local neighbourhood extension of SM..." refers methods in this paper
...In order to globally compare performance of a pair of methods on all data sets we used the Wilcoxon Signed Ranks Test – a nonparametric test for significant differences between paired observations – see details of its calculations described in [15]....
[...]
6,320 citations
"Local neighbourhood extension of SM..." refers background or methods in this paper
...Typical such problems are medical diagnosing dangerous illness, analysing financial risk, detecting oil spills in satellite images, predicting technical equipment failures or information filtering [1], [2]....
[...]
...As the overall classification accuracy is biased towards the majority classes [2], in most of the studies on imbalanced data, measures defined for two-class classification are considered, where typically the class label of the minority class is called positive and the class label of the majority class is negative....
[...]
...Learning from imbalanced data has received growing research interest in the last decade and several specialized methods have been proposed (see [2], [3] for a review)....
[...]
2,914 citations
"Local neighbourhood extension of SM..." refers methods or result in this paper
...[7], [1], [6]) we remove difficult noisy examples from the majority class in the first step before applying LN-SMOTE to the modified data....
[...]
...ENN and Tomek links [7], but the strategy described above gave the best results....
[...]
...We chose them as decision trees are known to be sensitive to class imbalance and they were often used in studies of SMOTE and their extensions [7], [9], [1], [8]....
[...]
...Although experiments confirmed its usefulness [1], [7], some of assumptions behind this technique could be still questioned....
[...]