Journal ArticleDOI
Feature Selection with Missing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy
Reads0
Chats0
TLDR
A feature selection algorithm is designed to improve the performance for multilabel data with missing labels and is effective not only for recovering missing labels but also for selecting significant features with better classification performance.Abstract:
Recently, multilabel classification has generated considerable research interest. However, the high dimensionality of multilabel data incurs high costs; moreover, in many real applications, a number of labels of training samples are randomly missed. Thus, multilabel classification can have great complexity and ambiguity, which means some feature selection methods exhibit poor robustness and yield low prediction accuracy. To solve these issues, this paper presents a novel feature selection method based on multilabel fuzzy neighborhood rough sets (MFNRS) and maximum relevance minimum redundancy (MRMR) that can be used on multilabel data with missing labels. First, to handle multilabel data with missing labels, a relation coefficient of samples, label complement matrix, and label-specific feature matrix are constructed and implemented in a linear regression model to recover missing labels. Second, the margin-based fuzzy neighborhood radius, fuzzy neighborhood similarity relationship, and fuzzy neighborhood information granule are developed. The MFNRS model is built based on multilabel neighborhood rough sets combined with fuzzy neighborhood rough sets. Based on algebra and information views, certain fuzzy neighborhood entropy-based uncertainty measures are proposed for MFNRS. The fuzzy neighborhood mutual information-based MRMR model with label correlation is improved to evaluate the performance of candidate features. Finally, a feature selection algorithm is designed to improve the performance for multilabel data with missing labels. Experiments on twenty datasets verify that our method is effective not only for recovering missing labels but also for selecting significant features with better classification performance.read more
Citations
More filters
Journal ArticleDOI
Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification
TL;DR: A filter-wrapper preprocessing algorithm for feature selection using the improved Fisher score model is proposed to decrease the spatiotemporal complexity ofMultilabel data, and a heuristic feature selection algorithm is designed for improve classification performance on multilabel datasets.
Journal ArticleDOI
PrimePatNet87: Prime pattern and tunable q-factor wavelet transform techniques for automated accurate EEG emotion recognition.
Abdullah Dogan,Merve Akay,Prabal Datta Barua,Prabal Datta Barua,Mehmet Baygin,Sengul Dogan,Turker Tuncer,Ali H. Dogru,U. Rajendra Acharya,U. Rajendra Acharya,U. Rajendra Acharya +10 more
TL;DR: In this article, a novel prime pattern and tunable q-factor wavelet transform (TQWT) techniques were used to classify human emotions using electroencephalogram (EEG) signals.
Journal ArticleDOI
Feature reduction for imbalanced data classification using similarity-based feature clustering with adaptive weighted K-nearest neighbors
TL;DR: Zhang et al. as discussed by the authors presented a novel feature reduction method for imbalanced data classification using similarity-based feature clustering with adaptive weighted k-nearest neighbors (AWKNN).
Journal ArticleDOI
Feature selection techniques in the context of big data: taxonomy and analysis
TL;DR: A comprehensive review of the latest FS approaches in the context of big data along with a structured taxonomy, which categorizes the existing methods based on their nature, search strategy, evaluation process, and feature structure and highlights the research issues and open challenges related to FS.
Journal ArticleDOI
Practical multi-party private collaborative k-means clustering
TL;DR: Wang et al. as discussed by the authors proposed a protocol for k-means clustering in a collaborative manner, while protecting the privacy of each data record, which is suitable for multi-party collaboration to update cluster centers without leaking data privacy.
References
More filters
Journal Article
Statistical Comparisons of Classifiers over Multiple Data Sets
TL;DR: A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.
Feature selection based on mutual information: criteria ofmax-dependency, max-relevance, and min-redundancy
TL;DR: This work derives an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection, and presents a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers).
Journal ArticleDOI
ML-KNN: A lazy learning approach to multi-label learning
Min-Ling Zhang,Zhi-Hua Zhou +1 more
TL;DR: Experiments on three different real-world multi-label learning problems, i.e. Yeast gene functional analysis, natural scene classification and automatic web page categorization, show that ML-KNN achieves superior performance to some well-established multi- label learning algorithms.
Journal ArticleDOI
Feature selection for multi-label naive Bayes classification
TL;DR: This paper proposes a method called Mlnb which adapts the traditional naive Bayes classifiers to deal with multi-label instances and achieves comparable performance to other well-established multi- label learning algorithms.
Journal ArticleDOI
Multilabel dimensionality reduction via dependence maximization
Yin Zhang,Zhi-Hua Zhou +1 more
TL;DR: Zhang et al. as mentioned in this paper proposed a multilabel dimensionality reduction method, MDDM, with two kinds of projection strategies, attempting to project the original data into a lower-dimensional feature space maximizing the dependence between the original feature description and the associated class labels.