Linguistic Hedges Fuzzy Feature Selection for Differential Diagnosis of Erythemato-Squamous Diseases.
01 Jan 2012-pp 487-500
TL;DR: In this article, a feature selection based on Linguistic Hedges Neural-Fuzzy classifier is presented for the diagnosis of erythemato-squamous diseases, and the performance evaluation of this system is estimated by using four training-test partition models: 50-50, 60-40, 70-30, and 80-20%.
Abstract: The differential diagnosis of erythemato-squamous diseases is a real challenge in dermatology. In diagnosing of these diseases, a biopsy is vital. However, unfortunately these diseases share many histopathological features, as well. Another difficulty for the differential diagnosis is that one disease may show the features of another disease at the beginning stage and may have the characteristic features at the following stages. In this paper, a new Feature Selection based on Linguistic Hedges Neural-Fuzzy classifier is presented for the diagnosis of erythemato-squamous diseases. The performance evaluation of this system is estimated by using four training-test partition models: 50–50%, 60–40%, 70–30% and 80–20%. The highest classification accuracy of 95.7746% was achieved for 80–20% training-test partition using 3 clusters and 18 fuzzy rules, 93.820% for 50–50% training-test partition using 3 clusters and 18 fuzzy rules, 92.5234% for 70–30% training-test partition using 5 clusters and 30 fuzzy rules, and 91.6084% for 60–40% training-test partition using 6 clusters and 36 fuzzy rules. Therefore, 80–20% training-test partition using 3 clusters and 18 fuzzy rules are the best classification accuracy with RMSE of 6.5139e-013. This research demonstrated that the proposed method can be used for reducing the dimension of feature space and can be used to obtain fast automatic diagnostic systems for other diseases.
Citations
More filters
TL;DR: A particle swarm optimization-based approach to train the NN (NN-PSO), capable to tackle the problem of predicting structural failure of multistoried reinforced concrete buildings via detecting the failure possibility of the multistory reinforced concrete building structure in the future.
Abstract: Faulty structural design may cause multistory reinforced concrete (RC) buildings to collapse suddenly. All attempts are directed to avoid structural failure as it leads to human life danger as well as wasting time and property. Using traditional methods for predicting structural failure of the RC buildings will be time-consuming and complex. Recent research proved the artificial neural network (ANN) potentiality in solving various real-life problems. The traditional learning algorithms suffer from being trapped into local optima with a premature convergence. Thus, it is a challenging task to achieve expected accuracy while using traditional learning algorithms to train ANN. To solve this problem, the present work proposed a particle swarm optimization-based approach to train the NN (NN-PSO). The PSO is employed to find a weight vector with minimum root-mean-square error (RMSE) for the NN. The proposed (NN-PSO) classifier is capable to tackle the problem of predicting structural failure of multistoried reinforced concrete buildings via detecting the failure possibility of the multistoried RC building structure in the future. A database of 150 multistoried buildings’ RC structures was employed in the experimental results. The PSO algorithm was involved to select the optimal weights for the NN classifier. Fifteen features have been extracted from the structural design, while nine features have been opted to perform the classification process. Moreover, the NN-PSO model was compared with NN and MLP-FFN (multilayer perceptron feed-forward network) classifier to find its ingenuity. The experimental results established the superiority of the proposed NN-PSO compared to the NN and MLP-FFN classifiers. The NN-PSO achieved 90 % accuracy with 90 % precision, 94.74 % recall and 92.31 % F-Measure.
252 citations
01 Apr 2015
TL;DR: Linguistic hedges neuro-fuzzy classifier with selected features (LHNFCSF) is presented for dimensionality reduction, feature selection and classification and suggests that the proposed method can help reducing the dimensionality of large data sets but also can speed up the computation time of a learning algorithm and simplify the classification tasks.
Abstract: Massive and complex data are generated every day in many fields. Complex data refer to data sets that are so large that conventional database management and data analysis tools are insufficient to deal with them. Managing and analysis of medical big data involve many different issues regarding their structure, storage and analysis. In this paper, linguistic hedges neuro-fuzzy classifier with selected features (LHNFCSF) is presented for dimensionality reduction, feature selection and classification. Four real-world data sets are provided to demonstrate the performance of the proposed neuro-fuzzy classifier. The new classifier is compared with the other classifiers for different classification problems. The results indicated that applying LHNFCSF not only reduces the dimensions of the problem, but also improves classification performance by discarding redundant, noise-corrupted, or unimportant features. The results strongly suggest that the proposed method not only help reducing the dimensionality of large data sets but also can speed up the computation time of a learning algorithm and simplify the classification tasks.
154 citations
TL;DR: A supervised feature selection method based on Rough Set Quick Reduct hybridized with Improved Harmony Search algorithm to deal with issues of high dimensionality in the medical dataset is presented.
Abstract: Feature selection is a process of selecting optimal features that produce the most prognostic outcome. It is one of the essential steps in knowledge discovery. The crisis is that not all features are important. Most of the features may be redundant, and the rest may be irrelevant and noisy. This paper presents a novel feature selection approach to deal with issues of high dimensionality in the medical dataset. Medical datasets are habitually classified by a large number of measurements and a comparatively small number of patient records. Most of these measurements are irrelevant or noisy. This paper proposes a supervised feature selection method based on Rough Set Quick Reduct hybridized with Improved Harmony Search algorithm. Rough set theory is one of the most thriving methods used for feature selection. The Rough Set Improved Harmony Search Quick Reduct (RS-IHS-QR) algorithm is a relatively new population-based meta-heuristic optimization algorithm. This approach imitates the music improvisation process, where each musician improvises their instrument's pitch by searching for a perfect state of harmony. The quality of the reduced data is measured by the classification performance. The proposed algorithm is experimentally compared with the existing algorithms Rough Set Quick Reduct (RS-QR) and Rough Set Particle Swarm Optimization Quick Reduct (RS-PSO-QR). The number of features selected by the proposed method is comparatively low. The proposed algorithm reveals more than 90 % classification accuracy in most of the cases and the time taken to reduct the dataset also decreased than the existing methods. The experimental result demonstrates the efficiency and effectiveness of the proposed algorithm.
112 citations
TL;DR: Jaya-based k-means is applied to divide the feature set into two mutually exclusive clusters and fire the fuzzy rule, and LH-based feature selecting capability of the proposed classifier not only reduces computation time but also improves the accuracy by discarding irrelevant features.
Abstract: The brain-computer interface (BCI) identifies brain patterns to translate thoughts into action. The identification relies on the performance of the classifier. In this paper, identification and monitoring of electroencephalogram-based BCI for motor imagery (MI) task is proposed by an efficient adaptive neuro-fuzzy classifier (NFC). The Jaya optimization algorithm is integrated with adaptive neuro-fuzzy inference systems to enhance classification accuracy. The linguistic hedge (LH) is used for proper elicitation and pruning of the fuzzy rules and network is trained using scaled conjugate gradient (SCG) and speeding up SCG (SSCG) techniques. In this paper, Jaya-based k-means is applied to divide the feature set into two mutually exclusive clusters and fire the fuzzy rule. The performance of the proposed classifier, Jaya-based NFC using SSCG as training algorithm and is powered by LH (JayaNFCSSCGLH), is compared with four different NFCs for classifying two class MI-based tasks. We observed a shortening of computation time per iteration by 57.78% in the case of SSCG as compared with the SCG technique of training. LH-based feature selecting capability of the proposed classifier not only reduces computation time but also improves the accuracy by discarding irrelevant features. Lesser computation time with fast convergence and high accuracy among considered NFCs make it a suitable choice for the real-time application. Supremacy of JayaNFCSSCGLH among the considered classifier is validated through Friedman test. Classification result is used to control switching of light emitting diode, turning thoughts into action.
35 citations
TL;DR: Improved dominance-based rough set for classification of medical data is suggested, which can accurately classify medical datasets collected from UCI repository Web sites and gives higher accuracy.
Abstract: Feature selection and classification is widely used in many areas of science and engineering, as large datasets become increasingly common. In particular, bioscience and medical datasets routinely contain several thousands of features. For effective data mining in such databases, many methods and techniques have been developed. Rough set is a mathematical theory for dealing with uncertainty. In dominance-based rough set extension of rough set, the set of objects partitioned into pre-defined and preference-ordered classes, the new rough set approach is able to approximate this partition by means of dominance relations. This paper suggests improved dominance-based rough set for classification of medical data. Dominance-based rough set can handle ordinal attribute. This paper proposed a technique for applying dominance-based rough set for nominal attribute. This proposed work suggests decision table to determine dominance relation, and then improved dominance-based rough set is applied to find lower, upper, boundary approximations in the entire dataset. Then attribute reduction based on proposed technique is applied to find the essential attribute required for classification. This proposed method can accurately classify medical datasets collected from UCI repository Web sites. This proposed method works in seven different datasets: They are heart disease dataset, Pima Indian diabetes dataset, Breast cancer Wisconsin dataset, heart valve dataset, jaundice datasets, dermatology dataset and lung cancer dataset. Comparing the classification accuracy with rule-based classifier (Zero R, decision table), tree-based classifier (J48, Random forest, Random Tree), neural network-based classifier (multilayer perceptron), lazy classifier (IBk, KStar, LWL), Bayesian-based classifier (Naive Bayes), benchmark algorithm k-nearest---neighbour, and classical rough set approach, improved dominance-based rough set gives higher accuracy.
33 citations
References
More filters
01 Jan 1985
TL;DR: A mathematical tool to build a fuzzy model of a system where fuzzy implications and reasoning are used is presented and two applications of the method to industrial processes are discussed: a water cleaning process and a converter in a steel-making process.
Abstract: A mathematical tool to build a fuzzy model of a system where fuzzy implications and reasoning are used is presented. The premise of an implication is the description of fuzzy subspace of inputs and its consequence is a linear input-output relation. The method of identification of a system using its input-output data is then shown. Two applications of the method to industrial processes are also discussed: a water cleaning process and a converter in a steel-making process.
18,803 citations
01 May 1993
TL;DR: The architecture and learning procedure underlying ANFIS (adaptive-network-based fuzzy inference system) is presented, which is a fuzzy inference System implemented in the framework of adaptive networks.
Abstract: The architecture and learning procedure underlying ANFIS (adaptive-network-based fuzzy inference system) is presented, which is a fuzzy inference system implemented in the framework of adaptive networks. By using a hybrid learning procedure, the proposed ANFIS can construct an input-output mapping based on both human knowledge (in the form of fuzzy if-then rules) and stipulated input-output data pairs. In the simulation, the ANFIS architecture is employed to model nonlinear functions, identify nonlinear components on-line in a control system, and predict a chaotic time series, all yielding remarkable results. Comparisons with artificial neural networks and earlier work on fuzzy modeling are listed and discussed. Other extensions of the proposed ANFIS and promising applications to automatic control and signal processing are also suggested. >
15,085 citations
TL;DR: Experiments show that SCG is considerably faster than BP, CGL, and BFGS, and avoids a time consuming line search.
Abstract: A supervised learning algorithm (Scaled Conjugate Gradient, SCG) is introduced. The performance of SCG is benchmarked against that of the standard back propagation algorithm (BP) (Rumelhart, Hinton, & Williams, 1986), the conjugate gradient algorithm with line search (CGL) (Johansson, Dowla, & Goodman, 1990) and the one-step Broyden-Fletcher-Goldfarb-Shanno memoriless quasi-Newton algorithm (BFGS) (Battiti, 1990). SCG is fully-automated, includes no critical user-dependent parameters, and avoids a time consuming line search, which CGL and BFGS use in each iteration in order to determine an appropriate step size. Experiments show that SCG is considerably faster than BP, CGL, and BFGS.
3,882 citations
01 Mar 1995
TL;DR: The essential part of neuro-fuzzy synergisms comes from a common framework called adaptive networks, which unifies both neural networks and fuzzy models, which possess certain advantages over neural networks.
Abstract: Fundamental and advanced developments in neuro-fuzzy synergisms for modeling and control are reviewed. The essential part of neuro-fuzzy synergisms comes from a common framework called adaptive networks, which unifies both neural networks and fuzzy models. The fuzzy models under the framework of adaptive networks is called adaptive-network-based fuzzy inference system (ANFIS), which possess certain advantages over neural networks. We introduce the design methods for ANFIS in both modeling and control applications. Current problems and future directions for neuro-fuzzy approaches are also addressed. >
2,260 citations
Book•
01 Jan 2006
TL;DR: This book discusses Feature Extraction for Classification of Proteomic Mass Spectra, Sequence Motifs: Highly Predictive Features of Protein Function, and Combining a Filter Method with SVMs.
Abstract: An Introduction to Feature Extraction.- An Introduction to Feature Extraction.- Feature Extraction Fundamentals.- Learning Machines.- Assessment Methods.- Filter Methods.- Search Strategies.- Embedded Methods.- Information-Theoretic Methods.- Ensemble Learning.- Fuzzy Neural Networks.- Feature Selection Challenge.- Design and Analysis of the NIPS2003 Challenge.- High Dimensional Classification with Bayesian Neural Networks and Dirichlet Diffusion Trees.- Ensembles of Regularized Least Squares Classifiers for High-Dimensional Problems.- Combining SVMs with Various Feature Selection Strategies.- Feature Selection with Transductive Support Vector Machines.- Variable Selection using Correlation and Single Variable Classifier Methods: Applications.- Tree-Based Ensembles with Dynamic Soft Feature Selection.- Sparse, Flexible and Efficient Modeling using L 1 Regularization.- Margin Based Feature Selection and Infogain with Standard Classifiers.- Bayesian Support Vector Machines for Feature Ranking and Selection.- Nonlinear Feature Selection with the Potential Support Vector Machine.- Combining a Filter Method with SVMs.- Feature Selection via Sensitivity Analysis with Direct Kernel PLS.- Information Gain, Correlation and Support Vector Machines.- Mining for Complex Models Comprising Feature Selection and Classification.- Combining Information-Based Supervised and Unsupervised Feature Selection.- An Enhanced Selective Naive Bayes Method with Optimal Discretization.- An Input Variable Importance Definition based on Empirical Data Probability Distribution.- New Perspectives in Feature Extraction.- Spectral Dimensionality Reduction.- Constructing Orthogonal Latent Features for Arbitrary Loss.- Large Margin Principles for Feature Selection.- Feature Extraction for Classification of Proteomic Mass Spectra: A Comparative Study.- Sequence Motifs: Highly Predictive Features of Protein Function.
1,593 citations