Shaimaa Ahmed El-Said
Other affiliations: Princess Nora bint Abdul Rahman University
Bio: Shaimaa Ahmed El-Said is an academic researcher from Zagazig University. The author has contributed to research in topics: Wireless sensor network & Intrusion detection system. The author has an hindex of 10, co-authored 25 publications receiving 476 citations. Previous affiliations of Shaimaa Ahmed El-Said include Princess Nora bint Abdul Rahman University.
TL;DR: The experimental results reveal that these SVM classifiers achieve very fast, simple, and efficient breast cancer diagnosis and strongly suggest that LPSVM can aid in the diagnosis of breast cancer.
Abstract: Support vector machine (SVM) is a supervised machine learning approach that was recognized as a statistical learning apotheosis for the small-sample database. SVM has shown its excellent learning and generalization ability and has been extensively employed in many areas. This paper presents a performance analysis of six types of SVMs for the diagnosis of the classical Wisconsin breast cancer problem from a statistical point of view. The classification performance of standard SVM (St-SVM) is analyzed and compared with those of the other modified classifiers such as proximal support vector machine (PSVM) classifiers, Lagrangian support vector machines (LSVM), finite Newton method for Lagrangian support vector machine (NSVM), Linear programming support vector machines (LPSVM), and smooth support vector machine (SSVM). The experimental results reveal that these SVM classifiers achieve very fast, simple, and efficient breast cancer diagnosis. The training results indicated that LSVM has the lowest accuracy of 95.6107 %, while St-SVM performed better than other methods for all performance indices (accuracy = 97.71 %) and is closely followed by LPSVM (accuracy = 97.3282). However, in the validation phase, the overall accuracies of LPSVM achieved 97.1429 %, which was superior to LSVM (95.4286 %), SSVM (96.5714 %), PSVM (96 %), NSVM (96.5714 %), and St-SVM (94.86 %). Value of ROC and MCC for LPSVM achieved 0.9938 and 0.9369, respectively, which outperformed other classifiers. The results strongly suggest that LPSVM can aid in the diagnosis of breast cancer.
TL;DR: Three classification algorithms, multi-layer perceptron (MLP), radial basis function (RBF) and probabilistic neural networks (PNN), are applied for the purpose of detection and classification of breast cancer and PNN was the best classifiers by achieving accuracy rates of 100 and 97.66 % in both training and testing phases, respectively.
Abstract: Among cancers, breast cancer causes second most number of deaths in women. To reduce the high number of unnecessary breast biopsies, several computer-aided diagnosis systems have been proposed in the last years. These systems help physicians in their decision to perform a breast biopsy on a suspicious lesion seen in a mammogram or to perform a short-term follow-up examination instead. In clinical diagnosis, the use of artificial intelligent techniques as neural networks has shown great potential in this field. In this paper, three classification algorithms, multi-layer perceptron (MLP), radial basis function (RBF) and probabilistic neural networks (PNN), are applied for the purpose of detection and classification of breast cancer. Decision making is performed in two stages: training the classifiers with features from Wisconsin Breast Cancer database and then testing. The performance of the proposed structure is evaluated in terms of sensitivity, specificity, accuracy and ROC. The results revealed that PNN was the best classifiers by achieving accuracy rates of 100 and 97.66 % in both training and testing phases, respectively. MLP was ranked as the second classifier and was capable of achieving 97.80 and 96.34 % classification accuracy for training and validation phases, respectively, using scaled conjugate gradient learning algorithm. However, RBF performed better than MLP in the training phase, and it has achieved the lowest accuracy in the validation phase.
TL;DR: A comparison between hard and fuzzy clustering algorithms for thyroid diseases data set in order to find the optimal number of clusters and some recommendations are formulated to improve determining the actual number of cluster present in the data set.
Abstract: Thyroid hormones produced by the thyroid gland help regulation of the body's metabolism. A variety of methods have been proposed in the literature for thyroid disease classification. As far as we know, clustering techniques have not been used in thyroid diseases data set so far. This paper proposes a comparison between hard and fuzzy clustering algorithms for thyroid diseases data set in order to find the optimal number of clusters. Different scalar validity measures are used in comparing the performances of the proposed clustering systems. To demonstrate the performance of each algorithm, the feature values that represent thyroid disease are used as input for the system. Several runs are carried out and recorded with a different number of clusters being specified for each run (between 2 and 11), so as to establish the optimum number of clusters. To find the optimal number of clusters, the so-called elbow criterion is applied. The experimental results revealed that for all algorithms, the elbow was located at c=3. The clustering results for all algorithms are then visualized by the Sammon mapping method to find a low-dimensional (normally 2D or 3D) representation of a set of points distributed in a high dimensional pattern space. At the end of this study, some recommendations are formulated to improve determining the actual number of clusters present in the data set.
TL;DR: The results strongly suggest that ANFCLH can aid in the diagnosis of breast cancer and can be very helpful to the physicians for their final decision on their patients.
Abstract: Although adaptive neuro-fuzzy inference system (ANFIS) has very fast convergence time, it is not suitable for classification problems because its outputs are not integer. In order to overcome this problem, this paper provides four adaptive neuro-fuzzy classifiers; adaptive neuro-fuzzy classifier with linguistic hedges (ANFCLH), linguistic hedges neuro-fuzzy classifier with selected features (LHNFCSF), conjugate gradient neuro-fuzzy classifier (SCGNFC) and speeding up scaled conjugate gradient neuro-fuzzy classifier (SSCGNFC). These classifiers are used to achieve very fast, simple and efficient breast cancer diagnosis. Both SCGNFC and SSCGNFC systems are optimized by scaled conjugate gradient algorithms. In these two systems, k-means algorithm is used to initialize the fuzzy rules. Also, Gaussian membership function is only used for fuzzy set descriptions, because of its simple derivative expressions. The other two systems are based on linguistic hedges (LH) tuned by scaled conjugate gradient. The classifiers performances are analyzed and compared by applying them to breast cancer diagnosis. The results indicated that SCGNFC, SSCGNFC and ANFCLH achieved the same accuracy of 97.6608 % in the training phase while LHNFCSF performed better than other methods in the training phase by achieving an accuracy of 100 %. In the testing phase, the overall accuracies of LHNFCSF achieved 97.8038 %, which is superior also to other methods. Applying LHNFCSF not only reduces the dimensions of the problem, but also improves classification performance by discarding redundant, noise-corrupted or unimportant features. Also, the k-means clustering algorithm was used to determine the membership functions of each feature. LHNFCSF achieved mean RMSE values of 0.0439 in the training phase after feature selection and gives the best testing recognition rates of 98.8304 and 98.0469 during training and testing phases, respectively using two clusters for each class. The results strongly suggest that ANFCLH can aid in the diagnosis of breast cancer and can be very helpful to the physicians for their final decision on their patients.
••01 Sep 2015
TL;DR: The results indicate that the proposed OFCM algorithm can decrease effectively the mean square deviation of color quantization, keep overall arrangement of ideas and part characteristic detail in image reconstruction, and is compared with those of three other quantization algorithms.
Abstract: Most image compression algorithms suffer from several drawbacks: high-computational complexity, moderate reconstructed picture qualities, and a variable bit rate. In this paper, an efficient color image quantization technique that depends on an optimized Fuzzy C-means (OFCM) algorithm is proposed. It exploits the optimization capability of the improved artificial fish swarm algorithm to overcome the shortage of Fuzzy C-means algorithm. It uses error diffusion algorithms to obtain perceptually better images after quantization. Experiments are carried out to estimate the performance of the proposed OFCM algorithm in image compression using standard image set. The results indicate that the algorithm can decrease effectively the mean square deviation of color quantization, keep overall arrangement of ideas and part characteristic detail in image reconstruction. The performance efficiency of the proposed technique is compared with those of three other quantization algorithms. The Comparative results confirmed that the OFCM has potential in terms of both accuracy and perceptual quality as compared to recent methods of the literature.
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).
TL;DR: New features based on the 2D and 3D PSRs of IMFs have been proposed for classification of epileptic seizure and seizure-free EEG signals.
Abstract: We propose new features for classification of epileptic seizure EEG signals.Features were extracted from PSR of IMFs of EEG signals.We define ellipse area of 2D PSR and IQR of Euclidian distance of 3D PSR as features.LS-SVM classifier has been used for classification with the proposed features.Results were compared with other existing methods studied on the same EEG dataset. Epileptic seizure is the most common disorder of human brain, which is generally detected from electroencephalogram (EEG) signals. In this paper, we have proposed the new features based on the phase space representation (PSR) for classification of epileptic seizure and seizure-free EEG signals. The EEG signals are firstly decomposed using empirical mode decomposition (EMD) and phase space has been reconstructed for obtained intrinsic mode functions (IMFs). For the purpose of classification of epileptic seizure and seizure-free EEG signals, two-dimensional (2D) and three-dimensional (3D) PSRs have been used. New features based on the 2D and 3D PSRs of IMFs have been proposed for classification of epileptic seizure and seizure-free EEG signals. Two measures have been defined namely, 95% confidence ellipse area for 2D PSR and interquartile range (IQR) of the Euclidian distances for 3D PSR of IMFs of EEG signals. These measured parameters show significant difference between epileptic seizure and seizure-free EEG signals. The combination of these measured parameters for different IMFs has been utilized to form the feature set for classification of epileptic seizure EEG signals. Least squares support vector machine (LS-SVM) has been employed for classification of epileptic seizure and seizure-free EEG signals, and its classification performance has been evaluated using different kernels namely, radial basis function (RBF), Mexican hat wavelet and Morlet wavelet kernels. Simulation results with various performance parameters of classifier, have been included to show the effectiveness of the proposed method for classification of epileptic seizure and seizure-free EEG signals.
TL;DR: Efficient detection of epileptic seizure is achieved when seizure events appear for long duration in hours long EEG recordings and the proposed method develops time–frequency plane for multivariate signals and builds patient-specific models for EEG seizure detection.
Abstract: Objective : This paper investigates the multivariate oscillatory nature of electroencephalogram (EEG) signals in adaptive frequency scales for epileptic seizure detection. Methods : The empirical wavelet transform (EWT) has been explored for the multivariate signals in order to determine the joint instantaneous amplitudes and frequencies in signal adaptive frequency scales. The proposed multivariate extension of EWT has been studied on multivariate multicomponent synthetic signal, as well as on multivariate EEG signals of Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) scalp EEG database. In a moving-window-based analysis, 2-s-duration multivariate EEG signal epochs containing five automatically selected channels have been decomposed and three features have been extracted from each 1-s part of the 2-s-duration joint instantaneous amplitudes of multivariate EEG signals. The extracted features from each oscillatory level have been processed using a proposed feature processing step and joint features have been computed in order to achieve better discrimination of seizure and seizure-free EEG signal epochs. Results : The proposed detection method has been evaluated over 177 h of EEG records using six classifiers. We have achieved average sensitivity, specificity, and accuracy values as 97.91%, 99.57%, and 99.41%, respectively, using tenfold cross-validation method, which are higher than the compared state of art methods studied on this database. Conclusion : Efficient detection of epileptic seizure is achieved when seizure events appear for long duration in hours long EEG recordings. Significance : The proposed method develops time–frequency plane for multivariate signals and builds patient-specific models for EEG seizure detection.
TL;DR: The proposed method is able to differentiate the focal and non-focal EEG signals with an average classification accuracy of 87% correct and can be useful in assessing the nonlinear interrelation and complexity of focal and other EEG signals.
Abstract: The brain is a complex structure made up of interconnected neurons, and its electrical activities can be evaluated using electroencephalogram (EEG) signals. The characteristics of the brain area affected by partial epilepsy can be studied using focal and non-focal EEG signals. In this work, a method for the classification of focal and non-focal EEG signals is presented using entropy measures. These entropy measures can be useful in assessing the nonlinear interrelation and complexity of focal and non-focal EEG signals. These EEG signals are first decomposed using the empirical mode decomposition (EMD) method to extract intrinsic mode functions (IMFs). The entropy features, namely, average Shannon entropy (ShEnAvg), average Renyi’s entropy (RenEnAvg ), average approximate entropy (ApEnAvg), average sample entropy (SpEnAvg) and average phase entropies (S1Avg and S2Avg), are computed from different IMFs of focal and non-focal EEG signals. These entropies are used as the input feature set for the least squares support vector machine (LS-SVM) classifier to classify into focal and non-focal EEG signals. Experimental results show that our proposed method is able to differentiate the focal and non-focal EEG signals with an average classification accuracy of 87% correct.
TL;DR: A particle swarm optimization-based approach to train the NN (NN-PSO), capable to tackle the problem of predicting structural failure of multistoried reinforced concrete buildings via detecting the failure possibility of the multistory reinforced concrete building structure in the future.
Abstract: Faulty structural design may cause multistory reinforced concrete (RC) buildings to collapse suddenly. All attempts are directed to avoid structural failure as it leads to human life danger as well as wasting time and property. Using traditional methods for predicting structural failure of the RC buildings will be time-consuming and complex. Recent research proved the artificial neural network (ANN) potentiality in solving various real-life problems. The traditional learning algorithms suffer from being trapped into local optima with a premature convergence. Thus, it is a challenging task to achieve expected accuracy while using traditional learning algorithms to train ANN. To solve this problem, the present work proposed a particle swarm optimization-based approach to train the NN (NN-PSO). The PSO is employed to find a weight vector with minimum root-mean-square error (RMSE) for the NN. The proposed (NN-PSO) classifier is capable to tackle the problem of predicting structural failure of multistoried reinforced concrete buildings via detecting the failure possibility of the multistoried RC building structure in the future. A database of 150 multistoried buildings’ RC structures was employed in the experimental results. The PSO algorithm was involved to select the optimal weights for the NN classifier. Fifteen features have been extracted from the structural design, while nine features have been opted to perform the classification process. Moreover, the NN-PSO model was compared with NN and MLP-FFN (multilayer perceptron feed-forward network) classifier to find its ingenuity. The experimental results established the superiority of the proposed NN-PSO compared to the NN and MLP-FFN classifiers. The NN-PSO achieved 90 % accuracy with 90 % precision, 94.74 % recall and 92.31 % F-Measure.