scispace - formally typeset
Search or ask a question
Author

Hany A. Elsalamony

Other affiliations: Salman bin Abdulaziz University
Bio: Hany A. Elsalamony is an academic researcher from Helwan University. The author has contributed to research in topics: Object detection & Matching (statistics). The author has an hindex of 7, co-authored 15 publications receiving 167 citations. Previous affiliations of Hany A. Elsalamony include Salman bin Abdulaziz University.

Papers
More filters
Journal ArticleDOI
TL;DR: Analysis and applications of the most important techniques in data mining; multilayer perception neural network (MLPNN), tree augmented Naive Bayes (TAN) known as Bayesian networks, Nominal regression or logistic regression (LR), and Ross Quinlan new decision tree model (C5.0) are introduced.
Abstract: bank marketing campaigns are dependent on customers' huge electronic data. The size of these data sources is impossible for a human analyst to come up with interesting information that will help in the decision-making process. Data mining models are completely helping in the performance of these campaigns. This paper introduces analysis and applications of the most important techniques in data mining; multilayer perception neural network (MLPNN), tree augmented Naive Bayes (TAN) known as Bayesian networks, Nominal regression or logistic regression (LR), and Ross Quinlan new decision tree model (C5.0). The objective is to examine the performance of MLPNN, TAN, LR and C5.0 techniques on a real-world data of bank deposit subscription. The purpose is increasing the campaign effectiveness by identifying the main characteristics that affect a success (the deposit subscribed by the client) based on MLPNN, TAN, LR and C5.0. The experimental results demonstrate, with higher accuracies, the success of these models in predicting the best campaign contact with the clients for subscribing deposit. The performances are calculated by three statistical measures; classification accuracy, sensitivity, and specificity.

63 citations

Journal ArticleDOI
01 Apr 2016-Micron
TL;DR: A proposed algorithm for detecting and counting three types of anaemia-infected red blood cells in a microscopic coloured image using circular Hough transform and morphological tools and has demonstrated high accuracy for analysing healthy/unhealthy cells.

43 citations

Journal ArticleDOI
TL;DR: The objective of this paper is to examine the performance of recent invented decision tree modeling algorithms and compared with one that achieved by radial basis function kernel support vector machine (RBFSVM) on the diagnosis of breast cancer using cytological proven tumor dataset.
Abstract: Breast cancer represents the second important cause of cancer deaths in women today and it is the most common type of cancer in women. Disease diagnosis is one of the applications where data mining tools are proving successful results. Data mining with decision trees is popular and effective data mining classification approach. Decision trees have the ability to generate understandable classification rules, which are very efficient tool for transfer knowledge to physicians and medical specialists. In fundamental truth, they provide trails to find rules that could be evaluated for separating the input samples into one of several groups without having to state the functional relationship directly. The objective of this paper is to examine the performance of recent invented decision tree modeling algorithms and compared with one that achieved by radial basis function kernel support vector machine (RBFSVM) on the diagnosis of breast cancer using cytological proven tumor dataset. Four models have been evaluated in decision tree: Chi-squared Automatic Interaction Detection (CHAID), Classification and Regression tree (CR classification accuracy, sensitivity, and specificity.

31 citations

Journal ArticleDOI
TL;DR: The proposed algorithm presents Circular Hough Transforms, watershed segmentation, and morphological mathematics functions as effective methods to detect the normal blood cells; but the anaemia kinds have been classified based on their shape signatures.

22 citations

Journal ArticleDOI
TL;DR: A proposed algorithm for detecting some anaemia types like sickle and elliptocytosis and trying to count them with healthy ones in human red blood smears based on the circular Hough transform and some morphological tools is presented.
Abstract: The identification process based on measuring the level of haemoglobin and the classification of red blood cells using microscopic examination of blood smears is the principal way to diagnose anaemia. This paper presents a proposed algorithm for detecting some anaemia types like sickle and elliptocytosis and trying to count them with healthy ones in human red blood smears based on the circular Hough transform and some morphological tools. Some cells with unknown shapes (not platelets or white cells) also have been detected. The extracted data from the detection process has been analyzed by neural network. The experimental results have demonstrated high accuracy, and the proposed algorithm has achieved the highest detection of around 98.9% out of all the cells in 27 microscopic images. Effectiveness rates up to 100%, 98%, and 99.3% have been achieved by using neural networks for sickle, elliptocytosis and cells with unknown shapes, respectively.

15 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
Hakan Gunduz1
TL;DR: Experimental results show that the second framework seems to be very promising, since it is able to learn deep features from each feature set via parallel convolution layers, which are successful at distinguishing PD patients from healthy individuals and effective in boosting up the discriminative power of the classifiers.
Abstract: Parkinson's Disease (PD) is a progressive neurodegenerative disease with multiple motor and non-motor characteristics. PD patients commonly face vocal impairments during the early stages of the disease. So, diagnosis systems based on vocal disorders are at the forefront on recent PD detection studies. Our study proposes two frameworks based on Convolutional Neural Networks to classify Parkinson's Disease (PD) using sets of vocal (speech) features. Although, both frameworks are employed for the combination of various feature sets, they have difference in terms of combining feature sets. While the first framework combines different feature sets before given to 9-layered CNN as inputs, whereas the second framework passes feature sets to the parallel input layers which are directly connected to convolution layers. Thus, deep features from each parallel branch are extracted simultaneously before combining in the merge layer. Proposed models are trained with dataset taken from UCI Machine Learning repository and their performances are validated with Leave-One-Person-Out Cross Validation (LOPO CV). Due to imbalanced class distribution in our data, F-Measure and Matthews Correlation Coefficient metrics are used for the assessment along with accuracy. Experimental results show that the second framework seems to be very promising, since it is able to learn deep features from each feature set via parallel convolution layers. Extracted deep features are not only successful at distinguishing PD patients from healthy individuals but also effective in boosting up the discriminative power of the classifiers.

132 citations

Journal ArticleDOI
TL;DR: The novel contribution of this paper is to explore the application of extreme gradient boosting (XGBoost) as an improvement on these traditional algorithms, specifically in its ability to generalize on noise-ridden data which is prevalent in this domain.
Abstract: Employee turnover has been identified as a key issue for organizations because of its adverse impact on work place productivity and long term growth strategies. To solve this problem, organizations use machine learning techniques to predict employee turnover. Accurate predictions enable organizations to take action for retention or succession planning of employees. However, the data for this modeling problem comes from HR Information Systems (HRIS); these are typically under-funded compared to the Information Systems of other domains in the organization which are directly related to its priorities. This leads to the prevalence of noise in the data that renders predictive models prone to over-fitting and hence inaccurate. This is the key challenge that is the focus of this paper, and one that has not been addressed historically. The novel contribution of this paper is to explore the application of Extreme Gradient Boosting (XGBoost) technique which is more robust because of its regularization formulation. Data from the HRIS of a global retailer is used to compare XGBoost against six historically used supervised classifiers and demonstrate its significantly higher accuracy for predicting employee turnover.

123 citations

Journal ArticleDOI
TL;DR: This research proposes lightweight deep learning models that classify the erythrocytes into three classes: circular (normal), elongated (sickle cells), and other blood content, which are different in the number of layers and learnable filters.
Abstract: Sickle cell anemia, which is also called sickle cell disease (SCD), is a hematological disorder that causes occlusion in blood vessels, leading to hurtful episodes and even death. The key function of red blood cells (erythrocytes) is to supply all the parts of the human body with oxygen. Red blood cells (RBCs) form a crescent or sickle shape when sickle cell anemia affects them. This abnormal shape makes it difficult for sickle cells to move through the bloodstream, hence decreasing the oxygen flow. The precise classification of RBCs is the first step toward accurate diagnosis, which aids in evaluating the danger level of sickle cell anemia. The manual classification methods of erythrocytes require immense time, and it is possible that errors may be made throughout the classification stage. Traditional computer-aided techniques, which have been employed for erythrocyte classification, are based on handcrafted features techniques, and their performance relies on the selected features. They also are very sensitive to different sizes, colors, and complex shapes. However, microscopy images of erythrocytes are very complex in shape with different sizes. To this end, this research proposes lightweight deep learning models that classify the erythrocytes into three classes: circular (normal), elongated (sickle cells), and other blood content. These models are different in the number of layers and learnable filters. The available datasets of red blood cells with sickle cell disease are very small for training deep learning models. Therefore, addressing the lack of training data is the main aim of this paper. To tackle this issue and optimize the performance, the transfer learning technique is utilized. Transfer learning does not significantly affect performance on medical image tasks when the source domain is completely different from the target domain. In some cases, it can degrade the performance. Hence, we have applied the same domain transfer learning, unlike other methods that used the ImageNet dataset for transfer learning. To minimize the overfitting effect, we have utilized several data augmentation techniques. Our model obtained state-of-the-art performance and outperformed the latest methods by achieving an accuracy of 99.54% with our model and 99.98% with our model plus a multiclass SVM classifier on the erythrocytesIDB dataset and 98.87% on the collected dataset.

92 citations