scispace - formally typeset
Search or ask a question
Author

Abid Ishaq

Bio: Abid Ishaq is an academic researcher from University of Engineering and Technology, Lahore. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 2, co-authored 5 publications receiving 22 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors analyzed the heart failure survivors from the dataset of 299 patients admitted in hospital and found significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient's survivor prediction.
Abstract: Cardiovascular disease is a substantial cause of mortality and morbidity in the world. In clinical data analytics, it is a great challenge to predict heart disease survivor. Data mining transforms huge amounts of raw data generated by the health industry into useful information that can help in making informed decisions. Various studies proved that significant features play a key role in improving performance of machine learning models. This study analyzes the heart failure survivors from the dataset of 299 patients admitted in hospital. The aim is to find significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient’s survivor prediction. To predict patient’s survival, this study employs nine classification models: Decision Tree (DT), Adaptive boosting classifier (AdaBoost), Logistic Regression (LR), Stochastic Gradient classifier (SGD), Random Forest (RF), Gradient Boosting classifier (GBM), Extra Tree Classifier (ETC), Gaussian Naive Bayes classifier (G-NB) and Support Vector Machine (SVM). The imbalance class problem is handled by Synthetic Minority Oversampling Technique (SMOTE). Furthermore, machine learning models are trained on the highest ranked features selected by RF. The results are compared with those provided by machine learning algorithms using full set of features. Experimental results demonstrate that ETC outperforms other models and achieves 0.9262 accuracy value with SMOTE in prediction of heart patient’s survival.

162 citations

Journal ArticleDOI
TL;DR: Results reveal that the proposed model performs better as compared to the existing state-of-the-art models when combined word embedding with LSTM and shows an accuracy of 97%, precision 83%, recall 71%, and F1-score 76.53%.
Abstract: Reviews of users on social networks have been gaining rapidly interest on the usage of sentiment analysis which serve as feedback to the government, public and private companies. Text Mining has a wide variety of applications such as sentiment analysis, spam detection, sarcasm detection, and news classification. Reviews classification using user sentiments is an important and collaborative task for many organizations. During recent years, text classification is mostly studied with machine learning models and hand–crafted features which are not able to give promising results on short text classification. In this research, a deep neural network–based model Long Short Term Memory (LSTM) with word embedding features is proposed. The proposed model has been evaluated on the large dataset of Hotel reviews based on accuracy, precision, recall, and F1-score. This research is a classification study on the hotel review sentiments given by guests of the hotel. The results reveal that the proposed model performs better as compared to the existing state-of-the-art models when combined word embedding with LSTM and shows an accuracy of 97%, precision 83%, recall 71%, and F1-score 76.53%. These promising results reveal the effectiveness of the proposed model on any type of review classification tasks.

18 citations

Journal ArticleDOI
TL;DR: The novel RIFS has been introduced which integrates two types of feature selection techniques – the ROI-based image filtering and the wrappers-based FFS technique and proves that ETC outperformed among the all applied ML model by providing 0.992 accuracy while VVG16 has outperformed other CNN models by giving 0.986 of accuracy.

16 citations

Journal ArticleDOI
TL;DR: In this article, the authors presented an ensemble of machine learning and deep learning models by combining Random Forest and Convolutional Neural Network called RFCNN for the prediction of road accident severity.
Abstract: Traffic accidents on highways are a leading cause of death despite the development of traffic safety measures. The burden of casualties and damage caused by road accidents is very high for developing countries. Many factors are associated with traffic accidents, some of which are more significant than others in determining the severity of accidents. Data mining techniques can help in predicting influential factors related to crash severity. In this study, significant factors that are strongly correlated with the accident severity on highways are identified by Random Forest. Top features affecting accidental severity include distance, temperature, wind_Chill, humidity, visibility, and wind direction. This study presents an ensemble of machine learning and deep learning models by combining Random Forest and Convolutional Neural Network called RFCNN for the prediction of road accident severity. The performance of the proposed approach is compared with several base learner classifiers. The data used in the analysis include accident records of the USA from February 2016 to June 2020. Obtained results demonstrate that the RFCNN enhanced the decision-making process and outperformed other models with 0.991 accuracy, 0.974 precision, 0.986 recall, and 0.980 F-score using the 20 most significant features in predicting the severity of accidents.

13 citations

Journal ArticleDOI
TL;DR: This study proposes the novel use of feature extraction from a convolutional neural network (CNN) using a CNN model to enlarge the feature set to train linear models including stochastic gradient descent classifier, logistic regression, and support vector machine that comprise the soft-voting based ensemble model.
Abstract: Cardiovascular diseases (CVDs) have been regarded as the leading cause of death with 32% of the total deaths around the world. Owing to the large number of symptoms related to age, gender, demographics, and ethnicity, diagnosing CVDs is a challenging and complex task. Furthermore, the lack of experienced staff and medical experts, and the non-availability of appropriate testing equipment put the lives of millions of people at risk, especially in under-developed and developing countries. Electronic health records (EHRs) have been utilized for diagnosing several diseases recently and show the potential for CVDs diagnosis as well. However, the accuracy and efficacy of EHRs-based CVD diagnosis are limited by the lack of an appropriate feature set. Often, the feature set is very small and unable to provide enough features for machine learning models to obtain a good fit. This study solves this problem by proposing the novel use of feature extraction from a convolutional neural network (CNN). An ensemble model is designed where a CNN model is used to enlarge the feature set to train linear models including stochastic gradient descent classifier, logistic regression, and support vector machine that comprise the soft-voting based ensemble model. Extensive experiments are performed to analyze the performance of different ratios of feature sets to the training dataset. Performance analysis is carried out using four different datasets and results are compared with recent approaches used for CVDs. Results show the superior performance of the proposed model with 0.93 accuracy, and 0.92 scores each for precision, recall, and F1 score. Results indicate both the superiority of the proposed approach, as well as the generalization of the ensemble model using multiple datasets.

7 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method, which is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting technique.
Abstract: Recent advancements in electronic commerce and communication systems have significantly increased the use of credit cards for both online and regular transactions. However, there has been a steady rise in fraudulent credit card transactions, costing financial companies huge losses every year. The development of effective fraud detection algorithms is vital in minimizing these losses, but it is challenging because most credit card datasets are highly imbalanced. Also, using conventional machine learning algorithms for credit card fraud detection is inefficient due to their design, which involves a static mapping of the input vector to output vectors. Therefore, they cannot adapt to the dynamic shopping behavior of credit card clients. This paper proposes an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method. The ensemble classifier is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting (AdaBoost) technique. Meanwhile, the hybrid resampling is achieved using the synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) method. The effectiveness of the proposed method is demonstrated using publicly available real-world credit card transaction datasets. The performance of the proposed approach is benchmarked against the following algorithms: support vector machine (SVM), multilayer perceptron (MLP), decision tree, traditional AdaBoost, and LSTM. The experimental results show that the classifiers performed better when trained with the resampled data, and the proposed LSTM ensemble outperformed the other algorithms by obtaining a sensitivity and specificity of 0.996 and 0.998, respectively.

28 citations

Journal ArticleDOI
TL;DR: Results show that the classifiers performed better when trained with the resampled data, and the proposed LSTM ensemble outperformed the other algorithms by obtaining a sensitivity and specificity of 0.996 and 0.998, respectively.
Abstract: Recent advancements in electronic commerce and communication systems have significantly increased the use of credit cards for both online and regular transactions. However, there has been a steady rise in fraudulent credit card transactions, costing financial companies huge losses every year. The development of effective fraud detection algorithms is vital in minimizing these losses, but it is challenging because most credit card datasets are highly imbalanced. Also, using conventional machine learning algorithms for credit card fraud detection is inefficient due to their design, which involves a static mapping of the input vector to output vectors. Therefore, they cannot adapt to the dynamic shopping behavior of credit card clients. This paper proposes an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method. The ensemble classifier is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting (AdaBoost) technique. Meanwhile, the hybrid resampling is achieved using the synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) method. The effectiveness of the proposed method is demonstrated using publicly available real-world credit card transaction datasets. The performance of the proposed approach is benchmarked against the following algorithms: support vector machine (SVM), multilayer perceptron (MLP), decision tree, traditional AdaBoost, and LSTM. The experimental results show that the classifiers performed better when trained with the resampled data, and the proposed LSTM ensemble outperformed the other algorithms by obtaining a sensitivity and specificity of 0.996 and 0.998, respectively.

23 citations

Journal ArticleDOI
TL;DR: This comprehensive investigation portrays a vivid visualization of the applicability and compatibility of different machine learning algorithms in such an imbalanced dataset and presents the role of the SMOTE-ENN algorithm and hyperparameter optimization for enhancing the performances of the machinelearning algorithms.
Abstract: Heart failure is a chronic cardiac condition characterized by reduced supply of blood to the body due to impaired contractile properties of the muscles of the heart. Like any other cardiac disorder, heart failure is a serious ailment limiting the activities and curtailing the lifespan of the patient, most often resulting in death sooner or later. Detection of survival of patients with heart failure is the path to effective intervention and good prognosis in terms of both treatment and quality of life of the patient. Machine learning techniques can be critical in this regard since they can be used to predict the survival of patients with heart failure in advance, allowing patients to receive appropriate treatment. Hence, six supervised machine learning algorithms have been studied and applied to analyze a dataset of 299 individuals from the UCI Machine Learning Repository and predict their survivability from heart failure. Three distinct approaches have been followed using Decision Tree Classifier, Logistic Regression, Gaussian Naïve Bayes, Random Forest Classifier, K-Nearest Neighbors, and Support Vector Machine algorithms. Data scaling has been performed as a preprocessing step utilizing the standard and min–max scaling method. However, grid search cross-validation and random search cross-validation techniques have been employed to optimize the hyperparameters. Additionally, the synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) data resampling technique are utilized, and the performances of all the approaches have been compared extensively. The experimental results clearly indicate that Random Forest Classifier (RFC) surpasses all other approaches with a test accuracy of 90% when used in combination with SMOTE-ENN and standard scaling technique. Therefore, this comprehensive investigation portrays a vivid visualization of the applicability and compatibility of different machine learning algorithms in such an imbalanced dataset and presents the role of the SMOTE-ENN algorithm and hyperparameter optimization for enhancing the performances of the machine learning algorithms.

23 citations

Journal ArticleDOI
TL;DR: In this paper, a heart failure dataset consisting of numerical values only, needs to be converted into image data for analysis using the advantages of CNN and the highest accuracy of 95.13 % is obtained with the ResNet18 model and this accuracy is superior to studies using previous numerical raw data.

19 citations