Showing papers in "Journal of Computer Science in 2021"

PDF

Open Access

Journal Article•DOI•

Machine Learning and Deep Learning for Phishing Email Classification using One-Hot Encoding

[...]

Sikha Bagui, Debarghya Nandi, Subhash C. Bagui, Robert Jamie White

23 Jul 2021-Journal of Computer Science

TL;DR: One-hot encoding was used with DL and ML techniques to classify emails as phishing or nonphishing, demonstrating the effectiveness of semantic analysis in phishing email detection.

...read moreread less

Abstract: Corresponding Author: Sikha Bagui Department of Computer Science, The University of West Florida, United States Email: bagui@uwf.edu Abstract: Representation of text is a significant task in Natural Language Processing (NLP) and in recent years Deep Learning (DL) and Machine Learning (ML) have been widely used in various NLP tasks like topic classification, sentiment analysis and language translation. Until very recently, little work has been devoted to semantic analysis in phishing detection or phishing email detection. The novelty of this study is in using deep semantic analysis to capture inherent characteristics of the text body. One-hot encoding was used with DL and ML techniques to classify emails as phishing or nonphishing. A comparison of various parameters and hyperparameters was performed for DL. The results of various ML models, Naïve Bayes, SVM, Decision Tree, as well as DL models, Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM), were presented. The DL models performed better than the ML models in terms of accuracy, but the ML models performed better than the DL models in terms of computation time. CNN with Word Embedding performed the best in terms of accuracy (96.34%), demonstrating the effectiveness of semantic analysis in phishing email detection.

...read moreread less

15 citations

Journal Article•DOI•

Sentimental Analysis on Health-Related Information with Improving Model Performance using Machine Learning

[...]

Wael M.S. Yafooz, Abdullah Alsaeedi

22 Feb 2021-Journal of Computer Science

TL;DR: This study proposes a model to detect and discover emotions/opinions of YouTube users on herbal treatment videos through an analysis of user comments by using machine learning classifiers and introduces a new Arabic Dataset on Herbal Treatments for Diabetes (ADHTD).

...read moreread less

Abstract: Social media platforms are extensively used in exchanging and sharing information and user experience, thereby resulting in massive outspread and viewing of personal experiences in many fields of life. Thus, informative health-related videos on YouTube are highly perceptible. Many users tend to procure medical treatments and health-related information from social media particularly from YouTube when searching for chronic illness treatments. Sometimes, these sources contain misinformation that cause fatal effects on the users’ health. Many sentimental analyses and classifications have been conducted on social media platforms to study user post and comments on many life science fields. However, no study has been conducted on the analysis of Arabic user comments, which provide details on herbal treatments for people with diabetes. Therefore, this study proposes a model to detect and discover emotions/opinions of YouTube users on herbal treatment videos is proposed through an analysis of user comments by using machine learning classifiers. In addition, a new Arabic Dataset on Herbal Treatments for Diabetes (ADHTD), which is based on user comments from several YouTube videos, is introduced. This study examines the impact of four representation methods on ADHTD to show the performance of machine learning classifiers. These methods remove repeating characters in Arabic dialect and character extension known as ‘TATAWEEL’ or ‘MAD’, stemming of Arabic words, Arabic stop words removal and N-grams with Arabic words. Experiments has been conducted based aforementioned methods to handle imbalanced proposed dataset and identify the best machine learning classifiers over Arabic dialect textual data. The model has achieved a higher accuracy that reached 95% when using Synthetic Minority Oversampling TEchnique (SMTOE) techniques to balanced dataset than imbalanced dataset.

...read moreread less

13 citations

Journal Article•DOI•

Financial Forecasting with Machine Learning: Price Vs Return

[...]

Firuz Kamalov, Ikhlaas Gurrib, Khairan Rajab

18 Mar 2021-Journal of Computer Science

TL;DR: This study compares the effectiveness of stock price and return as input features in directional forecasting models using 10-year historical data of ten large cap US companies and concludes that price is generally a more potent input feature than return value in predicting the direction of price movement.

...read moreread less

Abstract: Forecasting directional movement of stock price using machine learning tools has attracted a considerable amount of research. Two of the most common input features in a directional forecasting model are stock price and return. The choice between the former and the latter variables is often subjective. In this study, we compare the effectiveness of stock price and return as input features in directional forecasting models. We perform an extensive comparison of the two input features using 10-year historical data of ten large cap US companies. We employ four popular classification algorithms as the basis of the forecasting models used in our study. The results show that stock price is a more effective standalone input feature than return. The effectiveness of stock price and return equalize when we add technical indicators to the input feature set. We conclude that price is generally a more potent input feature than return value in predicting the direction of price movement. Our results should aid researchers and practitioners interested in applying machine learning models to stock price forecasting.

...read moreread less

13 citations

Journal Article•DOI•

Stock price prediction using Generative Adversarial Networks

[...]

HungChun Lin¹, Chen Chen¹, Gaofeng Huang¹, Amir Jafari•Institutions (1)

George Washington University¹

02 Apr 2021-Journal of Computer Science

TL;DR: This paper proposes a stock prediction model using Generative Adversarial Network (GAN) with Gated Recurrent Units (GRU) used as a generator that inputs historical stock price and generates future stock prices and Convolutional Neural Network (CNN) as a discriminator to discriminate between the real stockprice and generated stock price.

...read moreread less

Abstract: Deep learning is an exciting topic. It has been utilized in many areas owing to its strong potential. For example, it has been widely used in the financial area which is vital to the society, such as high-frequency trading, portfolio optimization, fraud detection and risk management. Stock market prediction is one of the most popular and valuable areas in finance. In this paper, it proposes a stock prediction model using Generative Adversarial Network (GAN) with Gated Recurrent Units (GRU) used as a generator that inputs historical stock price and generates future stock price and Convolutional Neural Network (CNN) as a discriminator to discriminate between the real stock price and generated stock price. Different from the traditional methods, which limited the forecasting on one-step-ahead only, by contrast, using the deep learning algorithm is possible to conduct the multi-step ahead prediction more accurately. In this study, it chose the Apple Inc. stock closing price as the target price, with features such as S&P 500 index, NASDAQ Composite index, U.S. Dollar index, etc. In addition, FinBert has been utilized to generate a news sentiment index for Apple Inc. as an additional predicting feature. Finally, this paper compares the proposed GAN model results with the baseline model.

...read moreread less

13 citations

Journal Article•DOI•

Fine-Tuned MobileNet Classifier for Classification of Strawberry and Cherry Fruit Types

[...]

Venkatesh, Nagaraju Yallappa, Siddhanth U Hegde, Sangeetha Raj Stalin

20 Jan 2021-Journal of Computer Science

TL;DR: An accurate, fast and reliable strawberry, cherry fruit detection and classification system for the automated strawberry cherry yield estimation and the fine-tuned MobileNet CNN model performs quite well with higher accuracy of classification of fruit at less computation cost is proposed.

...read moreread less

Abstract: This paper proposed an accurate, fast and reliable strawberry, cherry fruit detection and classification system for the automated strawberry cherry yield estimation. State-of-the-art deep learning-based fine-tuned MobileNet Convolutional Neural Network is developed to detect and classify strawberry and cherry fruit types in the outdoor field. The proposed CNN model is trained on 4250 strawberry fruit images, 3878 Cherry fruit images and tested on 990 strawberry fruit’s images, 1012 Cherry fruit images. To capture features and classify fruit type, a fine-tuned MobileNet Convolutional Neural Network model is presented in this study. The original MobileNet CNN model has 88 layers, which is computationally intensive and has more parameters. In the fine-tuned MobileNet CNN model, top layers are frozen and few layers are replaced with other layers such as a depthwise layer, pointwise layer, ReLu and Batch normalization layer, global average pooling layer. The fully connected layer is removed. The fine-tuned MobileNet CNN model performs quite well with higher accuracy of classification of fruit at less computation cost. The proposed CNN Model performs classification and labels them as Blueberry, Huckleberry, Mulberry, Rasberry, strawberry, strawberry wedge, Cherry Brown, Cherry Red, Cherry Rainier, Cherry wax Black, Cherry wax Red, Cherry wax Yellow. The proposed model’s average validation accuracy is about 98.60% and the loss rate is about 0.38%. The fruit images are acquired from the cultivation field include fruits that are occluded by foliage, under the shadow and some degree of overlap of strawberry, cherry flowers.

...read moreread less

12 citations

Journal Article•DOI•

Machine Learning-Based Technique to Detect SQL Injection Attack

[...]

Muhammad Amirulluqman Azman¹, Mohd Fadzli Marhusin¹, Rossilawati Sulaiman²•Institutions (2)

Universiti Sains Islam Malaysia¹, National University of Malaysia²

27 Mar 2021-Journal of Computer Science

TL;DR: This research uses machine learning technique where training is provided to the SQL injection detector using a training data and then is evaluated against a testing data and shows that the proposed technique produces high accuracy in recognizing malicious and benign web requests.

...read moreread less

Abstract: Lack of secure codes implemented in the web apps will lead to cyber-attack because of vulnerabilities. The statistic shows that highest record on the data theft related cyber-attacks are through the SQL injection technique. Hence, an effective SQL injection detection is needed in any web system to combat this threat. In this research, machine learning technique is used where training is provided to the SQL injection detector using a training data and then is evaluated against a testing data. The research relies on the preparation of the training and testing datasets. Training sets are used by the detector to establish the knowledge base and the test set is used to evaluate the performance of the detector. The result of the detection shows that the proposed technique produces high accuracy in recognizing malicious and benign web requests.

...read moreread less

10 citations

Journal Article•DOI•

Evaluating PSAU Mobile Application Based on People at the Center of Mobile Application Development (PACMAD) Usability Model: Empirical Investigation

[...]

Mohammed H. Afif

19 Mar 2021-Journal of Computer Science

TL;DR: The results demonstrate the state of usability attributes of PSAU mobile application is acceptable, and should be useful to IT deanships and related policymakers at the university level with empirical evidence about the issues and problems that faced users of mobile applications in higher educational institutions in KSA.

...read moreread less

Abstract: This study investigates the extent to which the usability attributes, namely, effectiveness, efficiency; learnability and memorability, satisfaction, errors, and cognitive load of PSAU mobile application exist from students’ point of view who were enrolling at the academic year 2019-2020 in College of Business Administration (CBA) at Prince Sattam Bin Abdulaziz University. The study employs the People at the Center of Mobile Application Development (PAMCAD) usability model to determine the extent to which the usability attributes are available of PSAU mobile application. A survey-based methodology is used to collect data from a random sample size of 137 enrolled students in the College of Business Administration (CBA) at Prince Sattam bin Abdulaziz University. The results demonstrate the state of usability attributes of PSAU mobile application is acceptable; the highest mean was 3.3 for the cognitive load dimension, after that, the learnability and memorability dimensions with mean 3.0. The lowest mean is 2.4 for the Efficiency dimension. The overall mean for usability is 2.8 which reflect the level of usability for the PSAU mobile application. The results of this study should be useful to IT deanships and related policymakers at the university level with empirical evidence about the issues and problems that faced users of mobile applications in higher educational institutions in KSA; and helping in developing high-quality mobile application.

...read moreread less

6 citations

Journal Article•DOI•

Big Data in Educational Institutions using RapidMiner to Predict Learning Effectiveness

[...]

Evaristus Didik Madyatmadja, David Jumpa Malem Sembiring, Sinek Mehuli Br Perangin Angin, David Ferdy, Johanes Fernandes Andry - Show less +1 more

29 Apr 2021-Journal of Computer Science

6 citations

Journal Article•DOI•

A Systematic Literature Review of Software Defect Prediction Using Deep Learning

[...]

Ahmed Bahaa, Enas Mohamed Fathy, Ahmed Sharaf Eldin, Laila A. Abd-Elmegid

22 May 2021-Journal of Computer Science

TL;DR: A Systematic Literature Review (SLR) of software defect prediction using deep learning models focused on identifying the studies that use the semantics of the source code for improving defect prediction.

...read moreread less

Abstract: Corresponding Author: Enas Mohamed Fathy Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Helwan 11795, Egypt Email: enasm.fathy@gmail.com Abstract: The approaches associated with software defect prediction are used to reduce the time and cost of discovering software defects in source code and to improve the software quality in the organizations. There are two approaches to reveal the software defects in the source code. The first approach is concentrated on the traditional features such as lines of code, code complexity, etc. However, these features fail to extract the semantics of the source code. The second one is concentrated on revealing these semantics. This paper presents a Systematic Literature Review (SLR) of software defect prediction using deep learning models. This SLR is focused on identifying the studies that use the semantics of the source code for improving defect prediction. This SLR aims to analyze the used datasets, models and frameworks. Also, identifying the evaluation metrics to ensure their applicability in software defect prediction. IEEE Xplore, Scopus and Web of Science digital libraries were used to select the suitable primary studies. Forty (40) primary studies were selected that published by 15 December 2020 for analysis based on the quality criteria. The project levels that applied in the studies were: Within-project 52.5%, cross-project 17.5% and both within-project and cross-project 30%. The datasets used were: Promise dataset 68.18% and other datasets 31.82%. The most used deep learning model in the primary studies was: Convolutional Neural Network (CNN) by 35%. The most used evaluation metrics were: F-measure and Area Under the Curve (AUC). Software defect prediction using deep learning models is still a valuable topic and requires much research studies to enhance the performance of the defect prediction.

...read moreread less

5 citations

Journal Article•DOI•

LeafsnapNet: An Experimentally Evolved Deep Learning Model for Recognition of Plant Species based on Leafsnap Image Dataset

[...]

Emmanuel Adetiba, Oluwaseun T. Ajayi, Jules R. Kala, Joke A. Badejo, Sunday Ajala, Abdultaofeek Abayomi - Show less +2 more

01 Apr 2021-Journal of Computer Science

TL;DR: Department of Electrical and Information Engineering, Covenant University, covenant University, Canaanland, P.M.B 1023, Ota, Nigeria Covenant Applied Informatics and Communication Africa Center of Excellence, Covenant university, Canaan land, P-M.O. Box 12363 Jacobs, 4026 Durban, South Africa

...read moreread less

Abstract: Department of Electrical and Information Engineering, Covenant University, Canaanland, P.M.B 1023, Ota, Nigeria Covenant Applied Informatics and Communication Africa Center of Excellence, Covenant University, Canaanland, P.M.B 1023, Ota, Nigeria HRA, Institute for Systems Science, Durban University of Technology, P.O. Box 1334, Durban 4000, South Africa Companie d'Electricite de Cote d'Ivoire, Abidjan, Cote d'Ivoire Department of Information and Communication Technology, Mangosuthu University of Technology, P.O. Box 12363 Jacobs, 4026 Durban, South Africa

...read moreread less

5 citations

Journal Article•DOI•

Social Media Sentiment Analysis: The Hajj Tweets Case Study

[...]

Mohammad Ashraf Ottom, Khalid M.O. Nahar

20 Mar 2021-Journal of Computer Science

TL;DR: This study focused on finding the best way for sentiment analysis by using a series of Hajj-related tweets, which is one of the most important rituals performed by Muslims, where the companies responsible for the pilgrimage season seek to complete the season in best way every year.

...read moreread less

Abstract: Corresponding Author: Mohammad Ashraf Ottom Department of Information Systems, Faculty of Information Technology and Computer Sciences, Yarmouk University, Irbid, Jordan Email: ottom.ma@yu.edu.jo Abstract: About forty five percent of the world's population use social networks, thinking of using these platforms seemed to find people's opinions and feelings on various topics. Companies that offer their services and products to customers focus on the subject for future improvement. Thus, serious thinking began to analyze the views of people across different social platforms and also to develop the best ways to analyze these views. In this study, we focused on finding the best way for sentiment analysis by using a series of Hajj-related tweets, which is one of the most important rituals performed by Muslims, where the companies responsible for the pilgrimage season seek to complete the season in best way every year. We used the Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Naïve Bayes (NB) as supervised algorithms for machine-learning approach and Text Blob analyzer for lexicon-based approach. Finding shows that, machine learning techniques worked better than the lexicon approach in the classification and analysis of Hajj related tweets. Even the limited availability of Hajj tweets corpus dataset, SVM reaches the best accuracy which was 84%.

...read moreread less

Journal Article•DOI•

Measuring Test Data Uniformity in Acceptance Tests for the FitNesse and Gherkin Notations

[...]

Douglas Hiura Longo¹, Patricia Vilain¹, Lucas Pereira da Silva¹•Institutions (1)

Universidade Federal de Santa Catarina¹

01 Feb 2021-Journal of Computer Science

TL;DR: Two metrics designed to measure the data uniformity of acceptance tests in FitNesse and Gherkin notations are presented to identify projects with lots of random and meaningless data and it is inferred that test data are more irregular inFitNesse features than in Gherskin features.

...read moreread less

Abstract: This paper presents two metrics designed to measure the data uniformity of acceptance tests in FitNesse and Gherkin notations. The objective is to measure the data uniformity of acceptance tests in order to identify projects with lots of random and meaningless data. Random data in acceptance tests hinder communication between stakeholders and increase the volume of glue code. The main contribution of this paper is the implementation of the proposed metrics. This paper also evaluates the uniformity of test data from several FitNesse and Gherkin projects found on GitHub, as a means to verify if the metrics are applicable. First, the metrics were applied to 18 FitNesse project repositories and 18 Gherkin project repositories. The measurements taken from these repositories were used to present cases of irregular and uniform test data. Then, we have compared the notations from FitNesse and Gherkin in terms of projects and features. In terms of projects, no significant difference was observed, that is, FitNesse projects have a level of uniformity similar to Gherkin projects. However, in terms of features and test documents, there was a significant difference. The uniformity scores of FitNesse and Gherkin features are 0.16 and 0.26, respectively. These uniformity scores are very low, which means that test data for both notations are very irregular. Thus, we can infer that test data are more irregular in FitNesse features than in Gherkin features. The evaluation also shows that 28 of 36 projects (78%) did not reach the minimum recommended measure, i.e., 0.45 of test data uniformity. In general, we can observe that there are still many challenges in improving the quality of acceptance tests, especially in relation to the uniformity of test data.

...read moreread less

Journal Article•DOI•

Critical Success Factors of Information Technology Outsourcing for Emerging Markets

[...]

Sushil Paudel, Vinish Kumar

18 May 2021-Journal of Computer Science

TL;DR: It is suggested that IT outsourcing vendors focus on these factors, policymakers implement new strategies to establish their presence in global outsourcing industries and researchers incorporate these variables in their future research.

...read moreread less

Abstract: Corresponding Author: Sushil Paudel Mewar University, Chittorgarh, Rajasthan, India Email: sushilpaudel@gmail.com Abstract: Outsourcing has gained popularity as several large U.S. companies in the 1980 s began delegating IT work to foreign firms. In outsourcing, one party (customer) asks another party (vendor) to do a particular job. Outsourcing has its own advantages, like reducing the cost and time and increasing performance and satisfaction. This study has focused on the critical success factors in information technology outsourcing for the emerging market and has considered the perspectives of the vendor. A snowball sampling technique was used to generate quantitative data among the respondents inside Kathmandu valley and variables were drawn from the available literature. Respondents included outsourcing vendors, freelancers, consultants and policymakers. Data were properly tested for reliability using Cronbach’s Alpha and results were validated using convergent and discriminant validity. The analysis included Structured Equation Modeling and Estimation was done using maximum likelihood and partial least squares. The study identified 21 critical success factors for the emerging market under seven categories: System quality, communication quality, service quality, system use, satisfaction, individual benefit and organizational benefit which is the main contribution of the paper. It is suggested that IT outsourcing vendors focus on these factors, policymakers implement new strategies to establish their presence in global outsourcing industries and researchers incorporate these variables in their future research.

...read moreread less

Journal Article•DOI•

An Approach for Finding Semantic Relatedness Score Between Two Sentences Based on their Senses

[...]

Nazreena Rahman, Bhogeswar Borah

09 Jun 2021-Journal of Computer Science

TL;DR: A Word Sense Disambiguation method is proposed which is used in finding the sense-oriented sentence semantic relatedness measure which combines edge-based score between words depending the context of the sentence; sense based score which finds sentences having similar senses; as well as word order score.

...read moreread less

Abstract: Corresponding Author: Nazreena Rahman Department of Computer Science and Engineering, Kaziranga University, India Email: nazreena.rehman@gmail.com Abstract: Finding semantic relatedness score between two sentences is useful in many research areas. Existing relatedness methods do not consider its sense while computing semantic relatedness score between two sentences. In this study, a Word Sense Disambiguation (WSD) method is proposed which is used in finding the sense-oriented sentence semantic relatedness measure. The WSD method is used to find the correct sense of a word present in a sentence. The proposed method uses both the WordNet lexical dictionary and the Wikipedia corpus. The sense-oriented sentence semantic relatedness measure combines edge-based score between words depending the context of the sentence; sense based score which finds sentences having similar senses; as well as word order score. We have evaluated the proposed WSD method on publicly available English WSD corpora. We have compared our proposed sense-oriented sentence semantic relatedness measure on standard datasets. Experimental analysis illustrates the significance of proposed method over many baseline and current systems like Lesk, UKB, IMS, Babelfy.

...read moreread less

Journal Article•DOI•

Face Verification for Person Re-Identification from Surveillance Camera and Drone-based Videos

[...]

Vimala Mathew, K. P. Ramesh, Tom Toby, Anu Mary Chacko

28 Jul 2021-Journal of Computer Science

TL;DR: A deep learning framework to perform face verification in videos is proposed and an average two percent increase is obtained in the accuracy of the face verification models by applying these methods.

...read moreread less

Abstract: Corresponding Author: Vimala Mathew Department of Computer Science, Hindustan Institute of Technology and Science, Chennai, India Email: vimalamathew@gmail.com Abstract: Person re-identification in surveillance camera videos is attracting widespread interest due to its increasing number of applications. It is being applied in the field of security, healthcare, product manufacturing, product sales and more. Though there are a variety of methods to do person reidentification, face verification-based methods are very much effective. In this study, a deep learning framework to perform face verification in videos is proposed. Face verification deep learning model development includes different stages like face recognition, cropping, alignment, augmentation, image enhancement and face image selection for model training. The authors have put forward innovative methods to be adopted in various stages of this sequence to improve the performance of the models. The focus of this study is on these image preprocessing stages of the process, rather than the deep learning part, which makes the approach unique. The overall model is improvised by increasing the efficiency of each of these stages by adopting methods like face recognition and cropping based on face landmarks, effective training image selection using face landmark symmetry, various image augmentation techniques including perspective transformation and image enhancement methods like contrast stretching and histogram equalization. An average two percent increase is obtained in the accuracy of the face verification models by applying these methods.

...read moreread less

Journal Article•DOI•

Advanced Persistent Threats (APT)-Attribution-MICTIC Framework Extension

[...]

Pedro Ramos Brandao

19 May 2021-Journal of Computer Science

TL;DR: Analysis of one of the fundamental parts of the Advanced Persistent Threats (APT) Attacks, the MICTIC, and the importance of Attribution in the analysis of APTS.

...read moreread less

Abstract: Email: pb@pbrandao.net Abstract: Analysis of one of the fundamental parts of the Advanced Persistent Threats (APT) Attacks. The phases of the APTs, their framing with the identification of criminals. Type of attack that normally requires resources only available to the State-State hacking. The importance of Attribution in the analysis of APTS. The unique and differentiating characteristics of this type of attacks compared to traditional cyber-attacks. Development of an extension for one of the few Frameworks applied to Attribution in APTs, the MICTIC.

...read moreread less

Journal Article•DOI•

Automatic Piano Sheet Music Transcription with Machine Learning

[...]

Fernandes Saputra¹, Un Greffin Namyu¹, Vincent¹, Derwin Suhartono¹, Aryo Pradipta Gema - Show less +1 more•Institutions (1)

Binus University¹

04 Mar 2021-Journal of Computer Science

TL;DR: It could be concluded that BiLSTM is the best architecture suited for automatic music transcription.

...read moreread less

Abstract: Automatic Music Transcription (AMT) is becoming more and more popular throughout the day, it has piqued the interest of many in addition to academic research. A successful AMT system would be able to bridge multiple ranges of interactions between people and music, including music education. The goal of this research is to transcribe an audio input to music notation. Research methods were conducted by training multiple neural networks architectures in different kinds of cases. The evaluation used two approaches, those were objective evaluation and subjective evaluation. The result of this research was an achievement of 74.80% F1 score and 73.3% out of 30 respondents claimed that Bidirectional Long Short-Term Memory (BiLSTM) has the best result. It could be concluded that BiLSTM is the best architecture suited for automatic music transcription.

...read moreread less

Journal Article•DOI•

An Overview of Artificial General Intelligence: Recent Developments and Future Challenges

[...]

Khalid Alattas, Ahmed Alkaabi, Alanoud Bandar Alsaud

15 Apr 2021-Journal of Computer Science

TL;DR: This study provides an analysis of what AGI protection scholars have written about the essence of human beliefs and proposes several well-supported hypotheses to indicate the difficulty of describing the character of human belief and a few meta-level theories are needed.

...read moreread less

Abstract: The defense sphere of Artificial General Intelligence (AGI) is developing exponentially. Notwithstanding, there is an under the definition of the character of human beliefs pertaining to AGI associations. Distinctive AGI protection scholars formulated numerous hypotheses regarding the existence of human beliefs, but contradictions exist. This study provides an analysis of what AGI protection scholars, up to the beginning of 2019, have written about the essence of human beliefs. It is generally advised to use a theory classification system, where the ideas are evaluated following the degree of their sophistication and size of behaviorists-internalists, equally because of the scope of their consensus mankind. We propose several well-supported hypotheses to indicate the difficulty of describing the character of human beliefs and a few meta-level theories are needed.

...read moreread less

Journal Article•DOI•

Data Preparation in Machine Learning for Condition-based Maintenance

[...]

Ons Masmoudi, Mehdi Jaoua, Amel Jaoua, Soumaya Yacout

04 Jun 2021-Journal of Computer Science

TL;DR: This study has shown that, for effective CBM application in industry, there is a need to develop a systematic methodology for design and selection of adequate data preparation steps and techniques with the proposed ML algorithms.

...read moreread less

Abstract: Corresponding Author: Amel Jaoua LR-OASIS, National Engineering School of Tunis, University of Tunis El Manar, Tunis, Tunisia Email: amel.jaoua@polymtl.ca Abstract: Using Machine Learning (ML) prediction to achieve a successful, cost-effective, Condition-Based Maintenance (CBM) strategy has become very attractive in the context of Industry 4.0. In other fields, it is well known that in order to benefit from the prediction capability of ML algorithms, the data preparation phase must be well conducted. Thus, the objective of this paper is to investigate the effect of data preparation on the ML prediction accuracy of Gas Turbines (GTs) performance decay. First a data cleaning technique for robust Linear Regression imputation is proposed based on the Mixed Integer Linear Programming. Then, experiments are conducted to compare the effect of commonly used data cleaning, normalization and reduction techniques on the ML prediction accuracy. Results revealed that the best prediction accuracy of GTs decay, found with the k-Nearest Neighbors ML algorithm, considerately deteriorate when changing the data preparation steps and/or techniques. This study has shown that, for effective CBM application in industry, there is a need to develop a systematic methodology for design and selection of adequate data preparation steps and techniques with the proposed ML algorithms.

...read moreread less

Journal Article•DOI•

Automatic Digitization of Engineering Diagrams using Intelligent Algorithms

[...]

Premanand Ghadekar, Shaunak Joshi, Debabrata Swain, Biswaranjan Acharya, Manas Ranjan Pradhan, Pramoda Patro - Show less +2 more

28 Sep 2021-Journal of Computer Science

Journal Article•DOI•

Smart Agri Wine: An Artificial Intelligence Approach to Predict Wine Quality

[...]

Gaurang S Patkar, D. Balaganesh

27 Nov 2021-Journal of Computer Science

Journal Article•DOI•

Impact and Control of Drug Therapy Guidelines for Tumor Patients During the Novel Coronavirus Pneumonia Epidemic

[...]

Razi Ahmed, Shafiza Mohd Shariff, Shahrinaz Ismail, Anwer Irshad Burney, Nawaf Waqas - Show less +1 more

01 Aug 2021-Journal of Computer Science

TL;DR: This article focuses on the group of cancer patients, comprehensively considers the social, medical resources and family issues to analyze the possible impact of the epidemic on cancer patients' drug treatment and health and makes recommendations forcancer patients' management.

...read moreread less

Abstract: Corresponding Author: Razi Ahmed Universiti Kuala Lumpur, Malaysian Institute of Information Technology, Kuala Lumpur, Malaysia Email: razi.ahmed@s.unikl.edu.my Abstract: Since December 2019, many unexplained viral pneumonia cases have been found in Wuhan City, Hubei Province, China. It was later confirmed that the outbreak's causative agent was a new coronavirus. The virus was temporarily named \"2019-new coronavirus\" (2019-nCoV) by the World Health Organization (WHO). The diseases caused by 2019-nCoV were called by the National Health and Health Commission of China \"New coronavirus pneumonia\" (Novel Coronavirus Pneumonia, NCP) and was named \"Coronavirus disease 2019\" (COVID-19) by the WHO. The outbreak of NCP seriously affected the lives of the public. This article focuses on the group of cancer patients, comprehensively considers the social, medical resources and family issues to analyze the possible impact of the epidemic on cancer patients' drug treatment and health and makes recommendations for cancer patients' management.

...read moreread less

Journal Article•DOI•

An Interval Type-2 Fuzzy Association Rule Mining Approach to Pattern Discovery in Breast Cancer Dataset

[...]

O. O. Oladipupo¹, Oluwole Olajide¹, Stephen A. Adubi¹, Jelili Oyelade¹, Zacchaeus O. Omogbadegun - Show less +1 more•Institutions (1)

Covenant University¹

06 Apr 2021-Journal of Computer Science

TL;DR: The proposed Interval Type-2 fuzzy association rule mining approach is able to eliminate redundant rules which reduce the number of generated rules by 39.5% and memory usage by 22.6%, and discover associative rules with minimum number of symptoms at confidence values as high as 91%.

...read moreread less

Abstract: In the literature, several methods explored to analyze breast cancer dataset have failed to sufficiently handle quantitative attribute sharp boundary problem to resolve inter and intra uncertainties in breast cancer dataset analysis. In this study an Interval Type-2 fuzzy association rule mining approach is proposed for pattern discovery in breast cancer dataset. In the first part of this analysis, the interval Type-2 fuzzification of the breast cancer dataset is carried out using Hao and Mendel approach. In the second part, FP-growth algorithm is adopted for associative pattern discovery from the fuzzified dataset from the first part. To define the intuitive words for breast cancer determinant factors and expert data interval, thirty (30) medical experts from specialized hospitals were consulted through questionnaire poling method. To establish the adequacy of the linguistic word defined by the expert, Jaccard similarity measure is used. This analysis is able to discover associative rules with minimum number of symptoms at confidence values as high as 91%. It also identifies High Bare Nuclei and High Uniformity of Cell Shape as strong determinant factors for diagnosing breast cancer. The proposed approach performed better in terms of rules generated when compared with traditional quantitative association rule mining. It is able to eliminate redundant rules which reduce the number of generated rules by 39.5% and memory usage by 22.6%. The discovered rules are viable in building a comprehensive and compact expert driven knowledge�base for breast cancer decision support or expert system

...read moreread less

Journal Article•DOI•

System Error Estimate using Combination of Classification and Optimization Technique

[...]

Khalid Alattas

01 Apr 2021-Journal of Computer Science

TL;DR: The aim of the paper was to show how static model of data mining is used to extract defects and the PSO algorithm and to develop an optimized software flaw prophecy system on data mining techniques namely Association Rule mining, Decision Tree, Naive Bayes and Classification integrated with Particle Swarm Optimization technique.

...read moreread less

Abstract: The representation of software must produce flawless significances without any inadequacies. Software imperfection evaluations scheme determines defective mechanisms in software. The eventual creation would have minor or negligible shortcomings to harvest great eminence software. Software quality metrics are a division of software metrics that spotlight the quality aspects of the product. The software flaw prediction system helps in the early discovery of flaws and contributes to talented removal and producing a quality software system through numerous metrics. The aim of the paper was to show how static model of data mining is used to extract defects and the PSO algorithm. Another aim of the research was to develop an optimized software flaw prophecy system on data mining techniques namely Association Rule mining, Decision Tree, Naive Bayes and Classification integrated with Particle Swarm Optimization technique. The proposed software flaw prediction system is deliberated through Data Mining techniques with Particle Swarm Optimization algorithm has been verified and compared the results. This proposed system is very useful to identify the relationships between the quality metrics and the potential defective modules. The optimized data mining systems have pragmatic perfect prediction of these defective modules. In the future, optimized data mining systems can be improved by the use of different platforms and particularly by improving data mining using PSO algorithms. It is necessary to develop algorithms that can identify faults in advance, which will minimize costs and promote the quality of developed software systems. Future optimized data mining systems will improve the relationship between quality metrics and the potential defective modules, which will lead to improved performances, productivity and lower operation costs.

...read moreread less

Journal Article•DOI•

A Systematic Literature Review on English and Bangla Topic Modeling

[...]

Md. Basim Uddin Ahmed¹, Ananta Akash Podder¹, Mahruba Sharmin Chowdhury¹, Mohammad Abdullah Al Mumin¹•Institutions (1)

Shahjalal University of Science and Technology¹

08 Jan 2021-Journal of Computer Science

TL;DR: This literature review paper has analyzed topic modeling methods from different aspects and identified the research gap between topic modeling in English and Bangla language and identified several types of topic modeling techniques, such as Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), Support Vector Machine (SVM), Bi-term Topic Modeling (BTM).

...read moreread less

Abstract: Due to the enormous growth of information and technology, the digitized texts and data are being immensely generated. Therefore, identifying the main topics in a vast collection of documents by humans is merely impossible. Topic modeling is such a statistical framework that infers the latent and underlying topics from text documents, corpus, or electronic archives through a probabilistic approach. It is a promising field in Natural Language Processing (NLP). Though many researchers have researched this field, only a few significant research has been done for Bangla. In this literature review paper, we have followed a systematic approach for reviewing topic modeling studies published from 2003 to 2020. We have analyzed topic modeling methods from different aspects and identified the research gap between topic modeling in English and Bangla language. After analyzing these papers, we have identified several types of topic modeling techniques, such as Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), Support Vector Machine (SVM), Bi-term Topic Modeling (BTM). Furthermore, this review paper also highlights the real-world applications of topic modeling. Several evaluation methods were used to evaluate these models’ performances, which we have discussed in this study. We conclude by mentioning the huge future research scopes for topic modeling in Bangla.

...read moreread less

Journal Article•DOI•

Extensive Analysis on Images Encryption using Hybrid Elliptic Curve Cryptosystem and Hill Cipher

[...]

Saniah Sulaiman, Zurina Mohd Hanapi

02 Apr 2021-Journal of Computer Science

TL;DR: The result shows the ECCHC produces poor performances on security analysis on grayscale and RGB images, which then concludes it is not suitable to encrypt grayingcale andRGB images.

...read moreread less

Abstract: Corresponding Author: Zurina Mohd Hanapi Department of Communication Technology and Network, Faculty of Computer Science and Information Technology, Universiti of Putra Malaysia, Serdang, Selangor, Malaysia Email: zurinamh@upm.edu.my Abstract: The advancement of communication technology helps individual to share images through an Internet. However, the sharing through insecure channels may expose the images to certain attacks that will compromise their confidentiality. Image encryption is one of the methods used to protect against the confidentiality threat. A Hill Cipher has been applied in image encryption because of its simple operation and fast computation, but it also possesses a weak security level which requires the sender and receiver to use and share the same private key within an unsecure channel. Thus, there are many solutions been proposed in utilizing hybrid approach of Hill Cipher where one of them is Elliptic Curve Cryptosystem together with Hill Cipher (ECCHC) to utilize the beauty of Hill Cipher while managing its weaknesses. However, the ECCHC only been experimented over four images which leads to inaccuracy of the results. Thus, this study extended the experiments on 209 images from USC-SIPI database in order to investigate the efficiency of ECCHC. The result shows the ECCHC produces poor performances on security analysis on grayscale and RGB images, which then concludes it is not suitable to encrypt grayscale and RGB images.

...read moreread less

Journal Article•DOI•

Mining Sports Articles using Cuckoo Search and Tabu Search with SMOTE Preprocessing Technique

[...]

Waheeda Almayyan

12 Mar 2021-Journal of Computer Science

TL;DR: This study presents a two-stage heuristic feature selection method to classify sports articles using Tabu search and Cuckoo search via Levy flight, and shows significant improvements in the overall accuracy.

...read moreread less

Abstract: Sentiment analysis is one of the most popular domains for natural language text classification, crucial for improving information extraction. However, massive data availability is one of the biggest problems for opinion mining due to accuracy considerations. Selecting high discriminative features from an opinion mining database is still an ongoing research topic. This study presents a two-stage heuristic feature selection method to classify sports articles using Tabu search and Cuckoo search via Levy flight. Levy flight is used to prevent the solution from being trapped at local optima. Comparative results on a benchmark dataset prove that our method shows significant improvements in the overall accuracy from 82.6% up to 89.5%.

...read moreread less

Journal Article•DOI•

SQL Generation from Natural Language: A Sequence-to-Sequence Model Powered by the Transformers Architecture and Association Rules

[...]

Youssef Mellah, Abdelkader Rhouati, El Hassane Ettifouri, Toumi Bouchentouf, Mohammed Ghaouth Belkasmi - Show less +1 more

23 May 2021-Journal of Computer Science

TL;DR: This study presents a Sequence-to-Sequence (Seq2Seq) parsing model for the NL to SQL task, powered by the Transformers Architecture exploring the two Language Models (LM): Text-To-Text Transfer Transformer (T5) and the Multilingual pre-trained Text- to-Text Trans transformer (mT5).

...read moreread less

Abstract: Corresponding Author: Youssef Mellah NovyLab Research, Novelis, Paris, France And LARSA Laboratory, ENSAO, Mohammed First University, Oujda, Morocco Email: ymellah@novelis.io Abstract: Using Natural Language (NL) to interacting with relational databases allows users from any background to easily query and analyze large amounts of data. This requires a system that understands user questions and automatically converts them into structured query language such as SQL. The best performing Text-to-SQL systems use supervised learning (usually formulated as a classification problem) by approaching this task as a sketch-based slot-filling problem, or by first converting questions into an Intermediate Logical Form (ILF) then convert it to the corresponding SQL query. However, non-supervised modeling that directly converts questions to SQL queries has proven more difficult. In this sense, we propose an approach to directly translate NL questions into SQL statements. In this study, we present a Sequence-to-Sequence (Seq2Seq) parsing model for the NL to SQL task, powered by the Transformers Architecture exploring the two Language Models (LM): Text-To-Text Transfer Transformer (T5) and the Multilingual pre-trained Text-To-Text Transformer (mT5). Besides, we adopt the transformationbased learning algorithm to update the aggregation predictions based on association rules. The resulting model achieves a new state-of-the-art on the WikiSQL DataSet, for the weakly supervised SQL generation.

...read moreread less

Journal Article•DOI•

Is Single Scan based Restructuring Always a Suitable Approach to Handle Incremental Frequent Pattern Mining

[...]

Shafiul Alom Ahmed, Bhabesh Nath

12 Mar 2021-Journal of Computer Science

TL;DR: The proposed Improved FP-tree construction algorithm has immensely improved the performance of tree construction time by resourcefully using node links, maintained in header table to manage the same item node list in theFP-tree.

...read moreread less

Abstract: Incremental mining of frequent patterns has attracted the attention of researchers in the last two decades. The researchers have explored the frequent pattern mining from incremental database problems by considering that the complete database to be processed can be accommodated in systems’ main memory even after the database gets updated very frequently. The FP-tree-based approaches were able to draw more interest because of their compact representation and requirement of a minimum number of database scans. The researchers have developed a few FP-tree based methods to handle the incremental scenario by adjusting or restructuring the tree prefix paths. Although the approaches have managed to solve the re-computation problem by constructing a complete pattern tree data structure using only one database scan, restructuring the prefix paths for each transaction is a computationally costly task, leading to the high tree construction time. If the FP-tree construction process can be supported with suitable data structures, reconstruction of the FP-tree from scratch may be less time consuming than the restructuring approaches in case of incremental scenario. In this study, we have proposed a tree data structure called Improved Frequent Pattern tree (Improved FP-tree). The proposed Improved FP-tree construction algorithm has immensely improved the performance of tree construction time by resourcefully using node links, maintained in header table to manage the same item node list in the FP-tree. The experimental results emphasize the significance of the proposed Improved FP-tree construction algorithm over a few conventional incremental FP-tree construction algorithms with prefix path restructuring.

...read moreread less

Journal Article•DOI•

Collision Avoidance Modelling in Airline Traffic Based on the Change of Airplane Movements and Dynamic Clouds

[...]

Yudhi Purwananto¹, Chastine Fatichah¹, Waskitho Wibisono¹, Ary Mazharuddin Shiddiqi¹, Bagus Jati Santoso¹, Radityo Anggoro¹, Nurlita Dhuha Fatmawati¹ - Show less +3 more•Institutions (1)

Sepuluh Nopember Institute of Technology¹

20 Jan 2021-Journal of Computer Science

TL;DR: A CDR-based CA modelling that involves the Cumulonimbus clouds by considering three airplane maneuvers, i.e., Velocity, angle Turn and Altitude level Change (VTAC) and results show that collisions between airplanes and clouds can be avoided with minimum change of the initial airplane velocity, angle and altitude levels.

...read moreread less

Abstract: An Air Traffic Controller (ATC) system aims to manage airline traffic to prevent collision of the airplane, called the Collision Avoidance (CA). The study on CA, called Conflict Detection and Resolution (CDR), becomes more critical as the airline traffic has grown each year significantly. Previous studies used optimization algorithms for CDR and did not involve the presence of cumulonimbus clouds. Many such clouds can be found in tropical regions like in Indonesia. Therefore, involving such clouds in the CDR optimization algorithms will be significant in Indonesia. We developed a CDR-based CA modelling that involves the Cumulonimbus (CB) clouds by considering three airplane maneuvers, i.e., Velocity, angle Turn and Altitude level Change (VTAC). Our optimization algorithm is developed based on a Mixed-Integer Programming (MIP) solver due to its efficiency. This proposed algorithm requires two input data, namely the initial airplane and cloud states input and the flight parameter such as velocity, angle and altitude levels. The outputs of our VTAC optimization algorithm are the optimum speed, altitude and angle turn of an airplane that is determined based on the currently calculated variables. Extensive experiments have been conducted to validate the proposed approach and the experiment results show that collisions between airplanes and clouds can be avoided with minimum change of the initial airplane velocity, angle and altitude levels. The VTAC algorithm produced longer distance to avoid collision between airplanes by at least 1 Nautical Mile (NM) compared to the VAC algorithm. The addition of angle in the VTAC algorithm has improved the result significantly.

...read moreread less