scispace - formally typeset
Search or ask a question

Showing papers in "The Journal of Supercomputing in 2023"



Journal ArticleDOI
TL;DR: YOLOv5-CA as discussed by the authors adds a coordinate attention (CA) module into the baseline backbone following two different schemes, namely YOLOV5s-CA and YOLov5s -C3CA.
Abstract: One of the most effective deterrent methods is using face masks to prevent the spread of the virus during the COVID-19 pandemic. Deep learning face mask detection networks have been implemented into COVID-19 monitoring systems to provide effective supervision for public areas. However, previous works have limitations: the challenge of real-time performance (i.e., fast inference and low accuracy) and training datasets. The current study aims to propose a comprehensive solution by creating a new face mask dataset and improving the YOLOv5 baseline to balance accuracy and detection time. Particularly, we improve YOLOv5 by adding coordinate attention (CA) module into the baseline backbone following two different schemes, namely YOLOv5s-CA and YOLOV5s-C3CA. In detail, we train three models with a Kaggle dataset of 853 images consisting of three categories: without a mask “NM," with mask “M,” and incorrectly worn mask “IWM” classes. The experimental results show that our modified YOLOv5 with CA module achieves the highest accuracy mAP@0.5 of 93.9% compared with 87% of baseline and detection time per image of 8.0 ms (125 FPS). In addition, we build an integrated system of improved YOLOv5-CA and auto-labeling module to create a new face mask dataset of 7110 images with more than 3500 labels for three categories from YouTube videos. Our proposed YOLOv5-CA and the state-of-the-art detection models (i.e., YOLOX, YOLOv6, and YOLOv7) are trained on our 7110 images dataset. In our dataset, the YOLOv5-CA performance enhances with mAP@0.5 of 96.8%. The results indicate the enhancement of the improved YOLOv5-CA model compared with several state-of-the-art works.

4 citations




Journal ArticleDOI
TL;DR: In this paper , an enhanced golden jackal optimization (EGJO) method with the elite opposition-based learning technique and the simplex technique is proposed to address adaptive infinite impulse response system identification.
Abstract: Golden jackal optimization (GJO) is inspired by the cooperative attacking behavior of golden jackals and mainly simulates searching for prey, stalking and enclosing prey, and pouncing on prey to solve complicated optimization problems. However, the basic GJO has the disadvantages of premature convergence, a slow convergence rate and low computation precision. To enhance the overall search and optimization abilities, an enhanced golden jackal optimization (EGJO) method with the elite opposition-based learning technique and the simplex technique is proposed to address adaptive infinite impulse response system identification. The intention is to minimize the error fitness value and obtain the appropriate control parameters. The elite opposition-based learning technique boosts population diversity, enhances the exploration ability, extends the search range and avoids search stagnation. The simplex technique accelerates the search process, enhances the exploitation ability, improves the computational precision and increases the optimization depth. EGJO can not only achieve complementary advantages to avoid search stagnation but also balance exploration and exploitation to arrive at the best value. Three sets of experiments are used to verify the effectiveness and feasibility of EGJO. The experimental results clearly demonstrate that the optimization efficiency and recognition accuracy of EGJO are superior to those of AOA, GTO, HHO, MDWA, RSO, WOA, TSA and GJO. EGJO has a faster convergence rate, higher computation precision, better control parameters and better fitness value, and it is stable and resilient in solving the IIR system identification problem.

3 citations



Journal ArticleDOI
TL;DR: In this paper , the authors extend and improve the SPBench benchmarking framework to support dynamic micro-batching and data stream frequency management, and propose a set of algorithms that generate the most commonly used frequency patterns for benchmarking stream processing in related work.
Abstract: Latency or throughput is often critical performance metrics in stream processing. Applications’ performance can fluctuate depending on the input stream. This unpredictability is due to the variety in data arrival frequency and size, complexity, and other factors. Researchers are constantly investigating new ways to mitigate the impact of these variations on performance with self-adaptive techniques involving elasticity or micro-batching. However, there is a lack of benchmarks capable of creating test scenarios to further evaluate these techniques. This work extends and improves the SPBench benchmarking framework to support dynamic micro-batching and data stream frequency management. We also propose a set of algorithms that generates the most commonly used frequency patterns for benchmarking stream processing in related work. It allows the creation of a wide variety of test scenarios. To validate our solution, we use SPBench to create custom benchmarks and evaluate the impact of micro-batching and data stream frequency on the performance of Intel TBB and FastFlow. These are two libraries that leverage stream parallelism for multi-core architectures. Our results demonstrated that our test cases did not benefit from micro-batches on multi-cores. For different data stream frequency configurations, TBB ensured the lowest latency, while FastFlow assured higher throughput in shorter pipelines.

2 citations





Journal ArticleDOI
TL;DR: In this article , the authors presented techniques for dynamic load balancing of the cellular automata parallel execution for the case of domain space partitioned along two dimensions, starting from general closed-form expressions that allow to compute the optimal workload assignment in a dynamic fashion when partitioning takes place along only one dimension, they tailor the procedure to allow partitioning and balancing along both dimensions.
Abstract: Abstract In this paper, techniques for dynamic load balancing of the cellular automata parallel execution are presented for the case of domain space partitioned along two dimensions. Starting from general closed-form expressions that allow to compute the optimal workload assignment in a dynamic fashion when partitioning takes place along only one dimension, we tailor the procedure to allow partitioning and balancing along both dimensions. Both qualitative and quantitative experiments are carried out that assess performance improvement in applying load balancing for the case of two-dimensional partitioned domain, especially when the load balancing takes place along both dimensions.


Journal ArticleDOI
TL;DR: In this paper , the authors conduct an extensive comparison and evaluation of homomorphic cryptosystems' performance based on their experimental results and discuss the resilience of HE schemes to different kinds of attacks such as indistinguishability under chosen plaintext attack and integer factorization attacks on classical and quantum computers.
Abstract: With the increased need for data confidentiality in various applications of our daily life, homomorphic encryption (HE) has emerged as a promising cryptographic topic. HE enables to perform computations directly on encrypted data (ciphertexts) without decryption in advance. Since the results of calculations remain encrypted and can only be decrypted by the data owner, confidentiality is guaranteed and any third party can operate on ciphertexts without access to decrypted data (plaintexts). Applying a homomorphic cryptosystem in a real-world application depends on its resource efficiency. Several works compared different HE schemes and gave the stakes of this research field. However, the existing works either do not deal with recently proposed HE schemes (such as CKKS) or focus only on one type of HE. In this paper, we conduct an extensive comparison and evaluation of homomorphic cryptosystems’ performance based on their experimental results. The study covers all three families of HE, including several notable schemes such as BFV, BGV, FHEW, TFHE, CKKS, RSA, El-Gamal, and Paillier, as well as their implementation specification in widely used HE libraries, namely Microsoft SEAL, PALISADE, and HElib. In addition, we also discuss the resilience of HE schemes to different kind of attacks such as indistinguishability under chosen plaintext attack and integer factorization attacks on classical and quantum computers.

Journal ArticleDOI
TL;DR: In this paper , the authors present a new backend for RDataFrame distributed over the OSCAR tool, an open source framework that supports serverless computing, and introduce new ways, relative to the AWS Lambda -based prototype, to synchronize the work of functions.
Abstract: Abstract CERN (Centre Europeen pour la Recherce Nucleaire) is the largest research centre for high energy physics (HEP). It offers unique computational challenges as a result of the large amount of data generated by the large hadron collider. CERN has developed and supports a software called ROOT , which is the de facto standard for HEP data analysis. This framework offers a high-level and easy-to-use interface called RDataFrame , which allows managing and processing large data sets. In recent years, its functionality has been extended to take advantage of distributed computing capabilities. Thanks to its declarative programming model, the user-facing API can be decoupled from the actual execution backend . This decoupling allows physical analysis to scale automatically to thousands of computational cores over various types of distributed resources. In fact, the distributed RDataFrame module already supports the use of established general industry engines such as Apache Spark or Dask. Notwithstanding the foregoing, these current solutions will not be sufficient to meet future requirements in terms of the amount of data that the new projected accelerators will generate. It is of interest, for this reason, to investigate a different approach, the one offered by serverless computing. Based on a first prototype using AWS Lambda , this work presents the creation of a new backend for RDataFrame distributed over the OSCAR tool, an open source framework that supports serverless computing. The implementation introduces new ways, relative to the AWS Lambda -based prototype, to synchronize the work of functions.




Journal ArticleDOI
TL;DR: In this paper , the authors proposed a transfer learning method to tackle the classification and regression problems simultaneously, which might improve performance on each of the two problems, and also proposed a modification to current deep learning models for multi-tasking.
Abstract: Abstract Non-intrusive load monitoring (NILM) is the problem of predicting the status or consumption of individual domestic appliances only from the knowledge of the aggregated power load. NILM is often formulated as a classification (ON/OFF) problem for each device. However, the training datasets gathered by smart meters do not contain these labels, but only the electric consumption at every time interval. This paper addresses a fundamental methodological problem in how a NILM problem is posed, namely how the different possible thresholding methods lead to different classification problems. Standard datasets and NILM deep learning models are used to illustrate how the choice of thresholding method affects the output results. Some criteria that should be considered for the choice of such methods are also proposed. Finally, we propose a slight modification to current deep learning models for multi-tasking, i.e. tackling the classification and regression problems simultaneously. Transfer learning between both problems might improve performance on each of them.

Journal ArticleDOI
TL;DR: In this article , a data resampling technique is proposed based on Adaptive Synthetic (ADASYN) and Tomek Links algorithms in combination with different deep learning models to mitigate the class imbalance problem.
Abstract: Abstract Network intrusion detection systems (NIDS) are the most common tool used to detect malicious attacks on a network. They help prevent the ever-increasing different attacks and provide better security for the network. NIDS are classified into signature-based and anomaly-based detection. The most common type of NIDS is the anomaly-based NIDS which is based on machine learning models and is able to detect attacks with high accuracy. However, in recent years, NIDS has achieved even better results in detecting already known and novel attacks with the adoption of deep learning models. Benchmark datasets in intrusion detection try to simulate real-network traffic by including more normal traffic samples than the attack samples. This causes the training data to be imbalanced and causes difficulties in detecting certain types of attacks for the NIDS. In this paper, a data resampling technique is proposed based on Adaptive Synthetic (ADASYN) and Tomek Links algorithms in combination with different deep learning models to mitigate the class imbalance problem. The proposed model is evaluated on the benchmark NSL-KDD dataset using accuracy, precision, recall and F-score metrics. The experimental results show that in binary classification, the proposed method improves the performance of the NIDS and outperforms state-of-the-art models with an achieved accuracy of 99.8%. In multi-class classification, the results were also improved, outperforming state-of-the-art models with an achieved accuracy of 99.98%.


Journal ArticleDOI
TL;DR: In this paper , a framework based on the elastic stack (ELK) is proposed to process and store log data in real time from different users and applications, exploiting the advantages of the ELK-based software architecture and of the Kubernetes platform.
Abstract: Abstract Nowadays, the speed of the user and application logs is so quick that it is almost impossible to analyse them in real time without using high-performance systems and platforms. In cybersecurity, human behaviour is responsible directly or indirectly for the most common attacks (i.e. ransomware and phishing). To monitor user behaviour, it is necessary to process fast user logs coming from different and heterogeneous sources, having part of the data or some entire sources missing. A framework based on the elastic stack (ELK) to process and store log data in real time from different users and applications is proposed for this aim. This system generates an ensemble of models to classify user behaviour and detect anomalies in real time, exploiting the advantages of the ELK-based software architecture and of the Kubernetes platform. In addition, a distributed evolutionary algorithm is used to classify the users by exploiting their digital footprints derived from many data sources. Experiments conducted on two real-life data sets verify the approach’s goodness in detecting anomalies in user behaviour, coping with missing data and lowering the number of false alarms.

Journal ArticleDOI
TL;DR: In this article , a machine learning model is applied to reduce the VMs migration number and energy consumption in the cloud computing environment, which is based on improving VM migration process and selection, which has been benchmarked with JVCMMD and EVSP solutions.
Abstract: Cloud Computing is a paradigm allowing access to physical and application resources online via the Internet. These resources are virtualized using virtualization software to make them available to users as a service. Virtual machines (VMs) migration technique provided by virtualization technology impacts the performance of the cloud. It is a significant concern in this environment. When allocating resources, the distribution of VMs is unbalanced, and their movement from one server to another can increase energy consumption and network overhead, necessitating an improvement in VM migrations. This paper addresses the VMs migration issue by applying a machine learning model to reduce the VMs migration number and energy consumption. The proposed algorithm (named VMLM) is based on improving VM’s migration process and selection. It has been benchmarked with JVCMMD and EVSP solutions. The simulation results demonstrate the efficiency of our proposal, which includes two phases the machine learning preparing stage and the VMs migration stage.


Journal ArticleDOI
TL;DR: In this article , a concatenated fusion between the proposed CNN and a fully connected model has been formulated, utilized, and tested, which can enhance recognition performance in a superior manner compared with the latest state-of-the-art studies.
Abstract: In the last decade, the need for a non-contact biometric model for recognizing candidates has increased, especially after the pandemic of COVID-19 appeared and spread worldwide. This paper presents a novel deep convolutional neural network (CNN) model that guarantees quick, safe, and precise human authentication via their poses and walking style. The concatenated fusion between the proposed CNN and a fully connected model has been formulated, utilized, and tested. The proposed CNN extracts the human features from two main sources: (1) human silhouette images according to model-free and (2) human joints, limbs, and static joint distances according to a model-based via a novel, fully connected deep-layer structure. The most commonly used dataset, CASIA gait families, has been utilized and tested. Numerous performance metrics have been evaluated to measure the system quality, including accuracy, specificity, sensitivity, false negative rate, and training time. Experimental results reveal that the proposed model can enhance recognition performance in a superior manner compared with the latest state-of-the-art studies. Moreover, the suggested system introduces a robust real-time authentication with any covariate conditions, scoring 99.8% and 99.6% accuracy in identifying casia (B) and casia (A) datasets, respectively.



Journal ArticleDOI
TL;DR: In this paper , an efficient meta-heuristic approach named Multi-objective Artificial Algae (MAA) algorithm is presented for scheduling scientific workflows in a hierarchical fog-cloud environment.
Abstract: With the development of current computing technology, workflow applications have become more important in a variety of fields, including research, education, health care, and scientific experimentation. A group of tasks with complicated dependency relationships constitute the workflow applications. It can be difficult to create an acceptable execution sequence while maintaining precedence constraints. Workflow scheduling algorithms (WSA) are gaining more attention from researchers as a real-time concern. Even though a variety of research perspectives have been demonstrated for WSAs, it remains challenging to develop a single coherent algorithm that simultaneously meets a variety of criteria. There is very less research available on WSA in the heterogeneous computing system. Classical scheduling techniques, evolutionary optimisation algorithms, and other methodologies are the available solution to this problem. The workflow scheduling problem is regarded as NP-complete. This problem is constrained by various factors, such as Quality of Service, interdependence between tasks, and user deadlines. In this paper, an efficient meta-heuristic approach named Multi-objective Artificial Algae (MAA) algorithm is presented for scheduling scientific workflows in a hierarchical fog-cloud environment. In the first phase, the algorithm pre-processes scientific workflow and prepares two task lists. In order to speed up execution, bottleneck tasks are executed with high priority. The MAA algorithm is used to schedule tasks in the following stage to reduce execution times, energy consumption and overall costs. In order to effectively use fog resources, the algorithm also utilises the weighted sum-based multi-objective function. The proposed approach is evaluated using five benchmark scientific workflow datasets. To verify the performance, the proposed algorithm's results are compared to those of conventional and specialised WSAs. In comparison to previous methodologies, the average results demonstrate significant improvements of about 43% in execution time, 28% in energy consumption and 10% in total cost without any trade-offs.