scispace - formally typeset
Search or ask a question

Showing papers in "Neural Computing and Applications in 2020"


Journal ArticleDOI
TL;DR: A new TCNN with the depth of 51 convolutional layers is proposed for fault diagnosis of ResNet-50 and achieves state-of-the-art results, which demonstrates that TCNN(ResNet- 50) outperforms other DL models and traditional methods.
Abstract: With the rapid development of smart manufacturing, data-driven fault diagnosis has attracted increasing attentions. As one of the most popular methods applied in fault diagnosis, deep learning (DL) has achieved remarkable results. However, due to the fact that the volume of labeled samples is small in fault diagnosis, the depths of DL models for fault diagnosis are shallow compared with convolutional neural network in other areas (including ImageNet), which limits their final prediction accuracies. In this research, a new TCNN(ResNet-50) with the depth of 51 convolutional layers is proposed for fault diagnosis. By combining with transfer learning, TCNN(ResNet-50) applies ResNet-50 trained on ImageNet as feature extractor for fault diagnosis. Firstly, a signal-to-image method is developed to convert time-domain fault signals to RGB images format as the input datatype of ResNet-50. Then, a new structure of TCNN(ResNet-50) is proposed. Finally, the proposed TCNN(ResNet-50) has been tested on three datasets, including bearing damage dataset provided by KAT datacenter, motor bearing dataset provided by Case Western Reserve University (CWRU) and self-priming centrifugal pump dataset. It achieved state-of-the-art results. The prediction accuracies of TCNN(ResNet-50) are as high as 98.95% ± 0.0074, 99.99% ± 0 and 99.20% ± 0, which demonstrates that TCNN(ResNet-50) outperforms other DL models and traditional methods.

319 citations


Journal ArticleDOI
TL;DR: An automated detection system for Parkinson’s disease employing the convolutional neural network (CNN) employing the thirteen-layer CNN architecture which can overcome the need for the conventional feature representation stages is proposed.
Abstract: An automated detection system for Parkinson’s disease (PD) employing the convolutional neural network (CNN) is proposed in this study. PD is characterized by the gradual degradation of motor function in the brain. Since it is related to the brain abnormality, electroencephalogram (EEG) signals are usually considered for the early diagnosis. In this work, we have used the EEG signals of twenty PD and twenty normal subjects in this study. A thirteen-layer CNN architecture which can overcome the need for the conventional feature representation stages is implemented. The developed model has achieved a promising performance of 88.25% accuracy, 84.71% sensitivity, and 91.77% specificity. The developed classification model is ready to be used on large population before installation of clinical usage.

317 citations


Journal ArticleDOI
TL;DR: A new deep learning forecasting model is proposed for the accurate prediction of gold price and movement that exploits the ability of convolutional layers for extracting useful knowledge and learning the internal representation of time-series data as well as the effectiveness of long short-term memory layers for identifying short- term and long-term dependencies.
Abstract: Gold price volatilities have a significant impact on many financial activities of the world. The development of a reliable prediction model could offer insights in gold price fluctuations, behavior and dynamics and ultimately could provide the opportunity of gaining significant profits. In this work, we propose a new deep learning forecasting model for the accurate prediction of gold price and movement. The proposed model exploits the ability of convolutional layers for extracting useful knowledge and learning the internal representation of time-series data as well as the effectiveness of long short-term memory (LSTM) layers for identifying short-term and long-term dependencies. We conducted a series of experiments and evaluated the proposed model against state-of-the-art deep learning and machine learning models. The preliminary experimental analysis illustrated that the utilization of LSTM layers along with additional convolutional layers could provide a significant boost in increasing the forecasting performance.

310 citations


Journal ArticleDOI
TL;DR: It is argued that, beyond improving model interpretability as a goal in itself, machine learning needs to integrate the medical experts in the design of data analysis interpretation strategies Otherwise, machineLearning is unlikely to become a part of routine clinical and health care practice.
Abstract: In a short period of time, many areas of science have made a sharp transition towards data-dependent methods. In some cases, this process has been enabled by simultaneous advances in data acquisition and the development of networked system technologies. This new situation is particularly clear in the life sciences, where data overabundance has sparked a flurry of new methodologies for data management and analysis. This can be seen as a perfect scenario for the use of machine learning and computational intelligence techniques to address problems in which more traditional data analysis approaches might struggle. But, this scenario also poses some serious challenges. One of them is model interpretability and explainability, especially for complex nonlinear models. In some areas such as medicine and health care, not addressing such challenge might seriously limit the chances of adoption, in real practice, of computer-based systems that rely on machine learning and computational intelligence methods for data analysis. In this paper, we reflect on recent investigations about the interpretability and explainability of machine learning methods and discuss their impact on medicine and health care. We pay specific attention to one of the ways in which interpretability and explainability in this context can be addressed, which is through data and model visualization. We argue that, beyond improving model interpretability as a goal in itself, we need to integrate the medical experts in the design of data analysis interpretation strategies. Otherwise, machine learning is unlikely to become a part of routine clinical and health care practice.

272 citations


Journal ArticleDOI
TL;DR: A survey of techniques for implementing and optimizing CNN algorithms on FPGA is presented and is expected to be useful for researchers in the area of artificial intelligence, hardware architecture and system design.
Abstract: Deep convolutional neural networks (CNNs) have recently shown very high accuracy in a wide range of cognitive tasks, and due to this, they have received significant interest from the researchers. Given the high computational demands of CNNs, custom hardware accelerators are vital for boosting their performance. The high energy efficiency, computing capabilities and reconfigurability of FPGA make it a promising platform for hardware acceleration of CNNs. In this paper, we present a survey of techniques for implementing and optimizing CNN algorithms on FPGA. We organize the works in several categories to bring out their similarities and differences. This paper is expected to be useful for researchers in the area of artificial intelligence, hardware architecture and system design.

248 citations


Journal ArticleDOI
TL;DR: It has been confirmed by experimental results that DEL produces dynamic NN ensembles of appropriate architecture and diversity that demonstrate good generalization ability.
Abstract: This paper presents a novel dynamic ensemble learning (DEL) algorithm for designing ensemble of neural networks (NNs). DEL algorithm determines the size of ensemble, the number of individual NNs employing a constructive strategy, the number of hidden nodes of individual NNs employing a constructive–pruning strategy, and different training samples for individual NN’s learning. For diversity, negative correlation learning has been introduced and also variation of training samples has been made for individual NNs that provide better learning from the whole training samples. The major benefits of the proposed DEL compared to existing ensemble algorithms are (1) automatic design of ensemble; (2) maintaining accuracy and diversity of NNs at the same time; and (3) minimum number of parameters to be defined by user. DEL algorithm is applied to a set of real-world classification problems such as the cancer, diabetes, heart disease, thyroid, credit card, glass, gene, horse, letter recognition, mushroom, and soybean datasets. It has been confirmed by experimental results that DEL produces dynamic NN ensembles of appropriate architecture and diversity that demonstrate good generalization ability.

222 citations


Journal ArticleDOI
TL;DR: The overall comparisons suggest that the optimization performance of AEO outperforms that of other state-of-the-art counterparts, especially for real-world engineering problems, and is more competitive than other reported methods in terms of both convergence rate and computational efforts.
Abstract: A novel nature-inspired meta-heuristic optimization algorithm, named artificial ecosystem-based optimization (AEO), is presented in this paper. AEO is a population-based optimizer motivated from the flow of energy in an ecosystem on the earth, and this algorithm mimics three unique behaviors of living organisms, including production, consumption, and decomposition. AEO is tested on thirty-one mathematical benchmark functions and eight real-world engineering design problems. The overall comparisons suggest that the optimization performance of AEO outperforms that of other state-of-the-art counterparts. Especially for real-world engineering problems, AEO is more competitive than other reported methods in terms of both convergence rate and computational efforts. The applications of AEO to the field of identification of hydrogeological parameters are also considered in this study to further evaluate its effectiveness in practice, demonstrating its potential in tackling challenging problems with difficulty and unknown search space. The codes are available at https://www.mathworks.com/matlabcentral/fileexchange/72685-artificial-ecosystem-based-optimization-aeo .

207 citations


Journal ArticleDOI
TL;DR: The main focus of this paper is on the family of evolutionary algorithms and their real-life applications, and each technique is presented in the pseudo-code form, which can be used for its easy implementation in any programming language.
Abstract: The main focus of this paper is on the family of evolutionary algorithms and their real-life applications. We present the following algorithms: genetic algorithms, genetic programming, differential evolution, evolution strategies, and evolutionary programming. Each technique is presented in the pseudo-code form, which can be used for its easy implementation in any programming language. We present the main properties of each algorithm described in this paper. We also show many state-of-the-art practical applications and modifications of the early evolutionary methods. The open research issues are indicated for the family of evolutionary algorithms.

207 citations


Journal ArticleDOI
TL;DR: This paper investigated the security of medical images in IoT by utilizing an innovative cryptographic model with optimization strategies, and identified a diverse encryption algorithm with its optimization methods with the most extreme peak signal-to-noise ratio values.
Abstract: The development of the Internet of Things (IoT) is predicted to change the healthcare industry and might lead to the rise of the Internet of Medical Things. The IoT revolution is surpassing the present-day human services with promising mechanical, financial, and social prospects. This paper investigated the security of medical images in IoT by utilizing an innovative cryptographic model with optimization strategies. For the most part, the patient data are stored as a cloud server in the hospital due to which the security is vital. So another framework is required for the secure transmission and effective storage of medical images interleaved with patient information. For increasing the security level of encryption and decryption process, the optimal key will be chosen using hybrid swarm optimization, i.e., grasshopper optimization and particle swarm optimization in elliptic curve cryptography. In view of this method, the medical images are secured in IoT framework. From this execution, the results are compared and contrasted, whereas a diverse encryption algorithm with its optimization methods from the literature is identified with the most extreme peak signal-to-noise ratio values, i.e., 59.45 dB and structural similarity index as 1.

200 citations


Journal ArticleDOI
TL;DR: The comparison of the ANN-derived results with the experimental findings, which are in very good agreement, demonstrates the ability of ANNs to estimate the compressive strength of concrete in a reliable and robust manner.
Abstract: The non-destructive testing of concrete structures with methods such as ultrasonic pulse velocity and Schmidt rebound hammer test is of utmost technical importance. Non-destructive testing methods do not require sampling, and they are simple, fast to perform, and efficient. However, these methods result in large dispersion of the values they estimate, with significant deviation from the actual (experimental) values of compressive strength. In this paper, the application of artificial neural networks (ANNs) for predicting the compressive strength of concrete in existing structures has been investigated. ANNs have been systematically used for predicting the compressive strength of concrete, utilizing both the ultrasonic pulse velocity and the Schmidt rebound hammer experimental results, which are available in the literature. The comparison of the ANN-derived results with the experimental findings, which are in very good agreement, demonstrates the ability of ANNs to estimate the compressive strength of concrete in a reliable and robust manner. Thus, the (quantitative) values of weights for the proposed neural network model are provided, so that the proposed model can be readily implemented in a spreadsheet and accessible to everyone interested in the procedure of simulation.

197 citations


Journal ArticleDOI
TL;DR: The proposed work based on 744 segments of ECG signal is obtained from the MIT-BIH Arrhythmia database and can be applied in cloud computing or implemented in mobile devices to evaluate the cardiac health immediately with highest precision.
Abstract: The heart disease is one of the most serious health problems in today’s world. Over 50 million persons have cardiovascular diseases around the world. Our proposed work based on 744 segments of ECG signal is obtained from the MIT-BIH Arrhythmia database (strongly imbalanced data) for one lead (modified lead II), from 29 people. In this work, we have used long-duration (10 s) ECG signal segments (13 times less classifications/analysis). The spectral power density was estimated based on Welch’s method and discrete Fourier transform to strengthen the characteristic ECG signal features. Our main contribution is the design of a novel three-layer (48 + 4 + 1) deep genetic ensemble of classifiers (DGEC). Developed method is a hybrid which combines the advantages of: (1) ensemble learning, (2) deep learning, and (3) evolutionary computation. Novel system was developed by the fusion of three normalization types, four Hamming window widths, four classifiers types, stratified tenfold cross-validation, genetic feature (frequency components) selection, layered learning, genetic optimization of classifiers parameters, and new genetic layered training (expert votes selection) to connect classifiers. The developed DGEC system achieved a recognition sensitivity of 94.62% (40 errors/744 classifications), accuracy = 99.37%, specificity = 99.66% with classification time of single sample = 0.8736 (s) in detecting 17 arrhythmia ECG classes. The proposed model can be applied in cloud computing or implemented in mobile devices to evaluate the cardiac health immediately with highest precision.

Journal ArticleDOI
TL;DR: Compared with other ant colony algorithms in different robot mobile simulation environments, the results showed that the global optimal search ability and the convergence speed have been improved greatly and the number of lost ants is less than one-third of others.
Abstract: To solve the problems of local optimum, slow convergence speed and low search efficiency in ant colony algorithm, an improved ant colony optimization algorithm is proposed. The unequal allocation initial pheromone is constructed to avoid the blindness search at early planning. A pseudo-random state transition rule is used to select path, the state transition probability is calculated according to the current optimal solution and the number of iterations, and the proportion of determined or random selections is adjusted adaptively. The optimal solution and the worst solution are introduced to improve the global pheromone updating method. Dynamic punishment method is introduced to solve the problem of deadlock. Compared with other ant colony algorithms in different robot mobile simulation environments, the results showed that the global optimal search ability and the convergence speed have been improved greatly and the number of lost ants is less than one-third of others. It is verified the effectiveness and superiority of the improved ant colony algorithm.

Journal ArticleDOI
TL;DR: An exhaustive and a comprehensive review of the so-called salp swarm algorithm (SSA) and discussions its main characteristics, including its variants, like binary, modifications and multi-objective.
Abstract: This paper completely introduces an exhaustive and a comprehensive review of the so-called salp swarm algorithm (SSA) and discussions its main characteristics. SSA is one of the efficient recent meta-heuristic optimization algorithms, where it has been successfully utilized in a wide range of optimization problems in different fields, such as machine learning, engineering design, wireless networking, image processing, and power energy. This review shows the available literature on SSA, including its variants, like binary, modifications and multi-objective. Followed by its applications, assessment and evaluation, and finally the conclusions, which focus on the current works on SSA, suggest possible future research directions.

Journal ArticleDOI
TL;DR: This review introduces disease prevention and its challenges followed by traditional prevention methodologies, and summarizes state-of-the-art data analytics algorithms used for classification of disease, clustering, anomalies detection, and association as well as their respective advantages, drawbacks and guidelines.
Abstract: Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations.

Journal ArticleDOI
TL;DR: A DNN-based prediction model is designed based on the PSR method and a long- and short-term memory networks for DL and used to predict stock prices and a comparison of the results shows that the proposed prediction model has higher prediction accuracy.
Abstract: Understanding the pattern of financial activities and predicting their development and changes are research hotspots in academic and financial circles. Because financial data contain complex, incomplete and fuzzy information, predicting their development trends is an extremely difficult challenge. Fluctuations in financial data depend on a myriad of correlated constantly changing factors. Therefore, predicting and analysing financial data are a nonlinear, time-dependent problem. Deep neural networks (DNNs) combine the advantages of deep learning (DL) and neural networks and can be used to solve nonlinear problems more satisfactorily compared to conventional machine learning algorithms. In this paper, financial product price data are treated as a one-dimensional series generated by the projection of a chaotic system composed of multiple factors into the time dimension, and the price series is reconstructed using the time series phase-space reconstruction (PSR) method. A DNN-based prediction model is designed based on the PSR method and a long- and short-term memory networks (LSTMs) for DL and used to predict stock prices. The proposed and some other prediction models are used to predict multiple stock indices for different periods. A comparison of the results shows that the proposed prediction model has higher prediction accuracy.

Journal ArticleDOI
TL;DR: The hybrid method is based on using both image processing and deep learning for improved results and the introduced method is efficient and successful enough at diagnosing diabetic retinopathy from retinal fundus images.
Abstract: The objective of this study is to propose an alternative, hybrid solution method for diagnosing diabetic retinopathy from retinal fundus images. In detail, the hybrid method is based on using both image processing and deep learning for improved results. In medical image processing, reliable diabetic retinopathy detection from digital fundus images is known as an open problem and needs alternative solutions to be developed. In this context, manual interpretation of retinal fundus images requires the magnitude of work, expertise, and over-processing time. So, doctors need support from imaging and computer vision systems and the next step is widely associated with use of intelligent diagnosis systems. The solution method proposed in this study includes employment of image processing with histogram equalization, and the contrast limited adaptive histogram equalization techniques. Next, the diagnosis is performed by the classification of a convolutional neural network. The method was validated using 400 retinal fundus images within the MESSIDOR database, and average values for different performance evaluation parameters were obtained as accuracy 97%, sensitivity (recall) 94%, specificity 98%, precision 94%, FScore 94%, and GMean 95%. In addition to those results, a general comparison of with some previously carried out studies has also shown that the introduced method is efficient and successful enough at diagnosing diabetic retinopathy from retinal fundus images. By employing the related image processing techniques and deep learning for diagnosing diabetic retinopathy, the proposed method and the research results are valuable contributions to the associated literature.

Journal ArticleDOI
TL;DR: It is confirmed that investors’ emotional tendency is effective to improve the predicted results; the introduction of EMD can improve the predictability of inventory sequences; and the attention mechanism can help LSTM to efficiently extract specific information and current mission objectives from the information ocean.
Abstract: Stock market prediction has been identified as a very important practical problem in the economic field. However, the timely prediction of the market is generally regarded as one of the most challenging problems due to the stock market’s characteristics of noise and volatility. To address these challenges, we propose a deep learning-based stock market prediction model that considers investors’ emotional tendency. First, we propose to involve investors’ sentiment for stock prediction, which can effectively improve the model prediction accuracy. Second, the stock pricing sequence is a complex time sequence with different scales of fluctuations, making the accurate prediction very challenging. We propose to gradually decompose the complex sequence of stock price by adopting empirical modal decomposition (EMD), which yields better prediction accuracy. Third, we adopt LSTM due to its advantages of analyzing relationships among time-series data through its memory function. We further revised it by adopting attention mechanism to focus more on the more critical information. Experiment results show that the revised LSTM model can not only improve prediction accuracy, but also reduce time delay. It is confirmed that investors’ emotional tendency is effective to improve the predicted results; the introduction of EMD can improve the predictability of inventory sequences; and the attention mechanism can help LSTM to efficiently extract specific information and current mission objectives from the information ocean.

Journal ArticleDOI
TL;DR: FEMa—a finite element machine classifier—for supervised learning problems, where each training sample is the center of a basis function, and the whole training set is modeled as a probabilistic manifold for classification purposes, is presented.
Abstract: Machine learning has played an essential role in the past decades and has been in lockstep with the main advances in computer technology. Given the massive amount of data generated daily, there is a need for even faster and more effective machine learning algorithms that can provide updated models for real-time applications and on-demand tools. This paper presents FEMa—a finite element machine classifier—for supervised learning problems, where each training sample is the center of a basis function, and the whole training set is modeled as a probabilistic manifold for classification purposes. FEMa has its theoretical basis in the finite element method, which is widely used for numeral analysis in engineering problems. It is shown FEMa is parameterless and has a quadratic complexity for both training and classification phases when basis functions are used that satisfy certain properties. The proposed classifier yields very competitive results when compared to some state-of-the-art supervised pattern recognition techniques.

Journal ArticleDOI
TL;DR: This work surveys the available literature on the grasshopper optimization algorithm, including its modifications, hybridizations, and generalization to the binary, chaotic, and multi-objective cases.
Abstract: The grasshopper optimization algorithm is one of the dominant modern meta-heuristic optimization algorithms. It has been successfully applied to various optimization problems in several fields, including engineering design, wireless networking, machine learning, image processing, control of power systems, and others. We survey the available literature on the grasshopper optimization algorithm, including its modifications, hybridizations, and generalization to the binary, chaotic, and multi-objective cases. We review its applications, evaluate the algorithms, and provide conclusions.

Journal ArticleDOI
TL;DR: This paper thoroughly presents a comprehensive review of the so-called moth–flame optimization (MFO) and analyzes its main characteristics, focusing on the current work on MFO, highlight its weaknesses, and suggest possible future research directions.
Abstract: This paper thoroughly presents a comprehensive review of the so-called moth–flame optimization (MFO) and analyzes its main characteristics. MFO is considered one of the promising metaheuristic algorithms and successfully applied in various optimization problems in a wide range of fields, such as power and energy systems, economic dispatch, engineering design, image processing and medical applications. This manuscript describes the available literature on MFO, including its variants and hybridization, the growth of MFO publications, MFO application areas, theoretical analysis and comparisons of MFO with other algorithms. Conclusions focus on the current work on MFO, highlight its weaknesses, and suggest possible future research directions. Researchers and practitioners of MFO belonging to different fields, like the domains of optimization, medical, engineering, clustering and data mining, among others will benefit from this study.

Journal ArticleDOI
TL;DR: The outcomes show that ResNet50 is the most appropriate deep learning model to detect the COVID-19 from limited chest CT dataset using the classical data augmentation with testing accuracy of 82.91%, sensitivity 77.66%, and specificity of 87.62%.
Abstract: The Coronavirus disease 2019 (COVID-19) is the fastest transmittable virus caused by severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2). The detection of COVID-19 using artificial intelligence techniques and especially deep learning will help to detect this virus in early stages which will reflect in increasing the opportunities of fast recovery of patients worldwide. This will lead to release the pressure off the healthcare system around the world. In this research, classical data augmentation techniques along with Conditional Generative Adversarial Nets (CGAN) based on a deep transfer learning model for COVID-19 detection in chest CT scan images will be presented. The limited benchmark datasets for COVID-19 especially in chest CT images are the main motivation of this research. The main idea is to collect all the possible images for COVID-19 that exists until the very writing of this research and use the classical data augmentations along with CGAN to generate more images to help in the detection of the COVID-19. In this study, five different deep convolutional neural network-based models (AlexNet, VGGNet16, VGGNet19, GoogleNet, and ResNet50) have been selected for the investigation to detect the Coronavirus-infected patient using chest CT radiographs digital images. The classical data augmentations along with CGAN improve the performance of classification in all selected deep transfer models. The outcomes show that ResNet50 is the most appropriate deep learning model to detect the COVID-19 from limited chest CT dataset using the classical data augmentation with testing accuracy of 82.91%, sensitivity 77.66%, and specificity of 87.62%.

Journal ArticleDOI
TL;DR: This review paper presents a comprehensive and full review of the so-called optimization algorithm, multi-verse optimizer algorithm (MOA), and reviews its main characteristics and procedures.
Abstract: This review paper presents a comprehensive and full review of the so-called optimization algorithm, multi-verse optimizer algorithm (MOA), and reviews its main characteristics and procedures. This optimizer is a kind of the most recent powerful nature-inspired meta-heuristic algorithms, where it has been successfully implemented and utilized in several optimization problems in a variety of several fields, which are covered in this context, such as benchmark test functions, machine learning applications, engineering applications, network applications, parameters control, and other applications of MOA. This paper covers all the available publications that have been used MOA in its application, which are published in the literature including the variants of MOA such as binary, modifications, hybridizations, chaotic, and multi-objective. Followed by its applications, the assessment and evaluation, and finally the conclusions, which interested in the current works on the optimization algorithm, recommend potential future research directions.

Journal ArticleDOI
TL;DR: The authors showed that even if one does not expect in principle a 50:50 pronominal gender distribution, Google Translate yields male defaults much more frequently than what would be expected from demographic data alone, in particular for fields typically associated to unbalanced gender distribution or stereotypes such as STEM (Science, Technology, Engineering and Mathematics) jobs.
Abstract: Recently there has been a growing concern in academia, industrial research laboratories and the mainstream commercial media about the phenomenon dubbed as machine bias, where trained statistical models—unbeknownst to their creators—grow to reflect controversial societal asymmetries, such as gender or racial bias. A significant number of Artificial Intelligence tools have recently been suggested to be harmfully biased toward some minority, with reports of racist criminal behavior predictors, Apple’s Iphone X failing to differentiate between two distinct Asian people and the now infamous case of Google photos’ mistakenly classifying black people as gorillas. Although a systematic study of such biases can be difficult, we believe that automated translation tools can be exploited through gender neutral languages to yield a window into the phenomenon of gender bias in AI. In this paper, we start with a comprehensive list of job positions from the U.S. Bureau of Labor Statistics (BLS) and used it in order to build sentences in constructions like “He/She is an Engineer” (where “Engineer” is replaced by the job position of interest) in 12 different gender neutral languages such as Hungarian, Chinese, Yoruba, and several others. We translate these sentences into English using the Google Translate API, and collect statistics about the frequency of female, male and gender neutral pronouns in the translated output. We then show that Google Translate exhibits a strong tendency toward male defaults, in particular for fields typically associated to unbalanced gender distribution or stereotypes such as STEM (Science, Technology, Engineering and Mathematics) jobs. We ran these statistics against BLS’ data for the frequency of female participation in each job position, in which we show that Google Translate fails to reproduce a real-world distribution of female workers. In summary, we provide experimental evidence that even if one does not expect in principle a 50:50 pronominal gender distribution, Google Translate yields male defaults much more frequently than what would be expected from demographic data alone. We believe that our study can shed further light on the phenomenon of machine bias and are hopeful that it will ignite a debate about the need to augment current statistical translation tools with debiasing techniques—which can already be found in the scientific literature.

Journal ArticleDOI
TL;DR: Heterogeneity into the international expansion behavior model of Chinese enterprises in the context of global value chain specialization is introduced and a new concept of enterprise advantage based on the value chain status from the research category of international trade theory is expanded.
Abstract: Since the beginning of the new millennium, the productivity differences among different enterprises within the industry, namely the heterogeneity of enterprises, have been incorporated into the general equilibrium trade model. At the same time, it brings new focus to trade theory: By analyzing the characteristics of individual enterprises, it analyzes the choice of organizational structure of individual enterprises. Outward foreign direct investment (OFDI) and international trade are the two most important international economic activities nowadays. The development of OFDI will have certain impact on export trade. The relationship between OFDI and export trade is different due to the specific national conditions of different countries. Against this background, we study the impact of China’s expanding OFDI on export trade. We should make better use of OFDI to promote the development of China’s export trade and improve the export-oriented economic mode that relies solely on exports to expand the international market. It not only has theoretical value, but also has practical guiding significance for the formulation of China’s “going out” economic and trade policy. This study introduces heterogeneity into the international expansion behavior model of Chinese enterprises in the context of global value chain specialization. Taking the position of Chinese enterprises in the value chain as the location dimension, we have integrated and expanded a new concept of enterprise advantage based on the value chain status from the research category of international trade theory. In this paper, we introduce the behavior of competitors into the dimension of enterprise decision space and construct an endogenous dynamic equilibrium model of Chinese enterprises’ export trade and OFDI. The purpose is to provide theoretical explanations and practical guidance for Chinese enterprises’ export trade and endogenous optimization decision making in the process of deepening the division of labor in the global value chain.

Journal ArticleDOI
TL;DR: The results of the proposed 10-layer CNN model show its performance better than seven state-of-the-art approaches, including three advanced techniques: parametric rectified linear unit (PReLU); batch normalization; and dropout.
Abstract: Alcoholism changes the structure of brain. Several somatic marker hypothesis network-related regions are known to be damaged in chronic alcoholism. Neuroimaging approach can help us better understanding the impairment discovered in alcohol-dependent subjects. In this research, we recruited subjects from participating hospitals. In total, 188 abstinent long-term chronic alcoholic participants (95 men and 93 women) and 191 non-alcoholic control participants (95 men and 96 women) were enrolled in our experiment via computerized diagnostic interview schedule version IV and medical history interview employed to determine whether the applicants can be enrolled or excluded. The Siemens Verio Tim 3.0 T MR scanner (Siemens Medical Solutions, Erlangen, Germany) was employed to scan the subjects. Then, we proposed a 10-layer convolutional neural network for the diagnosis based on imaging, including three advanced techniques: parametric rectified linear unit (PReLU); batch normalization; and dropout. The structure of network is fine-tuned. The results show that our method secured a sensitivity of 97.73 ± 1.04%, a specificity of 97.69 ± 0.87%, and an accuracy of 97.71 ± 0.68%. We observed the PReLU gives better performance than ordinary ReLU, clipped ReLU, and leaky ReLU. The batch normalization and dropout gained enhanced performance as batch normalization overcame the internal covariate shift and dropout got over the overfitting. The results of our proposed 10-layer CNN model show its performance better than seven state-of-the-art approaches.

Journal ArticleDOI
TL;DR: An LSTM-based recurrent network model is shown that subjectively performs quite well on this task and is considered to be valuable to work in the space of direct performance generation: jointly predicting the notes and also their expressive timing and dynamics.
Abstract: Music generation has generally been focused on either creating scores or interpreting them. We discuss differences between these two problems and propose that, in fact, it may be valuable to work in the space of direct performance generation: jointly predicting the notes and also their expressive timing and dynamics. We consider the significance and qualities of the dataset needed for this. Having identified both a problem domain and characteristics of an appropriate dataset, we show an LSTM-based recurrent network model that subjectively performs quite well on this task. Critically, we provide generated examples. We also include feedback from professional composers and musicians about some of these examples.

Journal ArticleDOI
TL;DR: The principal component analysis method and GRNN neural network are used to construct the gesture recognition system, so as to reduce the redundant information of EMG signals, reduce the signal dimension, improve the recognition efficiency and accuracy, and enhance the feasibility of real-time recognition.
Abstract: The principal component analysis method and GRNN neural network are used to construct the gesture recognition system, so as to reduce the redundant information of EMG signals, reduce the signal dimension, improve the recognition efficiency and accuracy, and enhance the feasibility of real-time recognition. Using the means of extracting key information of human motion, the specific action mode is identified. In this paper, nine static gestures are taken as samples, and the surface EMG signal of the arm is collected by the electromyography instrument to extract four kinds of characteristics of the signal. After dimension reduction and neural network learning, the overall recognition rate of the system reached 95.1%, and the average recognition time was 0.19 s.

Journal ArticleDOI
TL;DR: Robust modeling of static signs in the context of sign language recognition using deep learning-based convolutional neural networks (CNN) and demonstrates its effectiveness over the earlier works in which only a few hand signs are considered for recognition.
Abstract: Sign language for communication is efficacious for humans, and vital research is in progress in computer vision systems. The earliest work in Indian Sign Language (ISL) recognition considers the recognition of significant differentiable hand signs and therefore often selecting a few signs from the ISL for recognition. This paper deals with robust modeling of static signs in the context of sign language recognition using deep learning-based convolutional neural networks (CNN). In this research, total 35,000 sign images of 100 static signs are collected from different users. The efficiency of the proposed system is evaluated on approximately 50 CNN models. The results are also evaluated on the basis of different optimizers, and it has been observed that the proposed approach has achieved the highest training accuracy of 99.72% and 99.90% on colored and grayscale images, respectively. The performance of the proposed system has also been evaluated on the basis of precision, recall and F-score. The system also demonstrates its effectiveness over the earlier works in which only a few hand signs are considered for recognition.

Journal ArticleDOI
TL;DR: It is argued that the exploitation tendency of WOA is limited and can be considered as one of the main drawbacks of this algorithm, and the exploitative and exploratory capabilities of modified WOA in conjunction with a learning mechanism are improved.
Abstract: Whale optimization algorithm (WOA) is a recent nature-inspired metaheuristic that mimics the cooperative life of humpback whales and their spiral-shaped hunting mechanism. In this research, it is first argued that the exploitation tendency of WOA is limited and can be considered as one of the main drawbacks of this algorithm. In order to mitigate the problems of immature convergence and stagnation problems, the exploitative and exploratory capabilities of modified WOA in conjunction with a learning mechanism are improved. In this regard, the proposed WOA with associative learning approaches is combined with a recent variant of hill climbing local search to further enhance the exploitation process. The improved algorithm is then employed to tackle a wide range of numerical optimization problems. The results are compared with different well-known and novel techniques on multi-dimensional classic problems and new CEC 2017 test suite. The extensive experiments and statistical tests show the superiority of the proposed BMWOA compared to WOA and several well-established algorithms.

Journal ArticleDOI
TL;DR: In this article, the authors propose a zero-bit watermarking algorithm that makes use of adversarial model examples, which allows subsequent extraction of the watermark using only few queries.
Abstract: The state-of-the-art performance of deep learning models comes at a high cost for companies and institutions, due to the tedious data collection and the heavy processing requirements. Recently, Nagai et al. (Int J Multimed Inf Retr 7(1):3–16, 2018), Uchida et al. (Embedding watermarks into deep neural networks, ICMR, 2017) proposed to watermark convolutional neural networks for image classification, by embedding information into their weights. While this is a clear progress toward model protection, this technique solely allows for extracting the watermark from a network that one accesses locally and entirely. Instead, we aim at allowing the extraction of the watermark from a neural network (or any other machine learning model) that is operated remotely, and available through a service API. To this end, we propose to mark the model’s action itself, tweaking slightly its decision frontiers so that a set of specific queries convey the desired information. In the present paper, we formally introduce the problem and propose a novel zero-bit watermarking algorithm that makes use of adversarial model examples. While limiting the loss of performance of the protected model, this algorithm allows subsequent extraction of the watermark using only few queries. We experimented the approach on three neural networks designed for image classification, in the context of MNIST digit recognition task.