scispace - formally typeset
Search or ask a question

Showing papers on "Activation function published in 2021"


Journal ArticleDOI
TL;DR: A taxonomy of trainable activation functions is proposed, common and distinctive proprieties of recent and past models are highlighted, main advantages and limitations of this type of approach are discussed, and it is shown that many of the proposed approaches are equivalent to adding neuron layers which use fixed activation functions and some simple local rule that constrains the corresponding weight layers.

162 citations


Journal ArticleDOI
TL;DR: A novel particle swarm optimization (PSO) algorithm is put forward where a sigmoid-function-based weighting strategy is developed to adaptively adjust the acceleration coefficients, inspired by the activation function of neural networks.
Abstract: In this paper, a novel particle swarm optimization (PSO) algorithm is put forward where a sigmoid-function-based weighting strategy is developed to adaptively adjust the acceleration coefficients. The newly proposed adaptive weighting strategy takes into account both the distances from the particle to the global best position and from the particle to its personal best position, thereby having the distinguishing feature of enhancing the convergence rate. Inspired by the activation function of neural networks, the new strategy is employed to update the acceleration coefficients by using the sigmoid function. The search capability of the developed adaptive weighting PSO (AWPSO) algorithm is comprehensively evaluated via eight well-known benchmark functions including both the unimodal and multimodal cases. The experimental results demonstrate that the designed AWPSO algorithm substantially improves the convergence rate of the particle swarm optimizer and also outperforms some currently popular PSO algorithms.

160 citations


Journal ArticleDOI
TL;DR: This article develops a new activation function, i.e., adaptively parametric rectifier linear units, and inserts the activation function into deep residual networks to improve the feature learning ability, so that each input signal is trained to have its own set of nonlinear transformations.
Abstract: Vibration signals under the same health state often have large differences due to changes in operating conditions. Likewise, the differences among vibration signals under different health states can be small under some operating conditions. Traditional deep learning methods apply fixed nonlinear transformations to all the input signals, which have a negative impact on the discriminative feature learning ability, i.e., projecting the intraclass signals into the same region and the interclass signals into distant regions. Aiming at this issue, this article develops a new activation function, i.e., adaptively parametric rectifier linear units, and inserts the activation function into deep residual networks to improve the feature learning ability, so that each input signal is trained to have its own set of nonlinear transformations. To be specific, a subnetwork is inserted as an embedded module to learn slopes to be used in the nonlinear transformation. The slopes are dependent on the input signal, and thereby the developed method has more flexible nonlinear transformations than the traditional deep learning methods. Finally, the improved performance of the developed method in learning discriminative features has been validated through fault diagnosis applications.

116 citations


Book ChapterDOI
TL;DR: This research paper will evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid, and so forth, followed by their properties, own cons and pros, and particular formula application recommendations.
Abstract: The primary neural networks’ decision-making units are activation functions. Moreover, they evaluate the output of networks neural node; thus, they are essential for the performance of the whole network. Hence, it is critical to choose the most appropriate activation function in neural networks calculation. Acharya et al. (2018) suggest that numerous recipes have been formulated over the years, though some of them are considered deprecated these days since they are unable to operate properly under some conditions. These functions have a variety of characteristics, which are deemed essential to successfully learning. Their monotonicity, individual derivatives, and finite of their range are some of these characteristics. This research paper will evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid, and so forth. This will be followed by their properties, own cons and pros, and particular formula application recommendations.

113 citations


Journal ArticleDOI
TL;DR: In this paper, the authors studied the nonasymptotic high probability bounds for deep feed-forward neural networks and their use in semiparametric inference, and showed that these results can be used to develop a semi-parametric inference model for direct mail marketing.
Abstract: We study deep neural networks and their use in semiparametric inference. We establish novel nonasymptotic high probability bounds for deep feedforward neural nets. These deliver rates of convergence that are sufficiently fast (in some cases minimax optimal) to allow us to establish valid second‐step inference after first‐step estimation with deep learning, a result also new to the literature. Our nonasymptotic high probability bounds, and the subsequent semiparametric inference, treat the current standard architecture: fully connected feedforward neural networks (multilayer perceptrons), with the now‐common rectified linear unit activation function, unbounded weights, and a depth explicitly diverging with the sample size. We discuss other architectures as well, including fixed‐width, very deep networks. We establish the nonasymptotic bounds for these deep nets for a general class of nonparametric regression‐type loss functions, which includes as special cases least squares, logistic regression, and other generalized linear models. We then apply our theory to develop semiparametric inference, focusing on causal parameters for concreteness, and demonstrate the effectiveness of deep learning with an empirical application to direct mail marketing.

96 citations


Journal ArticleDOI
TL;DR: In this article, an artificial neural network analyst model was advanced based on the information from the well-tested model HYDRUS-2D/3D, and the methodological process for defining the drainage retention capacity of surface layers under conditions of unsteady-state groundwater flow was demonstrated.
Abstract: The methodological process for defining the drainage retention capacity of surface layers under conditions of unsteady-state groundwater flow was demonstrated. An artificial neural network analyst model was advanced based on the information from the well-tested model HYDRUS-2D/3D. Artificial neural network knowledge is reported as an intermittent to physical-based modeling of subsurface water distribution from trickle emitters. Three options are prospected to create input-output functional relations from information created using a numerical model (HYDRUS-2D). Artificial neural networks are a tool for modeling of non-linear systems in various engineering fields. These networks are effective tools for modeling non-linear systems. Each artificial neural network includes an input layer and an output layer between which there are one or some hidden layers. In each layer, there are one or several processing elements or neurons. The neurons of the input layer are independent variables of the understudy issue and the neurons of the output layer are its dependent variables. An artificial neural system, through exerting weight on inputs and by using an activation function, attempts to achieve a desirable output. In this research, in order to calculate the drain spacing in an unsteady state in a region situated in the northeast of Ahwaz, Iran, with different soil properties and drain spacing, the artificial neural networks have been used. The neurons in the input layer were specific yield, hydraulic conductivity, depth of the impermeable layer, and height of the water table in the middle of the interval between the drains in two-time steps. The neurons in the output layer were drain spacing. The network designed in this research included a hidden layer with four neurons. The distance of drains computed via this method had a good agreement with real values and had a high precision in comparison with other methods. This was done for three types of linear activation functions and hyperbolic and sigmoid tangents. The mean error was 0.1455, 0.092, and 0.0491, respectively.

88 citations


Journal ArticleDOI
TL;DR: A general novel methodology, scaled polynomial constant unit activation function “SPOCU,” is introduced and shown to work satisfactorily on a variety of problems, and it is shown that SPOCU can overcome already introduced activation functions with good properties on generic problems.
Abstract: We address the following problem: given a set of complex images or a large database, the numerical and computational complexity and quality of approximation for neural network may drastically differ from one activation function to another. A general novel methodology, scaled polynomial constant unit activation function “SPOCU,” is introduced and shown to work satisfactorily on a variety of problems. Moreover, we show that SPOCU can overcome already introduced activation functions with good properties, e.g., SELU and ReLU, on generic problems. In order to explain the good properties of SPOCU, we provide several theoretical and practical motivations, including tissue growth model and memristive cellular nonlinear networks. We also provide estimation strategy for SPOCU parameters and its relation to generation of random type of Sierpinski carpet, related to the [pppq] model. One of the attractive properties of SPOCU is its genuine normalization of the output of layers. We illustrate SPOCU methodology on cancer discrimination, including mammary and prostate cancer and data from Wisconsin Diagnostic Breast Cancer dataset. Moreover, we compared SPOCU with SELU and ReLU on large dataset MNIST, which justifies usefulness of SPOCU by its very good performance.

67 citations


Journal ArticleDOI
TL;DR: Overall, the findings identify potential causes for issues in the training procedure of deep learning such as no guaranteed convergence, explosion of parameters, and slow convergence.
Abstract: We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties. It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $$L^p$$ -norms, $$0< p < \infty $$ , for all practically used activation functions, and also not closed with respect to the $$L^\infty $$ -norm for all practically used activation functions except for the ReLU and the parametric ReLU. Finally, the function that maps a family of weights to the function computed by the associated network is not inverse stable for every practically used activation function. In other words, if $$f_1, f_2$$ are two functions realized by neural networks and if $$f_1, f_2$$ are close in the sense that $$\Vert f_1 - f_2\Vert _{L^\infty } \le \varepsilon $$ for $$\varepsilon > 0$$ , it is, regardless of the size of $$\varepsilon $$ , usually not possible to find weights $$w_1, w_2$$ close together such that each $$f_i$$ is realized by a neural network with weights $$w_i$$ . Overall, our findings identify potential causes for issues in the training procedure of deep learning such as no guaranteed convergence, explosion of parameters, and slow convergence.

66 citations


Journal ArticleDOI
TL;DR: This paper extends a method, called bilinear neural network method (BNNM), to solve exact solutions to nonlinear partial differential equation and new, test functions are constructed by using this method.
Abstract: This paper extends a method, called bilinear neural network method (BNNM), to solve exact solutions to nonlinear partial differential equation. New, test functions are constructed by using this method. These test functions are composed of specific activation functions of single-layer model, specific activation functions of “2-2” model and arbitrary functions of “2-2-3” model. By means of the BNNM, nineteen sets of exact analytical solutions and twenty-four arbitrary function solutions of the dimensionally reduced p-gBKP equation are obtained via symbolic computation with the help of Maple. The fractal solitons waves are obtained by choosing appropriate values and the self-similar characteristics of these waves are observed by reducing the observation range and amplifying the partial picture. By giving a specific activation function in the single layer neural network model, exact periodic waves and breathers are obtained. Via various three-dimensional plots, contour plots and density plots, the evolution characteristic of these waves are exhibited.

62 citations


Journal ArticleDOI
TL;DR: It is established that allowing the networks to have certain types of “skip connections” does not change the resulting approximation spaces, and some functions of very low Besov smoothness can nevertheless be well approximated by neural networks, if these networks are sufficiently deep.
Abstract: We study the expressivity of deep neural networks. Measuring a network's complexity by its number of connections or by its number of neurons, we consider the class of functions for which the error of best approximation with networks of a given complexity decays at a certain rate when increasing the complexity budget. Using results from classical approximation theory, we show that this class can be endowed with a (quasi)-norm that makes it a linear function space, called approximation space. We establish that allowing the networks to have certain types of "skip connections" does not change the resulting approximation spaces. We also discuss the role of the network's nonlinearity (also known as activation function) on the resulting spaces, as well as the role of depth. For the popular ReLU nonlinearity and its powers, we relate the newly constructed spaces to classical Besov spaces. The established embeddings highlight that some functions of very low Besov smoothness can nevertheless be well approximated by neural networks, if these networks are sufficiently deep.

58 citations


Journal ArticleDOI
TL;DR: The goal of this work is to propose an ensemble of Convolutional Neural Networks trained using several different activation functions, and a novel activation function is here proposed for the first time to improve the performance of Convolved Neural Networks in small/medium size biomedical datasets.
Abstract: Activation functions play a vital role in the training of Convolutional Neural Networks. For this reason, developing efficient and well-performing functions is a crucial problem in the deep learning community. The idea of these approaches is to allow a reliable parameter learning, avoiding vanishing gradient problems. The goal of this work is to propose an ensemble of Convolutional Neural Networks trained using several different activation functions. Moreover, a novel activation function is here proposed for the first time. Our aim is to improve the performance of Convolutional Neural Networks in small/medium sized biomedical datasets. Our results clearly show that the proposed ensemble outperforms Convolutional Neural Networks trained with a standard ReLU as activation function. The proposed ensemble outperforms with a p-value of 0.01 each tested stand-alone activation function; for reliable performance comparison we tested our approach on more than 10 datasets, using two well-known Convolutional Neural Networks: Vgg16 and ResNet50.

Journal ArticleDOI
TL;DR: A novel, scalable, and efficient technique based on the simplex method, extended to handle the non-convex Rectified Linear Unit (ReLU) activation function, which is a crucial ingredient in many modern neural networks.
Abstract: Deep neural networks have emerged as a widely used and effective means for tackling complex, real-world problems. However, a major obstacle in applying them to safety-critical systems is the great difficulty in providing formal guarantees about their behavior. We present a novel, scalable, and efficient technique for verifying properties of deep neural networks (or providing counter-examples). The technique is based on the simplex method, extended to handle the non-convex Rectified Linear Unit (ReLU) activation function, which is a crucial ingredient in many modern neural networks. The verification procedure tackles neural networks as a whole, without making any simplifying assumptions. We evaluated our technique on a prototype deep neural network implementation of the next-generation airborne collision avoidance system for unmanned aircraft (ACAS Xu). Results show that our technique can successfully prove properties of networks that are an order of magnitude larger than the largest networks that could be verified previously.

Journal ArticleDOI
TL;DR: The results indicate that for the same input dimension, theBRNN divides the input space into a greater number of linear regions than the ReLU network, which explains to a certain extent why the BRNN has the superior approximation ability.

Journal ArticleDOI
TL;DR: The proposed RSigELU activation functions can overcome the vanishing gradient and negative region problems and can be effective in the positive, negative, and linear activation regions.
Abstract: In deep learning models, the inputs to the network are processed using activation functions to generate the output corresponding to these inputs. Deep learning models are of particular importance in analyzing big data with numerous parameters and forecasting and are useful for image processing, natural language processing, object recognition, and financial forecasting. Sigmoid and tangent activation functions, which are traditional activation functions, are widely used in deep learning models. However, the sigmoid and tangent activation functions face the vanishing gradient problem. In order to overcome this problem, the ReLU activation function and its derivatives were proposed in the literature. However, there is a negative region problem in these activation functions. In this study, novel RSigELU activation functions, such as single-parameter RSigELU (RSigELUS) and double-parameter (RSigELUD), which are a combination of ReLU, sigmoid, and ELU activation functions, were proposed. The proposed RSigELUS and RSigELUD activation functions can overcome the vanishing gradient and negative region problems and can be effective in the positive, negative, and linear activation regions. Performance evaluation of the proposed RSigELU activation functions was performed on the MNIST, Fashion MNIST, CIFAR-10, and IMDb Movie benchmark datasets. Experimental evaluations showed that the proposed activation functions perform better than other activation functions.

Journal ArticleDOI
TL;DR: In this paper, deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations for non-smooth initial conditions, however, approximation of these functions suffers from a curse of dimension.
Abstract: We demonstrate that deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations For non-smooth initial conditions, the solutions of these PDEs are high-dimensional and non-smooth Therefore, approximation of these functions suffers from a curse of dimension We demonstrate that through their inherent compositionality deep neural networks can resolve the characteristic flow underlying the transport equations and thereby allow approximation rates independent of the parameter dimension

Journal ArticleDOI
TL;DR: In this article, an approach for the generation of an adaptive sigmoid-like and PReLU nonlinear activation function of an all-optical perceptron, exploiting the bistability of an injection-locked Fabry-Perot semiconductor laser, is presented.
Abstract: We present an approach for the generation of an adaptive sigmoid-like and PReLU nonlinear activation function of an all-optical perceptron, exploiting the bistability of an injection-locked Fabry–Perot semiconductor laser. The profile of the activation function can be tailored by adjusting the injection-locked side-mode order, frequency detuning of the input optical signal, Henry factor, or bias current. The universal fitting function for both families of the activation functions is presented.

Journal ArticleDOI
TL;DR: New theoretical results on multistability and complete stability of recurrent neural networks with a sinusoidal activation function are presented, and criteria for complete stability and instability of equilibria are derived for recurrent Neural networks without time delay.
Abstract: This article presents new theoretical results on multistability and complete stability of recurrent neural networks with a sinusoidal activation function. Sufficient criteria are provided for ascertaining the stability of recurrent neural networks with various numbers of equilibria, such as a unique equilibrium, finite, and countably infinite numbers of equilibria. Multiple exponential stability criteria of equilibria are derived, and the attraction basins of equilibria are estimated. Furthermore, criteria for complete stability and instability of equilibria are derived for recurrent neural networks without time delay. In contrast to the existing stability results with a finite number of equilibria, the new criteria, herein, are applicable for both finite and countably infinite numbers of equilibria. Two illustrative examples with finite and countably infinite numbers of equilibria are elaborated to substantiate the results.

Journal ArticleDOI
TL;DR: A characterization, a representation, a construction method, and an existence result are presented, each of which applies to any universal approximator on most function spaces of practical interest, which improves the known capabilities of the feed-forward architecture.
Abstract: The universal approximation property of various machine learning models is currently only understood on a case-by-case basis, limiting the rapid development of new theoretically justified neural network architectures and blurring our understanding of our current models’ potential. This paper works towards overcoming these challenges by presenting a characterization, a representation, a construction method, and an existence result, each of which applies to any universal approximator on most function spaces of practical interest. Our characterization result is used to describe which activation functions allow the feed-forward architecture to maintain its universal approximation capabilities when multiple constraints are imposed on its final layers and its remaining layers are only sparsely connected. These include a rescaled and shifted Leaky ReLU activation function but not the ReLU activation function. Our construction and representation result is used to exhibit a simple modification of the feed-forward architecture, which can approximate any continuous function with non-pathological growth, uniformly on the entire Euclidean input space. This improves the known capabilities of the feed-forward architecture.

Journal ArticleDOI
TL;DR: In this article, a complex-valued DNM (CDNM) model was proposed, which consists of a number of multiple/deep-layer McCulloch-Pitts neurons.
Abstract: A single dendritic neuron model (DNM) that owns the nonlinear information processing ability of dendrites has been widely used for classification and prediction. Complex-valued neural networks that consist of a number of multiple/deep-layer McCulloch-Pitts neurons have achieved great successes so far since neural computing was utilized for signal processing. Yet no complex value representations appear in single neuron architectures. In this article, we first extend DNM from a real-value domain to a complex-valued one. Performance of complex-valued DNM (CDNM) is evaluated through a complex xor problem, a non-minimum phase equalization problem, and a real-world wind prediction task. Also, a comparative analysis on a set of elementary transcendental functions as an activation function is implemented and preparatory experiments are carried out for determining hyperparameters. The experimental results indicate that the proposed CDNM significantly outperforms real-valued DNM, complex-valued multi-layer perceptron, and other complex-valued neuron models.

Journal Article
TL;DR: This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance, and discovers both general activation functions and specialized functions for different architectures, consistently improving accuracy over ReLU and other activation functions by significant margins.
Abstract: Recent studies have shown that the choice of activation function can significantly affect the performance of deep learning networks. However, the benefits of novel activation functions have been inconsistent and task-dependent, and therefore the rectified linear unit (ReLU) is still the most commonly used. This paper proposes a technique for customizing activation functions automatically, resulting in reliable improvements in performance. Evolutionary search is used to discover the general form of the function, and gradient descent to optimize its parameters for different parts of the network and over the learning process. Experiments with three different neural network architectures on the CIFAR-100 image classification dataset show that this approach is effective. It discovers different activation functions for different architectures, and consistently improves accuracy over ReLU and other recently proposed activation functions by significant margins. The approach can therefore be used as an automated optimization step in applying deep learning to new tasks.

Journal ArticleDOI
TL;DR: The purpose of implementing a CMOS-based design for a hyperbolic tangent activation function (Tanh) to be used in memristive-based neuromorphic architectures is to decrease power dissipation and area usage and increase the overall speed of computation in ANNs.
Abstract: Recently, enormous datasets have made power dissipation and area usage lie at the heart of designs for artificial neural networks (ANNs). Considering the significant role of activation functions in neurons and the growth of hardware-based neural networks like memristive neural networks, this work proposes a novel design for a hyperbolic tangent activation function (Tanh) to be used in memristive-based neuromorphic architectures. The purpose of implementing a CMOS-based design for Tanh is to decrease power dissipation and area usage. This design also increases the overall speed of computation in ANNs, while keeping the accuracy in an acceptable range. The proposed design is one of the first analog designs for the hyperbolic tangent and its performance is analyzed by using two well-known datsets, including the Modified National Institute of Standards and Technology (MNIST) and Fashion-MNIST. The direct implementation of the proposed design for Tanh is proposed and investigated via software and hardware modeling.

Journal ArticleDOI
TL;DR: TanhExp as discussed by the authors proposed a novel activation function named Tanh Exponential Activation Function (Tanhexp) which can improve the performance for these networks on image classification task significantly.
Abstract: Lightweight or mobile neural networks used for real-time computer vision tasks contain fewer parameters than normal networks, which lead to a constrained performance. In this work, we proposed a novel activation function named Tanh Exponential Activation Function (TanhExp) which can improve the performance for these networks on image classification task significantly. The definition of TanhExp is f(x) = xtanh(e^x). We demonstrate the simplicity, efficiency, and robustness of TanhExp on various datasets and network models and TanhExp outperforms its counterparts in both convergence speed and accuracy. Its behaviour also remains stable even with noise added and dataset altered. We show that without increasing the size of the network, the capacity of lightweight neural networks can be enhanced by TanhExp with only a few training epochs and no extra parameters added.

Journal ArticleDOI
TL;DR: In this paper, two nonlinear-function-activated zeroing neural networks are employed to solve the Jacobian estimation problem and trajectory tracking problem respectively, and theoretical analysis proves that the proposed control scheme has finite-time convergence when employing nonlinear activation functions and the tracking error will not exceed the upper bound with the bounded noise interference.

Journal ArticleDOI
TL;DR: A special kind of transfer learning based on the electromagnetic property from the attributed scattering center model is applied in networks to modulate the first convolutional layer and shows a better performance in terms of classification accuracy compared to random weight initialization.
Abstract: Considering that synthetic aperture radar (SAR) images obtained directly after signal processing are in the form of complex matrices, we propose a complex convolutional network for SAR target recognition. In this article, we give a brief introduction to complex convolutional networks and compare them with the real counterpart. A complex activation function is applied to analyze the influence of phase information in complex neural networks. Inspired by the theory of network visualization, a special kind of transfer learning based on the electromagnetic property from the attributed scattering center model is applied in our networks to modulate the first convolutional layer. The experiment shows a better performance in terms of classification accuracy compared to random weight initialization.

Journal ArticleDOI
TL;DR: The results of the experimental study indicate that it is possible to create self-consistent cell-signalling compendia based on AKT protein data that have been computationally simulated to provide valuable insights for cell survival/death regulation.

Journal ArticleDOI
TL;DR: This paper considers the self-synchronization and tracking synchronization issues for a class of nonidentically coupled neural networks model with unknown parameters and diffusion effects using the special structure of neural networks with global Lipschitz activation function.
Abstract: This paper considers the self-synchronization and tracking synchronization issues for a class of nonidentically coupled neural networks model with unknown parameters and diffusion effects. Using the special structure of neural networks with global Lipschitz activation function, nonidentical terms are treated as external disturbances, which can then be compensated via robust adaptive control techniques. For the case where no common reference trajectory is given in advance, a distributed adaptive controller is proposed to drive the synchronization error to an adjustable bounded area. For the case where a reference trajectory is predesigned, two distributed adaptive controllers are proposed, respectively, to address the tracking synchronization problem with bounded and unbounded reference trajectories, different decomposition methods are given to extract the heterogeneous characteristics. To avoid the appearance of global information, such as the spectrum of the coupling matrix, corresponding adaptive designs on coupling strengths are also provided for both cases. Moreover, the upper bounds of the final synchronization errors can be gradually adjusted according to the parameters of the adaptive designs. Finally, numerical examples are given to test the effectiveness of the control algorithms.

Journal ArticleDOI
TL;DR: The authors derived bounds on the error in high-order Sobolev norms incurred by neural networks with the hyperbolic tangent activation function, and provided explicit estimates on the approximation error with respect to the size of the neural networks.

Journal ArticleDOI
05 Nov 2021-Energies
TL;DR: In this paper, an optimized neural network (NN) model was proposed to predict battery average Nusselt number (Nuavg) data using four activations functions, including Sigmoidal, Gaussian, Tanh, and Linear functions.
Abstract: The focus of this work is to computationally obtain an optimized neural network (NN) model to predict battery average Nusselt number (Nuavg) data using four activations functions. The battery Nuavg is highly nonlinear as reported in the literature, which depends mainly on flow velocity, coolant type, heat generation, thermal conductivity, battery length to width ratio, and space between the parallel battery packs. Nuavg is modeled at first using only one hidden layer in the network (NN1). The neurons in NN1 are experimented from 1 to 10 with activation functions: Sigmoidal, Gaussian, Tanh, and Linear functions to get the optimized NN1. Similarly, deep NN (NND) was also analyzed with neurons and activations functions to find an optimized number of hidden layers to predict the Nuavg. RSME (root mean square error) and R-Squared (R2) is accessed to conclude the optimized NN model. From this computational experiment, it is found that NN1 and NND both accurately predict the battery data. Six neurons in the hidden layer for NN1 give the best predictions. Sigmoidal and Gaussian functions have provided the best results for the NN1 model. In NND, the optimized model is obtained at different hidden layers and neurons for each activation function. The Sigmoidal and Gaussian functions outperformed the Tanh and Linear functions in an NN1 model. The linear function, on the other hand, was unable to forecast the battery data adequately. The Gaussian and Linear functions outperformed the other two NN-operated functions in the NND model. Overall, the deep NN (NND) model predicted better than the single-layered NN (NN1) model for each activation function.

Journal ArticleDOI
TL;DR: A parameter-dependent reciprocally convex inequality (PDRCI) is presented, which encompasses some existing results as its special cases and the restrictions on slack matrices are overcome, which directly leads to performance improvement and reduction of conservativeness in the estimator solution.

Journal ArticleDOI
01 Oct 2021
TL;DR: A novel network circuit based on memristor synapses is proposed for bidirectional associative memory with in-situ learning method and an analog neuron circuit is designed to emulate the cubic activation function of neural networks.
Abstract: Memristor is considered as a promising synaptic device for neural networks because of its tunable and non-volatile resistance states, which is similar to the biological synapses. In this article, a novel network circuit based on memristor synapses is proposed for bidirectional associative memory with in-situ learning method. An analog neuron circuit is designed to emulate the cubic activation function of neural networks. A memristive synapse circuit is constructed to map both positive and negative weights on a single memristor. Moreover, an in-situ learning circuit fitting memristor's nonlinear characteristic is proposed. Feedback control strategy is incorporated in this learning circuit to adjust the resistance of the memristor and avoid the encoding error of the memristor's write voltage. The performance of the proposed network circuit is verified by the training and recalling simulations. The comparison between the proposed approach and related works is analyzed to demonstrate the advantage of the proposed circuit design.