scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 2020"


Journal ArticleDOI
TL;DR: In this paper, the authors propose sparse ternary compression (STC), a new compression framework that is specifically designed to meet the requirements of the federated learning environment, which extends the existing compression technique of top- $k$ gradient sparsification with a novel mechanism to enable downstream compression as well as ternarization and optimal Golomb encoding of the weight updates.
Abstract: Federated learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server. This form of privacy-preserving collaborative learning, however, comes at the cost of a significant communication overhead during training. To address this problem, several compression methods have been proposed in the distributed training literature that can reduce the amount of required communication by up to three orders of magnitude. These existing methods, however, are only of limited utility in the federated learning setting, as they either only compress the upstream communication from the clients to the server (leaving the downstream communication uncompressed) or only perform well under idealized conditions, such as i.i.d. distribution of the client data, which typically cannot be found in federated learning. In this article, we propose sparse ternary compression (STC), a new compression framework that is specifically designed to meet the requirements of the federated learning environment. STC extends the existing compression technique of top- $k$ gradient sparsification with a novel mechanism to enable downstream compression as well as ternarization and optimal Golomb encoding of the weight updates. Our experiments on four different learning tasks demonstrate that STC distinctively outperforms federated averaging in common federated learning scenarios. These results advocate for a paradigm shift in federated optimization toward high-frequency low-bitwidth communication, in particular in the bandwidth-constrained learning environments.

618 citations


Journal ArticleDOI
TL;DR: In this article, a review of state-of-the-art scalable Gaussian process regression (GPR) models is presented, focusing on global and local approximations for subspace learning.
Abstract: The vast quantity of information brought by big data as well as the evolving computer hardware encourages success stories in the machine learning community. In the meanwhile, it poses challenges for the Gaussian process regression (GPR), a well-known nonparametric, and interpretable Bayesian model, which suffers from cubic complexity to data size. To improve the scalability while retaining desirable prediction quality, a variety of scalable GPs have been presented. However, they have not yet been comprehensively reviewed and analyzed to be well understood by both academia and industry. The review of scalable GPs in the GP community is timely and important due to the explosion of data size. To this end, this article is devoted to reviewing state-of-the-art scalable GPs involving two main categories: global approximations that distillate the entire data and local approximations that divide the data for subspace learning. Particularly, for global approximations, we mainly focus on sparse approximations comprising prior approximations that modify the prior but perform exact inference, posterior approximations that retain exact prior but perform approximate inference, and structured sparse approximations that exploit specific structures in kernel matrix; for local approximations, we highlight the mixture/product of experts that conducts model averaging from multiple local experts to boost predictions. To present a complete review, recent advances for improving the scalability and capability of scalable GPs are reviewed. Finally, the extensions and open issues of scalable GPs in various scenarios are reviewed and discussed to inspire novel ideas for future research avenues.

381 citations


Journal ArticleDOI
TL;DR: The results demonstrate that the proposed asynchronous federated deep learning outperforms the baseline algorithm both in terms of communication cost and model accuracy.
Abstract: Federated learning obtains a central model on the server by aggregating models trained locally on clients. As a result, federated learning does not require clients to upload their data to the server, thereby preserving the data privacy of the clients. One challenge in federated learning is to reduce the client–server communication since the end devices typically have very limited communication bandwidth. This article presents an enhanced federated learning technique by proposing an asynchronous learning strategy on the clients and a temporally weighted aggregation of the local models on the server. In the asynchronous learning strategy, different layers of the deep neural networks (DNNs) are categorized into shallow and deep layers, and the parameters of the deep layers are updated less frequently than those of the shallow layers. Furthermore, a temporally weighted aggregation strategy is introduced on the server to make use of the previously trained local models, thereby enhancing the accuracy and convergence of the central model. The proposed algorithm is empirically on two data sets with different DNNs. Our results demonstrate that the proposed asynchronous federated deep learning outperforms the baseline algorithm both in terms of communication cost and model accuracy.

364 citations


Journal ArticleDOI
TL;DR: The proposed model defeats the state-of-the-art deep learning approaches applied to place recognition and is easily trained via the standard backpropagation method.
Abstract: We propose an end-to-end place recognition model based on a novel deep neural network. First, we propose to exploit the spatial pyramid structure of the images to enhance the vector of locally aggregated descriptors (VLAD) such that the enhanced VLAD features can reflect the structural information of the images. To encode this feature extraction into the deep learning method, we build a spatial pyramid-enhanced VLAD (SPE-VLAD) layer. Next, we impose weight constraints on the terms of the traditional triplet loss (T-loss) function such that the weighted T-loss (WT-loss) function avoids the suboptimal convergence of the learning process. The loss function can work well under weakly supervised scenarios in that it determines the semantically positive and negative samples of each query through not only the GPS tags but also the Euclidean distance between the image representations. The SPE-VLAD layer and the WT-loss layer are integrated with the VGG-16 network or ResNet-18 network to form a novel end-to-end deep neural network that can be easily trained via the standard backpropagation method. We conduct experiments on three benchmark data sets, and the results demonstrate that the proposed model defeats the state-of-the-art deep learning approaches applied to place recognition.

281 citations


Journal ArticleDOI
TL;DR: A finite-time controller, which is capable of ensuring the semiglobal practical finite- time stability for the closed-loop systems, is developed using the adaptive neural networks control method, adding one power integrator technique and backstepping scheme.
Abstract: This article addresses the finite-time optimal control problem for a class of nonlinear systems whose powers are positive odd rational numbers. First of all, a finite-time controller, which is capable of ensuring the semiglobal practical finite-time stability for the closed-loop systems, is developed using the adaptive neural networks (NNs) control method, adding one power integrator technique and backstepping scheme. Second, the corresponding design parameters are optimized, and the finite-time optimal control property is obtained by means of minimizing the well-defined and designed cost function. Finally, a numerical simulation example is given to further validate the feasibility and effectiveness of the proposed optimal control strategy.

269 citations


Journal ArticleDOI
TL;DR: This paper proposes a pattern-balanced semisupervised framework to extract and preserve diverse latent patterns of activities from multimodal wearable sensory data, and exploits the independence of multi-modalities of sensory data and attentively identify salient regions that are indicative of human activities from inputs by the authors' recurrent convolutional attention networks.
Abstract: Recent years have witnessed the success of deep learning methods in human activity recognition (HAR). The longstanding shortage of labeled activity data inherently calls for a plethora of semisupervised learning methods, and one of the most challenging and common issues with semisupervised learning is the imbalanced distribution of labeled data over classes. Although the problem has long existed in broad real-world HAR applications, it is rarely explored in the literature. In this paper, we propose a semisupervised deep model for imbalanced activity recognition from multimodal wearable sensory data. We aim to address not only the challenges of multimodal sensor data (e.g., interperson variability and interclass similarity) but also the limited labeled data and class-imbalance issues simultaneously. In particular, we propose a pattern-balanced semisupervised framework to extract and preserve diverse latent patterns of activities. Furthermore, we exploit the independence of multi-modalities of sensory data and attentively identify salient regions that are indicative of human activities from inputs by our recurrent convolutional attention networks. Our experimental results demonstrate that the proposed model achieves a competitive performance compared to a multitude of state-of-the-art methods, both semisupervised and supervised ones, with 10% labeled training data. The results also show the robustness of our method over imbalanced, small training data sets.

245 citations


Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed work can provide an efficient model and architecture for large-scale biologically meaningful networks, while the hardware synthesis results demonstrate low area utilization and high computational speed that supports the scalability of the approach.
Abstract: Multicompartment emulation is an essential step to enhance the biological realism of neuromorphic systems and to further understand the computational power of neurons. In this paper, we present a hardware efficient, scalable, and real-time computing strategy for the implementation of large-scale biologically meaningful neural networks with one million multi-compartment neurons (CMNs). The hardware platform uses four Altera Stratix III field-programmable gate arrays, and both the cellular and the network levels are considered, which provides an efficient implementation of a large-scale spiking neural network with biophysically plausible dynamics. At the cellular level, a cost-efficient multi-CMN model is presented, which can reproduce the detailed neuronal dynamics with representative neuronal morphology. A set of efficient neuromorphic techniques for single-CMN implementation are presented with all the hardware cost of memory and multiplier resources removed and with hardware performance of computational speed enhanced by 56.59% in comparison with the classical digital implementation method. At the network level, a scalable network-on-chip (NoC) architecture is proposed with a novel routing algorithm to enhance the NoC performance including throughput and computational latency, leading to higher computational efficiency and capability in comparison with state-of-the-art projects. The experimental results demonstrate that the proposed work can provide an efficient model and architecture for large-scale biologically meaningful networks, while the hardware synthesis results demonstrate low area utilization and high computational speed that supports the scalability of the approach.

240 citations


Journal ArticleDOI
TL;DR: The classification accuracy of the subject-independent (or calibration-free) model outperforms that of subject-dependent models using various methods [common spatial pattern (CSP), common spatiospectral pattern (CSSP), filter bank CSP, and Bayesian spatio-spectral filter optimization (BSSFO)].
Abstract: For a brain–computer interface (BCI) system, a calibration procedure is required for each individual user before he/she can use the BCI. This procedure requires approximately 20–30 min to collect enough data to build a reliable decoder. It is, therefore, an interesting topic to build a calibration-free, or subject-independent, BCI. In this article, we construct a large motor imagery (MI)-based electroencephalography (EEG) database and propose a subject-independent framework based on deep convolutional neural networks (CNNs). The database is composed of 54 subjects performing the left- and right-hand MI on two different days, resulting in 21 600 trials for the MI task. In our framework, we formulated the discriminative feature representation as a combination of the spectral–spatial input embedding the diversity of the EEG signals, as well as a feature representation learned from the CNN through a fusion technique that integrates a variety of discriminative brain signal patterns. To generate spectral–spatial inputs, we first consider the discriminative frequency bands in an information-theoretic observation model that measures the power of the features in two classes. From discriminative frequency bands, spectral–spatial inputs that include the unique characteristics of brain signal patterns are generated and then transformed into a covariance matrix as the input to the CNN. In the process of feature representations, spectral–spatial inputs are individually trained through the CNN and then combined by a concatenation fusion technique. In this article, we demonstrate that the classification accuracy of our subject-independent (or calibration-free) model outperforms that of subject-dependent models using various methods [common spatial pattern (CSP), common spatiospectral pattern (CSSP), filter bank CSP (FBCSP), and Bayesian spatio-spectral filter optimization (BSSFO)].

229 citations


Journal ArticleDOI
TL;DR: In this paper, the consensus tracking problem is investigated for a class of continuous switched stochastic nonlinear multiagent systems with an event-triggered control strategy and a new protocol design framework is proposed for the underlying systems.
Abstract: In this paper, the consensus tracking problem is investigated for a class of continuous switched stochastic nonlinear multiagent systems with an event-triggered control strategy. For continuous stochastic multiagent systems via event-triggered protocols, it is rather difficult to avoid the Zeno behavior by the existing methods. Thus, we propose a new protocol design framework for the underlying systems. It is proven that follower agents can almost surely track the given leader signal with bounded errors and no agent exhibits the Zeno behavior by the given control scheme. Finally, two numerical examples are given to illustrate the effectiveness and advantages of the new design techniques.

223 citations


Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed a lightweight pyramid networt (LPNet) for single image deraining, which adopted recursive and residual network structures to build the proposed LPNet, which has less than 8k parameters while still achieving the state-of-the-art performance on rain removal.
Abstract: Existing deep convolutional neural networks (CNNs) have found major success in image deraining, but at the expense of an enormous number of parameters. This limits their potential applications, e.g., in mobile devices. In this paper, we propose a lightweight pyramid networt (LPNet) for single-image deraining. Instead of designing a complex network structure, we use domain-specific knowledge to simplify the learning process. In particular, we find that by introducing the mature Gaussian–Laplacian image pyramid decomposition technology to the neural network, the learning problem at each pyramid level is greatly simplified and can be handled by a relatively shallow network with few parameters. We adopt recursive and residual network structures to build the proposed LPNet, which has less than 8K parameters while still achieving the state-of-the-art performance on rain removal. We also discuss the potential value of LPNet for other low- and high-level vision tasks.

221 citations


Journal ArticleDOI
TL;DR: A neural network-based adaptive control method that can provide effective control for both actuated and unactuated state variables based on the original nonlinear ship-mounted crane dynamics without any linearizing operations is designed.
Abstract: As a type of indispensable oceanic transportation tools, ship-mounted crane systems are widely employed to transport cargoes and containers on vessels due to their extraordinary flexibility. However, various working requirements and the oceanic environment may cause some uncertain and unfavorable factors for ship-mounted crane control. In particular, to accomplish different control tasks, some plant parameters (e.g., boom lengths, payload masses, and so on) frequently change; hence, most existing model-based controllers cannot ensure satisfactory control performance any longer. For example, inaccurate gravity compensation may result in positioning errors. Additionally, due to ship roll motions caused by sea waves, residual payload swing generally exists, which may result in safety risks in practice. To solve the above-mentioned issues, this paper designs a neural network-based adaptive control method that can provide effective control for both actuated and unactuated state variables based on the original nonlinear ship-mounted crane dynamics without any linearizing operations. In particular, the proposed update law availably compensates parameter/structure uncertainties for ship-mounted crane systems. Based on a 2-D sliding surface, the boom and rope can arrive at their preset positions in finite time, and the payload swing can be completely suppressed. Furthermore, the problem of nonlinear input dead zones is also taken into account. The stability of the equilibrium point of all state variables in ship-mounted crane systems is theoretically proven by a rigorous Lyapunov-based analysis. The hardware experimental results verify the practicability and robustness of the presented control approach.

Journal ArticleDOI
TL;DR: This paper provides the review of neuromorphic CMOS-memristive architectures that can be integrated into edge computing devices and discusses why the neuromorphic architectures are useful for edge devices and shows the advantages, drawbacks, and open problems in the field of neuromemristive circuits for edge computing.
Abstract: The volume, veracity, variability, and velocity of data produced from the ever increasing network of sensors connected to Internet pose challenges for power management, scalability, and sustainability of cloud computing infrastructure. Increasing the data processing capability of edge computing devices at lower power requirements can reduce several overheads for cloud computing solutions. This paper provides the review of neuromorphic CMOS-memristive architectures that can be integrated into edge computing devices. We discuss why the neuromorphic architectures are useful for edge devices and show the advantages, drawbacks, and open problems in the field of neuromemristive circuits for edge computing.

Journal ArticleDOI
TL;DR: A threshold flux-controlled memristor is presented and its frequency-dependent pinched hysteresis loops are examined, validating the physical mechanism of biological neuron and the reliability of electronic neuron.
Abstract: Memristors can be employed to mimic biological neural synapses or to describe electromagnetic induction effects. To exhibit the threshold effect of electromagnetic induction, this paper presents a threshold flux-controlled memristor and examines its frequency-dependent pinched hysteresis loops. Using an electromagnetic induction current generated by the threshold memristor to replace the external current in 2-D Hindmarsh–Rose (HR) neuron model, a 3-D memristive HR (mHR) neuron model with global hidden oscillations is established and the corresponding numerical simulations are performed. It is found that due to no equilibrium point, the obtained mHR neuron model always operates in hidden bursting firing patterns, including coexisting hidden bursting firing patterns with bistability also. In addition, the model exhibits complex dynamics of the actual neuron electrical activities, which acts like the 3-D HR neuron model, indicating its feasibility. In particular, by constructing the fold and Hopf bifurcation sets of the fast-scale subsystem, the bifurcation mechanisms of hidden bursting firings are expounded. Finally, circuit experiments on hardware breadboards are deployed and the captured results well match with the numerical results, validating the physical mechanism of biological neuron and the reliability of electronic neuron.

Journal ArticleDOI
TL;DR: A novel adaptive protocol is proposed for the switched nonlinear MASs based on the developed design framework and the neural network method and a numerical example is presented to demonstrate the effectiveness of the proposed control scheme.
Abstract: In this brief, the practical finite-time consensus (FTC) problem is investigated for the second-order heterogeneous switched nonlinear multi-agent systems (MASs), where the subsystems and the switching signal for each agent are different. Mainly due to that agents’ dynamics are switched and the unknown nonlinearities in the systems are more general, the practical FTC problem of the MASs is rather difficult to be solved by existing methods. As such, a new protocol design framework for the FTC problem is developed. Then, a novel adaptive protocol is proposed for the switched nonlinear MASs based on the developed design framework and the neural network method. The sufficient conditions for the practical FTC of nonlinear MASs under arbitrary switching are given. Finally, a numerical example is presented to demonstrate the effectiveness of the proposed control scheme.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed algorithm outperforms the state-of-the-art CNNs hand-crafted and the CNNs designed by automatic peer competitors in terms of the classification performance and achieves a competitive classification accuracy against semiautomatic peer competitors.
Abstract: The performance of convolutional neural networks (CNNs) highly relies on their architectures. In order to design a CNN with promising performance, extensive expertise in both CNNs and the investigated problem domain is required, which is not necessarily available to every interested user. To address this problem, we propose to automatically evolve CNN architectures by using a genetic algorithm (GA) based on ResNet and DenseNet blocks. The proposed algorithm is completely automatic in designing CNN architectures. In particular, neither preprocessing before it starts nor postprocessing in terms of CNNs is needed. Furthermore, the proposed algorithm does not require users with domain knowledge on CNNs, the investigated problem, or even GAs. The proposed algorithm is evaluated on the CIFAR10 and CIFAR100 benchmark data sets against 18 state-of-the-art peer competitors. Experimental results show that the proposed algorithm outperforms the state-of-the-art CNNs hand-crafted and the CNNs designed by automatic peer competitors in terms of the classification performance and achieves a competitive classification accuracy against semiautomatic peer competitors. In addition, the proposed algorithm consumes much less computational resource than most peer competitors in finding the best CNN architectures.

Journal ArticleDOI
TL;DR: In this article, the authors investigate anomaly detection in an unsupervised framework and introduce long short-term memory (LSTM) neural network-based algorithms, where given variable length data sequences, they first pass these sequences through their LSTM-based structure and obtain fixed length sequences.
Abstract: We investigate anomaly detection in an unsupervised framework and introduce long short-term memory (LSTM) neural network-based algorithms. In particular, given variable length data sequences, we first pass these sequences through our LSTM-based structure and obtain fixed-length sequences. We then find a decision function for our anomaly detectors based on the one-class support vector machines (OC-SVMs) and support vector data description (SVDD) algorithms. As the first time in the literature, we jointly train and optimize the parameters of the LSTM architecture and the OC-SVM (or SVDD) algorithm using highly effective gradient and quadratic programming-based training methods. To apply the gradient-based training method, we modify the original objective criteria of the OC-SVM and SVDD algorithms, where we prove the convergence of the modified objective criteria to the original criteria. We also provide extensions of our unsupervised formulation to the semisupervised and fully supervised frameworks. Thus, we obtain anomaly detection algorithms that can process variable length data sequences while providing high performance, especially for time series data. Our approach is generic so that we also apply this approach to the gated recurrent unit (GRU) architecture by directly replacing our LSTM-based structure with the GRU-based structure. In our experiments, we illustrate significant performance gains achieved by our algorithms with respect to the conventional methods.

Journal ArticleDOI
TL;DR: Experimental results indicate that the proposed optimization method is able to find optimized neural network models that can not only significantly reduce communication costs but also improve the learning performance of federated learning compared with the standard fully connected neural networks.
Abstract: Federated learning is an emerging technique used to prevent the leakage of private information. Unlike centralized learning that needs to collect data from users and store them collectively on a cloud server, federated learning makes it possible to learn a global model while the data are distributed on the users’ devices. However, compared with the traditional centralized approach, the federated setting consumes considerable communication resources of the clients, which is indispensable for updating global models and prevents this technique from being widely used. In this paper, we aim to optimize the structure of the neural network models in federated learning using a multi-objective evolutionary algorithm to simultaneously minimize the communication costs and the global model test errors. A scalable method for encoding network connectivity is adapted to federated learning to enhance the efficiency in evolving deep neural networks. Experimental results on both multilayer perceptrons and convolutional neural networks indicate that the proposed optimization method is able to find optimized neural network models that can not only significantly reduce communication costs but also improve the learning performance of federated learning compared with the standard fully connected neural networks.

Journal ArticleDOI
TL;DR: This paper addresses the problem of adaptive neural output-feedback decentralized control for a class of strongly interconnected nonlinear systems suffering stochastic disturbances with an observer-based adaptive backstepping decentralized controller developed.
Abstract: This paper addresses the problem of adaptive neural output-feedback decentralized control for a class of strongly interconnected nonlinear systems suffering stochastic disturbances. An state observer is designed to approximate the unmeasurable state signals. Using the approximation capability of radial basis function neural networks (NNs) and employing classic adaptive control strategy, an observer-based adaptive backstepping decentralized controller is developed. In the control design process, NNs are applied to model the uncertain nonlinear functions, and adaptive control and backstepping are combined to construct the controller. The developed control scheme can guarantee that all signals in the closed-loop systems are semiglobally uniformly ultimately bounded in fourth-moment. The simulation results demonstrate the effectiveness of the presented control scheme.

Journal ArticleDOI
TL;DR: This article tackles the recursive filtering problem for a class of stochastic nonlinear time-varying complex networks suffering from both the state saturations and the deception attacks, and designs a state-saturated recursive filter such that a certain upper bound is guaranteed on the filtering error covariance and is then minimized at each time instant.
Abstract: This article tackles the recursive filtering problem for a class of stochastic nonlinear time-varying complex networks (CNs) suffering from both the state saturations and the deception attacks. The nonlinear inner coupling and the state saturations are taken into account to characterize the nonlinear nature of CNs. From the defender’s perspective, the randomly occurring deception attack is governed by a set of Bernoulli binary distributed white sequence with a given probability. The objective of the addressed problem is to design a state-saturated recursive filter such that, in the simultaneous presence of the state saturations and the randomly occurring deception attacks, a certain upper bound is guaranteed on the filtering error covariance, and such an upper bound is then minimized at each time instant. By employing the induction method, an upper bound on the filtering error variance is first constructed in terms of the solutions to a set of matrix difference equations. Subsequently, the filter parameters are appropriately designed to minimize such an upper bound. Finally, a numerical simulation example is provided to demonstrate the feasibility and usefulness of the proposed filtering scheme.

Journal ArticleDOI
TL;DR: This brief addresses the fixed-time event/self-triggered leader–follower consensus problems for networked multi-agent systems subject to nonlinear dynamics and proposes two new self-Triggered control strategies to avoid continuous triggering condition monitoring.
Abstract: This brief addresses the fixed-time event/self-triggered leader–follower consensus problems for networked multi-agent systems subject to nonlinear dynamics. First, we present an event-triggered control strategy to achieve the fixed-time consensus, and a new measurement error is designed to avoid Zeno behavior. Then, two new self-triggered control strategies are presented to avoid continuous triggering condition monitoring. Moreover, under the proposed self-triggered control strategies, a strictly positive minimal triggering interval of each follower is given to exclude Zeno behavior. Compared with the existing fixed-time event-triggered results, we propose two new self-triggered control strategies, and the nonlinear term is more general. Finally, the performances of the consensus tracking algorithms are illustrated by a simulation example.

Journal ArticleDOI
TL;DR: Simulation and experiment are carried out to indicate the excellent static and dynamic performances of the proposed DHLRNN-based adaptive global sliding-mode controller, verifying its best approximation performance and the most stable internal state compared with other schemes.
Abstract: In this paper, a full-regulated neural network (NN) with a double hidden layer recurrent neural network (DHLRNN) structure is designed, and an adaptive global sliding-mode controller based on the DHLRNN is proposed for a class of dynamic systems. Theoretical guidance and adaptive adjustment mechanism are established to set up the base width and central vector of the Gaussian function in the DHLRNN structure, where six sets of parameters can be adaptively stabilized to their best values according to different inputs. The new DHLRNN can improve the accuracy and generalization ability of the network, reduce the number of network weights, and accelerate the network training speed due to the strong fitting and presentation ability of two-layer activation functions compared with a general NN with a single hidden layer. Since the neurons of input layer can receive signals which come back from the neurons of output layer in the output feedback neural structure, it can possess associative memory and rapid system convergence, achieving better approximation and superior dynamic capability. Simulation and experiment on an active power filter are carried out to indicate the excellent static and dynamic performances of the proposed DHLRNN-based adaptive global sliding-mode controller, verifying its best approximation performance and the most stable internal state compared with other schemes.

Journal ArticleDOI
TL;DR: The proposed CASC is a joint framework that performs cross-modal attention for local alignment and multilabel prediction for global semantic consistence and directly extracts semantic labels from available sentence corpus without additional labor cost, which provides a global similarity constraint for the aggregated region-word similarity obtained by the local alignment.
Abstract: The task of image–text matching refers to measuring the visual-semantic similarity between an image and a sentence. Recently, the fine-grained matching methods that explore the local alignment between the image regions and the sentence words have shown advance in inferring the image–text correspondence by aggregating pairwise region-word similarity. However, the local alignment is hard to achieve as some important image regions may be inaccurately detected or even missing. Meanwhile, some words with high-level semantics cannot be strictly corresponding to a single-image region. To tackle these problems, we address the importance of exploiting the global semantic consistence between image regions and sentence words as complementary for the local alignment. In this article, we propose a novel hybrid matching approach named Cross-modal Attention with Semantic Consistency (CASC) for image–text matching. The proposed CASC is a joint framework that performs cross-modal attention for local alignment and multilabel prediction for global semantic consistence. It directly extracts semantic labels from available sentence corpus without additional labor cost, which further provides a global similarity constraint for the aggregated region-word similarity obtained by the local alignment. Extensive experiments on Flickr30k and Microsoft COCO (MSCOCO) data sets demonstrate the effectiveness of the proposed CASC on preserving global semantic consistence along with the local alignment and further show its superior image–text matching performance compared with more than 15 state-of-the-art methods.

Journal ArticleDOI
TL;DR: In this paper, a teacher-student curriculum learning (TSCL) framework is proposed, where the student tries to learn a complex task, and the teacher automatically chooses subtasks from a given set for the student to train on.
Abstract: We propose Teacher–Student Curriculum Learning (TSCL), a framework for automatic curriculum learning, where the Student tries to learn a complex task, and the Teacher automatically chooses subtasks from a given set for the Student to train on. We describe a family of Teacher algorithms that rely on the intuition that the Student should practice more those tasks on which it makes the fastest progress, i.e., where the slope of the learning curve is highest. In addition, the Teacher algorithms address the problem of forgetting by also choosing tasks where the Student’s performance is getting worse. We demonstrate that TSCL matches or surpasses the results of carefully hand-crafted curricula in two tasks: addition of decimal numbers with long short-term memory (LSTM) and navigation in Minecraft. Our automatically ordered curriculum of submazes enabled to solve a Minecraft maze that could not be solved at all when training directly on that maze, and the learning was an order of magnitude faster than a uniform sampling of those submazes.

Journal ArticleDOI
TL;DR: Some LMI stabilization criteria are developed for the first time with the help of the newly established fractional-order differential inequality and the obtained LMI results provide new insights into the research of delayed fractiona-order nonlinear systems.
Abstract: This paper addresses the global stabilization of fractional-order memristor-based neural networks (FMNNs) with time delay. The voltage threshold type memristor model is considered, and the FMNNs are represented by fractional-order differential equations with discontinuous right-hand sides. Then, the problem is addressed based on fractional-order differential inclusions and set-valued maps, together with the aid of Lyapunov functions and the comparison principle. Two types of control laws (delayed state feedback control and coupling state feedback control) are designed. Accordingly, two types of stabilization criteria [algebraic form and linear matrix inequality (LMI) form] are established. There are two groups of adjustable parameters included in the delayed state feedback control, which can be selected flexibly to achieve the desired global asymptotic stabilization or global Mittag–Leffler stabilization. Since the existing LMI-based stability analysis techniques for fractional-order systems are not applicable to delayed fractional-order nonlinear systems, a fractional-order differential inequality is established to overcome this difficulty. Based on the coupling state feedback control, some LMI stabilization criteria are developed for the first time with the help of the newly established fractional-order differential inequality. The obtained LMI results provide new insights into the research of delayed fractional-order nonlinear systems. Finally, three numerical examples are presented to illustrate the effectiveness of the proposed theoretical results.

Journal ArticleDOI
Jiliang Zhang1, Chen Li1
TL;DR: The concept, cause, characteristics, and evaluation metrics of AEs are introduced, then a survey on the state-of-the-art AE generation methods with the discussion of advantages and disadvantages are given.
Abstract: Deep neural networks (DNNs) have shown huge superiority over humans in image recognition, speech processing, autonomous vehicles, and medical diagnosis. However, recent studies indicate that DNNs are vulnerable to adversarial examples (AEs), which are designed by attackers to fool deep learning models. Different from real examples, AEs can mislead the model to predict incorrect outputs while hardly be distinguished by human eyes, therefore threaten security-critical deep-learning applications. In recent years, the generation and defense of AEs have become a research hotspot in the field of artificial intelligence (AI) security. This article reviews the latest research progress of AEs. First, we introduce the concept, cause, characteristics, and evaluation metrics of AEs, then give a survey on the state-of-the-art AE generation methods with the discussion of advantages and disadvantages. After that, we review the existing defenses and discuss their limitations. Finally, future research opportunities and challenges on AEs are prospected.

Journal ArticleDOI
TL;DR: A novel optimizer is proposed based on the difference between the present and the immediate past gradient, diffGrad, which shows that diffGrad outperforms other optimizers and performs uniformly well for training CNN using different activation functions.
Abstract: Stochastic gradient descent (SGD) is one of the core techniques behind the success of deep neural networks. The gradient provides information on the direction in which a function has the steepest rate of change. The main problem with basic SGD is to change by equal-sized steps for all parameters, irrespective of the gradient behavior. Hence, an efficient way of deep network optimization is to have adaptive step sizes for each parameter. Recently, several attempts have been made to improve gradient descent methods such as AdaGrad, AdaDelta, RMSProp, and adaptive moment estimation (Adam). These methods rely on the square roots of exponential moving averages of squared past gradients. Thus, these methods do not take advantage of local change in gradients. In this article, a novel optimizer is proposed based on the difference between the present and the immediate past gradient (i.e., diffGrad). In the proposed diffGrad optimization technique, the step size is adjusted for each parameter in such a way that it should have a larger step size for faster gradient changing parameters and a lower step size for lower gradient changing parameters. The convergence analysis is done using the regret bound approach of the online learning framework. In this article, thorough analysis is made over three synthetic complex nonconvex functions. The image categorization experiments are also conducted over the CIFAR10 and CIFAR100 data sets to observe the performance of diffGrad with respect to the state-of-the-art optimizers such as SGDM, AdaGrad, AdaDelta, RMSProp, AMSGrad, and Adam. The residual unit (ResNet)-based convolutional neural network (CNN) architecture is used in the experiments. The experiments show that diffGrad outperforms other optimizers. Also, we show that diffGrad performs uniformly well for training CNN using different activation functions. The source code is made publicly available at https://github.com/shivram1987/diffGrad .

Journal ArticleDOI
TL;DR: Experimental results, conducted using three large-scale benchmark data sets, demonstrate that the newly proposed SCCov network exhibits very competitive or superior classification performance when compared with the current state-of-the-art RSSC techniques, using a much lower amount of parameters.
Abstract: This paper proposes a novel end-to-end learning model, called skip-connected covariance (SCCov) network, for remote sensing scene classification (RSSC) The innovative contribution of this paper is to embed two novel modules into the traditional convolutional neural network (CNN) model, ie, skip connections and covariance pooling The advantages of newly developed SCCov are twofold First, by means of the skip connections, the multi-resolution feature maps produced by the CNN are combined together, which provides important benefits to address the presence of large-scale variance in RSSC data sets Second, by using covariance pooling, we can fully exploit the second-order information contained in such multi-resolution feature maps This allows the CNN to achieve more representative feature learning when dealing with RSSC problems Experimental results, conducted using three large-scale benchmark data sets, demonstrate that our newly proposed SCCov network exhibits very competitive or superior classification performance when compared with the current state-of-the-art RSSC techniques, using a much lower amount of parameters Specifically, our SCCov only needs 10% of the parameters used by its counterparts

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a structured sparsity regularization (SSR) regularization to reduce the memory overhead of CNNs, which can be well supported by various off-the-shelf deep learning libraries.
Abstract: The success of convolutional neural networks (CNNs) in computer vision applications has been accompanied by a significant increase of computation and memory costs, which prohibits their usage on resource-limited environments, such as mobile systems or embedded devices. To this end, the research of CNN compression has recently become emerging. In this paper, we propose a novel filter pruning scheme, termed structured sparsity regularization (SSR), to simultaneously speed up the computation and reduce the memory overhead of CNNs, which can be well supported by various off-the-shelf deep learning libraries. Concretely, the proposed scheme incorporates two different regularizers of structured sparsity into the original objective function of filter pruning, which fully coordinates the global output and local pruning operations to adaptively prune filters. We further propose an alternative updating with Lagrange multipliers (AULM) scheme to efficiently solve its optimization. AULM follows the principle of alternating direction method of multipliers (ADMM) and alternates between promoting the structured sparsity of CNNs and optimizing the recognition loss, which leads to a very efficient solver ( $2.5\times $ to the most recent work that directly solves the group sparsity-based regularization). Moreover, by imposing the structured sparsity, the online inference is extremely memory-light since the number of filters and the output feature maps are simultaneously reduced. The proposed scheme has been deployed to a variety of state-of-the-art CNN structures, including LeNet, AlexNet, VGGNet, ResNet, and GoogLeNet, over different data sets. Quantitative results demonstrate that the proposed scheme achieves superior performance over the state-of-the-art methods. We further demonstrate the proposed compression scheme for the task of transfer learning, including domain adaptation and object detection, which also show exciting performance gains over the state-of-the-art filter pruning methods.

Journal ArticleDOI
TL;DR: This paper studies the online adaptive optimal controller design for a class of nonlinear systems through a novel policy iteration (PI) algorithm developed with the online linearization and the two-step iteration, i.e., policy evaluation and policy improvement.
Abstract: This paper studies the online adaptive optimal controller design for a class of nonlinear systems through a novel policy iteration (PI) algorithm. By using the technique of neural network linear differential inclusion (LDI) to linearize the nonlinear terms in each iteration, the optimal law for controller design can be solved through the relevant algebraic Riccati equation (ARE) without using the system internal parameters. Based on PI approach, the adaptive optimal control algorithm is developed with the online linearization and the two-step iteration, i.e., policy evaluation and policy improvement. The convergence of the proposed PI algorithm is also proved. Finally, two numerical examples are given to illustrate the effectiveness and applicability of the proposed method.

Journal ArticleDOI
TL;DR: Through extensive experiments on two real-world databases, this article shows that MAGRM remarkably outperforms the state-of-the-art methods in solving a group recommendation problem.
Abstract: Group recommendation research has recently received much attention in a recommender system community. Currently, several deep-learning-based methods are used in group recommendation to learn preferences of groups on items and predict the next ones in which groups may be interested. However, their recommendation effectiveness is disappointing. To address this challenge, this article proposes a novel model called a multiattention-based group recommendation model (MAGRM). It well utilizes multiattention-based deep neural network structures to achieve accurate group recommendation. We train its two closely related modules: vector representation for group features and preference learning for groups on items. The former is proposed to learn to accurately represent each group’s deep semantic features. It integrates four aspects of subfeatures: group co-occurrence, group description, and external and internal social features. In particular, we employ multiattention networks to learn to capture internal social features for groups. The latter employs a neural attention mechanism to depict preference interactions between each group and its members and then combines group and item features to accurately learn group preferences on items. Through extensive experiments on two real-world databases, we show that MAGRM remarkably outperforms the state-of-the-art methods in solving a group recommendation problem.