scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 2021"


Journal ArticleDOI
TL;DR: This article provides a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields and proposes a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNS, convolutional GNN’s, graph autoencoders, and spatial–temporal Gnns.
Abstract: Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications, where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on the existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this article, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNs, convolutional GNNs, graph autoencoders, and spatial–temporal GNNs. We further discuss the applications of GNNs across various domains and summarize the open-source codes, benchmark data sets, and model evaluation of GNNs. Finally, we propose potential research directions in this rapidly growing field.

4,584 citations


Journal ArticleDOI
TL;DR: A comprehensive review of the knowledge graph covering overall research topics about: 1) knowledge graph representation learning; 2) knowledge acquisition and completion; 3) temporal knowledge graph; and 4) knowledge-aware applications and summarize recent breakthroughs and perspective directions to facilitate future research.
Abstract: Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction toward cognition and human-level intelligence. In this survey, we provide a comprehensive review of the knowledge graph covering overall research topics about: 1) knowledge graph representation learning; 2) knowledge acquisition and completion; 3) temporal knowledge graph; and 4) knowledge-aware applications and summarize recent breakthroughs and perspective directions to facilitate future research. We propose a full-view categorization and new taxonomies on these topics. Knowledge graph embedding is organized from four aspects of representation space, scoring function, encoding models, and auxiliary information. For knowledge acquisition, especially knowledge graph completion, embedding methods, path inference, and logical rule reasoning are reviewed. We further explore several emerging topics, including metarelational learning, commonsense reasoning, and temporal knowledge graphs. To facilitate future research on knowledge graphs, we also provide a curated collection of data sets and open-source libraries on different tasks. In the end, we have a thorough outlook on several promising research directions.

1,025 citations


Journal ArticleDOI
TL;DR: A review on interpretabilities suggested by different research works and categorize them is provided, hoping that insight into interpretability will be born with more considerations for medical practices and initiatives to push forward data-based, mathematically grounded, and technically grounded medical education are encouraged.
Abstract: Recently, artificial intelligence and machine learning in general have demonstrated remarkable performances in many tasks, from image processing to natural language processing, especially with the advent of deep learning (DL). Along with research progress, they have encroached upon many different fields and disciplines. Some of them require high level of accountability and thus transparency, for example, the medical sector. Explanations for machine decisions and predictions are thus needed to justify their reliability. This requires greater interpretability, which often means we need to understand the mechanism underlying the algorithms. Unfortunately, the blackbox nature of the DL is still unresolved, and many machine decisions are still poorly understood. We provide a review on interpretabilities suggested by different research works and categorize them. The different categories show different dimensions in interpretability research, from approaches that provide “obviously” interpretable information to the studies of complex patterns. By applying the same categorization to interpretability in medical research, it is hoped that: 1) clinicians and practitioners can subsequently approach these methods with caution; 2) insight into interpretability will be born with more considerations for medical practices; and 3) initiatives to push forward data-based, mathematically grounded, and technically grounded medical education are encouraged.

810 citations


Journal ArticleDOI
TL;DR: The field of natural language processing has been propelled forward by an explosion in the use of deep learning models over the last several years as mentioned in this paper, which includes several core linguistic processing issues in addition to many applications of computational linguistics.
Abstract: Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This article provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to many applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.

783 citations


Journal ArticleDOI
Deng-Ping Fan1, Zheng Lin1, Zhao Zhang1, Menglong Zhu2, Ming-Ming Cheng1 
TL;DR: It is demonstrated that D3Net can be used to efficiently extract salient object masks from real scenes, enabling effective background-changing application with a speed of 65 frames/s on a single GPU.
Abstract: The use of RGB-D information for salient object detection (SOD) has been extensively explored in recent years. However, relatively few efforts have been put toward modeling SOD in real-world human activity scenes with RGB-D. In this article, we fill the gap by making the following contributions to RGB-D SOD: 1) we carefully collect a new S al i ent P erson (SIP) data set that consists of ~1 K high-resolution images that cover diverse real-world scenes from various viewpoints, poses, occlusions, illuminations, and background s; 2) we conduct a large-scale (and, so far, the most comprehensive) benchmark comparing contemporary methods, which has long been missing in the field and can serve as a baseline for future research, and we systematically summarize 32 popular models and evaluate 18 parts of 32 models on seven data sets containing a total of about 97k images; and 3) we propose a simple general architecture, called deep depth-depurator network (D3Net). It consists of a depth depurator unit (DDU) and a three-stream feature learning module (FLM), which performs low-quality depth map filtering and cross-modal feature learning, respectively. These components form a nested structure and are elaborately designed to be learned jointly. D3Net exceeds the performance of any prior contenders across all five metrics under consideration, thus serving as a strong model to advance research in this field. We also demonstrate that D3Net can be used to efficiently extract salient object masks from real scenes, enabling effective background-changing application with a speed of 65 frames/s on a single GPU. All the saliency maps, our new SIP data set, the D3Net model, and the evaluation tools are publicly available at https://github.com/DengPingFan/D3NetBenchmark .

423 citations


Journal ArticleDOI
TL;DR: This work presents a deep subdomain adaptation network (DSAN) that learns a transfer network by aligning the relevant subdomain distributions of domain-specific layer activations across different domains based on a local maximum mean discrepancy (LMMD).
Abstract: For a target task where the labeled data are unavailable, domain adaptation can transfer a learner from a different source domain. Previous deep domain adaptation methods mainly learn a global domain shift, i.e., align the global source and target distributions without considering the relationships between two subdomains within the same category of different domains, leading to unsatisfying transfer learning performance without capturing the fine-grained information. Recently, more and more researchers pay attention to subdomain adaptation that focuses on accurately aligning the distributions of the relevant subdomains. However, most of them are adversarial methods that contain several loss functions and converge slowly. Based on this, we present a deep subdomain adaptation network (DSAN) that learns a transfer network by aligning the relevant subdomain distributions of domain-specific layer activations across different domains based on a local maximum mean discrepancy (LMMD). Our DSAN is very simple but effective, which does not need adversarial training and converges fast. The adaptation can be achieved easily with most feedforward network models by extending them with LMMD loss, which can be trained efficiently via backpropagation. Experiments demonstrate that DSAN can achieve remarkable results on both object recognition tasks and digit classification tasks. Our code will be available at https://github.com/easezyc/deep-transfer-learning .

357 citations


Journal ArticleDOI
Zewen Li1, Fan Liu1, Wenjie Yang1, Shouheng Peng1, Jun Zhou2 
TL;DR: In this article, the authors provide an overview of various convolutional neural network (CNN) models and provide several rules of thumb for functions and hyperparameter selection, as well as open issues and promising directions for future work.
Abstract: A convolutional neural network (CNN) is one of the most significant networks in the deep learning field. Since CNN made impressive achievements in many areas, including but not limited to computer vision and natural language processing, it attracted much attention from both industry and academia in the past few years. The existing reviews mainly focus on CNN's applications in different scenarios without considering CNN from a general perspective, and some novel ideas proposed recently are not covered. In this review, we aim to provide some novel ideas and prospects in this fast-growing field. Besides, not only 2-D convolution but also 1-D and multidimensional ones are involved. First, this review introduces the history of CNN. Second, we provide an overview of various convolutions. Third, some classic and advanced CNN models are introduced; especially those key points making them reach state-of-the-art results. Fourth, through experimental analysis, we draw some conclusions and provide several rules of thumb for functions and hyperparameter selection. Fifth, the applications of 1-D, 2-D, and multidimensional convolution are covered. Finally, some open issues and promising directions for CNN are discussed as guidelines for future work.

342 citations


Journal ArticleDOI
TL;DR: In this article, an adaptive neural network (NN) output feedback optimized control design for a class of strict-feedback nonlinear systems that contain unknown internal dynamics and the states that are immeasurable and constrained within some predefined compact sets is proposed.
Abstract: This article proposes an adaptive neural network (NN) output feedback optimized control design for a class of strict-feedback nonlinear systems that contain unknown internal dynamics and the states that are immeasurable and constrained within some predefined compact sets. NNs are used to approximate the unknown internal dynamics, and an adaptive NN state observer is developed to estimate the immeasurable states. By constructing a barrier type of optimal cost functions for subsystems and employing an observer and the actor-critic architecture, the virtual and actual optimal controllers are developed under the framework of backstepping technique. In addition to ensuring the boundedness of all closed-loop signals, the proposed strategy can also guarantee that system states are confined within some preselected compact sets all the time. This is achieved by means of barrier Lyapunov functions which have been successfully applied to various kinds of nonlinear systems such as strict-feedback and pure-feedback dynamics. Besides, our developed optimal controller requires less conditions on system dynamics than some existing approaches concerning optimal control. The effectiveness of the proposed optimal control approach is eventually validated by numerical as well as practical examples.

337 citations


Journal ArticleDOI
TL;DR: A novel event-triggered control protocol is constructed, which realizes that the outputs of all followers converge to a neighborhood of the leader’s output and ensures that all signals are bounded in the closed-loop system.
Abstract: This article addresses the adaptive event-triggered neural control problem for nonaffine pure-feedback nonlinear multiagent systems with dynamic disturbance, unmodeled dynamics, and dead-zone input. Radial basis function neural networks are applied to approximate the unknown nonlinear function. A dynamic signal is constructed to deal with the design difficulties in the unmodeled dynamics. Moreover, to reduce the communication burden, we propose an event-triggered strategy with a varying threshold. Based on the Lyapunov function method and adaptive neural control approach, a novel event-triggered control protocol is constructed, which realizes that the outputs of all followers converge to a neighborhood of the leader’s output and ensures that all signals are bounded in the closed-loop system. An illustrative simulation example is applied to verify the usefulness of the proposed algorithms.

308 citations


Journal ArticleDOI
TL;DR: This article proposed a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the inputs and outputs.
Abstract: Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

272 citations


Journal ArticleDOI
TL;DR: Clustered FL (CFL) as discussed by the authors exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions, which can be viewed as a postprocessing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models.
Abstract: Federated learning (FL) is currently the most widely adopted framework for collaborative training of (deep) machine learning models under privacy constraints. Albeit its popularity, it has been observed that FL yields suboptimal results if the local clients’ data distributions diverge. To address this issue, we present clustered FL (CFL), a novel federated multitask learning (FMTL) framework, which exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions. In contrast to existing FMTL approaches, CFL does not require any modifications to the FL communication protocol to be made, is applicable to general nonconvex objectives (in particular, deep neural networks), does not require the number of clusters to be known a priori , and comes with strong mathematical guarantees on the clustering quality. CFL is flexible enough to handle client populations that vary over time and can be implemented in a privacy-preserving way. As clustering is only performed after FL has converged to a stationary point, CFL can be viewed as a postprocessing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models. We verify our theoretical analysis in experiments with deep convolutional and recurrent neural networks on commonly used FL data sets.

Journal ArticleDOI
TL;DR: This survey covers the main steps of deep learning-based BTC methods, including preprocessing, features extraction, and classification, along with their achievements and limitations, and investigates the state-of-the-art convolutional neural network models for BTC by performing extensive experiments using transfer learning with and without data augmentation.
Abstract: Brain tumor is one of the most dangerous cancers in people of all ages, and its grade recognition is a challenging problem for radiologists in health monitoring and automated diagnosis. Recently, numerous methods based on deep learning have been presented in the literature for brain tumor classification (BTC) in order to assist radiologists for a better diagnostic analysis. In this overview, we present an in-depth review of the surveys published so far and recent deep learning-based methods for BTC. Our survey covers the main steps of deep learning-based BTC methods, including preprocessing, features extraction, and classification, along with their achievements and limitations. We also investigate the state-of-the-art convolutional neural network models for BTC by performing extensive experiments using transfer learning with and without data augmentation. Furthermore, this overview describes available benchmark data sets used for the evaluation of BTC. Finally, this survey does not only look into the past literature on the topic but also steps on it to delve into the future of this area and enumerates some research directions that should be followed in the future, especially for personalized and smart healthcare.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a semi-supervised method for medical image segmentation, where the network is optimized by a weighted combination of a common supervised loss only for the labeled inputs and a regularization loss for both the labeled and unlabeled data.
Abstract: A common shortfall of supervised deep learning for medical imaging is the lack of labeled data, which is often expensive and time consuming to collect. This article presents a new semisupervised method for medical image segmentation, where the network is optimized by a weighted combination of a common supervised loss only for the labeled inputs and a regularization loss for both the labeled and unlabeled data. To utilize the unlabeled data, our method encourages consistent predictions of the network-in-training for the same input under different perturbations. With the semisupervised segmentation tasks, we introduce a transformation-consistent strategy in the self-ensembling model to enhance the regularization effect for pixel-level predictions. To further improve the regularization effects, we extend the transformation in a more generalized form including scaling and optimize the consistency loss with a teacher model, which is an averaging of the student model weights. We extensively validated the proposed semisupervised method on three typical yet challenging medical image segmentation tasks: 1) skin lesion segmentation from dermoscopy images in the International Skin Imaging Collaboration (ISIC) 2017 data set; 2) optic disk (OD) segmentation from fundus images in the Retinal Fundus Glaucoma Challenge (REFUGE) data set; and 3) liver segmentation from volumetric CT scans in the Liver Tumor Segmentation Challenge (LiTS) data set. Compared with state-of-the-art, our method shows superior performance on the challenging 2-D/3-D medical images, demonstrating the effectiveness of our semisupervised method for medical image segmentation.

Journal ArticleDOI
TL;DR: A novel HSI and MSI fusion method based on the subspace representation and convolutional neural network (CNN) denoiser, i.e., a well-trained CNN for gray image denoising, which has superior performance over the state-of-the-art fusion methods.
Abstract: Hyperspectral image (HSI) and multispectral image (MSI) fusion, which fuses a low-spatial-resolution HSI (LR-HSI) with a higher resolution multispectral image (MSI), has become a common scheme to obtain high-resolution HSI (HR-HSI). This article presents a novel HSI and MSI fusion method (called as CNN-Fus), which is based on the subspace representation and convolutional neural network (CNN) denoiser, i.e., a well-trained CNN for gray image denoising. Our method only needs to train the CNN on the more accessible gray images and can be directly used for any HSI and MSI data sets without retraining. First, to exploit the high correlations among the spectral bands, we approximate the desired HR-HSI with the low-dimensional subspace multiplied by the coefficients, which can not only speed up the algorithm but also lead to more accurate recovery. Since the spectral information mainly exists in the LR-HSI, we learn the subspace from it via singular value decomposition. Due to the powerful learning performance and high speed of CNN, we use the well-trained CNN for gray image denoising to regularize the estimation of coefficients. Specifically, we plug the CNN denoiser into the alternating direction method of multipliers (ADMM) algorithm to estimate the coefficients. Experiments demonstrate that our method has superior performance over the state-of-the-art fusion methods.

Journal ArticleDOI
TL;DR: In this paper, a novel double-layer switching regulation containing Markov chain and persistent dwell-time switching regulation (PDTSR) is used for singularly perturbed coupled neural networks (SPCNNs) affected by nonlinear constraints and gain uncertainties.
Abstract: This work explores the $H_{∞ }$ synchronization issue for singularly perturbed coupled neural networks (SPCNNs) affected by both nonlinear constraints and gain uncertainties, in which a novel double-layer switching regulation containing Markov chain and persistent dwell-time switching regulation (PDTSR) is used. The first layer of switching regulation is the Markov chain to characterize the switching stochastic properties of the systems suffering from random component failures and sudden environmental disturbances. Meanwhile, PDTSR, as the second-layer switching regulation, is used to depict the variations in the transition probability of the aforementioned Markov chain. For systems under double-layer switching regulation, the purpose of the addressed issue is to design a mode-dependent synchronization controller for the network with the desired controller gains calculated by solving convex optimization problems. As such, new sufficient conditions are established to ensure that the synchronization error systems are mean-square exponentially stable with a specified level of the $H_{∞ }$ performance. Eventually, the solvability and validity of the proposed control scheme are illustrated through a numerical simulation.

Journal ArticleDOI
TL;DR: This article presents a survey of DRL approaches developed for cyber security, including DRL-based security methods for cyber-physical systems, autonomous intrusion detection techniques, and multiagent D RL-based game theory simulations for defense strategies against cyberattacks.
Abstract: The scale of Internet-connected systems has increased considerably, and these systems are being exposed to cyberattacks more than ever. The complexity and dynamics of cyberattacks require protecting mechanisms to be responsive, adaptive, and scalable. Machine learning, or more specifically deep reinforcement learning (DRL), methods have been proposed widely to address these issues. By incorporating deep learning into traditional RL, DRL is highly capable of solving complex, dynamic, and especially high-dimensional cyber defense problems. This article presents a survey of DRL approaches developed for cyber security. We touch on different vital aspects, including DRL-based security methods for cyber-physical systems, autonomous intrusion detection techniques, and multiagent DRL-based game theory simulations for defense strategies against cyberattacks. Extensive discussions and future research directions on DRL-based cyber security are also given. We expect that this comprehensive review provides the foundations for and facilitates future studies on exploring the potential of emerging DRL to cope with increasingly complex cyber security problems.

Journal ArticleDOI
TL;DR: The finite-time consensus fault-tolerant control (FTC) tracking problem is studied for the nonlinear multi-agent systems (MASs) in the nonstrict feedback form and the Nussbaum function is used to address the output dead zones and unknown control directions problems.
Abstract: The finite-time consensus fault-tolerant control (FTC) tracking problem is studied for the nonlinear multi-agent systems (MASs) in the nonstrict feedback form. The MASs are subject to unknown symmetric output dead zones, actuator bias and gain faults, and unknown control coefficients. According to the properties of the neural network (NN), the unstructured uncertainties problem is solved. The Nussbaum function is used to address the output dead zones and unknown control directions problems. By introducing an arbitrarily small positive number, the “singularity” problem caused by combining the finite-time control and backstepping design is solved. According to the backstepping design and Lyapunov stability theory, a finite-time adaptive NN FTC controller is obtained, which guarantees that the tracking error converges to a small neighborhood of zero in a finite time, and all signals in the closed-loop system are bounded. Finally, the effectiveness of the proposed method is illustrated via a physical example.

Journal ArticleDOI
TL;DR: A new feature selection method based on the Dempster–Shafer theory is proposed, which takes into consideration the distribution of features and results in a significant increase in the performance of MI-based BCI systems.
Abstract: The common spatial pattern (CSP) algorithm is a well-recognized spatial filtering method for feature extraction in motor imagery (MI)-based brain–computer interfaces (BCIs). However, due to the influence of nonstationary in electroencephalography (EEG) and inherent defects of the CSP objective function, the spatial filters, and their corresponding features are not necessarily optimal in the feature space used within CSP. In this work, we design a new feature selection method to address this issue by selecting features based on an improved objective function. Especially, improvements are made in suppressing outliers and discovering features with larger interclass distances. Moreover, a fusion algorithm based on the Dempster–Shafer theory is proposed, which takes into consideration the distribution of features. With two competition data sets, we first evaluate the performance of the improved objective functions in terms of classification accuracy, feature distribution, and embeddability. Then, a comparison with other feature selection methods is carried out in both accuracy and computational time. Experimental results show that the proposed methods consume less additional computational cost and result in a significant increase in the performance of MI-based BCI systems.

Journal ArticleDOI
Jun Fu1, Jing Liu1, Jie Jiang1, Yong Li, Yongjun Bao, Hanqing Lu1 
TL;DR: A Dual Relation-aware Attention Network (DRANet) is proposed to handle the task of scene segmentation and designs two types of compact attention modules, which model the contextual dependencies in spatial and channel dimensions, respectively.
Abstract: In this article, we propose a Dual Relation-aware Attention Network (DRANet) to handle the task of scene segmentation. How to efficiently exploit context is essential for pixel-level recognition. To address the issue, we adaptively capture contextual information based on the relation-aware attention mechanism. Especially, we append two types of attention modules on the top of the dilated fully convolutional network (FCN), which model the contextual dependencies in spatial and channel dimensions, respectively. In the attention modules, we adopt a self-attention mechanism to model semantic associations between any two pixels or channels. Each pixel or channel can adaptively aggregate context from all pixels or channels according to their correlations. To reduce the high cost of computation and memory caused by the abovementioned pairwise association computation, we further design two types of compact attention modules. In the compact attention modules, each pixel or channel is built into association only with a few numbers of gathering centers and obtains corresponding context aggregation over these gathering centers. Meanwhile, we add a cross-level gating decoder to selectively enhance spatial details that boost the performance of the network. We conduct extensive experiments to validate the effectiveness of our network and achieve new state-of-the-art segmentation performance on four challenging scene segmentation data sets, i.e., Cityscapes, ADE20K, PASCAL Context, and COCO Stuff data sets. In particular, a Mean IoU score of 82.9% on the Cityscapes test set is achieved without using extra coarse annotated data.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a heterogeneous GNNs framework based on attention mechanism, where the neighbor features of an entity are first aggregated under each relation-path, and then the importance of different relationpaths is learned through the relation features.
Abstract: Knowledge graph (KG) embedding aims to study the embedding representation to retain the inherent structure of KGs. Graph neural networks (GNNs), as an effective graph representation technique, have shown impressive performance in learning graph embedding. However, KGs have an intrinsic property of heterogeneity, which contains various types of entities and relations. How to address complex graph data and aggregate multiple types of semantic information simultaneously is a critical issue. In this article, a novel heterogeneous GNNs framework based on attention mechanism is proposed. Specifically, the neighbor features of an entity are first aggregated under each relation-path. Then the importance of different relation-paths is learned through the relation features. Finally, each relation-path-based features with the learned weight values are aggregated to generate the embedding representation. Thus, the proposed method not only aggregates entity features from different semantic aspects but also allocates appropriate weights to them. This method can capture various types of semantic information and selectively aggregate informative features. The experiment results on three real-world KGs demonstrate superior performance when compared with several state-of-the-art methods.

Journal ArticleDOI
TL;DR: This article proposes a novel graph LSTM-in-LSTM (GLIL) for group activity recognition by modeling the person-level actions and the group-level activity simultaneously, and introduces a residual L STM with the residual connection to learn theperson-level residual features, consisting of temporal features and static features.
Abstract: This article aims to tackle the problem of group activity recognition in the multiple-person scene. To model the group activity with multiple persons, most long short-term memory (LSTM)-based methods first learn the person-level action representations by several LSTMs and then integrate all the person-level action representations into the following LSTM to learn the group-level activity representation. This type of solution is a two-stage strategy, which neglects the “host–parasite” relationship between the group-level activity (“host”) and person-level actions (“parasite”) in spatiotemporal space. To this end, we propose a novel graph LSTM-in-LSTM (GLIL) for group activity recognition by modeling the person-level actions and the group-level activity simultaneously. GLIL is a “host–parasite” architecture, which can be seen as several person LSTMs (P-LSTMs) in the local view or a graph LSTM (G-LSTM) in the global view. Specifically, P-LSTMs model the person-level actions based on the interactions among persons. Meanwhile, G-LSTM models the group-level activity, where the person-level motion information in multiple P-LSTMs is selectively integrated and stored into G-LSTM based on their contributions to the inference of the group activity class. Furthermore, to use the person-level temporal features instead of the person-level static features as the input of GLIL, we introduce a residual LSTM with the residual connection to learn the person-level residual features, consisting of temporal features and static features. Experimental results on two public data sets illustrate the effectiveness of the proposed GLIL compared with state-of-the-art methods.

Journal ArticleDOI
TL;DR: This article presents an alternative way to remove the feasibility condition that most BLF-based controllers should meet and design a control scheme on the premise that constraint violation possibly happens due to the control input saturation.
Abstract: This article presents a control scheme for the robot manipulator’s trajectory tracking task considering output error constraints and control input saturation. We provide an alternative way to remove the feasibility condition that most BLF-based controllers should meet and design a control scheme on the premise that constraint violation possibly happens due to the control input saturation. A bounded barrier Lyapunov function is proposed and adopted to handle the output error constraints. Besides, to suppress the input saturation effect, an auxiliary system is designed and emerged into the control scheme. Moreover, a simplified RBFNN structure is adopted to approximate the lumped uncertainties. Simulation and experimental results demonstrate the effectiveness of the proposed control scheme.

Journal ArticleDOI
TL;DR: This article provides a systematic review of existing compelling DL architectures applied in LiDAR point clouds, detailing for specific tasks in autonomous driving, such as segmentation, detection, and classification.
Abstract: Recently, the advancement of deep learning (DL) in discriminative feature learning from 3-D LiDAR data has led to rapid development in the field of autonomous driving. However, automated processing uneven, unstructured, noisy, and massive 3-D point clouds are a challenging and tedious task. In this article, we provide a systematic review of existing compelling DL architectures applied in LiDAR point clouds, detailing for specific tasks in autonomous driving, such as segmentation, detection, and classification. Although several published research articles focus on specific topics in computer vision for autonomous vehicles, to date, no general survey on DL applied in LiDAR point clouds for autonomous vehicles exists. Thus, the goal of this article is to narrow the gap in this topic. More than 140 key contributions in the recent five years are summarized in this survey, including the milestone 3-D deep architectures, the remarkable DL applications in 3-D semantic segmentation, object detection, and classification; specific data sets, evaluation metrics, and the state-of-the-art performance. Finally, we conclude the remaining challenges and future researches.

Journal ArticleDOI
TL;DR: This article proposes automatic extraction and classification of features through the use of different convolutional neural networks (CNNs) and shows that configurable CNN requires very less learning parameters with better accuracy.
Abstract: Emotions composed of cognizant logical reactions toward various situations. Such mental responses stem from physiological, cognitive, and behavioral changes. Electroencephalogram (EEG) signals provide a noninvasive and nonradioactive solution for emotion identification. Accurate and automatic classification of emotions can boost the development of human–computer interface. This article proposes automatic extraction and classification of features through the use of different convolutional neural networks (CNNs). At first, the proposed method converts the filtered EEG signals into an image using a time–frequency representation. Smoothed pseudo-Wigner–Ville distribution is used to transform time-domain EEG signals into images. These images are fed to pretrained AlexNet, ResNet50, and VGG16 along with configurable CNN. The performance of four CNNs is evaluated by measuring the accuracy, precision, Mathew’s correlation coefficient, F1-score, and false-positive rate. The results obtained by evaluating four CNNs show that configurable CNN requires very less learning parameters with better accuracy. Accuracy scores of 90.98%, 91.91%, 92.71%, and 93.01% obtained by AlexNet, ResNet50, VGG16, and configurable CNN show that the proposed method is best among other existing methods.

Journal ArticleDOI
Xiaofeng Yuan1, Chen Ou1, Yalin Wang1, Chunhua Yang1, Weihua Gui1 
TL;DR: A layer-wise data augmentation (LWDA) strategy is proposed for the pretraining of deep learning networks and soft sensor modeling and the proposed LWDA-SAE model is applied to predict the 10% and 50% boiling points of the aviation kerosene in an industrial hydrocracking process.
Abstract: In industrial processes, inferential sensors have been extensively applied for prediction of quality variables that are difficult to measure online directly by hard sensors. Deep learning is a recently developed technique for feature representation of complex data, which has great potentials in soft sensor modeling. However, it often needs a large number of representative data to train and obtain a good deep network. Moreover, layer-wise pretraining often causes information loss and generalization degradation of high hidden layers. This greatly limits the implementation and application of deep learning networks in industrial processes. In this article, a layer-wise data augmentation (LWDA) strategy is proposed for the pretraining of deep learning networks and soft sensor modeling. In particular, the LWDA-based stacked autoencoder (LWDA-SAE) is developed in detail. Finally, the proposed LWDA-SAE model is applied to predict the 10% and 50% boiling points of the aviation kerosene in an industrial hydrocracking process. The results show that the LWDA-SAE-based soft sensor is superior to multilayer perceptron, traditional SAE, and the SAE with data augmentation only for its input layer (IDA-SAE). Moreover, LWDA-SAE can converge at a faster speed with a lower learning error than the other methods.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed an L₁-and-L₂-norm-oriented latent factor (L³F) model, which adopts twofold ideas: aggregating norm's robustness and norm's stability to form its loss and adaptively adjusting weights of L ₁ and L ³F in its loss.
Abstract: A recommender system (RS) is highly efficient in filtering people's desired information from high-dimensional and sparse (HiDS) data. To date, a latent factor (LF)-based approach becomes highly popular when implementing a RS. However, current LF models mostly adopt single distance-oriented Loss like an L₂ norm-oriented one, which ignores target data's characteristics described by other metrics like an L₁ norm-oriented one. To investigate this issue, this article proposes an L₁-and-L₂-norm-oriented LF (L³F) model. It adopts twofold ideas: 1) aggregating L₁ norm's robustness and L₂ norm's stability to form its Loss and 2) adaptively adjusting weights of L₁ and L₂ norms in its Loss. By doing so, it achieves fine aggregation effects with L₁ norm-oriented Loss's robustness and L₂ norm-oriented Loss's stability to precisely describe HiDS data with outliers. Experimental results on nine HiDS datasets generated by real systems show that an L³F model significantly outperforms state-of-the-art models in prediction accuracy for missing data of an HiDS dataset. Its computational efficiency is also comparable with the most efficient LF models. Hence, it has good potential for addressing HiDS data from real applications.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed to adjust the scaling factor via a linear or nonlinear strategy, thereby innovatively implementing several scaling-factor-adjusted NMU schemes to achieve a significant accuracy gain in community detection over the state-of-the-art community detectors.
Abstract: Community detection is a popular yet thorny issue in social network analysis. A symmetric and nonnegative matrix factorization (SNMF) model based on a nonnegative multiplicative update (NMU) scheme is frequently adopted to address it. Current research mainly focuses on integrating additional information into it without considering the effects of a learning scheme. This study aims to implement highly accurate community detectors via the connections between an SNMF-based community detector's detection accuracy and an NMU scheme's scaling factor. The main idea is to adjust such scaling factor via a linear or nonlinear strategy, thereby innovatively implementing several scaling-factor-adjusted NMU schemes. They are applied to SNMF and graph-regularized SNMF models to achieve four novel SNMF-based community detectors. Theoretical studies indicate that with the proposed schemes and proper hyperparameter settings, each model can: 1) keep its loss function nonincreasing during its training process and 2) converge to a stationary point. Empirical studies on eight social networks show that they achieve significant accuracy gain in community detection over the state-of-the-art community detectors.

Journal ArticleDOI
TL;DR: In this article, the problem of tracking control for a class of nonlinear time-varying full state constrained systems is investigated, and the intelligent controller and adaptive law are developed.
Abstract: In this article, the problem of tracking control for a class of nonlinear time-varying full state constrained systems is investigated. By constructing the time-varying asymmetric barrier Lyapunov function (BLF) and combining it with the backstepping algorithm, the intelligent controller and adaptive law are developed. Neural networks (NNs) are utilized to approximate the uncertain function. It is well known that in the past research of nonlinear systems with state constraints, the state constraint boundary is either a constant or a time-varying function. In this article, the constraint boundaries both related to state and time are investigated, which makes the design of control algorithm more complex and difficult. Furthermore, by employing the Lyapunov stability analysis, it is proven that all signals in the closed-loop system are bounded and the time-varying full state constraints are not violated. In the end, the effectiveness of the control algorithm is verified by numerical simulation.

Journal ArticleDOI
TL;DR: A novel FS framework with two continuous constraints is proposed to select the exact top-ranked features in the unsupervised, semisupervised, and supervised scenarios and can be optimized by the alternating direction method of multipliers (ADMM).
Abstract: Feature selection (FS), which identifies the relevant features in a data set to facilitate subsequent data analysis, is a fundamental problem in machine learning and has been widely studied in recent years. Most FS methods rank the features in order of their scores based on a specific criterion and then select the $k$ top-ranked features, where $k$ is the number of desired features. However, these features are usually not the top- $k$ features and may present a suboptimal choice. To address this issue, we propose a novel FS framework in this article to select the exact top- $k$ features in the unsupervised, semisupervised, and supervised scenarios. The new framework utilizes the $\ell _{0,2}$ -norm as the matrix sparsity constraint rather than its relaxations, such as the $\ell _{1,2}$ -norm. Since the $\ell _{0,2}$ -norm constrained problem is difficult to solve, we transform the discrete $\ell _{0,2}$ -norm-based constraint into an equivalent 0–1 integer constraint and replace the 0–1 integer constraint with two continuous constraints. The obtained top- $k$ FS framework with two continuous constraints is theoretically equivalent to the $\ell _{0,2}$ -norm constrained problem and can be optimized by the alternating direction method of multipliers (ADMM). Unsupervised and semisupervised FS methods are developed based on the proposed framework, and extensive experiments on real-world data sets are conducted to demonstrate the effectiveness of the proposed FS framework.

Journal ArticleDOI
TL;DR: A modified Levenberg–Marquardt algorithm is proposed for the artificial neural network learning containing the training and testing stages and error stability and weights boundedness are assured based on the Lyapunov technique.
Abstract: The Levenberg–Marquardt and Newton are two algorithms that use the Hessian for the artificial neural network learning. In this article, we propose a modified Levenberg–Marquardt algorithm for the artificial neural network learning containing the training and testing stages. The modified Levenberg–Marquardt algorithm is based on the Levenberg–Marquardt and Newton algorithms but with the following two differences to assure the error stability and weights boundedness: 1) there is a singularity point in the learning rates of the Levenberg–Marquardt and Newton algorithms, while there is not a singularity point in the learning rate of the modified Levenberg–Marquardt algorithm and 2) the Levenberg–Marquardt and Newton algorithms have three different learning rates, while the modified Levenberg–Marquardt algorithm only has one learning rate. The error stability and weights boundedness of the modified Levenberg–Marquardt algorithm are assured based on the Lyapunov technique. We compare the artificial neural network learning with the modified Levenberg–Marquardt, Levenberg–Marquardt, Newton, and stable gradient algorithms for the learning of the electric and brain signals data set.