scispace - formally typeset
Search or ask a question
Author

Yongbin Gao

Other affiliations: Chonbuk National University
Bio: Yongbin Gao is an academic researcher from Shanghai University of Engineering Sciences. The author has contributed to research in topics: Deep learning & Computer science. The author has an hindex of 12, co-authored 47 publications receiving 420 citations. Previous affiliations of Yongbin Gao include Chonbuk National University.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
TL;DR: A novel deep learning architecture named R3D is proposed to extract effective and discriminative spatial-temporal features to be used for action recognition, which enables the capturing of long-range temporal information by aggregating the 3D convolutional network entries to serve as an input to the LSTM (Long Short-Term Memory) architecture.
Abstract: Human action monitoring can be advantageous to remotely monitor the status of patients or elderly person for intelligent healthcare Human action recognition enables efficient and accurate monitoring of human behaviors, which can exhibit multifaceted complexity attributed to disparities in viewpoints, personality, resolution and motion speed of individuals, etc The spatial-temporal information plays an important role in the human action recognition In this paper, we proposed a novel deep learning architecture named as recurrent 3D convolutional neural network (R3D) to extract effective and discriminative spatial-temporal features to be used for action recognition, which enables the capturing of long-range temporal information by aggregating the 3D convolutional network entries to serve as an input to the LSTM (Long Short-Term Memory) architecture The 3D convolutional network and LSTM are two effective methods for extracting the temporal information The proposed R3D network integrated these two methods by sharing a shared 3D convolutional network in sliding windows on video streaming to capturing short-term spatial-temporal features into the LSTM The output features of LSTM encapsulate the long-range spatial-temporal information representing high-level abstraction of the human actions The proposed algorithm is compared to traditional and the-state-of-the-art and deep learning algorithms The experimental results demonstrated the effectiveness of the proposed system, which can be used as smart monitoring for remote healthcare

77 citations

Proceedings ArticleDOI
30 Jul 2015
TL;DR: A deep learning algorithm is proposed to simultaneously learn the features and classify the emotions of EEG signals to achieve better recognition accuracy than conventional algorithms.
Abstract: Emotion recognition is an important task for computer to understand the human status in brain computer interface (BCI) systems. It is difficult to perceive the emotion of some disabled people through their facial expression, such as functional autism patient. EEG signal provides us a non-invasive way to recognize the emotion of these disable people through EEG headset electrodes placed on their scalp. In this paper, we propose a deep learning algorithm to simultaneously learn the features and classify the emotions of EEG signals. It differs from the conventional methods as we apply deep learning on the raw signal without explicit hand-crafted feature extraction. Because the EEG signal has subject dependency, it is better to train the emotion model subject-wise, while there is not much epochs available for each subject. Deep learning algorithm provides a solution with a pre-training way using three layers of restricted Boltzmann machines (RBMs). Thus, we can use epochs of all subjects to pre-training the deep network, and use back-propagation to fine tuning the network subject by subject. Experiment results show that our proposed framework achieves better recognition accuracy than conventional algorithms.

74 citations

Journal ArticleDOI
26 Feb 2019-Sensors
TL;DR: A novel deep learning approach for MMR using the SqueezNet architecture with bypass connections between the Fire modules, a variant of the vanilla SqueezeNet, is employed for this study, which makes the MMR system more efficient.
Abstract: Make and model recognition (MMR) of vehicles plays an important role in automatic vision-based systems. This paper proposes a novel deep learning approach for MMR using the SqueezeNet architecture. The frontal views of vehicle images are first extracted and fed into a deep network for training and testing. The SqueezeNet architecture with bypass connections between the Fire modules, a variant of the vanilla SqueezeNet, is employed for this study, which makes our MMR system more efficient. The experimental results on our collected large-scale vehicle datasets indicate that the proposed model achieves 96.3% recognition rate at the rank-1 level with an economical time slice of 108.8 ms. For inference tasks, the deployed deep model requires less than 5 MB of space and thus has a great viability in real-time applications.

65 citations

Journal ArticleDOI
11 Feb 2016-Sensors
TL;DR: A novel framework for MMR using local tiled deep networks, which provides the translational, rotational, and scale invariance as well as locality, and the histogram oriented gradient to the frontal view of images prior to the LTCNN.
Abstract: Vehicle analysis involves license-plate recognition (LPR), vehicle-type classification (VTC), and vehicle make and model recognition (MMR). Among these tasks, MMR plays an important complementary role in respect to LPR. In this paper, we propose a novel framework for MMR using local tiled deep networks. The frontal views of vehicle images are first extracted and fed into the local tiled deep networks for training and testing. A local tiled convolutional neural network (LTCNN) is proposed to alter the weight sharing scheme of CNN with local tiled structure. The LTCNN unties the weights of adjacent units and then ties the units k steps from each other within a local map. This architecture provides the translational, rotational, and scale invariance as well as locality. In addition, to further deal with the colour and illumination variation, we applied the histogram oriented gradient (HOG) to the frontal view of images prior to the LTCNN. The experimental results show that our LTCNN framework achieved a 98% accuracy rate in terms of vehicle MMR.

53 citations

Journal ArticleDOI
TL;DR: This work proposes a scheme for haze removal based on Double-Discriminator Cycle-Consistent Generative Adversarial Network (DD-CycleGAN), which leverages CycleGAN to translate a hazy image to the corresponding haze-free image.

47 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Practical suggestions on the selection of many hyperparameters are provided in the hope that they will promote or guide the deployment of deep learning to EEG datasets in future research.
Abstract: Objective Electroencephalography (EEG) analysis has been an important tool in neuroscience with applications in neuroscience, neural engineering (e.g. Brain-computer interfaces, BCI's), and even commercial applications. Many of the analytical tools used in EEG studies have used machine learning to uncover relevant information for neural classification and neuroimaging. Recently, the availability of large EEG data sets and advances in machine learning have both led to the deployment of deep learning architectures, especially in the analysis of EEG signals and in understanding the information it may contain for brain functionality. The robust automatic classification of these signals is an important step towards making the use of EEG more practical in many applications and less reliant on trained professionals. Towards this goal, a systematic review of the literature on deep learning applications to EEG classification was performed to address the following critical questions: (1) Which EEG classification tasks have been explored with deep learning? (2) What input formulations have been used for training the deep networks? (3) Are there specific deep learning network structures suitable for specific types of tasks? Approach A systematic literature review of EEG classification using deep learning was performed on Web of Science and PubMed databases, resulting in 90 identified studies. Those studies were analyzed based on type of task, EEG preprocessing methods, input type, and deep learning architecture. Main results For EEG classification tasks, convolutional neural networks, recurrent neural networks, deep belief networks outperform stacked auto-encoders and multi-layer perceptron neural networks in classification accuracy. The tasks that used deep learning fell into five general groups: emotion recognition, motor imagery, mental workload, seizure detection, event related potential detection, and sleep scoring. For each type of task, we describe the specific input formulation, major characteristics, and end classifier recommendations found through this review. Significance This review summarizes the current practices and performance outcomes in the use of deep learning for EEG classification. Practical suggestions on the selection of many hyperparameters are provided in the hope that they will promote or guide the deployment of deep learning to EEG datasets in future research.

777 citations

Journal ArticleDOI
TL;DR: In this paper, the authors present a review of 154 studies that apply deep learning to EEG, published between 2010 and 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring.
Abstract: Context Electroencephalography (EEG) is a complex signal and can require several years of training, as well as advanced signal processing and feature extraction methodologies to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. Objective In this work, we review 154 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches from this large body of literature in order to inform future research and formulate recommendations. Methods Major databases spanning the fields of science and engineering were queried to identify relevant studies published in scientific journals, conferences, and electronic preprint repositories. Various data items were extracted for each study pertaining to (1) the data, (2) the preprocessing methodology, (3) the DL design choices, (4) the results, and (5) the reproducibility of the experiments. These items were then analyzed one by one to uncover trends. Results Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours, while the number of samples seen during training by a network varies from a few dozens to several millions, depending on how epochs are extracted. Interestingly, we saw that more than half the studies used publicly available data and that there has also been a clear shift from intra-subject to inter-subject approaches over the last few years. About [Formula: see text] of the studies used convolutional neural networks (CNNs), while [Formula: see text] used recurrent neural networks (RNNs), most often with a total of 3-10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was [Formula: see text] across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. Significance To help the community progress and share work more effectively, we provide a list of recommendations for future studies and emphasize the need for more reproducible research. We also make our summary table of DL and EEG papers available and invite authors of published work to contribute to it directly. A planned follow-up to this work will be an online public benchmarking portal listing reproducible results.

699 citations

Journal ArticleDOI
TL;DR: A survey of the neurophysiological research performed from 2009 to 2016 is presented, providing a comprehensive overview of the existing works in emotion recognition using EEG signals, and a set of good practice recommendations that researchers must follow to achieve reproducible, replicable, well-validated and high-quality results.
Abstract: Emotions have an important role in daily life, not only in human interaction, but also in decision-making processes, and in the perception of the world around us. Due to the recent interest shown by the research community in establishing emotional interactions between humans and computers, the identification of the emotional state of the former became a need. This can be achieved through multiple measures, such as subjective self-reports, autonomic and neurophysiological measurements. In the last years, Electroencephalography (EEG) received considerable attention from researchers, since it can provide a simple, cheap, portable, and ease-to-use solution for identifying emotions. In this paper, we present a survey of the neurophysiological research performed from 2009 to 2016, providing a comprehensive overview of the existing works in emotion recognition using EEG signals. We focus our analysis in the main aspects involved in the recognition process (e.g., subjects, features extracted, classifiers), and compare the works per them. From this analysis, we propose a set of good practice recommendations that researchers must follow to achieve reproducible, replicable, well-validated and high-quality results. We intend this survey to be useful for the research community working on emotion recognition through EEG signals, and in particular for those entering this field of research, since it offers a structured starting point.

640 citations

Journal ArticleDOI
30 Jan 2017-Sensors
TL;DR: A deep neural network structure named Convolutional Bi-directional Long Short-Term Memory networks (CBLSTM) has been designed here to address raw sensory data and is able to outperform several state-of-the-art baseline methods.
Abstract: In modern manufacturing systems and industries, more and more research efforts have been made in developing effective machine health monitoring systems. Among various machine health monitoring approaches, data-driven methods are gaining in popularity due to the development of advanced sensing and data analytic techniques. However, considering the noise, varying length and irregular sampling behind sensory data, this kind of sequential data cannot be fed into classification and regression models directly. Therefore, previous work focuses on feature extraction/fusion methods requiring expensive human labor and high quality expert knowledge. With the development of deep learning methods in the last few years, which redefine representation learning from raw data, a deep neural network structure named Convolutional Bi-directional Long Short-Term Memory networks (CBLSTM) has been designed here to address raw sensory data. CBLSTM firstly uses CNN to extract local features that are robust and informative from the sequential input. Then, bi-directional LSTM is introduced to encode temporal information. Long Short-Term Memory networks(LSTMs) are able to capture long-term dependencies and model sequential data, and the bi-directional structure enables the capture of past and future contexts. Stacked, fully-connected layers and the linear regression layer are built on top of bi-directional LSTMs to predict the target value. Here, a real-life tool wear test is introduced, and our proposed CBLSTM is able to predict the actual tool wear based on raw sensory data. The experimental results have shown that our model is able to outperform several state-of-the-art baseline methods.

520 citations