scispace - formally typeset
Search or ask a question
Author

Haobin Shi

Other affiliations: National Sun Yat-sen University
Bio: Haobin Shi is an academic researcher from Northwestern Polytechnical University. The author has contributed to research in topics: Reinforcement learning & Computer science. The author has an hindex of 10, co-authored 44 publications receiving 325 citations. Previous affiliations of Haobin Shi include National Sun Yat-sen University.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
TL;DR: The results of simulations and experiments on control of UAVs demonstrate that the proposed IBVS method with Q-Learning has better properties in stability and convergence than the competing methods.
Abstract: The objective of visual servoing aims to control an object's motion with visual feedbacks and becomes popular recently. Problems of complex modeling and instability always exist in visual servoing methods. Moreover, there are few research works on selection of the servoing gain in image-based visual servoing (IBVS) methods. This paper proposes an IBVS method with Q -Learning, where the learning rate is adjusted by a fuzzy system. Meanwhile, a synthetic preprocess is introduced to perform feature extraction. The extraction method is actually a combination of a color-based recognition algorithm and an improved contour-based recognition algorithm. For dealing with underactuated dynamics of the unmanned aerial vehicles (UAVs), a decoupled controller is designed, where the velocity and attitude are decoupled through attenuating the effects of underactuation in roll and pitch and two independent servoing gains, for linear and angular motion servoing, respectively, are designed in place of single servoing gain in traditional methods. For further improvement in convergence and stability, a reinforcement learning method, Q -Learning, is taken for adaptive servoing gain adjustment. The Q -Learning is composed of two independent learning agents for adjusting two serving gains, respectively. In order to improve the performance of the Q -Learning, a fuzzy-based method is proposed for tuning the learning rate. The results of simulations and experiments on control of UAVs demonstrate that the proposed method has better properties in stability and convergence than the competing methods.

90 citations

Journal ArticleDOI
TL;DR: An end-to-end navigation planner that translates sparse laser ranging results into movement actions and achieves map-less navigation in complex environments through a reward signal that is enhanced by intrinsic motivation, the agent explores more efficiently, and the learned strategy is more reliable.
Abstract: In this article, we develop a navigation strategy based on deep reinforcement learning (DRL) for mobile robots. Because of the large difference between simulation and reality, most of the trained DRL models cannot be directly migrated into real robots. Moreover, how to explore in a sparsely rewarded environment is also a long-standing problem of DRL. This article proposes an end-to-end navigation planner that translates sparse laser ranging results into movement actions. Using this highly abstract data as input, agents trained by simulation can be extended to the real scene for practical application. For map-less navigation across obstacles and traps, it is difficult to reach the target via random exploration. Curiosity is used to encourage agents to explore the state of an environment that has not been visited and as an additional reward for exploring behavior. The agent relies on the self-supervised model to predict the next state, based on the current state and the executed action. The prediction error is used as a measure of curiosity. The experimental results demonstrate that without any manual design features and previous demonstrations, the proposed method accomplishes map-less navigation in complex environments. Through a reward signal that is enhanced by intrinsic motivation, the agent explores more efficiently, and the learned strategy is more reliable.

75 citations

Journal ArticleDOI
TL;DR: A homography method that uses a priori visual information is proposed to predict all of the missing feature points and to ensure the execution of IBVS to address the problem ofMissing feature points in current images during a visual navigation task.
Abstract: Image-based visual servoing (IBVS) can reach a desired position for a relatively stationary target using continuous visual feedback. Proper feature extraction and appropriate servoing control laws are essential to performance for IBVS. IBVS control can be interrupted or interfered abruptly if no features are extracted when the observed object is occluded. To address the problem of missing feature points in current images during a visual navigation task, a homography method that uses a priori visual information is proposed to predict all of the missing feature points and to ensure the execution of IBVS. The mixture parameter for the image Jacobian matrix can also affect the control of IBVS. The settings for the mixture parameter are heuristic so there is no a systematic approach for most IBVS applications. An adaptive control approach is proposed to determine the mixture parameter. The proposed method uses a reinforcement learning (RL) method to adaptively adjust the mixture parameter during the robot movement, which allows more efficient control than a constant parameter. A logarithmic interval state-space partition for RL is used to ensure efficient learning. The integrated visual servoing control system is validated by several experiments that involve wheeled mobile robots reaching a target with a desired configuration. The results for simulation and experiment demonstrate that the proposed method has a faster convergence rate than other methods.

49 citations

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed method with fuzzy Bayesian reinforcement learning (RL) has better knowledge representation and strategy selection than other competing methods.
Abstract: A robot soccer system is a typical complex time-sequence decision-making system. Problems of uncertain knowledge representation and complex models always exist in robot soccer games. To achieve an adaptive decision-making mechanism, a method with fuzzy Bayesian reinforcement learning (RL) is proposed in this paper. To extract the features utilized in the proposed learning method, a fuzzy comprehensive evaluation method (FCEM) is developed. This method classifies the situations in robot soccer games into a set of features. With the fuzzy analytical hierarchy process (FAHP), the FCEM can calculate the weights according to defined factors for these features, which comprise the dimensionality of the state space. The weight imposed on each feature determines the range of each dimension. Through a Bayesian network, the comprehensively evaluated features are transformed into decision bases. An RL method for strategy selection over time is implemented. The fuzzy mechanism can skillfully adapt experiences to the learning system and provide flexibility in state aggregation, thus improving learning efficiency. The experimental results demonstrate that the proposed method has better knowledge representation and strategy selection than other competing methods.

44 citations

Journal ArticleDOI
TL;DR: An adaptive decision-making method that uses reinforcement learning (RL), and the decision- Making system for a robotic soccer game is composed of two subsystems, which demonstrate that the proposed method allows satisfactory decision- making.
Abstract: Robotic soccer games, which have become popular, require timely and precise decision-making in a dynamic environment. To address the problems of complexity in a critical situation, policy improvement in robotic soccer games must occur. This paper proposes an adaptive decision-making method that uses reinforcement learning (RL), and the decision-making system for a robotic soccer game is composed of two subsystems. The first subsystem in the architecture for the proposed method criticizes the situation, and the second subsystem implements decision-making policy. Inspired by the support vector machine (SVM), a situation classification method, which is called an improved SVM, embeds a decision tree structure and simultaneously addresses the problems of a large scale and multiple classifications. When a variety of situations that are collected in the field are classified and congregated into the tree structure, the problem of local strategy selection for each individual class of situations over time is regarded as a RL problem and is solved using a Q-learning method. The results of simulations and experiments demonstrate that the proposed method allows satisfactory decision-making.

36 citations


Cited by
More filters
Journal ArticleDOI
Kai Zhu1, Tao Zhang1
TL;DR: This paper systematically compares and analyzes the relationship and differences between four typical application scenarios: local obstacle avoidance, indoor navigation, multi-robot navigation, and social navigation; and describes the development of DRL-based navigation.
Abstract: Navigation is a fundamental problem of mobile robots, for which Deep Reinforcement Learning (DRL) has received significant attention because of its strong representation and experience learning abilities. There is a growing trend of applying DRL to mobile robot navigation. In this paper, we review DRL methods and DRL-based navigation frameworks. Then we systematically compare and analyze the relationship and differences between four typical application scenarios: local obstacle avoidance, indoor navigation, multi-robot navigation, and social navigation. Next, we describe the development of DRL-based navigation. Last, we discuss the challenges and some possible solutions regarding DRL-based navigation.

117 citations

Journal ArticleDOI
TL;DR: The experiment indicates that deep learning algorithms are suitable for intrusion detection in IoT network environment.
Abstract: With the popularity of Internet of Things (IoT) technology, the security of the IoT network has become an important issue. Traditional intrusion detection systems have their limitations when applied to the IoT network due to resource constraints and the complexity. This research focusses on the design, implementation and testing of an intrusion detection system which uses a hybrid placement strategy based on a multi-agent system, blockchain and deep learning algorithms. The system consists of the following modules: data collection, data management, analysis, and response. The National security lab–knowledge discovery and data mining NSL-KDD dataset is used to test the system. The results demonstrate the efficiency of deep learning algorithms when detecting attacks from the transport layer. The experiment indicates that deep learning algorithms are suitable for intrusion detection in IoT network environment.

114 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposedinline-formula-VAE outperforms the state-of-the-art algorithms for anomaly detection from video data and can achieve better performance for detecting both local abnormal events and global abnormal events.
Abstract: Security surveillance is critical to social harmony and people’s peaceful life. It has a great impact on strengthening social stability and life safeguarding. Detecting anomaly timely, effectively and efficiently in video surveillance remains challenging. This paper proposes a new approach, called $S^{2}$ -VAE, for anomaly detection from video data. The $S^{2}$ -VAE consists of two proposed neural networks: a Stacked Fully Connected Variational AutoEncoder ( $S_{F}$ -VAE) and a Skip Convolutional VAE ( $S_{C}$ -VAE). The $S_{F}$ -VAE is a shallow generative network to obtain a model like Gaussian mixture to fit the distribution of the actual data. The $S_{C}$ -VAE, as a key component of $S^{2}$ -VAE, is a deep generative network to take advantages of CNN, VAE and skip connections. Both $S_{F}$ -VAE and $S_{C}$ -VAE are efficient and effective generative networks and they can achieve better performance for detecting both local abnormal events and global abnormal events. The proposed $S^{2}$ -VAE is evaluated using four public datasets. The experimental results show that the $S^{2}$ -VAE outperforms the state-of-the-art algorithms. The code is available publicly at https://github.com/tianwangbuaa/ .

102 citations

Journal ArticleDOI
Haozhe Wang1, Yulei Wu1, Geyong Min1, Jie Xu2, Pengcheng Tang2 
TL;DR: Deep Reinforcement Learning is leveraged to extract knowledge from experience by interacting with the network and enable dynamic adjustment of the resources allocated to various slices in order to maximise the resource utilisation while guaranteeing the Quality-of-Service (QoS).
Abstract: Network slicing is designed to support a variety of emerging applications with diverse performance and flexibility requirements, by dividing the physical network into multiple logical networks. These applications along with a massive number of mobile phones produce large amounts of data, bringing tremendous challenges for network slicing performance. From another perspective, this huge amount of data also offers a new opportunity for the management of network slicing resources. Leveraging the knowledge and insights retrieved from the data, we develop a novel Machine Learning-based scheme for dynamic resource scheduling for networks slicing, aiming to achieve automatic and efficient resource optimisation and End-to-End (E2E) service reliability. However, it is difficult to obtain the user-related data, which is crucial to understand the user behaviour and requests, due to the privacy issue. Therefore, Deep Reinforcement Learning (DRL) is leveraged to extract knowledge from experience by interacting with the network and enable dynamic adjustment of the resources allocated to various slices in order to maximise the resource utilisation while guaranteeing the Quality-of-Service (QoS). The experiment results demonstrate that the proposed resource scheduling scheme can dynamically allocate resources for multiple slices and meet the corresponding QoS requirements.

97 citations