scispace - formally typeset
Search or ask a question
Topic

Reinforcement learning

About: Reinforcement learning is a research topic. Over the lifetime, 46064 publications have been published within this topic receiving 1055697 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jul 2020
TL;DR: In this article, the authors propose an end-to-end reinforcement learning (RL) approach to UAV-enabled data collection from Internet of Things (IoT) devices in an urban environment.
Abstract: Autonomous deployment of unmanned aerial vehicles (UAVs) supporting next-generation communication networks requires efficient trajectory planning methods. We propose a new end-to-end reinforcement learning (RL) approach to UAV-enabled data collection from Internet of Things (IoT) devices in an urban environment. An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance. While previous approaches, learning and non-learning based, must perform expensive recomputations or relearn a behavior when important scenario parameters such as the number of sensors, sensor positions, or maximum flying time, change, we train a double deep Q-network (DDQN) with combined experience replay to learn a UAV control policy that generalizes over changing scenario parameters. By exploiting a multi-layer map of the environment fed through convolutional network layers to the agent, we show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters that balance the data collection goal with flight time efficiency and safety constraints. Considerable advantages in learning efficiency from using a map centered on the UAV’s position over a non-centered map are also illustrated.

24 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a new ensemble deep graph reinforcement learning network to construct a high-precision traffic volume forecasting model, which can effectively improve freeway traffic efficiency and the travel comfort of humans.
Abstract: Spatio-temporal traffic volume forecasting technologies can effectively improve freeway traffic efficiency and the travel comfort of humans. To construct a high-precision traffic volume forecasting model, this study proposed a new ensemble deep graph reinforcement learning network. The modeling process of the spatio-temporal prediction model mainly included three steps. In step I, raw spatiotemporal traffic network datasets (traffic volumes, traffic speeds, weather, and holidays) were preprocessed and the adjacency matrix was constructed. In step II, a graph attention network (GAT) and graph convolution network (GCN) were used as the main predictors to build the spatio-temporal traffic volume forecasting model and obtain the forecasting results, respectively. In step III, deep reinforcement learning was used to effectively analyze the correlations between the forecasting results from these two neural networks and the final results, so as to optimize the weight coefficient. The final result of the proposed model was obtained by combining the forecasting results from the GAT and GCN with the weight coefficient. Based on summarizing and analyzing the experimental results, it can be concluded that: (1) deep reinforcement learning can effectively integrate the two different graph neural networks and achieve better results than traditional ensemble methods; and (2) the presented ensemble model performs better than twenty-one models proposed by other researchers for all studied cases.

24 citations

Journal ArticleDOI
TL;DR: A deep reinforcement learning (RL) method for continuous fine-grained drone control that allows for acquiring high-quality frontal view person shots and an appropriate reward-shaping approach is proposed to improve the stability of the employed continuous RL method.
Abstract: Drones, also known as unmanned aerial vehicles, can be used to aid various aerial cinematography tasks. However, using drones for aerial cinematography requires the coordination of several people, increasing the cost and reducing the shooting flexibility, while also increasing the cognitive load of the drone operators. To overcome these limitations, we propose a deep reinforcement learning (RL) method for continuous fine-grained drone control, that allows for acquiring high-quality frontal view person shots. To this end, a head pose image dataset is combined with 3D models and face alignment/warping techniques to develop an RL environment that realistically simulates the effects of the drone control commands. An appropriate reward-shaping approach is also proposed to improve the stability of the employed continuous RL method. Apart from performing continuous control, it was demonstrated that the proposed method can be also effectively combined with simulation environments that support only discrete control commands, improving the control accuracy, even in this case. The effectiveness of the proposed technique is experimentally demonstrated using several quantitative and qualitative experiments.

24 citations

Proceedings ArticleDOI
24 Oct 2020
TL;DR: In this article, a generative framework is proposed to generate safety-critical scenarios for evaluating specific task algorithms in the real world using a series of autoregressive building blocks and sampling from the joint distribution of these blocks.
Abstract: Long-tail and rare event problems become crucial when autonomous driving algorithms are applied in the real world. For the purpose of evaluating systems in challenging settings, we propose a generative framework to create safety-critical scenarios for evaluating specific task algorithms. We first represent the traffic scenarios with a series of autoregressive building blocks and generate diverse scenarios by sampling from the joint distribution of these blocks. We then train the generative model as an agent (or a generator) to search the risky scenario parameters for a given driving algorithm. We treat the driving algorithm as an environment that returns high reward to the agent when a risky scenario is generated. The whole process is optimized by the policy gradient reinforcement learning method. Through the experiments conducted on several scenarios in the simulation, we demonstrate that the proposed framework generates safety-critical scenarios more efficiently than grid search or human design methods. Another advantage of this method is its adaptiveness to the routes and parameters.

24 citations

Proceedings ArticleDOI
14 Jun 2020
TL;DR: This work uses reinforcement learning for online key-frame decision in dynamic video segmentation as a deep reinforcement learning problem and learns an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return.
Abstract: For real-time semantic video segmentation, most recent works utilised a dynamic framework with a key scheduler to make online key/non-key decisions. Some works used a fixed key scheduling policy, while others proposed adaptive key scheduling methods based on heuristic strategies, both of which may lead to suboptimal global performance. To overcome this limitation, we model the online key decision process in dynamic video segmentation as a deep reinforcement learning problem and learn an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return. Moreover, we study the application of dynamic video segmentation on face videos, a field that has not been investigated before. By evaluating on the 300VW dataset, we show that the performance of our reinforcement key scheduler outperforms that of various baselines in terms of both effective key selections and running speed. Further results on the Cityscapes dataset demonstrate that our proposed method can also generalise to other scenarios. To the best of our knowledge, this is the first work to use reinforcement learning for online key-frame decision in dynamic video segmentation, and also the first work on its application on face videos.

24 citations


Network Information
Related Topics (5)
Robustness (computer science)
94.7K papers, 1.6M citations
88% related
Artificial neural network
207K papers, 4.5M citations
88% related
Deep learning
79.8K papers, 2.1M citations
88% related
Optimization problem
96.4K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20243
20237,164
202213,747
20218,484
20208,703