Topic

Reinforcement learning

About: Reinforcement learning is a research topic. Over the lifetime, 46064 publications have been published within this topic receiving 1055697 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach

[...]

Harald Bayerlein¹, Mirco Theile², Marco Caccamo², David Gesbert¹•Institutions (2)

Institut Eurécom¹, Technische Universität München²

01 Jul 2020

TL;DR: In this article, the authors propose an end-to-end reinforcement learning (RL) approach to UAV-enabled data collection from Internet of Things (IoT) devices in an urban environment.

...read moreread less

Abstract: Autonomous deployment of unmanned aerial vehicles (UAVs) supporting next-generation communication networks requires efficient trajectory planning methods. We propose a new end-to-end reinforcement learning (RL) approach to UAV-enabled data collection from Internet of Things (IoT) devices in an urban environment. An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance. While previous approaches, learning and non-learning based, must perform expensive recomputations or relearn a behavior when important scenario parameters such as the number of sensors, sensor positions, or maximum flying time, change, we train a double deep Q-network (DDQN) with combined experience replay to learn a UAV control policy that generalizes over changing scenario parameters. By exploiting a multi-layer map of the environment fed through convolutional network layers to the agent, we show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters that balance the data collection goal with flight time efficiency and safety constraints. Considerable advantages in learning efficiency from using a map centered on the UAV’s position over a non-centered map are also illustrated.

...read moreread less

24 citations

Journal Article•DOI•

A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network

[...]

Pan Shang, Xinwei Liu, Chengqing Yu, Guangxi Yan, Qingqing Xiang, Xiwei Mi - Show less +2 more

01 Jan 2022-Digital signal processing

TL;DR: Wang et al. as mentioned in this paper proposed a new ensemble deep graph reinforcement learning network to construct a high-precision traffic volume forecasting model, which can effectively improve freeway traffic efficiency and the travel comfort of humans.

...read moreread less

Abstract: Spatio-temporal traffic volume forecasting technologies can effectively improve freeway traffic efficiency and the travel comfort of humans. To construct a high-precision traffic volume forecasting model, this study proposed a new ensemble deep graph reinforcement learning network. The modeling process of the spatio-temporal prediction model mainly included three steps. In step I, raw spatiotemporal traffic network datasets (traffic volumes, traffic speeds, weather, and holidays) were preprocessed and the adjacency matrix was constructed. In step II, a graph attention network (GAT) and graph convolution network (GCN) were used as the main predictors to build the spatio-temporal traffic volume forecasting model and obtain the forecasting results, respectively. In step III, deep reinforcement learning was used to effectively analyze the correlations between the forecasting results from these two neural networks and the final results, so as to optimize the weight coefficient. The final result of the proposed model was obtained by combining the forecasting results from the GAT and GCN with the weight coefficient. Based on summarizing and analyzing the experimental results, it can be concluded that: (1) deep reinforcement learning can effectively integrate the two different graph neural networks and achieve better results than traditional ensemble methods; and (2) the presented ensemble model performs better than twenty-one models proposed by other researchers for all studied cases.

...read moreread less

24 citations

Journal Article•DOI•

Continuous drone control using deep reinforcement learning for frontal view person shooting

[...]

Nikolaos Passalis¹, Anastasios Tefas¹•Institutions (1)

Aristotle University of Thessaloniki¹

09 Apr 2020-Neural Computing and Applications

TL;DR: A deep reinforcement learning (RL) method for continuous fine-grained drone control that allows for acquiring high-quality frontal view person shots and an appropriate reward-shaping approach is proposed to improve the stability of the employed continuous RL method.

...read moreread less

Abstract: Drones, also known as unmanned aerial vehicles, can be used to aid various aerial cinematography tasks. However, using drones for aerial cinematography requires the coordination of several people, increasing the cost and reducing the shooting flexibility, while also increasing the cognitive load of the drone operators. To overcome these limitations, we propose a deep reinforcement learning (RL) method for continuous fine-grained drone control, that allows for acquiring high-quality frontal view person shots. To this end, a head pose image dataset is combined with 3D models and face alignment/warping techniques to develop an RL environment that realistically simulates the effects of the drone control commands. An appropriate reward-shaping approach is also proposed to improve the stability of the employed continuous RL method. Apart from performing continuous control, it was demonstrated that the proposed method can be also effectively combined with simulation environments that support only discrete control commands, improving the control accuracy, even in this case. The effectiveness of the proposed technique is experimentally demonstrated using several quantitative and qualitative experiments.

...read moreread less

24 citations

Proceedings Article•DOI•

Learning to Collide: An Adaptive Safety-Critical Scenarios Generating Method

[...]

Wenhao Ding¹, Baiming Chen², Minjun Xu¹, Ding Zhao¹•Institutions (2)

Carnegie Mellon University¹, Tsinghua University²

24 Oct 2020

TL;DR: In this article, a generative framework is proposed to generate safety-critical scenarios for evaluating specific task algorithms in the real world using a series of autoregressive building blocks and sampling from the joint distribution of these blocks.

...read moreread less

Abstract: Long-tail and rare event problems become crucial when autonomous driving algorithms are applied in the real world. For the purpose of evaluating systems in challenging settings, we propose a generative framework to create safety-critical scenarios for evaluating specific task algorithms. We first represent the traffic scenarios with a series of autoregressive building blocks and generate diverse scenarios by sampling from the joint distribution of these blocks. We then train the generative model as an agent (or a generator) to search the risky scenario parameters for a given driving algorithm. We treat the driving algorithm as an environment that returns high reward to the agent when a risky scenario is generated. The whole process is optimized by the policy gradient reinforcement learning method. Through the experiments conducted on several scenarios in the simulation, we demonstrate that the proposed framework generates safety-critical scenarios more efficiently than grid search or human design methods. Another advantage of this method is its adaptiveness to the routes and parameters.

...read moreread less

24 citations

Proceedings Article•DOI•

Dynamic Face Video Segmentation via Reinforcement Learning

[...]

Yujiang Wang¹, Mingzhi Dong², Jie Shen¹, Yang Wu³, Shiyang Cheng⁴, Maja Pantic¹ - Show less +2 more•Institutions (4)

Imperial College London¹, University College London², Kyoto University³, Samsung⁴

14 Jun 2020

TL;DR: This work uses reinforcement learning for online key-frame decision in dynamic video segmentation as a deep reinforcement learning problem and learns an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return.

...read moreread less

Abstract: For real-time semantic video segmentation, most recent works utilised a dynamic framework with a key scheduler to make online key/non-key decisions. Some works used a fixed key scheduling policy, while others proposed adaptive key scheduling methods based on heuristic strategies, both of which may lead to suboptimal global performance. To overcome this limitation, we model the online key decision process in dynamic video segmentation as a deep reinforcement learning problem and learn an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return. Moreover, we study the application of dynamic video segmentation on face videos, a field that has not been investigated before. By evaluating on the 300VW dataset, we show that the performance of our reinforcement key scheduler outperforms that of various baselines in terms of both effective key selections and running speed. Further results on the Cityscapes dataset demonstrate that our proposed method can also generalise to other scenarios. To the best of our knowledge, this is the first work to use reinforcement learning for online key-frame decision in dynamic video segmentation, and also the first work on its application on face videos.

...read moreread less

24 citations

Collapse

Network Information

Performance

Metrics

66,933

Papers

1,506,753

Citations

No. of papers in the topic in previous years
Year	Papers
2025	1
2024	3
2023	7,164
2022	13,747
2021	8,484
2020	8,703

Reinforcement learning

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics