H
Horia Mania
Researcher at University of California, Berkeley
Publications - 28
Citations - 2229
Horia Mania is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Linear-quadratic regulator & Regret. The author has an hindex of 17, co-authored 27 publications receiving 1668 citations.
Papers
More filters
Journal ArticleDOI
On the Sample Complexity of the Linear Quadratic Regulator
TL;DR: This paper proposes a multi-stage procedure that estimates a model from a few experimental trials, estimates the error in that model with respect to the truth, and then designs a controller using both the model and uncertainty estimate, and provides end-to-end bounds on the relative error in control cost.
Posted Content
Simple random search provides a competitive approach to reinforcement learning
TL;DR: This work introduces a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the benchmark MuJoCo locomotion tasks.
Proceedings Article
Simple random search of static linear policies is competitive for reinforcement learning
TL;DR: This work introduces a model-free random search algorithm for training static, linear policies for continuous control problems and evaluates the performance of this method over hundreds of random seeds and many different hyperparameter configurations for each benchmark task.
Proceedings Article
Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator
TL;DR: In this paper, a provably polynomial time algorithm that achieves sub-linear regret was proposed for adaptive control of the Linear Quadratic Regulator (LQR).
Proceedings Article
Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
TL;DR: In this paper, the authors show that the OLS estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory, using a generalization of Mendelson's small-ball method to dependent data, eschewing the use of standard mixing-time arguments.