scispace - formally typeset
Search or ask a question
Author

Eugene Vinitsky

Bio: Eugene Vinitsky is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Reinforcement learning & Computer science. The author has an hindex of 13, co-authored 30 publications receiving 625 citations. Previous affiliations of Eugene Vinitsky include University of California & University of Delaware.

Papers
More filters
Posted Content
16 Oct 2017
TL;DR: This work uses Flow to develop reliable controllers for complex problems, such as controlling mixed-autonomy traffic (involving both autonomous and human-driven vehicles) in a ring road, and shows that even simple neural network policies can solve the stabilization task across density settings and generalize to out-of-distribution settings.
Abstract: Flow is a new computational framework, built to support a key need triggered by the rapid growth of autonomy in ground traffic: controllers for autonomous vehicles in the presence of complex nonlinear dynamics in traffic. Leveraging recent advances in deep Reinforcement Learning (RL), Flow enables the use of RL methods such as policy gradient for traffic control and enables benchmarking the performance of classical (including hand-designed) controllers with learned policies (control laws). Flow integrates traffic microsimulator SUMO with deep reinforcement learning library rllab and enables the easy design of traffic tasks, including different networks configurations and vehicle dynamics. We use Flow to develop reliable controllers for complex problems, such as controlling mixed-autonomy traffic (involving both autonomous and human-driven vehicles) in a ring road. For this, we first show that state-of-the-art hand-designed controllers excel when in-distribution, but fail to generalize; then, we show that even simple neural network policies can solve the stabilization task across density settings and generalize to out-of-distribution settings.

153 citations

23 Oct 2018
TL;DR: New benchmarks in the use of deep reinforcement learning to create controllers for mixed-autonomy traffic, where connected and autonomous vehicles (CAVs) interact with human drivers and infrastructure are released.
Abstract: We release new benchmarks in the use of deep reinforcement learning (RL) to create controllers for mixed-autonomy traffic, where connected and autonomous vehicles (CAVs) interact with human drivers and infrastructure. Benchmarks, such as Mujoco or the Arcade Learning Environment, have spurred new research by enabling researchers to effectively compare their results so that they can focus on algorithmic improvements and control techniques rather than system design. To promote similar advances in traffic control via RL, we propose four benchmarks, based on three new traffic scenarios, illustrating distinct reinforcement learning problems with applications to mixed-autonomy traffic. We provide an introduction to each control problem, an overview of their MDP structures, and preliminary performance results from commonly used RL algorithms. For the purpose of reproducibility, the benchmarks, reference implementations, and tutorials are available at https://github.com/flow-project/flow.

96 citations

18 Oct 2017
TL;DR: The present article formulates and approaches the mixed-autonomy traffic control problem using the powerful framework of deep reinforcement learning (RL) to provide insight for the potential for automation of traffic through mixed fleets of automated and manned vehicles.
Abstract: Traffic dynamics are often modeled by complex dynamical systems for which classical analysis tools can struggle to provide tractable policies used by transportation agencies and planners. In light of the introduction of automated vehicles into transportation systems, there is a new need for understanding the impacts of automation on transportation networks. The present article formulates and approaches the mixed-autonomy traffic control problem (where both automated and human-driven vehicles are present) using the powerful framework of deep reinforcement learning (RL). The resulting policies and emergent behaviors in mixed-autonomy traffic settings provide insight for the potential for automation of traffic through mixed fleets of automated and manned vehicles. Modelfree learning methods are shown to naturally select policies and behaviors previously designed by model-driven approaches, such as stabilization and platooning, known to improve ring road efficiency and to even exceed a theoretical velocity limit. Remarkably, RL succeeds at maximizing velocity by effectively leveraging the structure of the human driving behavior to form an efficient vehicle spacing for an intersection network. We describe our results in the context of existing control theoretic results for stability analysis and mixed-autonomy analysis. This article additionally introduces state equivalence classes to improve the sample complexity for the learning methods.

91 citations

Posted Content
TL;DR: This work proposes Unsupervised Environment Design (UED) as an alternative paradigm, where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments.
Abstract: A wide range of reinforcement learning (RL) problems - including robustness, transfer learning, unsupervised RL, and emergent complexity - require specifying a distribution of tasks or environments in which a policy will be trained. However, creating a useful distribution of environments is error prone, and takes a significant amount of developer time and effort. We propose Unsupervised Environment Design (UED) as an alternative paradigm, where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments. Existing approaches to automatically generating environments suffer from common failure modes: domain randomization cannot generate structure or adapt the difficulty of the environment to the agent's learning progress, and minimax adversarial training leads to worst-case environments that are often unsolvable. To generate structured, solvable environments for our protagonist agent, we introduce a second, antagonist agent that is allied with the environment-generating adversary. The adversary is motivated to generate environments which maximize regret, defined as the difference between the protagonist and antagonist agent's return. We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED). Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.

87 citations

Proceedings ArticleDOI
01 Nov 2018
TL;DR: Using deep reinforcement learning, novel control policies for autonomous vehicles are derived to improve the throughput of a bottleneck modeled after the San Francisco-Oakland Bay Bridge and it is shown that the AV controller provides comparable performance to ramp metering without the need to build new Ramp metering infrastructure.
Abstract: Using deep reinforcement learning, we derive novel control policies for autonomous vehicles to improve the throughput of a bottleneck modeled after the San Francisco-Oakland Bay Bridge. Using Flow, a new library for applying deep reinforcement learning to traffic micro-simulators, we consider the problem of improving the throughput of a traffic benchmark: a two-stage bottleneck where four lanes reduce to two and then reduce to one. We first characterize the inflow-outflow curve of this bottleneck without any control. We introduce an inflow of autonomous vehicles with the intent of improving the congestion through Lagrangian control. To handle the varying number of autonomous vehicles in the system we derive a per-lane variable speed limits parametrization of the controller. We demonstrate that a 10% penetration rate of controlled autonomous vehicles can improve the throughput of the bottleneck by 200 vehicles per hour: a 25% improvement at high inflows. Finally, we compare the performance of our control policies to feedback ramp metering and show that the AV controller provides comparable performance to ramp metering without the need to build new ramp metering infrastructure. Illustrative videos of the results can be found at https://sites.google.com/view/itsc-lagrangian-avs/home and code and tutorials can be found at https://github.com/flow-project/flow.

78 citations


Cited by
More filters
22 Jan 2013
TL;DR: Premises of creation of Internet portal designed to provide access to participants of educational and scientific process for the joint creation, consolidation, concentration and rapid spreading of educationaland scientific information resources in its own depository are considered.
Abstract: Premises of creation of Internet portal designed to provide access to participants of educational and scientific process for the joint creation, consolidation, concentration and rapid spreading of educational and scientific information resources in its own depository are considered. CMS-based portal content management systems’ potentiality is investigated. Architecture for Internet portal of MES of Ukraine’s information resources is offered.

969 citations

Journal ArticleDOI
TL;DR: This review summarises deep reinforcement learning algorithms, provides a taxonomy of automated driving tasks where (D)RL methods have been employed, highlights the key challenges algorithmically as well as in terms of deployment of real world autonomous driving agents, the role of simulators in training agents, and finally methods to evaluate, test and robustifying existing solutions in RL and imitation learning.
Abstract: With the development of deep representation learning, the domain of reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments. This review summarises deep reinforcement learning (DRL) algorithms and provides a taxonomy of automated driving tasks where (D)RL methods have been employed, while addressing key computational challenges in real world deployment of autonomous driving agents. It also delineates adjacent domains such as behavior cloning, imitation learning, inverse reinforcement learning that are related but are not classical RL algorithms. The role of simulators in training agents, methods to validate, test and robustify existing solutions in RL are discussed.

740 citations

Journal Article
TL;DR: This work introduces benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL, and releases benchmark tasks and datasets with a comprehensive evaluation of existing algorithms and an evaluation protocol together with an open-source codebase.
Abstract: The offline reinforcement learning (RL) problem, also known as batch RL, refers to the setting where a policy must be learned from a static dataset, without additional online data collection. This setting is compelling as it potentially allows RL methods to take advantage of large, pre-collected datasets, much like how the rise of large datasets has fueled results in supervised learning in recent years. However, existing online RL benchmarks are not tailored towards the offline setting, making progress in offline RL difficult to measure. In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL. Examples of such properties include: datasets generated via hand-designed controllers and human demonstrators, multi-objective datasets where an agent can perform different tasks in the same environment, and datasets consisting of a mixtures of policies. To facilitate research, we release our benchmark tasks and datasets with a comprehensive evaluation of existing algorithms and an evaluation protocol together with an open-source codebase. We hope that our benchmark will focus research effort on methods that drive improvements not just on simulated tasks, but ultimately on the kinds of real-world problems where offline RL will have the largest impact.

563 citations

Posted Content
TL;DR: From smart grids to disaster management, high impact problems where existing gaps can be filled by ML are identified, in collaboration with other fields, to join the global effort against climate change.
Abstract: Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help. Here we describe how machine learning can be a powerful tool in reducing greenhouse gas emissions and helping society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by machine learning, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the machine learning community to join the global effort against climate change.

441 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide an up-to-date compendium of the available results on superconducting hydrides and explain how the synergy of different methodologies led to extraordinary discoveries in the field.

265 citations