Z
Ziyu Wang
Researcher at Google
Publications - 122
Citations - 15809
Ziyu Wang is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Computer science. The author has an hindex of 35, co-authored 98 publications receiving 10745 citations. Previous affiliations of Ziyu Wang include University of Oxford & Wuhan University.
Papers
More filters
Journal ArticleDOI
Taking the Human Out of the Loop: A Review of Bayesian Optimization
TL;DR: This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.
Journal ArticleDOI
Grandmaster level in StarCraft II using multi-agent reinforcement learning.
Oriol Vinyals,Igor Babuschkin,Wojciech Marian Czarnecki,Michael Mathieu,Andrew Dudzik,Junyoung Chung,David H. Choi,Richard E. Powell,Timo Ewalds,Petko Georgiev,Junhyuk Oh,Dan Horgan,Manuel Kroiss,Ivo Danihelka,Aja Huang,Laurent Sifre,Trevor Cai,John P. Agapiou,Max Jaderberg,Alexander Vezhnevets,Rémi Leblond,Tobias Pohlen,Valentin Dalibard,David Budden,Yury Sulsky,James Molloy,Tom Le Paine,Caglar Gulcehre,Ziyu Wang,Tobias Pfaff,Yuhuai Wu,Roman Ring,Dani Yogatama,Dario Wünsch,Katrina McKinney,Oliver Smith,Tom Schaul,Timothy P. Lillicrap,Koray Kavukcuoglu,Demis Hassabis,Chris Apps,David Silver +41 more
TL;DR: The agent, AlphaStar, is evaluated, which uses a multi-agent reinforcement learning algorithm and has reached Grandmaster level, ranking among the top 0.2% of human players for the real-time strategy game StarCraft II.
Posted Content
Dueling Network Architectures for Deep Reinforcement Learning
TL;DR: This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.
Proceedings Article
Dueling network architectures for deep reinforcement learning
TL;DR: In this paper, a dueling network is proposed to represent two separate estimators for the state value function and the state-dependent advantage function, which leads to better policy evaluation in the presence of many similar-valued actions.
Posted Content
Emergence of Locomotion Behaviours in Rich Environments
Nicolas Heess,Dhruva Tb,Srinivasan Sriram,Jay Lemmon,Josh Merel,Greg Wayne,Yuval Tassa,Tom Erez,Ziyu Wang,S. M. Ali Eslami,Martin Riedmiller,David Silver +11 more
TL;DR: This paper explores how a rich environment can help to promote the learning of complex behavior, and finds that this encourages the emergence of robust behaviours that perform well across a suite of tasks.