H
Hengyuan Hu
Researcher at Facebook
Publications - 29
Citations - 1228
Hengyuan Hu is an academic researcher from Facebook. The author has contributed to research in topics: Computer science & Reinforcement learning. The author has an hindex of 7, co-authored 17 publications receiving 761 citations. Previous affiliations of Hengyuan Hu include Carnegie Mellon University.
Papers
More filters
Posted Content
Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures
TL;DR: This paper introduces network trimming which iteratively optimizes the network by pruning unimportant neurons based on analysis of their outputs on a large dataset, inspired by an observation that the outputs of a significant portion of neurons in a large network are mostly zero.
Posted Content
"Other-Play" for Zero-Shot Coordination
TL;DR: This work introduces a novel learning algorithm called other-play (OP), that enhances self-play by looking for more robust strategies, exploiting the presence of known symmetries in the underlying problem.
Journal ArticleDOI
Human-level play in the game of Diplomacy by combining language models with strategic reasoning
Anton Bakhtin,Nick Brown,Emily Dinan,Gabriele Farina,Colin Flaherty,D. S. Fried,Andrew Goff,Jonathan Gray,Hengyuan Hu,Athul Paul Jacob,Mojtaba Komeili,Karthik Konath,Minae Kwon,Adam Lerer,Mike Lewis,Alexander L. Miller,S. Mitts,Adithya Renduchintala,Stephen Roller,Dirk Rowe,Weiyan Shi,Joe Spisak,Alexander Wei,David J. Wu,Hugh Zhang,Markus Zijlstra +25 more
TL;DR: Cicero as mentioned in this paper is the first AI agent to achieve human-level performance in Diplomacy, a strategy game involving both cooperation and competition that emphasizes natural language negotiation and tactical coordination between seven players.
Proceedings Article
Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
Hengyuan Hu,Jakob Foerster +1 more
TL;DR: A new deep multi-agent RL method, the Simplified Action Decoder (SAD), which resolves this contradiction exploiting the centralized training phase and establishes a new SOTA for learning methods for 2-5 players on the self-play part of the Hanabi challenge.
Posted Content
Hierarchical Decision Making by Generating and Following Natural Language Instructions
TL;DR: Experiments show that models using natural language as a latent variable significantly outperform models that directly imitate human actions and the compositional structure of language proves crucial for action representation.