M
Masa-aki Sato
Researcher at Honda
Publications - 22
Citations - 1414
Masa-aki Sato is an academic researcher from Honda. The author has contributed to research in topics: Reinforcement learning & Bayes' theorem. The author has an hindex of 9, co-authored 22 publications receiving 1293 citations. Previous affiliations of Masa-aki Sato include Hirosaki University.
Papers
More filters
Journal ArticleDOI
A Bayesian missing value estimation method for gene expression profile data
TL;DR: While the estimation performance of existing methods depends on model parameters whose determination is difficult, the BPCA method is free from this difficulty, and provides accurate and convenient estimation for missing values.
Journal ArticleDOI
Hierarchical Bayesian estimation for MEG inverse problem.
Masa-aki Sato,Taku Yoshioka,Shigeki Kajihara,Keisuke Toyama,Naokazu Goda,Kenji Doya,Mitsuo Kawato +6 more
TL;DR: Simulation results demonstrate that the proposed new hierarchical Bayesian method appropriately resolves the inverse problem even if fMRI data convey inaccurate information, while the Wiener filter method is seriously deteriorated by inaccurate fMRI information.
Journal ArticleDOI
Reinforcement learning for a biped robot based on a CPG-actor-critic method
TL;DR: Computer simulations show that training of the CPG can be successfully performed by the proposed CPG-actor-critic method, thus allowing the biped robot to not only walk stably but also adapt to environmental changes.
Journal ArticleDOI
Learning CPG-based biped locomotion with a policy gradient method
TL;DR: It is demonstrated that appropriate sensory feedback in the CPG-based control architecture can be acquired using the proposed method within a thousand trials by numerical simulations, and the robustness of the acquired controllers against environmental changes and variations in the mass properties of the robot is suggested.
Proceedings Article
Reinforcement learning for a CPG-driven biped robot
TL;DR: This study proposes a learning scheme for a CPG controller called a C PG-actor-critic model, whose learning algorithm is based on a policy gradient method, and applies this method to autonomous acquisition of biped locomotion by a biped robot simulator.