Y
Yuyun Liao
Researcher at Intel
Publications - 8
Citations - 2537
Yuyun Liao is an academic researcher from Intel. The author has contributed to research in topics: Wallace tree & Voltage regulator. The author has an hindex of 7, co-authored 8 publications receiving 1490 citations.
Papers
More filters
Journal ArticleDOI
Loihi: A Neuromorphic Manycore Processor with On-Chip Learning
Michael Davies,Narayan Srinivasa,Tsung-Han Lin,Gautham N. Chinya,Cao Yongqiang,Sri Harsha Choday,Georgios D. Dimou,Prasad Joshi,Nabil Imam,Shweta Jain,Yuyun Liao,Chit-Kwan Lin,Andrew Lines,Ruokun Liu,Deepak A. Mathaikutty,Steven McCoy,Arnab Paul,Jonathan Tse,Guruguhanathan Venkataramanan,Yi-Hsin Weng,Andreas Wild,Yoon Seok Yang,Hong Wang +22 more
TL;DR: Loihi is a 60-mm2 chip fabricated in Intels 14-nm process that advances the state-of-the-art modeling of spiking neural networks in silicon, and can solve LASSO optimization problems with over three orders of magnitude superior energy-delay-product compared to conventional solvers running on a CPU iso-process/voltage/area.
Patent
Processing multiply-accumulate operations in a single cycle
TL;DR: A multiply-accumulate unit (MAC) as mentioned in this paper can perform Wallace tree and carry look-ahead adder functions simultaneously for different operations, such as lookahead adders and Wallace trees.
Journal ArticleDOI
A high-performance and low-power 32-bit multiply-accumulate unit with single-instruction-multiple-data (SIMD) feature
Yuyun Liao,D.B. Roberts +1 more
TL;DR: A high-performance and low-power 32-bit multiply-accumulate unit (MAC) is described in this paper, which leverages the advantage of a 16-bit encoding scheme without adding extra delay to the faster four-stage Wallace tree of a 12- bit encoding scheme.
Patent
Fast 16-B early termination implementation for 32-B multiply-accumulate unit
Yuyun Liao,David B. Roberts +1 more
TL;DR: In this paper, a mixed length encoding unit with a 16-b Booth encoder coupled with a four stage Wallace tree is described. But the authors do not specify the number of vectors to be produced.
Proceedings ArticleDOI
An energy-efficient graphics processor featuring fine-grain DVFS with integrated voltage regulators, execution-unit turbo, and retentive sleep in 14nm tri-gate CMOS
Pascal Meinerzhagen,Carlos Tokunaga,Andres Malavasi,Vaibhav Vaidya,Ashwin Mendon,Deepak A. Mathaikutty,Jaydeep P. Kulkarni,Charles Augustine,Minki Cho,Stephen Kim,George E. Matthew,Rinkle Jain,Joseph F. Ryan,Chung-Ching Peng,Somnath Paul,Sriram R. Vangal,Brando Perez Esparza,Luis Cuellar,Michael Woodman,Bala Iyer,Subramaniam Maiyuran,Gautham N. Chinya,Chris Zou,Yuyun Liao,Krishnan Ravichandran,Hong Wang,Muhammad M. Khellah,James W. Tschanz,Vivek De +28 more
TL;DR: A coarse-grain DVFS, driven by a power-management IC (PMIC) setting a shared rail voltage (VIN) and not allowing a performance-critical unit to use on demand a higher V/F without an energy penalty for the rest of the GPU.