M
Mercedes Tan
Researcher at Google
Publications - 3
Citations - 5775
Mercedes Tan is an academic researcher from Google. The author has contributed to research in topics: Central processing unit & Augmented reality. The author has an hindex of 3, co-authored 3 publications receiving 4334 citations.
Papers
More filters
Posted Content
In-Datacenter Performance Analysis of a Tensor Processing Unit
Norman P. Jouppi,Cliff Young,Nishant Patil,David A. Patterson,Gaurav Agrawal,Raminder Bajwa,Sarah Bates,Suresh Bhatia,Nan Boden,Albert T. Borchers,Rick Boyle,Pierre-luc Cantin,Clifford Chao,Christopher Aaron Clark,Jeremy Coriell,Michael J. Daley,Matt Dau,Jeffrey Dean,Ben Gelb,Tara Vazir Ghaemmaghami,Rajendra Gottipati,William John Gulland,Robert Hagmann,C. Richard Ho,Doug Hogberg,John Hu,Robert Hundt,D. Hurt,Julian Ibarz,Aaron Jaffey,Alek Jaworski,Alexander Kaplan,Khaitan Harshit,Andy Koch,Naveen Kumar,Steve Lacy,James Laudon,James Law,Diemthu Le,Chris Leary,Zhuyuan Liu,Kyle Lucke,Alan Lundin,Gordon MacKean,Adriana Maggiore,Maire Mahony,Kieran Miller,Rahul Nagarajan,Ravi Narayanaswami,Ray Ni,Kathy Nix,Thomas Norrie,Mark Omernick,Narayana Penukonda,Andrew Everett Phelps,Jonathan Ross,Matt Ross,Amir Salek,Emad Samadiani,Chris Severn,Gregory Sizikov,Matthew Snelham,Jed Souter,Dan Steinberg,Andy Swing,Mercedes Tan,Gregory Michael Thorson,Bo Tian,Horia Toma,Erick Tuttle,Vijay K. Vasudevan,Richard Walter,Walter Wang,Eric Wilcox,Doe Hyun Yoon +74 more
TL;DR: This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) and compares it to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the samedatacenters.
Proceedings ArticleDOI
In-Datacenter Performance Analysis of a Tensor Processing Unit
Norman P. Jouppi,Cliff Young,Nishant Patil,David A. Patterson,Gaurav Agrawal,Raminder Bajwa,Sarah Bates,Suresh Bhatia,Nan Boden,Albert T. Borchers,Rick Boyle,Pierre-luc Cantin,Clifford Chao,Christopher Aaron Clark,Jeremy Coriell,Michael J. Daley,Matt Dau,Jeffrey Dean,Ben Gelb,Tara Vazir Ghaemmaghami,Rajendra Gottipati,William John Gulland,Robert Hagmann,C. Richard Ho,Doug Hogberg,John Hu,Robert Hundt,D. Hurt,Julian Ibarz,Aaron Jaffey,Alek Jaworski,Alexander Kaplan,Khaitan Harshit,Daniel Killebrew,Andy Koch,Naveen Kumar,Steve Lacy,James Laudon,James Law,Diemthu Le,Chris Leary,Zhuyuan Liu,Kyle Lucke,Alan Lundin,Gordon MacKean,Adriana Maggiore,Maire Mahony,Kieran Miller,Rahul Nagarajan,Ravi Narayanaswami,Ray Ni,Kathy Nix,Thomas Norrie,Mark Omernick,Narayana Penukonda,Andrew Everett Phelps,Jonathan Ross,Matt Ross,Amir Salek,Emad Samadiani,Chris Severn,Gregory Sizikov,Matthew Snelham,Jed Souter,Dan Steinberg,Andy Swing,Mercedes Tan,Gregory Michael Thorson,Bo Tian,Horia Toma,Erick Tuttle,Vijay K. Vasudevan,Richard Walter,Walter Wang,Eric Wilcox,Doe Hyun Yoon +75 more
TL;DR: The Tensor Processing Unit (TPU) as discussed by the authors is a custom ASIC deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) using a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS).
Proceedings ArticleDOI
Warehouse-scale video acceleration: co-design and deployment in the wild
Parthasarathy Ranganathan,Daniel Stodolsky,Jeff Calow,Jeremy Dorfman,Marisabel Guevara,Clinton Wills Smullen,Aki Kuusela,Raghu Balasubramanian,Sandeep Bhatia,Prakash Chauhan,Anna Cheung,In Suk Chong,Niranjani Dasharathi,Jia Feng,Brian Fosco,Samuel Foss,Ben Gelb,Sara J. Gwin,Yoshiaki Hase,He Dake,C. Richard Ho,Roy W. Huffman,Elisha Indupalli,Indira Jayaram,Poonacha Kongetira,Cho Mon Kyaw,Aaron Laursen,Yuan Li,Fong Lou,Kyle Lucke,JP Maaninen,Ramon Macias,Maire Mahony,David Alexander Munday,Srikanth Muroor,Narayana Penukonda,Eric Perkins-Argueta,Devin Persaud,Alex Ramirez,Ville-Mikko Rautio,Yolanda Ripley,Amir Salek,Sathish Sekar,Sergey N. Sokolov,Robert Springer,Don Stark,Mercedes Tan,Mark S. Wachsler,Andrew C. Walton,David A. Wickeraad,Alvin Wijaya,Hon Kwan Wu +51 more
TL;DR: In this paper, the authors describe the design and deployment of a new accelerator targeted at warehouse-scale video transcoding, and discuss key design trade-offs for balanced systems at data center scale and co-designing accelerators with large-scale distributed software systems.