S
Stephen F. Heil
Researcher at Microsoft
Publications - 24
Citations - 3141
Stephen F. Heil is an academic researcher from Microsoft. The author has contributed to research in topics: Hardware acceleration & Component (UML). The author has an hindex of 13, co-authored 24 publications receiving 2679 citations. Previous affiliations of Stephen F. Heil include Unisys.
Papers
More filters
Journal ArticleDOI
A reconfigurable fabric for accelerating large-scale datacenter services
Andrew Putnam,Adrian M. Caulfield,Eric S. Chung,Derek Chiou,Kypros Constantinides,John Demme,Hadi Esmaeilzadeh,Jeremy Fowers,Gopi Prashanth Gopal,Jan Gray,Michael Haselman,Scott Hauck,Stephen F. Heil,Amir Hormati,Joo-Young Kim,Sitaram Lanka,James R. Larus,Eric C. Peterson,Simon Pope,Aaron L. Smith,Jason Thong,Phillip Yi Xiao,Doug Burger +22 more
TL;DR: The authors deployed the reconfigurable fabric in a bed of 1,632 servers and FPGAs in a production datacenter and successfully used it to accelerate the ranking portion of the Bing Web search engine by nearly a factor of two.
Journal ArticleDOI
A reconfigurable fabric for accelerating large-scale datacenter services
Andrew Putnam,Adrian M. Caulfield,Eric S. Chung,Derek Chiou,Kypros Constantinides,John Demme,Hadi Esmaeilzadeh,Jeremy Fowers,Gopi Prashanth Gopal,Jan Gray,Michael Haselman,Scott Hauck,Stephen F. Heil,Amir Hormati,Joo-Young Kim,Sitaram Lanka,James R. Larus,Eric C. Peterson,Simon Pope,Aaron L. Smith,Jason Thong,Phillip Yi Xiao,Doug Burger +22 more
TL;DR: The requirements and architecture of the fabric are described, the critical engineering challenges and solutions needed to make the system robust in the presence of failures are detailed, and the performance, power, and resilience of the system when ranking candidate documents are measured.
Proceedings ArticleDOI
A cloud-scale acceleration architecture
Adrian M. Caulfield,Eric S. Chung,Andrew Putnam,Hari Angepat,Jeremy Fowers,Michael Haselman,Stephen F. Heil,Matt Humphrey,Puneet Kaur,Joo-Young Kim,Lo Daniel,Todd Massengill,Kalin Ovtcharov,Michael K. Papamichael,Lisa Woods,Sitaram Lanka,Derek Chiou,Doug Burger +17 more
TL;DR: A new cloud architecture that uses reconfigurable logic to accelerate both network plane functions and applications, and is much more scalable than prior work which used secondary rack-scale networks for inter-FPGA communication.
Proceedings ArticleDOI
A configurable cloud-scale DNN processor for real-time AI
Jeremy Fowers,Kalin Ovtcharov,Michael K. Papamichael,Todd Massengill,Ming Liu,Lo Daniel,Shlomi Alkalay,Michael Haselman,Logan Adams,Mahdi Ghandi,Stephen F. Heil,Prerak Patel,Adam Sapek,Gabriel Weisz,Lisa Woods,Sitaram Lanka,Steven K. Reinhardt,Adrian M. Caulfield,Eric S. Chung,Doug Burger +19 more
TL;DR: This paper describes the NPU architecture for Project Brainwave, a production-scale system for real-time AI, and achieves more than an order of magnitude improvement in latency and throughput over state-of-the-art GPUs on large RNNs at a batch size of 1.5 teraflops.
Journal ArticleDOI
Serving DNNs in Real Time at Datacenter Scale with Project Brainwave
Eric S. Chung,Jeremy Fowers,Kalin Ovtcharov,Michael K. Papamichael,Adrian M. Caulfield,Todd Massengill,Ming Liu,Lo Daniel,Shlomi Alkalay,Michael Haselman,Maleen Abeydeera,Logan Adams,Hari Angepat,Christian Boehn,Derek Chiou,Oren Firestein,Alessandro Forin,Kang Su Gatlin,Mahdi Ghandi,Stephen F. Heil,Kyle Holohan,Ahmad M. El Husseini,Tamas Juhasz,Kara Kagi,Ratna Kumar Kovvuri,Sitaram Lanka,Friedel van Megen,Dima Mukhortov,Prerak Patel,Brandon Perez,Amanda Rapsang,Steven K. Reinhardt,Bita Darvish Rouhani,Adam Sapek,Raja Seera,Sangeetha Shekar,Balaji Sridharan,Gabriel Weisz,Lisa Woods,Phillip Yi Xiao,Dan Zhang,Ritchie Zhao,Doug Burger +42 more
TL;DR: Project Brainwave, Microsofts principal infrastructure for AI serving in real time, accelerates deep neural network inferencing in major services such as Bings intelligent search features and Azure by exploiting distributed model parallelism and pinning over low-latency hardware microservices.