M
Michael J. Klaiber
Researcher at Bosch
Publications - 27
Citations - 337
Michael J. Klaiber is an academic researcher from Bosch. The author has contributed to research in topics: Hardware architecture & Connected component. The author has an hindex of 8, co-authored 27 publications receiving 223 citations. Previous affiliations of Michael J. Klaiber include IBM & University of Stuttgart.
Papers
More filters
Proceedings ArticleDOI
A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference
Bruce M. Fleischer,Sunil Shukla,Matthew M. Ziegler,Joel Abraham Silberman,Jinwook Oh,Vijavalakshmi Srinivasan,Jungwook Choi,Silvia Melitta Mueller,Ankur Agrawal,Tina Babinsky,Nianzheng Cao,Chia-Yu Chen,Pierce Chuang,Thomas W. Fox,George D. Gristede,Michael A. Guillorn,Howard M. Haynie,Michael J. Klaiber,Dongsoo Lee,Shih-Hsien Lo,Gary W. Maier,Michael R. Scheuermann,Swagath Venkataramani,Christos Vezyrtzis,Naigang Wang,Fanchieh Yee,Ching Zhou,Pong-Fei Lu,Brian W. Curran,Lel Chang,Kailash Gopalakrishnan +30 more
TL;DR: A multi-TOPS AI core is presented for acceleration of deep learning training and inference in systems from edge devices to data centers by employing a dataflow architecture and an on-chip scratchpad hierarchy.
Journal ArticleDOI
A Resource-Efficient Hardware Architecture for Connected Component Analysis
TL;DR: A resource-efficient hardware architecture for connected component analysis (CCA) of streamed video data is presented, which reduces the required hardware resources, especially for larger image widths, and is possible to realize an architecture processing video streams of larger images sizes.
Journal ArticleDOI
Efficient AI System Design With Cross-Layer Approximate Computing
Swagath Venkataramani,Xiao Sun,Naigang Wang,Chia-Yu Chen,Jungwook Choi,Mingu Kang,Ankur Agarwal,Jinwook Oh,Shubham Jain,Tina Babinsky,Nianzheng Cao,Thomas W. Fox,Bruce M. Fleischer,George D. Gristede,Michael A. Guillorn,Howard M. Haynie,Hiroshi Inoue,Kazuaki Ishizaki,Michael J. Klaiber,Shih-Hsien Lo,Gary W. Maier,Silvia Melitta Mueller,Michael R. Scheuermann,Eri Ogawa,Marcel Schaal,Mauricio J. Serrano,Joel Abraham Silberman,Christos Vezyrtzis,Wei Wang,Fanchieh Yee,Jintao Zhang,Matthew M. Ziegler,Ching Zhou,Moriyoshi Ohara,Pong-Fei Lu,Brian W. Curran,Sunil Shukla,Vijayalakshmi Srinivasan,Leland Chang,Kailash Gopalakrishnan +39 more
TL;DR: RaPiD, a multi-tera operations per second (TOPS) AI hardware accelerator core that is built from the ground-up using AxC techniques across the stack including algorithms, architecture, programmability, and hardware, is presented.
Journal ArticleDOI
A Scalable Multi-TeraOPS Core for AI Training and Inference
Sunil Shukla,Bruce M. Fleischer,Matthew M. Ziegler,Joel Abraham Silberman,Jinwook Oh,Vijayalakshmi Srinivasan,Jungwook Choi,Silvia Melitta Mueller,Ankur Agrawal,Tina Babinsky,Nianzheng Cao,Chia-Yu Chen,Pierce Chuang,Thomas W. Fox,George D. Gristede,Michael A. Guillorn,Howard M. Haynie,Michael J. Klaiber,Dongsoo Lee,Shih-Hsien Lo,Gary W. Maier,Michael R. Scheuermann,Swagath Venkataramani,Christos Vezyrtzis,Naigang Wang,Fanchieh Yee,Ching Zhou,Pong-Fei Lu,Brian W. Curran,Leland Chang,Kailash Gopalakrishnan +30 more
TL;DR: This letter presents a multi-TOPS AI accelerator core for deep learning training and inference that achieves >90% sustained utilization across the range of neural network topologies by employing a dataflow architecture to provide high throughput and an on-chip scratchpad hierarchy to meet the bandwidth demands of the compute units.
Proceedings ArticleDOI
A memory-efficient parallel single pass architecture for connected component labeling of streamed images
TL;DR: A scalable parallel memory-efficient single pass algorithm forconnected component labeling is proposed which reduces the amount of memory required by the hardware architecture by a factor of 100 or more, for typical image sizes, compared to a recently proposed parallel connected component labeling algorithm.