R
Rangharajan Venkatesan
Researcher at Nvidia
Publications - 54
Citations - 3138
Rangharajan Venkatesan is an academic researcher from Nvidia. The author has contributed to research in topics: Cache & Efficient energy use. The author has an hindex of 20, co-authored 50 publications receiving 2127 citations. Previous affiliations of Rangharajan Venkatesan include Purdue University.
Papers
More filters
Proceedings ArticleDOI
SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
Angshuman Parashar,Minsoo Rhu,Anurag Mukkara,Antonio Puglielli,Rangharajan Venkatesan,Brucek Khailany,Joel Emer,Stephen W. Keckler,William J. Dally +8 more
TL;DR: The Sparse CNN (SCNN) accelerator as discussed by the authors employs a dataflow that enables maintaining the sparse weights and activations in a compressed encoding, which eliminates unnecessary data transfers and reduces storage requirements.
Proceedings ArticleDOI
Timeloop: A Systematic Approach to DNN Accelerator Evaluation
Angshuman Parashar,Priyanka Raina,Yakun Sophia Shao,Yu-Hsin Chen,Victor A. Ying,Anurag Mukkara,Rangharajan Venkatesan,Brucek Khailany,Stephen W. Keckler,Joel Emer +9 more
TL;DR: Timeloop's underlying models and algorithms are described in detail and results from case studies enabled by Timeloop are shown, which reveal that dataflow and memory hierarchy co-design plays a critical role in optimizing energy efficiency.
Proceedings ArticleDOI
Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture
Yakun Sophia Shao,Jason Clemons,Rangharajan Venkatesan,Brian Zimmer,Matthew Fojtik,Nan Jiang,Ben Keller,Alicia Klinefelter,Nathaniel Pinckney,Priyanka Raina,Stephen G. Tell,Yanqing Zhang,William J. Dally,Joel Emer,C. Thomas Gray,Brucek Khailany,Stephen W. Keckler +16 more
TL;DR: This work investigates and quantifies the costs and benefits of using MCMs with fine-grained chiplets for deep learning inference, an application area with large compute and on-chip storage requirements, and introduces three tiling optimizations that improve data locality.
Proceedings ArticleDOI
MACACO: modeling and analysis of circuits for approximate computing
TL;DR: The results show that MACACO can help a designer to systematically evaluate the impact of approximate circuits, and to choose between different approximate implementations, thereby facilitating the adoption of such circuits for approximate computing.
Posted Content
SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
Angshuman Parashar,Minsoo Rhu,Anurag Mukkara,Antonio Puglielli,Rangharajan Venkatesan,Brucek Khailany,Joel Emer,Stephen W. Keckler,William J. Dally +8 more
TL;DR: The Sparse CNN (SCNN) accelerator architecture is introduced, which improves performance and energy efficiency by exploiting thezero-valued weights that stem from network pruning during training and zero-valued activations that arise from the common ReLU operator.