Proceedings ArticleDOI
1.1 Computing's energy problem (and what we can do about it)
Mark Horowitz
- pp 10-14
Reads0
Chats0
TLDR
If the drive for performance and the end of voltage scaling have made power, and not the number of transistors, the principal factor limiting further improvements in computing performance, a new wave of innovative and efficient computing devices will be created.Abstract:
Our challenge is clear: The drive for performance and the end of voltage scaling have made power, and not the number of transistors, the principal factor limiting further improvements in computing performance. Continuing to scale compute performance will require the creation and effective use of new specialized compute engines, and will require the participation of application experts to be successful. If we play our cards right, and develop the tools that allow our customers to become part of the design process, we will create a new wave of innovative and efficient computing devices.read more
Citations
More filters
Journal ArticleDOI
ConvAix: An Application-Specific Instruction-Set Processor for the Efficient Acceleration of CNNs
TL;DR: The proposed design offers sufficient processing power for the execution of state-of-the-art CNNs in real-time by utilizing a combination of data-level parallelism (DLP), instruction-level Parallelism (ILP), and subword parallelism, while consuming between 972mW and 340mW of power.
Journal ArticleDOI
Spintronic In-Memory Pattern Matching
Zamshed I. Chowdhury,S. Karen Khatamifard,Zhengyang Zhao,Masoud Zabihi,Salonik Resch,Meisam Razaviyayn,Jian-Ping Wang,Sachin S. Sapatnekar,Ulya R. Karpuzcu +8 more
TL;DR: SpinPM is introduced, a novel high-density, reconfigurable spintronic in-memory pattern matching spin–orbit torque (SOT)—specifically spin Hall effect (SHE)—substrate, and the performance benefit SpinPM can achieve over conventional and near-memory processing systems is demonstrated.
Proceedings ArticleDOI
Algorithm-Aware Neural Network Based Image Compression for High-Speed Imaging
TL;DR: In this article, a CNN-based compression scheme for wearable AR/VR systems is presented, which can be tuned for specific machine vision applications and enables increased compression for a given application performance target.
Journal ArticleDOI
Quantitatively Evaluating the Effect of Read Noise in Memristive Hopfield Network on Solving Traveling Salesman Problem
TL;DR: The simulated results demonstrate that the intrinsic read noise in the resistive weight matrix is indeed helpful for the network to escape from local minima, serving as a useful computing resource.
Proceedings ArticleDOI
Approximate computing in the nanoscale era
TL;DR: This paper focuses on approximate arithmetic circuits for computer vision and machine learning, which have an excellent resiliency to computation errors and makes the increase of efficiency of arithmetic circuits a keypoint.
References
More filters
Journal ArticleDOI
Design of ion-implanted MOSFET's with very small physical dimensions
TL;DR: This paper considers the design, fabrication, and characterization of very small Mosfet switching devices suitable for digital integrated circuits, using dimensions of the order of 1 /spl mu/.
Book
Low Power Digital CMOS Design
TL;DR: The Hierarchy of Limits of Power J.D. Stratakos, et al., and Low Power Programmable Computation coauthored with M.B. Srivastava, provide a review of the main approaches to Voltage Scaling Approaches.
IEEE International Solid-State Circuits Conference
Hurwitz Jonathan Ephraim David,Stewart Smith,A. A. Murray,Peter B. Denyer,John Thomson,Scot D. Anderson,E. Duncan,B. Paisley,A. Kinsey,E. Christison,B. Laffoley,J. Vittu,R. Bechignac,Robert Henderson,M.J. Panaghiston,P.-F. Pugibet,H. Hendry,K. M. Findlater +17 more
Journal ArticleDOI
Towards energy-proportional datacenter memory with mobile DRAM
Malladi Krishna T,Benjamin C. Lee,Frank Austin Nothaft,Christos Kozyrakis,Karthika Periyathambi,Mark Horowitz +5 more
TL;DR: This work architects server memory systems using mobile DRAM devices, trading peak bandwidth for lower energy consumption per bit and more efficient idle modes, and demonstrates 3-5× lower memory power, better proportionality, and negligible performance penalties for data-center workloads.