Proceedings ArticleDOI
1.1 Computing's energy problem (and what we can do about it)
Mark Horowitz
- pp 10-14
Reads0
Chats0
TLDR
If the drive for performance and the end of voltage scaling have made power, and not the number of transistors, the principal factor limiting further improvements in computing performance, a new wave of innovative and efficient computing devices will be created.Abstract:
Our challenge is clear: The drive for performance and the end of voltage scaling have made power, and not the number of transistors, the principal factor limiting further improvements in computing performance. Continuing to scale compute performance will require the creation and effective use of new specialized compute engines, and will require the participation of application experts to be successful. If we play our cards right, and develop the tools that allow our customers to become part of the design process, we will create a new wave of innovative and efficient computing devices.read more
Citations
More filters
Proceedings ArticleDOI
Dynamic Hyperdimensional Computing for Improving Accuracy-Energy Efficiency Trade-Offs
TL;DR: This paper proposes a threshold-based dynamic HD computing framework (TD-HDC) to improve the accuracy-energy efficiency trade-offs and demonstrates that the proposed framework is flexible and can reduce energy consumption and execution time under the same accuracy level.
Posted ContentDOI
Accelerating recommendation system training by leveraging popular choices
TL;DR: This paper deep dives into the semantics of training data and obtains insights about the feature access, transfer, and usage patterns of these models and proposes a hot-embedding aware data layout for training recommender models, which reduces the data transfers from CPU to GPU.
Posted Content
E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings
TL;DR: This paper attempts to conduct more energy-efficient training of CNNs, so as to enable on-device training, by dropping unnecessary computations from three complementary levels: stochastic mini-batch dropping on the data level; selective layer update on the model level; and sign prediction for low-cost, low-precision back-propagation, on the algorithm level.
Proceedings ArticleDOI
Simpler, more efficient design
TL;DR: This paper proposes several approaches for making the design process more efficient and enabling custom energy-efficient integrated circuits and demonstrates on a design of a processor, based on an open-source instruction set architecture, with integrated switched-capacitor DC-DC converters implemented in 28nm FDSOI.
Proceedings ArticleDOI
Cambricon-Q: a hybrid architecture for efficient training
Zhao Yongwei,Chang Liu,Zidong Du,Qi Guo,Xing Hu,Zhuang Yimin,Zhenxing Zhang,Song Xinkai,Wei Li,Xishan Zhang,Ling Li,Zhiwei Xu,Tianshi Chen +12 more
TL;DR: Cambricon-Q as mentioned in this paper proposes a hybrid architecture consisting of an ASIC acceleration core and a near-data processing (NDP) engine for efficient quantized training with negligible accuracy loss.
References
More filters
Journal ArticleDOI
Design of ion-implanted MOSFET's with very small physical dimensions
TL;DR: This paper considers the design, fabrication, and characterization of very small Mosfet switching devices suitable for digital integrated circuits, using dimensions of the order of 1 /spl mu/.
Book
Low Power Digital CMOS Design
TL;DR: The Hierarchy of Limits of Power J.D. Stratakos, et al., and Low Power Programmable Computation coauthored with M.B. Srivastava, provide a review of the main approaches to Voltage Scaling Approaches.
IEEE International Solid-State Circuits Conference
Hurwitz Jonathan Ephraim David,Stewart Smith,A. A. Murray,Peter B. Denyer,John Thomson,Scot D. Anderson,E. Duncan,B. Paisley,A. Kinsey,E. Christison,B. Laffoley,J. Vittu,R. Bechignac,Robert Henderson,M.J. Panaghiston,P.-F. Pugibet,H. Hendry,K. M. Findlater +17 more
Journal ArticleDOI
Towards energy-proportional datacenter memory with mobile DRAM
Malladi Krishna T,Benjamin C. Lee,Frank Austin Nothaft,Christos Kozyrakis,Karthika Periyathambi,Mark Horowitz +5 more
TL;DR: This work architects server memory systems using mobile DRAM devices, trading peak bandwidth for lower energy consumption per bit and more efficient idle modes, and demonstrates 3-5× lower memory power, better proportionality, and negligible performance penalties for data-center workloads.