scispace - formally typeset
Search or ask a question
Topic

Performance per watt

About: Performance per watt is a research topic. Over the lifetime, 315 publications have been published within this topic receiving 5778 citations.


Papers
More filters
Book ChapterDOI
21 May 2010
TL;DR: A finite difference scheme solving the general convection-diffusion-reaction equations adapted for application of Graphics Processing Units (GPU) and multithreading for many-Core computing.
Abstract: Many-Core system plays a key role on High Performance Computing, HPC, nowadays. This platform shows the big potential on the performance per watt, performance per floor area, cost performance, and so on. This paper presents a finite difference scheme solving the general convection-diffusion-reaction equations adapted for application of Graphics Processing Units (GPU) and multithreading. A two-dimensional nonlinear Burgers' equation was chosen as the test case. The best results that we measured are speed-up ratio of 12 times at mesh size 1026×1026 by using GPU and 20 times at mesh size 514×514 by using full 8 CPU cores when compared with an equivalent single CPU code.

4 citations

Proceedings ArticleDOI
24 Sep 2020
TL;DR: This paper presents PROTEUS framework that employs rule-based self-adaptation in PNoCs, and can achieve up to 24.5% less laser power consumption, up to 31% less average packet latency, and up to 20% less energy-per-bit compared to another laser power management technique from prior work.
Abstract: The performance of on-chip communication in the state-of-the-art multi-core processors that use the traditional electronic NoCs has already become severely energy-constrained. To that end, emerging photonic NoCs (PNoC) are seen as a potential solution to improve the energy-efficiency (performance per watt) of on-chip communication. However, existing PNoC designs cannot realize their full potential due to their excessive laser power consumption. Prior works that attempt to improve laser power efficiency in PNoCs do not consider all key factors that affect the laser power requirement of PNoCs. Therefore, they cannot yield the desired balance between the reduction in laser power, achieved performance and energy-efficiency in PNoCs. In this paper, we present PROTEUS framework that employs rule-based self-adaptation in PNoCs. Our approach not only reduces the laser power consumption, but also minimizes the average packet latency by opportunistically increasing the communication data rate in PNoCs, and thus, yields the desired balance between the laser power reduction, performance, and energy-efficiency in PNoCs. Our evaluation with PARSEC benchmarks shows that our PROTEUS framework can achieve up to 24.5% less laser power consumption, up to 31% less average packet latency, and up to 20% less energy-per-bit, compared to another laser power management technique from prior work.

4 citations

Proceedings ArticleDOI
10 May 2009
TL;DR: This presentation will first explain what is meant by green computing and how greenness of information processing may be quantified, and energy-efficient computing paradigms which utilize chip multi-processing, multiple-voltage domains, dynamic voltage/frequency scaling, and power/clock gating techniques will be reviewed.
Abstract: Digital information management is the key enabler for unprecedented rise in productivity and efficiency gains experienced by the world economies during the 21st century. Information processing systems have thus become essential to the functioning of business, service, academic, and governmental institutions. As institutions increase their offerings of digital information services, the demand for computation and storage capability also increases. Examples include online banking, e-filing of taxes, music and video downloads, online shipment tracking, real-time inventory/supply-chain management, electronic medical recording, insurance database management, surveillance and disaster recovery. It is estimated that, in some industries, the number of records that must be retained is growing at a CAGR of 50 percent or greater. This exponential increase in the digital intensity of human existence is driven by many factors, including ease of use and availability of a rich set of information technology (IT) devices and services. Indeed, it would be difficult to imagine how significant societal transformations that better our world could occur without the productivity and innovation enabled by the IT. Unfortunately, the energy cost and carbon footprint of the IT devices and services has become exorbitant. Moreover, current technological and digital service utilization trends result in a doubling of the energy cost of the IT infrastructure and its carbon footprint in less than five years. In an energy-constrained world, this consumption trend is unsustainable and comes at increasingly unacceptable societal and environmental costs. This presentation will first explain what is meant by green computing and how greenness of information processing may be quantified. Next, energy-efficient computing paradigms which utilize chip multi-processing, multiple-voltage domains, dynamic voltage/frequency scaling, and power/clock gating techniques will be reviewed. Finally, techniques for improving performance per Watt of large-scale information processing and storage systems (e.g., a data center), including hierarchical dynamic power management, task placement and scheduling, energy balancing, resource virtualization, and application optimizations that dynamically configure hardware for higher efficiency will be discussed.

4 citations

Proceedings ArticleDOI
25 Apr 2007
TL;DR: In this paper, an infrared photon-emission (IREM) based technique has been established to meet the needs of chip power characterization, debug, and validation developed for an energy-efficient product performance.
Abstract: As performance per Watt concept being adapted on CPU's performance, a comprehensive post-silicon design methodology on chip power characterization, debug, and validation developed for an energy-efficient product performance become ever more important. An infrared photon-emission (IREM) based technique has been established to meet the needs. With those developed tool capabilities, we can validate simulated fullchip power, determine the causes of excessive power leakage, generate die power and thermal maps, and eventually optimize follow-on designs for power performance. This newly developed techniques have been applied and proven reusable on multiple core microprocessors fabricated under 90 nm and 65 nm CMOS technology. Examples of 5-8% power saving as compared with the first silicon data are presented here to demonstrate the success on debug and design optimization on full chip power.

4 citations

Journal ArticleDOI
TL;DR: GPUs are found to be an order of magnitude ahead in performance per watt compared to Xeon Phis, and versus typical low-power devices like FPGAs, GPUs keep similar GFLOPS/w ratios in 2017 on a five times faster execution.
Abstract: We present a performance per watt analysis of CUDAlign 4.0, a parallel strategy to obtain the optimal pairwise alignment of huge DNA sequences in multi-GPU platforms using the exact Smith-Waterman method. Our study includes acceleration factors, performance, scalability, power efficiency and energy costs. We also quantify the influence of the contents of the compared sequences, identify potential scenarios for energy savings on speculative executions, and calculate performance and energy usage differences among distinct GPU generations and models. For a sequence alignment on chromosome-wide scale (around 2 Petacells), we are able to reduce execution times from 9.5 h on a Kepler GPU to just 2.5 h on a Pascal counterpart, with energy costs cut by 60%. We find GPUs to be an order of magnitude ahead in performance per watt compared to Xeon Phis. Finally, versus typical low-power devices like FPGAs, GPUs keep similar GFLOPS/w ratios in 2017 on a five times faster execution.

4 citations

Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
81% related
Benchmark (computing)
19.6K papers, 419.1K citations
80% related
Programming paradigm
18.7K papers, 467.9K citations
77% related
Compiler
26.3K papers, 578.5K citations
77% related
Scalability
50.9K papers, 931.6K citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202114
202015
201915
201836
201725
201631