scispace - formally typeset
Proceedings ArticleDOI

Comparing the power and performance of Intel's SCC to state-of-the-art CPUs and GPUs

TLDR
The results show that the GPGPU has outstanding results in performance, power consumption and energy efficiency for many applications, but it requires significant programming effort and is not general enough to show the same level of efficiency for all the applications.
Abstract
Power dissipation and energy consumption are becoming increasingly important architectural design constraints in different types of computers, from embedded systems to large-scale supercomputers. To continue the scaling of performance, it is essential that we build parallel processor chips that make the best use of exponentially increasing numbers of transistors within the power and energy budgets. Intel SCC is an appealing option for future many-core architectures. In this paper, we use various scalable applications to quantitatively compare and analyze the performance, power consumption and energy efficiency of different cutting-edge platforms that differ in architectural build. These platforms include the Intel Single-Chip Cloud Computer (SCC) many-core, the Intel Core i7 general-purpose multi-core, the Intel Atom low-power processor, and the Nvidia ION2 GPGPU. Our results show that the GPGPU has outstanding results in performance, power consumption and energy efficiency for many applications, but it requires significant programming effort and is not general enough to show the same level of efficiency for all the applications. The “light-weight” many-core presents an opportunity for better performance per watt over the “heavy-weight” multi-core, although the multi-core is still very effective for some sophisticated applications. In addition, the low-power processor is not necessarily energy-efficient, since the runtime delay effect can be greater than the power savings.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A Survey of Mobile Device Virtualization: Taxonomy and State of the Art

TL;DR: Challenges and issues faced in virtualization of CPU, memory, I/O, interrupt, and network interfaces are highlighted and various performance parameters are presented in a detailed comparative analysis to quantify the efficiency of mobile virtualization techniques and solutions.
Journal ArticleDOI

On the energy efficiency and performance of irregular application executions on multicore, NUMA and manycore platforms

TL;DR: This study evaluates the computing and energy performance of two well-known irregular NP-hard problems-the Traveling-Salesman Problem and K-Means clustering-and a numerical seismic wave propagation simulation kernel-Ondes3D-on multicore, NUMA, and manycore platforms.
Journal ArticleDOI

An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding

TL;DR: A coarse-grained reconfigurable processing unit (RPU) consisting of 16 ×16 multi-functional processing elements (PEs) interconnected by an area-efficient line-switched mesh connect (LSMC) routing is proposed to reduce the implementation overhead and the energy dissipation spent on fast reconfiguration.
Proceedings ArticleDOI

Protozoa: adaptive granularity cache coherence

TL;DR: The design of Protozoa is presented, a family of coherence protocols that eliminate unnecessary coherence traffic and match data movement to an application's spatial locality and is demonstrated to consistently reduce miss rate and improve the fraction of transmitted data that is actually utilized.
Proceedings ArticleDOI

Improving Energy Efficiency through Parallelization and Vectorization on Intel Core i5 and i7 Processors

TL;DR: Results show that software developers should prioritize vectorization over parallelization whenever possible, as it is much better in terms of energy efficiency, and need to develop a more detailed model to predict system power based on on-chip power information.
References
More filters
Journal ArticleDOI

Scalable molecular dynamics with NAMD

TL;DR: NAMD as discussed by the authors is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems that scales to hundreds of processors on high-end parallel platforms, as well as tens of processors in low-cost commodity clusters, and also runs on individual desktop and laptop computers.
Proceedings ArticleDOI

The 48-core SCC Processor: the Programmer's View

TL;DR: The programmer's view of this chip is described and RCCE is described: the native message passing model created for the SCC processor, an intermediate case, sharing traits of message passing and shared memory architectures.
Proceedings ArticleDOI

NAS parallel benchmark results

TL;DR: The performance results of various systems using the NAS parallel benchmarks are presented and these results represent the best results that have been reported to the authors for the specific systems listed.
Related Papers (5)
Trending Questions (1)
How many transistors are in a Cuda core?

The “light-weight” many-core presents an opportunity for better performance per watt over the “heavy-weight” multi-core, although the multi-core is still very effective for some sophisticated applications.