Comparing the power and performance of Intel's SCC to state-of-the-art CPUs and GPUs

doi:10.1109/ISPASS.2012.6189208

Proceedings ArticleDOI

Comparing the power and performance of Intel's SCC to state-of-the-art CPUs and GPUs

- pp 78-87

TLDR

The results show that the GPGPU has outstanding results in performance, power consumption and energy efficiency for many applications, but it requires significant programming effort and is not general enough to show the same level of efficiency for all the applications.

Abstract:

Power dissipation and energy consumption are becoming increasingly important architectural design constraints in different types of computers, from embedded systems to large-scale supercomputers. To continue the scaling of performance, it is essential that we build parallel processor chips that make the best use of exponentially increasing numbers of transistors within the power and energy budgets. Intel SCC is an appealing option for future many-core architectures. In this paper, we use various scalable applications to quantitatively compare and analyze the performance, power consumption and energy efficiency of different cutting-edge platforms that differ in architectural build. These platforms include the Intel Single-Chip Cloud Computer (SCC) many-core, the Intel Core i7 general-purpose multi-core, the Intel Atom low-power processor, and the Nvidia ION2 GPGPU. Our results show that the GPGPU has outstanding results in performance, power consumption and energy efficiency for many applications, but it requires significant programming effort and is not general enough to show the same level of efficiency for all the applications. The “light-weight” many-core presents an opportunity for better performance per watt over the “heavy-weight” multi-core, although the multi-core is still very effective for some sophisticated applications. In addition, the low-power processor is not necessarily energy-efficient, since the runtime delay effect can be greater than the power savings.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A Survey of Mobile Device Virtualization: Taxonomy and State of the Art

Junaid Shuja, +6 more

- 05 Apr 2016 -

ACM Computing Surveys

TL;DR: Challenges and issues faced in virtualization of CPU, memory, I/O, interrupt, and network interfaces are highlighted and various performance parameters are presented in a detailed comparative analysis to quantify the efficiency of mobile virtualization techniques and solutions.

...read moreread less

Journal ArticleDOI

On the energy efficiency and performance of irregular application executions on multicore, NUMA and manycore platforms

Emilio Francesquini, +6 more

- 01 Feb 2015 -

Journal of Parallel and Distributed Comp...

TL;DR: This study evaluates the computing and energy performance of two well-known irregular NP-hard problems-the Traveling-Salesman Problem and K-Means clustering-and a numerical seismic wave propagation simulation kernel-Ondes3D-on multicore, NUMA, and manycore platforms.

...read moreread less

Journal ArticleDOI

An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding

Leibo Liu, +7 more

- 03 Aug 2015 -

IEEE Transactions on Multimedia

TL;DR: A coarse-grained reconfigurable processing unit (RPU) consisting of 16 ×16 multi-functional processing elements (PEs) interconnected by an area-efficient line-switched mesh connect (LSMC) routing is proposed to reduce the implementation overhead and the energy dissipation spent on fast reconfiguration.

...read moreread less

Proceedings ArticleDOI

Protozoa: adaptive granularity cache coherence

Hongzhou Zhao, +3 more

TL;DR: The design of Protozoa is presented, a family of coherence protocols that eliminate unnecessary coherence traffic and match data movement to an application's spatial locality and is demonstrated to consistently reduce miss rate and improve the fraction of transmitted data that is actually utilized.

...read moreread less

Proceedings ArticleDOI

Improving Energy Efficiency through Parallelization and Vectorization on Intel Core i5 and i7 Processors

Juan M. Cebrian, +2 more

TL;DR: Results show that software developers should prioritize vectorization over parallelization whenever possible, as it is much better in terms of energy efficiency, and need to develop a more detailed model to predict system power based on on-chip power information.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Scalable molecular dynamics with NAMD

James C. Phillips, +9 more

- 01 Dec 2005 -

Journal of Computational Chemistry

TL;DR: NAMD as discussed by the authors is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems that scales to hundreds of processors on high-end parallel platforms, as well as tens of processors in low-cost commodity clusters, and also runs on individual desktop and laptop computers.

...read moreread less

Book

Scalable Molecular Dynamics with NAMD

James C. Phillips, +5 more

Proceedings ArticleDOI

The 48-core SCC Processor: the Programmer's View

Timothy G. Mattson, +10 more

TL;DR: The programmer's view of this chip is described and RCCE is described: the native message passing model created for the SCC processor, an intermediate case, sharing traits of message passing and shared memory architectures.

...read moreread less

Proceedings ArticleDOI

NAS parallel benchmark results

David H. Bailey, +3 more

TL;DR: The performance results of various systems using the NAS parallel benchmarks are presented and these results represent the best results that have been reported to the authors for the specific systems listed.

...read moreread less

Comparing the power and performance of Intel's SCC to state-of-the-art CPUs and GPUs

Citations

A Survey of Mobile Device Virtualization: Taxonomy and State of the Art

On the energy efficiency and performance of irregular application executions on multicore, NUMA and manycore platforms

An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding

Protozoa: adaptive granularity cache coherence

Improving Energy Efficiency through Parallelization and Vectorization on Intel Core i5 and i7 Processors

References

Scalable molecular dynamics with NAMD

Scalable Molecular Dynamics with NAMD

A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS

The 48-core SCC Processor: the Programmer's View

NAS parallel benchmark results

Related Papers (5)

The gem5 simulator

Benchmarking modern multiprocessors

Challenges and opportunities of obtaining performance from multi-core CPUs and many-core GPUs

Energy Efficiency Analysis of GPUs

Dark silicon and the end of multicore scaling

Trending Questions (1)