scispace - formally typeset
Book ChapterDOI

Power efficiency evaluation of block ciphers on GPU-integrated multicore processor

TLDR
This paper is the first to describe a study to evaluate the per-watt performance of block ciphers on GPUs and shows that performance per watt of AES-128 on the APU including 80 cores were 743.0 Mbps/W and 44.0 % increases compared with those on a system equipped with a discrete AMD Radeon HD 6770.
Abstract: 
Computer systems with discrete GPUs are expected to become the standard methodology for high-speed encryption processing, but they require large amounts of power consumption and are inapplicable to embedded devices. Therefore, we have specifically examined a new heterogeneous multicore processor with CPU---GPU integration architecture. We first implemented three 128-bit block ciphers (AES, Camellia, and SC2000) from several symmetric block ciphers in an e-government recommended ciphers list by CRYPTREC in Japan using OpenCL on AMD E-350 APU with CPU---GPU integration architecture and two traditional systems with discrete GPUs. Then we evaluated their respective power efficiencies. Result showed that performance per watt of AES-128 on the APU including 80 cores were 743.0 Mbps/W and 44.0 % increases compared with those on a system equipped with a discrete AMD Radeon HD 6770 including 800 cores. This paper is the first to describe a study to evaluate the per-watt performance of block ciphers on GPUs.

read more

Citations
More filters
Journal ArticleDOI

Throughput and Power Efficiency Evaluation of Block Ciphers on Kepler and GCN GPUs Using Micro-Benchmark Analysis

TL;DR: It is speculated that to ameliorate Kepler GPUs as co-processor of block ciphers, the arithmetic and logical instructions must be improved in terms of software and hardware.
Proceedings ArticleDOI

Throughput and Power Efficiency Evaluations of Block Ciphers on Kepler and GCN GPUs

TL;DR: Evaluating throughput and power efficiency of three 128-bit block ciphers on GPUs with recent Nvidia Kepler and AMD GCN architectures found arithmetic logical instructions are required by encryption processing but are eliminated from some of the processing cores in Kepler architecture, unlike GCN's.
Proceedings ArticleDOI

ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors

TL;DR: Ad-heap is proposed, an efficient heap data structure that introduces an implicit bridge structure and properly apportions workloads to the two types of cores and obtains up to 1.5x and 3.6x speedup over the optimal AMP scheduling method that executes the fastest d-heaps on the standalone CPUs and GPUs in parallel.
Proceedings Article

Design of a Parallel AES for Graphics Harware using the CUDA frameworkd

TL;DR: In this article, the authors propose an effective implementation of the AES-CTR symmetric cryptographic primitive using the CUDA framework and compare it with the common CPU-based OpenSSL implementation on a performance-cost basis.
References
More filters
Book

The Design of Rijndael: AES - The Advanced Encryption Standard

TL;DR: The underlying mathematics and the wide trail strategy as the basic design idea are explained in detail and the basics of differential and linear cryptanalysis are reworked.
BookDOI

The Design of Rijndael

TL;DR: This volume is the authoritative guide to the Rijndael algorithm and AES and professionals, researchers, and students active or interested in data encryption will find it a valuable source of information and reference.
Book

Fast Software Encryption

TL;DR: Simplified variants that omit a quadratic function and a fixed rotation in RC6 are examined to clarify their essential contribution to the overall security of RC6.
Book ChapterDOI

Polynomial reconstruction based cryptography

TL;DR: A short overview of recent works on the problem of Decoding Reed Solomon Codes (aka Polynomial Reconstruction) and the novel applications that were enabled due to this development.
Proceedings ArticleDOI

An integrated GPU power and performance model

TL;DR: An integrated power and performance prediction model for a GPU architecture to predict the optimal number of active processors for a given application and the outcome of IPP is used to control the number of running cores.
Related Papers (5)