Proceedings ArticleDOI
Enable OpenCL Compiler with Open64 Infrastructures
Yu-Te Lin,Shao-Chung Wang,Wen-Li Shih,Brian Kun-Yuan Hsieh,Jenq Kuen Lee +4 more
- pp 863-868
TLDR
The flow to enable an OpenCL compiler based on Open64 infrastructures for ATI GPUs is described, which includes the extension of the front-end parser for OpenCL, the generation of high-level intermediate representations with OpenCL linguistics, performing high- level optimization, and finally applying OpenCL specific optimization for code generations.Abstract:
As microprocessors evolve into heterogeneous architectures with multi-cores of MPUs and GPUs, programming model supports become important for programming such architectures. To address this issue, OpenCL is proposed. Currently, most of OpenCL implementations take LLVM as their infrastructures. This presents an opportunity to demonstrate whether OpenCL can be effectively implemented on other compiler infrastructures. For example, Open64, which is another open source compiler and known to generate efficient codes for microprocessors, can contribute further to performance improvements and enhancing the adoption of heterogeneous computing based on OpenCL. In this paper, we describe the flow to enable an OpenCL compiler based on Open64 infrastructures for ATI GPUs. Our work includes the extension of the front-end parser for OpenCL, the generation of high-level intermediate representations with OpenCL linguistics, performing high-level optimization, and finally applying OpenCL specific optimization for code generations. Preliminary experimental results show that our compiler based on Open64 is able to generate efficient codes for OpenCL programs.read more
Citations
More filters
Proceedings ArticleDOI
Multimodal Biometrics for Enhanced IoT Security
TL;DR: This work used discriminant correlation analysis (DCA) to fuse features from face and voice and used the K-nearest neighbors (KNN) algorithm to classify the features and showed that fusion increased recognition accuracy by 52.45% compared to using face alone and 81.62% when using voice alone.
Proceedings ArticleDOI
Design of vehicle detection methods with OpenCL programming on multi-core systems
TL;DR: This paper presents a case study to accelerate a sliding-window based vehicle detection algorithm on a heterogeneous multicore systems using OpenCL designs and integrates width model into the vehicle detection method to reduce search space.
Journal ArticleDOI
Vector data flow analysis for SIMD optimizations on OpenCL programs
Yu-Te Lin,Jenq Kuen Lee +1 more
TL;DR: This paper proposes a calculus framework to support the data flow analysis of vector constructs for OpenCL programs that compilers can use to perform SIMD optimizations, and model OpenCL vector operations as data access functions in the style of mathematical functions.
Proceedings ArticleDOI
A Compile-Time Cost Model for Automatic OpenMP Decoupled Software Pipelining Parallelization
TL;DR: A compile-time cost model for automatic parallelization profit estimate is proposed by extending the existing cost model in Open64 loop nest optimizer (LNO) phase and the OpenMP DSWP algorithm is improved based on this model, which increases execution efficiency of automatic parallelizations.
Proceedings ArticleDOI
An Automatic Parallel-Stage Decoupled Software Pipelining Parallelization Algorithm Based on OpenMP
TL;DR: An improved PS-DSWP algorithm based on OpenMP is proposed, implemented without relying on CPU architectures by using a high level intermediate representation and the Program Dependence Graph (PDG) used in the algorithm is built based on the basic blocks, which exploits coarser-grained parallelism than the original PS- DSWP transformation with PDG based on instructions.
References
More filters
Proceedings ArticleDOI
LLVM: a compilation framework for lifelong program analysis & transformation
Chris Lattner,Vikram Adve +1 more
TL;DR: The design of the LLVM representation and compiler framework is evaluated in three ways: the size and effectiveness of the representation, including the type information it provides; compiler performance for several interprocedural problems; and illustrative examples of the benefits LLVM provides for several challenging compiler problems.
Journal ArticleDOI
OpenMP: an industry standard API for shared-memory programming
TL;DR: At its most elemental level, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran (and separately, C and C++ to express shared memory parallelism) and leaves the base language unspecified.
Journal ArticleDOI
Brook for GPUs: stream computing on graphics hardware
Ian Buck,Tim Foley,Daniel Reiter Horn,Jeremy Sugerman,Kayvon Fatahalian,Mike Houston,Pat Hanrahan +6 more
TL;DR: This paper presents Brook for GPUs, a system for general-purpose computation on programmable graphics hardware that abstracts and virtualizes many aspects of graphics hardware, and presents an analysis of the effectiveness of the GPU as a compute engine compared to the CPU.
Journal ArticleDOI
Introduction to the cell multiprocessor
TL;DR: This paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation of the Cell multiprocessor.
Proceedings ArticleDOI
The design and implementation of a first-generation CELL processor
D. Pham,Shigehiro Asano,M. Bolliger,M. N. Day,Harm Peter Hofstee,Charles Ray Johns,J. Kahle,Atsushi Kameyama,J. Keaty,Y. Masubuchi,Mack W. Riley,David Shippy,Daniel Lawrence Stasiak,Masakazu Suzuoki,Michael Fan Wang,James D. Warnock,S. Weitzel,D. Wendel,Takeshi Yamazaki,Kazuaki Yazawa +19 more
TL;DR: A CELL processor is a multi-core chip consisting of a 64b power architecture processor, multiple streaming processors, a flexible IO interface, and a memory interface controller that is implemented in 90nm SOI technology.