scispace - formally typeset
Proceedings ArticleDOI

Enable OpenCL Compiler with Open64 Infrastructures

TLDR
The flow to enable an OpenCL compiler based on Open64 infrastructures for ATI GPUs is described, which includes the extension of the front-end parser for OpenCL, the generation of high-level intermediate representations with OpenCL linguistics, performing high- level optimization, and finally applying OpenCL specific optimization for code generations.
Abstract
As microprocessors evolve into heterogeneous architectures with multi-cores of MPUs and GPUs, programming model supports become important for programming such architectures. To address this issue, OpenCL is proposed. Currently, most of OpenCL implementations take LLVM as their infrastructures. This presents an opportunity to demonstrate whether OpenCL can be effectively implemented on other compiler infrastructures. For example, Open64, which is another open source compiler and known to generate efficient codes for microprocessors, can contribute further to performance improvements and enhancing the adoption of heterogeneous computing based on OpenCL. In this paper, we describe the flow to enable an OpenCL compiler based on Open64 infrastructures for ATI GPUs. Our work includes the extension of the front-end parser for OpenCL, the generation of high-level intermediate representations with OpenCL linguistics, performing high-level optimization, and finally applying OpenCL specific optimization for code generations. Preliminary experimental results show that our compiler based on Open64 is able to generate efficient codes for OpenCL programs.

read more

Citations
More filters
Proceedings ArticleDOI

Multimodal Biometrics for Enhanced IoT Security

TL;DR: This work used discriminant correlation analysis (DCA) to fuse features from face and voice and used the K-nearest neighbors (KNN) algorithm to classify the features and showed that fusion increased recognition accuracy by 52.45% compared to using face alone and 81.62% when using voice alone.
Proceedings ArticleDOI

Design of vehicle detection methods with OpenCL programming on multi-core systems

TL;DR: This paper presents a case study to accelerate a sliding-window based vehicle detection algorithm on a heterogeneous multicore systems using OpenCL designs and integrates width model into the vehicle detection method to reduce search space.
Journal ArticleDOI

Vector data flow analysis for SIMD optimizations on OpenCL programs

TL;DR: This paper proposes a calculus framework to support the data flow analysis of vector constructs for OpenCL programs that compilers can use to perform SIMD optimizations, and model OpenCL vector operations as data access functions in the style of mathematical functions.
Proceedings ArticleDOI

A Compile-Time Cost Model for Automatic OpenMP Decoupled Software Pipelining Parallelization

TL;DR: A compile-time cost model for automatic parallelization profit estimate is proposed by extending the existing cost model in Open64 loop nest optimizer (LNO) phase and the OpenMP DSWP algorithm is improved based on this model, which increases execution efficiency of automatic parallelizations.
Proceedings ArticleDOI

An Automatic Parallel-Stage Decoupled Software Pipelining Parallelization Algorithm Based on OpenMP

TL;DR: An improved PS-DSWP algorithm based on OpenMP is proposed, implemented without relying on CPU architectures by using a high level intermediate representation and the Program Dependence Graph (PDG) used in the algorithm is built based on the basic blocks, which exploits coarser-grained parallelism than the original PS- DSWP transformation with PDG based on instructions.
References
More filters
Proceedings ArticleDOI

LLVM: a compilation framework for lifelong program analysis & transformation

TL;DR: The design of the LLVM representation and compiler framework is evaluated in three ways: the size and effectiveness of the representation, including the type information it provides; compiler performance for several interprocedural problems; and illustrative examples of the benefits LLVM provides for several challenging compiler problems.
Journal ArticleDOI

OpenMP: an industry standard API for shared-memory programming

L. Dagum, +1 more
TL;DR: At its most elemental level, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran (and separately, C and C++ to express shared memory parallelism) and leaves the base language unspecified.
Journal ArticleDOI

Brook for GPUs: stream computing on graphics hardware

TL;DR: This paper presents Brook for GPUs, a system for general-purpose computation on programmable graphics hardware that abstracts and virtualizes many aspects of graphics hardware, and presents an analysis of the effectiveness of the GPU as a compute engine compared to the CPU.
Journal ArticleDOI

Introduction to the cell multiprocessor

TL;DR: This paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation of the Cell multiprocessor.
Proceedings ArticleDOI

The design and implementation of a first-generation CELL processor

TL;DR: A CELL processor is a multi-core chip consisting of a 64b power architecture processor, multiple streaming processors, a flexible IO interface, and a memory interface controller that is implemented in 90nm SOI technology.
Related Papers (5)