scispace - formally typeset
T

Tarek S. Abdelrahman

Researcher at University of Toronto

Publications -  88
Citations -  1680

Tarek S. Abdelrahman is an academic researcher from University of Toronto. The author has contributed to research in topics: Shared memory & Compiler. The author has an hindex of 18, co-authored 86 publications receiving 1609 citations. Previous affiliations of Tarek S. Abdelrahman include University of Iowa & University of Michigan.

Papers
More filters
Journal ArticleDOI

hiCUDA: High-Level GPGPU Programming

TL;DR: The hiCUDA}, a high-level directive-based language for CUDA programming is designed, which allows programmers to perform tedious tasks in a simpler manner and directly to the sequential code, thus speeding up the porting process.
Proceedings ArticleDOI

Reducing branch divergence in GPU programs

TL;DR: This work proposes two novel software-based optimizations, called iteration delaying and branch distribution that aim to reduce branch divergence, and shows that they improve the performance of the synthetic benchmarks and that of the real-world application by 12% and 16% respectively.
Proceedings ArticleDOI

hiCUDA: a high-level directive-based language for GPU programming

TL;DR: HiCUDA as mentioned in this paper is a high-level directive-based language for CUDA programming, which allows programmers to perform data transfer between the host memory and various components of the GPU memory.
Journal ArticleDOI

Fusion of loops for parallelism and locality

TL;DR: In this article, the authors present new techniques to allow fusion of loop nests in the presence of fusion-preventing dependences, maintain parallelism and allow the parallel execution of fused loops with minimal synchronization.
Proceedings ArticleDOI

The NUMAchine multiprocessor

TL;DR: The design choices and the resulting performance of the NUMAchine multiprocessor system are documents using both simulation results and measurements on the prototype hardware.