scispace - formally typeset
P

Putt Sakdhnagool

Researcher at Purdue University

Publications -  16
Citations -  166

Putt Sakdhnagool is an academic researcher from Purdue University. The author has contributed to research in topics: Compiler & Speedup. The author has an hindex of 6, co-authored 16 publications receiving 137 citations. Previous affiliations of Putt Sakdhnagool include Thailand National Science and Technology Development Agency.

Papers
More filters
Proceedings ArticleDOI

Pagoda: Fine-Grained GPU Resource Virtualization for Narrow Tasks

TL;DR: Pagoda is presented, a runtime system that virtualizes GPU resources, using an OS-like daemon kernel called MasterKernel, and achieves a geometric mean speedup of 5.70x over PThreads running on a 20-core CPU, 1.51x over CUDA-HyperQ, and 1.69x over GeMTC, the state-of- the-art runtime GPU task scheduling system.
Proceedings ArticleDOI

Massively parallel 3D image reconstruction

TL;DR: A new algorithm for MBIR is presented, the Non-Uniform Parallel Super-Voxel (NU-PSV) algorithm, that regularizes the data access pattern, enables massive parallelism, and ensures fast convergence.
Book ChapterDOI

Evaluating Performance Portability of OpenACC

TL;DR: This paper evaluates the performance portability obtained by OpenACC on twelve OpenACC programs on NVIDIA CUDA, AMD GCN, and Intel MIC architectures and studies the effects of various compiler optimizations and OpenACC program settings on these architectures to provide insights into the achievedperformance portability.
Proceedings ArticleDOI

HeteroDoop: A MapReduce Programming System for Accelerator Clusters

TL;DR: Evaluation results of HeteroDoop on recent hardware indicate that usage of even a single GPU per node can improve performance by up to 2.6x, compared to a CPU-only Hadoop, running on a cluster with 20-core CPUs.
Proceedings ArticleDOI

Scaling large-data computations on multi-GPU accelerators

TL;DR: A mechanism and an implementation to automatically pipeline the CPU-GPU memory channel so as to overlap the GPU computation with the memory copies, alleviating the data transfer overhead and a novel adaptive runtime tuning mechanism is proposed to automatically select the pipeline stage size.