Proceedings ArticleDOI
Parallel Image Processing Based on CUDA
Zhiyi Yang,Yating Zhu,Yong Pu +2 more
- Vol. 3, pp 198-201
Reads0
Chats0
TLDR
The distinct features ofCUDA GPU are analyzed, the general program mode of CUDA is summarized and several classical image processing algorithms by CUDA, such as histogram equalization, removing clouds, edge detection and DCT encode and decode are implemented.Abstract:
CUDA (compute unified device architecture) is a novel technology of general-purpose computing on the GPU, which makes users develop general GPU (graphics processing unit) programs easily. This paper analyzes the distinct features of CUDA GPU, summarizes the general program mode of CUDA. Furthermore, we implement several classical image processing algorithms by CUDA, such as histogram equalization, removing clouds, edge detection and DCT encode and decode etc., especially introduce the first two algorithms. If we donpsilat take the data transfer time in experiment between host memory and device memory into account, as the image size increase, histogram computation can get a more than 40x speedup, removing clouds can get an about 79x speedup, DCT can gain around 8x and edge detection more than 200x.read more
Citations
More filters
Proceedings ArticleDOI
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
Victor W. Lee,Changkyu Kim,Jatin Chhugani,Michael E. Deisher,Daehyun Kim,Anthony D. Nguyen,Nadathur Satish,Mikhail Smelyanskiy,Srinivas Chennupaty,Per Hammarlund,Ronak Singhal,Pradeep Dubey +11 more
TL;DR: This paper discusses optimization techniques for both CPU and GPU, analyzes what architecture features contributed to performance differences between the two architectures, and recommends a set of architectural features which provide significant improvement in architectural efficiency for throughput kernels.
Journal ArticleDOI
Debunking the 100X GPU vs. CPU myth
W LeeVictor,KimChangkyu,ChhuganiJatin,DeisherMichael,KimDaehyun,D NguyenAnthony,SatishNadathur,SmelyanskiyMikhail,ChennupatySrinivas,HammarlundPer,SinghalRonak,DubeyPradeep +11 more
TL;DR: This research presents a novel and scalable approach to throughput computing that combines reinforcement learning, artificial intelligence, and reinforcement learning to solve the challenge of integrating NoSQL data stores to manage massive amounts of data.
Proceedings ArticleDOI
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Jiayuan Meng,Kevin Skadron +1 more
TL;DR: A performance model is established using NVIDIA's Tesla architecture as a case study and a framework is proposed that uses the performance model to automatically select the ghost zone size that performs best and generate appropriate code to automate this process on shared memory systems.
Journal ArticleDOI
A Multi-Resolution FPGA-Based Architecture for Real-Time Edge and Corner Detection
TL;DR: A performance analysis of the FPGA and the GPU implementations, and an extra CPU reference implementation, shows the competitive throughput of the proposed architecture even at a much lower clock frequency than those of the GPU and the CPU.
Journal ArticleDOI
A survey of GPU-based medical image computing techniques
TL;DR: The continuous advancement of GPU computing is reviewed and the existing traditional applications in three areas of medical image processing, namely, segmentation, registration and visualization, are surveyed.
References
More filters
GPU Computing
TL;DR: The background, hardware, and programming model for GPU computing is described, the state of the art in tools and techniques are summarized, and four GPU computing successes in game physics and computational biophysics that deliver order-of-magnitude performance gains over optimized CPU applications are presented.
Proceedings ArticleDOI
Scan primitives for GPU computing
TL;DR: Using the scan primitives, this work shows novel GPU implementations of quicksort and sparse matrix-vector multiply, and analyzes the performance of the scanPrimitives, several sort algorithms that use the scan Primitives, and a graphical shallow-water fluid simulation using the scan framework for a tridiagonal matrix solver.
Proceedings ArticleDOI
StoreGPU: exploiting graphics processing units to accelerate distributed storage systems
TL;DR: StoreGPU is designed, a library that accelerates a number of hashing based primitives popular in distributed storage system implementations that enable up to eight-fold performance gains on synthetic benchmarks as well as on a high-level application: the online similarity detection between large data files.
Journal Article
State of the Art and Future Challenge on General Purpose Computation by Graphics Processing Unit
TL;DR: The development of various applications on general purpose computation on GPU is introduced, and among those applications, fluid dynamics, algebraic computation, database operations, and spectrum analysis are introduced in detail.