scispace - formally typeset
F

Fumihiko Ino

Researcher at Osaka University

Publications -  104
Citations -  1052

Fumihiko Ino is an academic researcher from Osaka University. The author has contributed to research in topics: Graphics processing unit & CUDA. The author has an hindex of 17, co-authored 97 publications receiving 972 citations.

Papers
More filters
Proceedings ArticleDOI

LogGPS: a parallel computational model for synchronization analysis

TL;DR: The results indicate that the LogGPS model is more accurate than the LogGP model, and analyzing synchronization costs is important when improving parallel program performance.
Journal ArticleDOI

A data distributed parallel algorithm for nonrigid image registration

TL;DR: A data distributed parallel algorithm that is capable of aligning large-scale three-dimensional images of deformable objects and requires less amount of memory resources, so that aligns datasets up to 1024x1024x590 voxel images with reducing the execution time from hours to minutes, a clinically compatible time.
Journal ArticleDOI

High-performance cone beam reconstruction using CUDA compatible GPUs

TL;DR: An acceleration method for cone beam reconstruction using CUDA compatible GPUs that accelerates the Feldkamp, Davis, and Kress (FDK) algorithm using three techniques: off-chip memory access reduction for saving the memory bandwidth; loop unrolling for hiding the memory latency; and multithreading for exploiting multiple GPUs.
Proceedings ArticleDOI

Design and implementation of the Smith-Waterman algorithm on the CUDA-compatible GPU

TL;DR: The Smith-Waterman algorithm efficiently uses on-chip shared memory to reduce the data amount being transferred between off-chip memory and processing elements in the GPU, and reduces the number of data fetches by applying a data reuse technique to query and database sequences.
Journal ArticleDOI

An improved binary-swap compositing for sort-last parallel rendering on distributed memory multiprocessors

TL;DR: An improvement on the binary-swap (BS) method, which is an efficient image compositing algorithm for sort-last parallel rendering, using three acceleration techniques compared to the original BS method: the interleaved splitting, multiple bounding rectangle, and run-length encoding.