scispace - formally typeset
T

Tien-Fu Chen

Researcher at National Chiao Tung University

Publications -  90
Citations -  3036

Tien-Fu Chen is an academic researcher from National Chiao Tung University. The author has contributed to research in topics: Cache & CPU cache. The author has an hindex of 22, co-authored 90 publications receiving 2878 citations. Previous affiliations of Tien-Fu Chen include National Chung Cheng University & University of Washington.

Papers
More filters
Journal ArticleDOI

Effective hardware-based data prefetching for high-performance processors

TL;DR: The results show that the three hardware prefetching schemes all yield significant reductions in the data access penalty when compared with regular caches, the benefits are greater when the hardware assist augments small on-chip caches, and the lookahead scheme is the preferred one cost-performance wise.
Proceedings ArticleDOI

An effective on-chip preloading scheme to reduce data access penalty

TL;DR: In this article, a new hardware prefetching scheme based on the prediction of the execution of the instruction stream and associated operand references is proposed. But this scheme requires the use of a reference prediction table and its associated logic.
Proceedings ArticleDOI

A performance study of software and hardware data prefetching schemes

TL;DR: Qualitative comparisons indicate that both schemes are able to reduce cache misses in the domain of linear array references, and an approach combining software and hardware schemes is proposed; it shows promise in reducing the memory latency with least overhead.
Proceedings ArticleDOI

Reducing memory latency via non-blocking and prefetching caches

TL;DR: A hybrid design based on the combination of non-blocking and prefetching caches is proposed, which is found to be very effective in reducing the memory latency penalty for many applications.
Book

Reducing memory latency via non-blocking and prefetching caches

TL;DR: In this paper, a hybrid non-blocking cache and prefetching cache is proposed to hide memory latency by exploiting the overlap of processor computations with data accesses, and a hybrid design based on the combination of these two hardware-based schemes is proposed.