scispace - formally typeset
E

Eiman Ebrahimi

Researcher at Nvidia

Publications -  35
Citations -  2387

Eiman Ebrahimi is an academic researcher from Nvidia. The author has contributed to research in topics: Shared memory & Cache. The author has an hindex of 19, co-authored 35 publications receiving 1953 citations. Previous affiliations of Eiman Ebrahimi include University of Tehran & University of Texas at Austin.

Papers
More filters
Journal ArticleDOI

Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multicore Memory Systems

TL;DR: The technique, Fairness via Source Throttling (FST), estimates unfairness in the entire memory system, and enforces thread priorities/weights, and enables system-software to enforce different fairness objectives in the memory system.
Journal ArticleDOI

EMOGI: efficient memory-access for out-of-memory graph-traversal in GPUs

TL;DR: This paper addresses the open question of whether a sufficiently large number of overlapping cacheline-sized accesses can be sustained to tolerate the long latency to host memory, fully utilize the available bandwidth, and achieve favorable execution performance and proposes EMOGI, an alternative approach to traverse graphs that do not fit in GPU memory using direct cacheline -sized access to data stored in host memory.
Proceedings ArticleDOI

Self-adaptive memetic algorithm: an adaptive conjugate gradient approach

TL;DR: This proposed method not only adds no extra computation load to the genetic algorithm but also eliminates computation burden of parameter adjustment of hill climbing operator.
Proceedings ArticleDOI

Energy Savings via Dead Sub-Block Prediction

TL;DR: The Dead Sub-Block Predictor (DSBP) is proposed to predict which sub-block of a cache line will be actually used and how many times it will be used in order to bring into the cache only those sub-blocks that are necessary, and power them off after they are touched the predicted number of times.
Journal ArticleDOI

DUCATI: High-performance Address Translation by Extending TLB Reach of GPU-accelerated Systems

TL;DR: The combination of these two mechanisms, DUCATI, is an address translation architecture that improves GPU performance by 81%; (up to 4.5×) while requiring minimal changes to the existing system design.