scispace - formally typeset
D

Daniel Oliveira

Researcher at Universidade Federal do Rio Grande do Sul

Publications -  30
Citations -  531

Daniel Oliveira is an academic researcher from Universidade Federal do Rio Grande do Sul. The author has contributed to research in topics: Fault injection & Energy consumption. The author has an hindex of 11, co-authored 30 publications receiving 403 citations. Previous affiliations of Daniel Oliveira include University of Rio Grande & Federal University of Paraná.

Papers
More filters
Proceedings ArticleDOI

Understanding GPU errors on large-scale HPC systems and the implications for system design and operation

TL;DR: A detailed study is presented to provide a thorough understanding of GPU errors on a large-scale GPU-enabled system, and results from extensive neutron-beam tests are presented to measure the resilience of different generations of GPUs.
Journal ArticleDOI

Evaluation and Mitigation of Radiation-Induced Soft Errors in Graphics Processing Units

TL;DR: Novel insights on GPU reliability are given by evaluating the neutron sensitivity of modern GPUs memory structures, highlighting pattern dependence and multiple errors occurrences and error-correcting code, algorithm-based fault tolerance, and comparison hardening strategies are presented and evaluated on GPUs through radiation experiments.
Proceedings ArticleDOI

Experimental and analytical study of Xeon Phi reliability

TL;DR: An in-depth analysis of transient faults effects on HPC applications in Intel Xeon Phi processors based on radiation experiments and high-level fault injection is presented and it is shown that portions of applications can be graded by different criticalities.
Journal ArticleDOI

Modern GPUs Radiation Sensitivity Evaluation and Mitigation Through Duplication With Comparison

TL;DR: The neutron sensitivity of the modern GPU caches, and internal resources are experimentally evaluated, and various Duplication With Comparison strategies to reduce GPU radiation sensitivity are presented and validated through radiation experiments.
Proceedings ArticleDOI

Radiation-Induced Error Criticality in Modern HPC Parallel Accelerators

TL;DR: It is shown that arithmetic operations are less critical for the K40, while Xeon Phi is more reliable when executing particles interactions solved through Finite Difference Methods, and iterative stencil operations seem the most reliable on both architectures.