D
Daniel Oliveira
Researcher at Universidade Federal do Rio Grande do Sul
Publications - 30
Citations - 531
Daniel Oliveira is an academic researcher from Universidade Federal do Rio Grande do Sul. The author has contributed to research in topics: Fault injection & Energy consumption. The author has an hindex of 11, co-authored 30 publications receiving 403 citations. Previous affiliations of Daniel Oliveira include University of Rio Grande & Federal University of Paraná.
Papers
More filters
Proceedings ArticleDOI
Understanding GPU errors on large-scale HPC systems and the implications for system design and operation
Devesh Tiwari,Saurabh Gupta,James H. Rogers,Don Maxwell,Paolo Rech,Sudharshan S. Vazhkudai,Daniel Oliveira,Dave Londo,Nathan DeBardeleben,Philippe O. A. Navaux,Luigi Carro,Arthur S. Bland +11 more
TL;DR: A detailed study is presented to provide a thorough understanding of GPU errors on a large-scale GPU-enabled system, and results from extensive neutron-beam tests are presented to measure the resilience of different generations of GPUs.
Journal ArticleDOI
Evaluation and Mitigation of Radiation-Induced Soft Errors in Graphics Processing Units
TL;DR: Novel insights on GPU reliability are given by evaluating the neutron sensitivity of modern GPUs memory structures, highlighting pattern dependence and multiple errors occurrences and error-correcting code, algorithm-based fault tolerance, and comparison hardening strategies are presented and evaluated on GPUs through radiation experiments.
Proceedings ArticleDOI
Experimental and analytical study of Xeon Phi reliability
Daniel Oliveira,Laércio Lima Pilla,Nathan DeBardeleben,Sean Blanchard,Heather Quinn,Israel Koren,Philippe O. A. Navaux,Paolo Rech +7 more
TL;DR: An in-depth analysis of transient faults effects on HPC applications in Intel Xeon Phi processors based on radiation experiments and high-level fault injection is presented and it is shown that portions of applications can be graded by different criticalities.
Journal ArticleDOI
Modern GPUs Radiation Sensitivity Evaluation and Mitigation Through Duplication With Comparison
Daniel Oliveira,Paolo Rech,Heather Quinn,Thomas D. Fairbanks,Laura Monroe,Sarah E. Michalak,Christine M. Anderson-Cook,Philippe O. A. Navaux,Luigi Carro +8 more
TL;DR: The neutron sensitivity of the modern GPU caches, and internal resources are experimentally evaluated, and various Duplication With Comparison strategies to reduce GPU radiation sensitivity are presented and validated through radiation experiments.
Proceedings ArticleDOI
Radiation-Induced Error Criticality in Modern HPC Parallel Accelerators
Daniel Oliveira,Laércio Lima Pilla,Mauricio Hanzich,Vinicius Fratin,Fernando Antonio Da Silva Fernandes,Caio Lunardi,José María Cela,Philippe O. A. Navaux,Luigi Carro,Paolo Rech +9 more
TL;DR: It is shown that arithmetic operations are less critical for the K40, while Xeon Phi is more reliable when executing particles interactions solved through Finite Difference Methods, and iterative stencil operations seem the most reliable on both architectures.