H
Hideki Saito
Researcher at Intel
Publications - 41
Citations - 851
Hideki Saito is an academic researcher from Intel. The author has contributed to research in topics: Compiler & SIMD. The author has an hindex of 18, co-authored 40 publications receiving 814 citations. Previous affiliations of Hideki Saito include University of Illinois at Urbana–Champaign.
Papers
More filters
Journal ArticleDOI
Can traditional programming bridge the Ninja performance gap for parallel computing applications
Nadathur Satish,Changkyu Kim,Jatin Chhugani,Hideki Saito,Rakesh Krishnaiyer,Mikhail Smelyanskiy,Milind B. Girkar,Pradeep Dubey +7 more
TL;DR: It is demonstrated that the otherwise uncontrolled growth of the Ninja gap can be contained and offer a more stable and predictable performance growth over future architectures, offering strong evidence that radical language changes are not required.
Journal ArticleDOI
Can traditional programming bridge the ninja performance gap for parallel computing applications
Nadathur Satish,Changkyu Kim,Jatin Chhugani,Hideki Saito,Rakesh Krishnaiyer,Mikhail Smelyanskiy,Milind B. Girkar,Pradeep Dubey +7 more
TL;DR: It is demonstrated that one can contain the otherwise uncontrolled growth of the Ninja gap and offer a more stable and predictable performance growth over future architectures, offering strong evidence that radical language changes are not required.
Proceedings ArticleDOI
SPEC MPI2007—an application benchmark suite for parallel systems using MPI
Matthias S. Müller,Matthijs van Waveren,Ron Lieberman,Brian Whitney,Hideki Saito,Kalyan Kumaran,John Baron,William C. Brantley,Chris Parrott,Tom Elken,Huiyu Feng,Carl Ponder +11 more
TL;DR: The benchmark suite SPEC MPI2007, which includes 13 technical computing applications from the fields of computational fluid dynamics, molecular dynamics, electromagnetism, geophysics, ray tracing, and hydrodynamics, is described and compared with other benchmark suites.
Proceedings ArticleDOI
Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors
Xinmin Tian,Hideki Saito,Serguei V. Preis,Eric N. Garcia,Sergey S. Kozhukhov,Matt Masten,Aleksei G. Cherkasov,Nikolay Panchenko +7 more
TL;DR: Several practical SIMD vectorization techniques such as less-than-full-vector loop vectorization, Intel® MIC specific alignment optimization, and small matrix transpose/multiplication 2-Dvectorization implemented in the Intel® C/C++ and Fortran production compilers for Intel® Xeon Phi coprocessors are presented.
On the Performance Potential of Different Types of Speculative Thread-Level Parallelism
Arun Kejariwal,Xinmin Tian,Wei Li,Milind Girkar,Sergey S. Kozhukhov,Hideki Saito,Utpal Banerjee,Alexandru Nicolau,Alexander V. Veidenbaum,Constantine D. Polychronopoulos +9 more
TL;DR: This study shows that, at the loop-level, the upper bound on the arithmetic mean and geometric mean speedup achievable via TLS across SPEC CPU2000 is 39.16% (standard deviation = 31.23) and 18.18% respectively.