D
Daniele Paolo Scarpazza
Researcher at IBM
Publications - 29
Citations - 1489
Daniele Paolo Scarpazza is an academic researcher from IBM. The author has contributed to research in topics: String searching algorithm & SIMD. The author has an hindex of 17, co-authored 29 publications receiving 1228 citations. Previous affiliations of Daniele Paolo Scarpazza include IMEC & Pacific Northwest National Laboratory.
Papers
More filters
Proceedings ArticleDOI
Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer
David E. Shaw,J. P. Grossman,Joseph A. Bank,Brannon Batson,J. Adam Butts,Jack C. Chao,Martin M. Deneroff,Ron O. Dror,Amos Even,Christopher H. Fenton,Anthony Forte,Joseph Gagliardo,Gennette Gill,Brian Greskamp,C. Richard Ho,Douglas J. Ierardi,Lev Iserovich,Jeffrey S. Kuskin,Richard H. Larson,Timothy Layman,Li-Siang Lee,Adam Lerer,Chester Li,Daniel Killebrew,Kenneth M. Mackenzie,Shark Yeuk-Hai Mok,Mark A. Moraes,Rolf Mueller,Lawrence J. Nociolo,Jon L. Peticolas,Terry Quan,Daniel Ramot,John K. Salmon,Daniele Paolo Scarpazza,U. Ben Schafer,Naseer Siddique,Christopher W. Snyder,Jochen Spengler,Ping Tak Peter Tang,Michael Theobald,Horia Toma,Brian Towles,Benjamin Vitale,Stanley C. Wang,Cliff Young +44 more
TL;DR: The architecture of Anton 2 is tailored for fine-grained event-driven operation, which improves performance by increasing the overlap of computation with communication, and also allows a wider range of algorithms to run efficiently, enabling many new software-based optimizations.
Posted Content
Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking.
TL;DR: This technical report presents the microarchitectural details of the NVIDIA Volta architecture, discovered through microbenchmarks and instruction set disassembly, and compares quantitatively the findings against its predecessors, Kepler, Maxwell and Pascal.
Posted Content
Dissecting the Graphcore IPU Architecture via Microbenchmarking
TL;DR: This report focuses on the architecture and performance of the Intelligence Processing Unit (IPU), a novel, massively parallel platform recently introduced by Graphcore and aimed at Artificial Intelligence/Machine Learning (AI/ML) workloads.
Journal ArticleDOI
Efficient Breadth-First Search on the Cell/BE Processor
TL;DR: The proposed methodology combines a high-level algorithmic design that captures the machine-independent aspects, to guarantee portability with performance to future processors, with an implementation that embeds processor-specific optimizations.
Posted Content
Dissecting the NVidia Turing T4 GPU via Microbenchmarking
TL;DR: This report examines Turing and compares it quantitatively against previous NVidia GPU generations, and reveals that Turing introduces new instructions that express matrix math more succinctly, and maps Turing's instruction space, finding the same encoding as Volta, and additional instructions.