scispace - formally typeset
S

Shoaib Kamil

Researcher at Adobe Systems

Publications -  84
Citations -  4429

Shoaib Kamil is an academic researcher from Adobe Systems. The author has contributed to research in topics: Compiler & Code generation. The author has an hindex of 29, co-authored 80 publications receiving 3773 citations. Previous affiliations of Shoaib Kamil include Lawrence Berkeley National Laboratory & University of California, Berkeley.

Papers
More filters
ReportDOI

Science Driven Supercomputing Architectures: AnalyzingArchitectural Bottlenecks with Applications and Benchmark Probes

TL;DR: Improved understanding of the performance behavior of scientific applications will allow improved performance predictions, development of adequate benchmarks for identification of hardware and application features that work well or poorly together, and a more systematic performance evaluation in procurement situations.
Proceedings ArticleDOI

Parallel High Performance Bootstrapping in Python

TL;DR: This work uses a combination of code-generation, code lowering, and just-in-time compilation techniques called SEJITS (Selective Embedded JIT Specialization) to generate highly performant parallel code for Bag of Little Bootstraps (BLB), a statistical sampling algorithm that solves the same class of problems as general bootstrapping, but which parallelizes better.
Journal Article

Reconfigurable Hybrid Interconnection for Static and Dynamic Scientific Applications

TL;DR: In this article, the authors present a hybrid switch architecture called HFAST that uses circuit switches to dynamically reconfigure a lower-degree interconnect to suit the topological requirements of each scientific application.
Posted Content

Automatic Generation of Sparse Tensor Kernels with Workspaces.

TL;DR: This work describes a compiler optimization called operator splitting that breaks up tensor sub-computations by introducing workspaces and shows that it increases the performance of important generated tensor kernels to match hand-optimized code.
Posted Content

Technical Report about Tiramisu: a Three-Layered Abstraction for Hiding Hardware Complexity from DSL Compilers

TL;DR: Tamisu is introduced, a common middle-end that can generate efficient code for modern processors and accelerators such as multicores, GPUs, FPGAs and distributed clusters and introduces a novel three-level IR that separation simplifies optimization and makes targeting multiple hardware architectures from the same algorithm easier.