scispace - formally typeset
Open AccessJournal ArticleDOI

The tensor algebra compiler

Reads0
Chats0
TLDR
TACO as mentioned in this paper is a C++ library that automatically generates compound tensor algebra operations on dense and sparse tensors, which can be used in machine learning, data analytics, engineering and the physical sciences.
Abstract
Tensor algebra is a powerful tool with applications in machine learning, data analytics, engineering and the physical sciences. Tensors are often sparse and compound operations must frequently be computed in a single kernel for performance and to save memory. Programmers are left to write kernels for every operation of interest, with different mixes of dense and sparse tensors in different formats. The combinations are infinite, which makes it impossible to manually implement and optimize them all. This paper introduces the first compiler technique to automatically generate kernels for any compound tensor algebra operation on dense and sparse tensors. The technique is implemented in a C++ library called taco. Its performance is competitive with best-in-class hand-optimized kernels in popular libraries, while supporting far more tensor operations.

read more

Citations
More filters
Proceedings ArticleDOI

Timeloop: A Systematic Approach to DNN Accelerator Evaluation

TL;DR: Timeloop's underlying models and algorithms are described in detail and results from case studies enabled by Timeloop are shown, which reveal that dataflow and memory hierarchy co-design plays a critical role in optimizing energy efficiency.
Posted Content

Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions

TL;DR: A language close to the mathematics of deep learning called Tensor Comprehensions offering both imperative and declarative styles, a polyhedral Just-In-Time compiler to convert a mathematical description of a deep learning DAG into a CUDA kernel with delegated memory management and synchronization, and a compilation cache populated by an autotuner are contributed.
Proceedings ArticleDOI

ExTensor: An Accelerator for Sparse Tensor Algebra

TL;DR: The ExTensor accelerator is proposed, which builds these novel ideas on handling sparsity into hardware to enable better bandwidth utilization and compute throughput and evaluated on several kernels relative to industry libraries and state-of-the-art tensor algebra compilers.
Posted Content

Learning to Optimize Tensor Programs

TL;DR: In this article, a learning-based framework is introduced to optimize tensor programs for deep learning workloads, such as matrix multiplication and high dimensional convolution, which are key enablers of effective deep learning systems.
Journal ArticleDOI

Efficient Processing of Deep Neural Networks

TL;DR: This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs).
References
More filters
Journal ArticleDOI

The NumPy Array: A Structure for Efficient Numerical Computation

TL;DR: In this article, the authors show how to improve the performance of NumPy arrays through vectorizing calculations, avoiding copying data in memory, and minimizing operation counts, which is a technique similar to the one described in this paper.
Journal ArticleDOI

The university of Florida sparse matrix collection

TL;DR: The University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications, is described and a new multilevel coarsening scheme is proposed to facilitate this task.
Related Papers (5)