Khaitan Harshit

Researcher at Google

Publications - 16

Citations - 5840

Khaitan Harshit is an academic researcher from Google. The author has contributed to research in topics: Nested loop join & Loop (topology). The author has an hindex of 5, co-authored 16 publications receiving 4422 citations.

Papers

PDF

Open Access

More filters

Posted Content

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, +74 more

- 16 Apr 2017 -

arXiv: Hardware Architecture

TL;DR: This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) and compares it to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the samedatacenters.

...read moreread less

Proceedings ArticleDOI

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, +75 more

TL;DR: The Tensor Processing Unit (TPU) as discussed by the authors is a custom ASIC deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) using a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS).

...read moreread less

Patent

Neural network compute tile

Olivier Temam, +3 more

TL;DR: In this paper, a computing unit consisting of a first memory bank for storing input activations and a second memory bank storing parameters used in performing computations is described, and the computing unit performs one or more computations associated with at least one element of a data array, the computations being performed by the MAC operator and comprising, in part, a multiply operation of the input activation received from the data bus and a parameter received from another memory bank.

...read moreread less

Patent

Neural network instruction set architecture

Narayanaswami Ravi, +3 more

TL;DR: In this paper, a tensor computation is performed by executing a loop nest comprising a plurality of loops, where a structure of the loop nest is defined based on one or more of the data values of the instruction.

...read moreread less

Patent

Neural network accelerator with parameters resident on chip

Olivier Temam, +3 more

TL;DR: In this paper, the authors present an accelerator that includes a computing unit; a first memory bank for storing input activations; and a second memory bank configured to store a sufficient amount of the neural network parameters on the computing unit to allow for latency below a specified level with throughput above aspecified level.

...read moreread less