Aki Kuusela

Researcher at Google

Publications - 20

Citations - 465

Aki Kuusela is an academic researcher from Google. The author has contributed to research in topics: Encoder & Deep learning. The author has an hindex of 8, co-authored 19 publications receiving 274 citations.

Papers

PDF

Open Access

More filters

Proceedings ArticleDOI

Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks

Amirali Boroumand, +10 more

TL;DR: This work comprehensively analyzes the energy and performance impact of data movement for several widely-used Google consumer workloads, and finds that processing-in-memory (PIM) can significantly reduceData movement for all of these workloads by performing part of the computation close to memory.

...read moreread less

Journal ArticleDOI

Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors

Claudionor Coelho, +9 more

- 01 Aug 2021 -

Nature Machine Intelligence

TL;DR: In this paper, a method for designing optimally heterogeneously quantized versions of deep neural network models for minimum energy, high-accuracy, nanosecond inference and fully automated deployment on chip is introduced.

...read moreread less

Posted Content

Automatic deep heterogeneous quantization of Deep Neural Networks for ultra low-area, low-latency inference on the edge at particle colliders

Claudionor Coelho, +9 more

TL;DR: A novel method for designing optimally heterogeneously quantized versions of deep neural network models for minimum-energy, high-accuracy, nanosecond inference and fully automated deployment on chip is introduced.

...read moreread less

Proceedings ArticleDOI

Warehouse-scale video acceleration: co-design and deployment in the wild

Parthasarathy Ranganathan, +51 more

TL;DR: In this paper, the authors describe the design and deployment of a new accelerator targeted at warehouse-scale video transcoding, and discuss key design trade-offs for balanced systems at data center scale and co-designing accelerators with large-scale distributed software systems.

...read moreread less

Posted Content

Ultra Low-latency, Low-area Inference Accelerators using Heterogeneous Deep Quantization with QKeras and hls4ml

Claudionor Coelho, +7 more

TL;DR: The QKeras library is introduced, an extension of the Keras library allowing for the creation of heterogeneously quantized versions of deep neural network models, through drop-in replacement of Keras layers, which significantly reduces resource consumption while retaining high accuracy when implemented on FPGA hardware.

...read moreread less