scispace - formally typeset
Journal ArticleDOI

A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine

Reads0
Chats0
TLDR
In the proposed hardware architecture, three recognition tasks (visual perception, descriptor generation, and object decision) are directly mapped to the neural perception engine, 16 SIMD processors including 128 processing elements, and decision processor and executed in the pipeline to maximize throughput of the object recognition.
Abstract
A 201.4 GOPS real-time multi-object recognition processor is presented with a three-stage pipelined architecture. Visual perception based multi-object recognition algorithm is applied to give multiple attentions to multiple objects in the input image. For human-like multi-object perception, a neural perception engine is proposed with biologically inspired neural networks and fuzzy logic circuits. In the proposed hardware architecture, three recognition tasks (visual perception, descriptor generation, and object decision) are directly mapped to the neural perception engine, 16 SIMD processors including 128 processing elements, and decision processor, respectively, and executed in the pipeline to maximize throughput of the object recognition. For efficient task pipelining, proposed task/power manager balances the execution times of the three stages based on intelligent workload estimations. In addition, a 118.4 GB/s multi-casting network-on-chip is proposed for communication architecture with incorporating overall 21 IP blocks. For low-power object recognition, workload-aware dynamic power management is performed in chip-level. The 49 mm2 chip is fabricated in a 0.13 ?m 8-metal CMOS process and contains 3.7 M gates and 396 KB on-chip SRAM. It achieves 60 frame/sec multi-object recognition up to 10 different objects for VGA (640 × 480) video input while dissipating 496 mW at 1.2 V. The obtained 8.2 mJ/frame energy efficiency is 3.2 times higher than the state-of-the-art recognition processor.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning

TL;DR: This study designs an accelerator for large-scale CNNs and DNNs, with a special emphasis on the impact of memory on accelerator design, performance and energy, and shows that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s in a small footprint.
Journal ArticleDOI

PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory

TL;DR: This work proposes a novel PIM architecture, called PRIME, to accelerate NN applications in ReRAM based main memory, and distinguishes itself from prior work on NN acceleration, with significant performance improvement and energy saving.
Proceedings ArticleDOI

Project Adam: building an efficient and scalable deep learning training system

TL;DR: The design and implementation of a distributed system called Adam comprised of commodity server machines to train large deep neural network models that exhibits world-class performance, scaling and task accuracy on visual recognition tasks and shows that task accuracy improves with larger models.
Posted Content

A Survey of Neuromorphic Computing and Neural Networks in Hardware.

TL;DR: An exhaustive review of the research conducted in neuromorphic computing since the inception of the term is provided to motivate further work by illuminating gaps in the field where new research is needed.
Proceedings ArticleDOI

NeuFlow: A runtime reconfigurable dataflow processor for vision

TL;DR: A scalable dataflow hardware architecture optimized for the computation of general-purpose vision algorithms — neuFlow — and a dataflow compiler — luaFlow — that transforms high-level flow-graph representations of these algorithms into machine code for neu Flow are presented.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Journal ArticleDOI

A model of saliency-based visual attention for rapid scene analysis

TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.
Book

The organization of behavior

D. O. Hebb

A model of saliency-based visual attention for rapid scene analysis

Laurent Itti
TL;DR: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented, which breaks down the complex problem of scene understanding by rapidly selecting conspicuous locations to be analyzed in detail.
Related Papers (5)