Open AccessProceedings Article
NoScope: Optimizing Deep CNN-Based Queries over Video Streams at Scale.
Daniel Kang,John Emmons,Firas Abuzaid,Peter Bailis,Matei Zaharia +4 more
- Vol. 10, pp 1586-1597
TLDR
NoScope is a system for querying videos that can reduce the cost of neural network video analysis by up to three orders of magnitude via inference-optimized model search and achieves two to three order of magnitude speed-ups on binary classification tasks over fixed-angle webcam and surveillance video while maintaining accuracy within 1--5% of state-of-the-art neural networks.Abstract:
Recent advances in computer vision---in the form of deep neural networks---have made it possible to query increasing volumes of video data with high accuracy. However, neural network inference is computationally expensive at scale: applying a state-of-the-art object detector in real time (i.e., 30+ frames per second) to a single video requires a $4000 GPU. In response, we present NoScope, a system for querying videos that can reduce the cost of neural network video analysis by up to three orders of magnitude via inference-optimized model search. Given a target video, object to detect, and reference neural network, NoScope automatically searches for and trains a sequence, or cascade, of models that preserves the accuracy of the reference network but is specialized to the target video and are therefore far less computationally expensive. NoScope cascades two types of models: specialized models that forego the full generality of the reference model but faithfully mimic its behavior for the target video and object; and difference detectors that highlight temporal differences across frames. We show that the optimal cascade architecture differs across videos and objects, so NoScope uses an efficient cost-based optimizer to search across models and cascades. With this approach, NoScope achieves two to three order of magnitude speed-ups (265-15,500x real-time) on binary classification tasks over fixed-angle webcam and surveillance video while maintaining accuracy within 1--5% of state-of-the-art neural networks.read more
Citations
More filters
Journal ArticleDOI
Compressed linear algebra for large-scale machine learning
TL;DR: This work initiates work on compressed linear algebra (CLA), in which lightweight database compression techniques are applied to matrices and then linear algebra operations such as matrix-vector multiplication are executed directly on the compressed representations.
Posted Content
Scaling Video Analytics on Constrained Edge Nodes
Christopher Canel,Thomas Kim,Giulio Zhou,Conglong Li,Hyeontaek Lim,David G. Andersen,Michael Kaminsky,Subramanya R. Dulloor +7 more
TL;DR: Evaluation on two real-world camera feed datasets shows that FilterForward reduces bandwidth use by an order of magnitude while improving computational efficiency and event detection accuracy for challenging video content.
Proceedings ArticleDOI
Couper: DNN model slicing for visual analytics containers at the edge
TL;DR: Couper is presented, a practical solution that provides for quick creation of slices of production DNNs for visual analytics, and enables their deployment in contemporary container-based edge software stacks.
Posted Content
TASM: A Tile-Based Storage Manager for Video Analytics
TL;DR: It is demonstrated that TASM can speed up subframe selection queries by an average of over 50% and up to 94% and improve the throughput of the full scan phase of object detection queries by up to 2×.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI
Deep learning
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Book
Deep Learning
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Book ChapterDOI
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.