scispace - formally typeset
Proceedings ArticleDOI

Real-time bag of words, approximately

Reads0
Chats0
TLDR
The Bag of Words pipeline forms the basis to compare various fast alternatives for all of its components, and a fast algorithm to densely sample SIFT and SURF is proposed, and several variants of these descriptors are compared.
Abstract
We start from the state-of-the-art Bag of Words pipeline that in the 2008 benchmarks of TRECvid and PASCAL yielded the best performance scores. We have contributed to that pipeline, which now forms the basis to compare various fast alternatives for all of its components: (i) For descriptor extraction we propose a fast algorithm to densely sample SIFT and SURF, and we compare several variants of these descriptors. (ii) For descriptor projection we compare a k-means visual vocabulary with a Random Forest. As a preprojection step we experiment with PCA on the descriptors to decrease projection time. (iii) For classification we use Support Vector Machines and compare the x2 kernel with the RBF kernel. Our results lead to a 10-fold speed increase without any loss of accuracy and to a 30-fold speed increase with 17% loss of accuracy, where the latter system does real-time classification at 26 images per second.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative

TL;DR: This paper provides an overview of the various strategies that were devised for automatic visual concept detection using the MIR Flickr collection, and discusses results from various experiments in combining social data and low-level content-based descriptors to improve the accuracy of visual concept classifiers.
Book ChapterDOI

An eye fixation database for saliency detection in images

TL;DR: A mechanism to automatically determine characteristic fixation seeds for segmentation is proposed and it is shown that the use of fixation seeds generated from multiple fixation clusters on the salient object can lead to a 10% improvement in segmentation performance over the state-of-the-art.
Journal ArticleDOI

Real-Time Visual Concept Classification

TL;DR: This paper reviews techniques to accelerate concept classification, where the trade-off between computational efficiency and accuracy is shown and the results lead to a 7-fold speed increase without accuracy loss, and a 70- fold speed increase with 3% accuracy loss.
Journal ArticleDOI

Multi-scale and real-time non-parametric approach for anomaly detection and localization

TL;DR: An approach for anomaly detection and localization, in video surveillance applications, based on spatio-temporal features that capture scene dynamic statistics together with appearance is proposed, and outperforms other state-of-the-art real-time approaches.
Journal ArticleDOI

Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features

TL;DR: Improved performance of the proposed approach in comparison to other unimodal and multimodal techniques of the relevant literature is demonstrated and the contribution of high-level audiovisual features toward improved video segmentation to scenes is highlighted.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Journal ArticleDOI

Speeded-Up Robust Features (SURF)

TL;DR: A novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.
Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.
Journal ArticleDOI

A performance evaluation of local descriptors

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.