Towards optimal bag-of-features for object categorization and semantic video retrieval

doi:10.1145/1282280.1282352

Proceedings ArticleDOI

Towards optimal bag-of-features for object categorization and semantic video retrieval

- pp 494-501

TLDR

This paper evaluates various factors which govern the performance of Bag-of-features, and proposes a novel soft-weighting method to assess the significance of a visual word to an image and experimentally shows it can consistently offer better performance than other popular weighting methods.

Abstract:

Bag-of-features (BoF) deriving from local keypoints has recently appeared promising for object and scene classification. Whether BoF can naturally survive the challenges such as reliability and scalability of visual classification, nevertheless, remains uncertain due to various implementation choices. In this paper, we evaluate various factors which govern the performance of BoF. The factors include the choices of detector, kernel, vocabulary size and weighting scheme. We offer some practical insights in how to optimize the performance by choosing good keypoint detector and kernel. For the weighting scheme, we propose a novel soft-weighting method to assess the significance of a visual word to an image. We experimentally show that the proposed soft-weighting scheme can consistently offer better performance than other popular weighting methods. On both PASCAL-2005 and TRECVID-2006 datasets, our BoF setting generates competitive performance compared to the state-of-the-art techniques. We also show that the BoF is highly complementary to global features. By incorporating the BoF with color and texture features, an improvement of 50% is reported on TRECVID-2006 dataset.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Evaluating Color Descriptors for Object and Scene Recognition

Koen E. A. van de Sande, +2 more

- 01 Sep 2010 -

IEEE Transactions on Pattern Analysis an...

TL;DR: From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition and the usefulness of invariance is category-specific.

...read moreread less

Journal ArticleDOI

Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments

Radu Bogdan Rusu

- 17 Aug 2010 -

Künstliche Intelligenz

TL;DR: The dissertation presented in this article proposes Semantic 3D Object Models as a novel representation of the robot’s operating environment that satisfies these requirements and shows how these models can be automatically acquired from dense 3D range data.

...read moreread less

Proceedings ArticleDOI

Evaluating bag-of-visual-words representations in scene classification

Jun Yang, +3 more

TL;DR: This study provides an empirical basis for designing visual-word representations that are likely to produce superior classification performance and applies techniques used in text categorization to generate image representations that differ in the dimension, selection, and weighting of visual words.

...read moreread less

Journal ArticleDOI

Visual Word Ambiguity

Jan C. van Gemert, +3 more

- 01 Jul 2010 -

IEEE Transactions on Pattern Analysis an...

TL;DR: It is demonstrated that explicitly modeling visual word assignment ambiguity improves classification performance compared to the hard assignment of the traditional codebook model, and the proposed model performs consistently.

...read moreread less

Journal ArticleDOI

Semi-Supervised Hashing for Large-Scale Search

Jun Wang, +2 more

- 01 Dec 2012 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work proposes a semi-supervised hashing (SSH) framework that minimizes empirical error over the labeled set and an information theoretic regularizer over both labeled and unlabeled sets and presents three different semi- supervised hashing methods, including orthogonal hashing, nonorthogonal hash, and sequential hashing.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Journal ArticleDOI

LIBSVM: A library for support vector machines

Chih-Chung Chang, +1 more

- 06 May 2011 -

ACM Transactions on Intelligent Systems ...

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Book

The Nature of Statistical Learning Theory

Vladimir Vapnik

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Svetlana Lazebnik, +2 more

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.

...read moreread less

Proceedings ArticleDOI

Video Google: a text retrieval approach to object matching in videos

Sivic, +1 more

TL;DR: An approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video, represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion.

...read moreread less

Collapse

Related Papers (5)

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004 -

International Journal of Computer Vision

IEEE Transactions on Pattern Analysis an...

Towards optimal bag-of-features for object categorization and semantic video retrieval

Citations

Evaluating Color Descriptors for Object and Scene Recognition

Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments

Evaluating bag-of-visual-words representations in scene classification

Visual Word Ambiguity

Semi-Supervised Hashing for Large-Scale Search

References

Distinctive Image Features from Scale-Invariant Keypoints

LIBSVM: A library for support vector machines

The Nature of Statistical Learning Theory

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Video Google: a text retrieval approach to object matching in videos

Related Papers (5)

Distinctive Image Features from Scale-Invariant Keypoints

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Video Google: a text retrieval approach to object matching in videos

Object recognition from local scale-invariant features

A performance evaluation of local descriptors