scispace - formally typeset
Open AccessJournal ArticleDOI

A performance evaluation of gradient field HOG descriptor for sketch based image retrieval

Rui Hu, +1 more
- 01 Jul 2013 - 
- Vol. 117, Iss: 7, pp 790-806
TLDR
Gradient Field HOG is described; an adapted form of the HOG descriptor suitable for Sketch Based Image Retrieval (SBIR) and incorporated into a Bag of Visual Words retrieval framework, and shown to consistently outperform retrieval versus SIFT, multi-resolution HOG, Self Similarity, Shape Context and Structure Tensor.
About
This article is published in Computer Vision and Image Understanding.The article was published on 2013-07-01 and is currently open access. It has received 363 citations till now. The article focuses on the topics: Visual Word & Image retrieval.

read more

Citations
More filters
Journal ArticleDOI

The sketchy database: learning to retrieve badly drawn bunnies

TL;DR: The Sketchy database is presented, the first large-scale collection of sketch-photo pairs and it is shown that the learned representation significantly outperforms both hand-crafted features as well as deep features trained for sketch or photo classification.
Proceedings ArticleDOI

Sketch Me That Shoe

TL;DR: A deep tripletranking model for instance-level SBIR is developed with a novel data augmentation and staged pre-training strategy to alleviate the issue of insufficient fine-grained training data.
Journal ArticleDOI

Sketch-a-Net: A Deep Neural Network that Beats Humans

TL;DR: It is shown that state-of-the-art deep networks specifically engineered for photos of natural objects fail to perform well on sketch recognition, regardless whether they are trained using photos or sketches.
Proceedings ArticleDOI

Deep Sketch Hashing: Fast Free-Hand Sketch-Based Image Retrieval

TL;DR: This paper introduces a novel binary coding method, named Deep Sketch Hashing (DSH), where a semi-heterogeneous deep architecture is proposed and incorporated into an end-to-end binary coding framework, and is the first hashing work specifically designed for category-level SBIR with an end to end deep architecture.
Proceedings ArticleDOI

Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval

TL;DR: A novel deep FG-SBIR model is proposed which differs significantly from the existing models in that it is spatially aware, achieved by introducing an attention module that is sensitive to the spatial position of visual details and combines coarse and fine semantic information via a shortcut connection fusion block.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Proceedings ArticleDOI

Mining association rules between sets of items in large databases

TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.
Journal ArticleDOI

WordNet : an electronic lexical database

Christiane Fellbaum
- 01 Sep 2000 - 
TL;DR: The lexical database: nouns in WordNet, Katherine J. Miller a semantic network of English verbs, and applications of WordNet: building semantic concordances are presented.
Journal ArticleDOI

A performance evaluation of local descriptors

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Related Papers (5)
Frequently Asked Questions (15)
Q1. What contributions have the authors mentioned in the paper "A performance evaluation of gradient field hog descriptor for sketch based image retrieval" ?

The authors present an image retrieval system for the interactive search of photo collections using free-hand sketches depicting shape. The authors describe Gradient Field HOG ( GF-HOG ) ; an adapted form of the HOG descriptor suitable for sketch based image retrieval ( SBIR ). The authors incorporate GF-HOG into a Bag of Visual Words ( BoVW ) retrieval framework, and demonstrate how this combination may be harnessed both for robust SBIR, and for localizing sketched objects within an image. The authors compare GF-HOG against state-of-the-art descriptors with common distance measures and language models for image retrieval, and explore how affine deformation of the sketch impacts search performance. Further, the authors incorporate semantic keywords in to their GF-HOG system to enable the use of annotated sketches for image search. 

Future directions of this work will explore more sophisticated combination schemes, for example kernel canonical correlation analysis ( KCCA ) [ 68 ] which has been used to good effect combining photorealistic and textual constraints outside the domain of SBIR. It may be possible to draw upon work on grouping regions for structure invariant matching [ 69 ], to select an appropriate set of scales for edge detection and further improve retrieval accuracy. However the authors believe such enhancements are not necessary to demonstrate the robustness and performance of GF-HOG for SBIR, and its potential for use in sketch based retrieval applications such as sketch-text search and photo montage. 

The authors apply Random Sampling and Consensus (RANSAC) to fit the sketched shape to the image via a constrained affine transformation with four degrees of freedom (uniform scale, rotation and translation). 

In their experiments of using Cityblock distance based linear search the retrieval time increases approximately linearly with the increasing database size. 

Given a vocabulary V = {w1, ...,wK} of K keywords present within all image tags, the similarity of a pair of keyword tags is commonly defined using tag cooccurrence. 

Digital image repositories are commonly indexed using manually annotated keyword tags that indicate the presence of salient objects or concepts. 

In order to encode the relative location and spatial orientation of sketches or Canny edges of images, the authors represent image structure using a dense gradient field interpolated from the sparse set of edge pixels. 

Chalechale et al. [34] employ angular-spatial distribution of pixels in the abstract images to extract features using the Fourier transform. 

Whilst a variety of local descriptors such as SIFT, SSIM, HOG have been successfully used in image retrieval and classification tasks [57], it is still unclear how various local descriptors perform in SBIR. 

In all experiments, the photos and the sketch canvas were pre-scaled so that their largest dimension (e.g. width or height) was 200 pixels. 

In this paper, the authors explore using the Cityblock distance based kd-tree to improve the retrieval time, since the Cityblock distance achieves comparable performance to the best results achieved by the Histogram Intersection distance (shown in Fig. 7) and its linear geometry nature makes it easy to be adapted in the kd-tree indexing technique. 

The authors compute the similarity between two sets of tags C1 = C11,C 1 2, ...,C 1 N and C2 = C21,C 2 2, ...,C 2 M , corresponding to images The author1 and I2 as: ∑Mm=1 maxn{p(C 1 n |C 2 m)}N +∑N n=1 maxm{p(C 1 n |C 2 m)}M (7)where p(C1n |C 2 m) calculates the co-occurrence probability of two tags via the shortest path techniques of subSec 5.1. 

The authors also experiment with eight commonly used distance measures from norms to metrics frequently used in text (“Bag of Words”) retrieval. 

The early nineties delivered several SBIR algorithms capable of matching photographs with queries comprising blobs of color, or predefined texture. 

As expected, the greater the affine deviation of the sketch from the typical configuration of the target objects in each category, the greater the performance (MAP) degradation for the rotation and scaling.