scispace - formally typeset
Proceedings ArticleDOI

TC-Net for iSBIR: Triplet Classification Network for Instance-level Sketch Based Image Retrieval

TLDR
A Triplet Classification Network (TC-Net) for iSBIR is presented which is composed of two major components: triplet Siamese network, and auxiliary classification loss which can break the limitations existed in previous works.
Abstract
Sketch has been employed as an effective communication tool to express the abstract and intuitive meaning of object. While content-based sketch recognition has been studied for several decades, the instance-level Sketch Based Image Retrieval (iSBIR) task has attracted significant research attention recently. In many previous iSBIR works -- TripletSN, and DSSA, edge maps were employed as intermediate representations in bridging the cross-domain discrepancy between photos and sketches. However, it is nontrivial to efficiently train and effectively use the edge maps in an iSBIR system. Particularly, we find that such an edge map based iSBIR system has several major limitations. First, the system has to be pre-trained on a significant amount of edge maps, either from large-scale sketch datasets, e.g., TU-Berlin~\citeeitz2012hdhso, or converted from other large-scale image datasets, e.g., ImageNet-1K\citedeng2009imagenet dataset. Second, the performance of such an iSBIR system is very sensitive to the quality of edge maps. Third and empirically, the multi-cropping strategy is essentially very important in improving the performance of previous iSBIR systems. To address these limitations, this paper advocates an end-to-end iSBIR system without using the edge maps. Specifically, we present a Triplet Classification Network (TC-Net) for iSBIR which is composed of two major components: triplet Siamese network, and auxiliary classification loss. Our TC-Net can break the limitations existed in previous works. Extensive experiments on several datasets validate the efficacy of the proposed network and system.

read more

Citations
More filters
Proceedings ArticleDOI

Sketch-BERT: Learning Sketch Bidirectional Encoder Representation From Transformers by Self-Supervised Learning of Sketch Gestalt

TL;DR: This work presents a model of learning Sketch Bidirectional Encoder Representation from Transformer (Sketch-BERT), and generalizes BERT to sketch domain, with the novel proposed components and pre-training algorithms, including the newly designed sketch embedding networks, and the self-supervised learning of sketch gestalt.
Proceedings ArticleDOI

Deep Structural Contour Detection

TL;DR: This work proposes a novel yet very effective loss function for contour detection, capable of penalizing the distance of contour-structure similarity between each pair of prediction and ground-truth, and introduces a novel convolutional encoder-decoder network.
Posted Content

Deep Learning for Free-Hand Sketch: A Survey and A Toolbox

TL;DR: A comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable.
Journal ArticleDOI

Deep Learning for Free-Hand Sketch: A Survey

TL;DR: A comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable can be found in this paper , where the authors highlight the essential differences between sketch data and other data modalities, e.g., natural photos.
Journal ArticleDOI

AE-Net: Fine-grained sketch-based image retrieval via attention-enhanced network

TL;DR: Zhang et al. as discussed by the authors investigated the task of Fine-grained Sketch-based Image Retrieval (FG-SBIR), which uses hand-drawn sketches as input queries to retrieve the relevant images at the finegrained instance level.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal ArticleDOI

A Computational Approach to Edge Detection

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.
Proceedings ArticleDOI

Densely Connected Convolutional Networks

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.