scispace - formally typeset
Search or ask a question

Showing papers on "Sketch recognition published in 2018"


Journal ArticleDOI
TL;DR: Major constraints on vision-based gesture recognition occurring in detection and pre-processing, representation and feature extraction, and recognition are surveyed.
Abstract: The ability of computers to recognise hand gestures visually is essential for progress in human-computer interaction. Gesture recognition has applications ranging from sign language to medical assistance to virtual reality. However, gesture recognition is extremely challenging not only because of its diverse contexts, multiple interpretations, and spatio-temporal variations but also because of the complex non-rigid properties of the hand. This study surveys major constraints on vision-based gesture recognition occurring in detection and pre-processing, representation and feature extraction, and recognition. Current challenges are explored in detail.

138 citations


Proceedings ArticleDOI
19 Feb 2018
TL;DR: This work proposes a deep hashing framework for sketch retrieval that, for the first time, works on a multi-million scale human sketch dataset and shows that state-of-the-art hashing models specifically engineered for static images fail to perform well on temporal sketch data.
Abstract: We propose a deep hashing framework for sketch retrieval that, for the first time, works on a multi-million scale human sketch dataset. Leveraging on this large dataset, we explore a few sketch-specific traits that were otherwise under-studied in prior literature. Instead of following the conventional sketch recognition task, we introduce the novel problem of sketch hashing retrieval which is not only more challenging, but also offers a better testbed for large-scale sketch analysis, since: (i) more fine-grained sketch feature learning is required to accommodate the large variations in style and Abstraction, and (ii) a compact binary code needs to be learned at the same time to enable efficient retrieval. Key to our network design is the embedding of unique characteristics of human sketch, where (i) a two-branch CNN-RNN architecture is adapted to explore the temporal ordering of strokes, and (ii) a novel hashing loss is specifically designed to accommodate both the temporal and Abstract traits of sketches. By working with a 3.8M sketch dataset, we show that state-of-the-art hashing models specifically engineered for static images fail to perform well on temporal sketch data. Our network on the other hand not only offers the best retrieval performance on various code sizes, but also yields the best generalization performance under a zero-shot setting and when re-purposed for sketch recognition. Such superior performances effectively demonstrate the benefit of our sketch-specific design.

103 citations


Proceedings ArticleDOI
17 Dec 2018
TL;DR: This article proposed a stroke-level sketch abstraction model based on the insight of sketch abstraction as a process of trading off between the recognizability of a sketch and the number of strokes used to draw it.
Abstract: Human free-hand sketches have been studied in various contexts including sketch recognition, synthesis and fine-grained sketch-based image retrieval (FG-SBIR). A fundamental challenge for sketch analysis is to deal with drastically different human drawing styles, particularly in terms of abstraction level. In this work, we propose the first stroke-level sketch abstraction model based on the insight of sketch abstraction as a process of trading off between the recognizability of a sketch and the number of strokes used to draw it. Concretely, we train a model for abstract sketch generation through reinforcement learning of a stroke removal policy that learns to predict which strokes can be safely removed without affecting recognizability. We show that our abstraction model can be used for various sketch analysis tasks including: (1) modeling stroke saliency and understanding the decision of sketch recognition models, (2) synthesizing sketches of variable abstraction for a given category, or reference object instance in a photo, and (3) training a FG-SBIR model with photos only, bypassing the expensive photo-sketch pair collection step.

69 citations


Proceedings ArticleDOI
01 Sep 2018
TL;DR: This paper treats the problem of stroke-level sketch segmentation as a seqence-to-sequence generation problem, and a reccurent nueral networks (RNN)-based model SketchSegNet is presented to translate sequence of strokes into thier semantic part labels.
Abstract: We investigate the problem of stroke-level sketch segmentation, which is to train machines to assign strokes with semantic part labels given a input sketch. Solving the problem of sketch segmentation opens the door for fine-grained sketch interpretation, which can benefit many novel sketch-based applications, including sketch recognition and sketch-based image retrieval. In this paper, we treat the problem of stroke-level sketch segmentation as a seqence-to-sequence generation problem, and a reccurent nueral networks (RNN)-based model SketchSegNet is presented to translate sequence of strokes into thier semantic part labels. In addition, for the first time a large-scale stroke-level sketch segmentation dataset is proposed, which is composed of 57K annotated free-hand human sketch selected from QuickDraw. Experimental results of stroke-level sketch segmentation on this novel dataset shows that our approach offers an average accuracy over 90% for stroke labeling.

43 citations


Journal ArticleDOI
TL;DR: A new approach for static hand gesture recognition is proposed, tuned by a multi-objective evolutionary algorithm based on the Nondominated Sorting Genetic Algorithm II (NSGA-II), which shows good recognition rate and low computational cost.
Abstract: Hand gestures are an intuitive way for humans to interact with computers. They are becoming increasingly popular in several applications, such as smart houses, games, vehicle infotainment systems, kitchens and operating theaters. An effective human–computer interaction system should aim at both good recognition accuracy and speed. This paper proposes a new approach for static hand gesture recognition. A benchmark database with 36 gestures is used, containing variations in scale, illumination and rotation. Several common image descriptors, such as Fourier, Zernike moments, pseudo-Zernike moments, Hu moments, complex moments and Gabor features are comprehensively compared in terms of their respective accuracy and speed. Gesture recognition is undertaken by a multilayer perceptron which has a flexible structure and fast recognition. In order to achieve improved accuracy and minimize computational cost, both the feature vector and the neural network are tuned by a multi-objective evolutionary algorithm based on the Nondominated Sorting Genetic Algorithm II (NSGA-II). The proposed method is compared with state-of-the-art methods. A real-time gesture recognition system based on the proposed descriptor is constructed and evaluated. Experimental results show a good recognition rate, using a descriptor with low computational cost and reduced size.

34 citations


Proceedings ArticleDOI
15 Oct 2018
TL;DR: This paper addresses the Sketch Re-ID problem by proposing a cross-domain adversarial feature learning approach to jointly learn the identity features and domain-invariant features and shows that the proposed method outperforms the state-of-the-arts.
Abstract: Under person re-identification (Re-ID), a query photo of the target person is often required for retrieval. However, one is not always guaranteed to have such a photo readily available under a practical forensic setting. In this paper, we define the problem of Sketch Re-ID, which instead of using a photo as input, it initiates the query process using a professional sketch of the target person. This is akin to the traditional problem of forensic facial sketch recognition, yet with the major difference that our sketches are whole-body other than just the face. This problem is challenging because sketches and photos are in two distinct domains. Specifically, a sketch is the abstract description of a person. Besides, person appearance in photos is variational due to camera viewpoint, human pose and occlusion. We address the Sketch Re-ID problem by proposing a cross-domain adversarial feature learning approach to jointly learn the identity features and domain-invariant features. We employ adversarial feature learning to filter low-level interfering features and remain high-level semantic information. We also contribute to the community the first Sketch Re-ID dataset with 200 persons, where each person has one sketch and two photos from different cameras associated. Extensive experiments have been performed on the proposed dataset and other common sketch datasets including CUFSF and QUML-shoe. Results show that the proposed method outperforms the state-of-the-arts.

34 citations


Journal ArticleDOI
Decheng Liu1, Jie Li1, Nannan Wang1, Chunlei Peng1, Xinbo Gao1 
TL;DR: A novel composite sketch recognition method by extracting scale-invariant feature transform (SIFT) feature and histogram of oriented gradient (HOG) feature from components, fusing different features at score level, combining facial components with linear function is presented.

32 citations


Posted Content
TL;DR: A novel single-branch attentive network architecture RNN-Rasterization-CNN (Sketch-R2CNN for short) is proposed to fully leverage the dynamics in sketches for recognition and achieves better performance than the state-of-the-art methods.
Abstract: Freehand sketching is a dynamic process where points are sequentially sampled and grouped as strokes for sketch acquisition on electronic devices. To recognize a sketched object, most existing methods discard such important temporal ordering and grouping information from human and simply rasterize sketches into binary images for classification. In this paper, we propose a novel single-branch attentive network architecture RNN-Rasterization-CNN (Sketch-R2CNN for short) to fully leverage the dynamics in sketches for recognition. Sketch-R2CNN takes as input only a vector sketch with grouped sequences of points, and uses an RNN for stroke attention estimation in the vector space and a CNN for 2D feature extraction in the pixel space respectively. To bridge the gap between these two spaces in neural networks, we propose a neural line rasterization module to convert the vector sketch along with the attention estimated by RNN into a bitmap image, which is subsequently consumed by CNN. The neural line rasterization module is designed in a differentiable way to yield a unified pipeline for end-to-end learning. We perform experiments on existing large-scale sketch recognition benchmarks and show that by exploiting the sketch dynamics with the attention mechanism, our method is more robust and achieves better performance than the state-of-the-art methods.

21 citations


Journal ArticleDOI
TL;DR: This paper proposes an approach to discover the most distinctive sketches for action recognition, and presents four kinds of sketch pooling methods to get a uniform representation for action videos.

17 citations


Proceedings ArticleDOI
01 Oct 2018
TL;DR: This paper proposes a novel point-based network with a compact architecture, named SketchPointNet, for robust sketch recognition, which achieves comparable performance on the challenging TU-Berlin dataset while it significantly reduces the network size.
Abstract: Sketch recognition is a challenging image processing task. In this paper, we propose a novel point-based network with a compact architecture, named SketchPointNet, for robust sketch recognition. Sketch features are hierarchically learned from three mini PointNets, by successively sampling and grouping 2D points in a bottom-up fashion. SketchPointNet exploits both temporal and spatial context in strokes during point sampling and grouping. By directly consuming the sparse points, SketchPointN et is very compact and efficient. Compared with state-of-the-art techniques, SketchPointNet achieves comparable performance on the challenging TU-Berlin dataset while it significantly reduces the network size.

16 citations


Posted Content
TL;DR: Li et al. as mentioned in this paper propose a deep hashing framework for sketch retrieval that, for the first time, works on a multi-million scale human sketch dataset, and explore a few sketch-specific traits that were otherwise understudied in prior literature.
Abstract: We propose a deep hashing framework for sketch retrieval that, for the first time, works on a multi-million scale human sketch dataset. Leveraging on this large dataset, we explore a few sketch-specific traits that were otherwise under-studied in prior literature. Instead of following the conventional sketch recognition task, we introduce the novel problem of sketch hashing retrieval which is not only more challenging, but also offers a better testbed for large-scale sketch analysis, since: (i) more fine-grained sketch feature learning is required to accommodate the large variations in style and abstraction, and (ii) a compact binary code needs to be learned at the same time to enable efficient retrieval. Key to our network design is the embedding of unique characteristics of human sketch, where (i) a two-branch CNN-RNN architecture is adapted to explore the temporal ordering of strokes, and (ii) a novel hashing loss is specifically designed to accommodate both the temporal and abstract traits of sketches. By working with a 3.8M sketch dataset, we show that state-of-the-art hashing models specifically engineered for static images fail to perform well on temporal sketch data. Our network on the other hand not only offers the best retrieval performance on various code sizes, but also yields the best generalization performance under a zero-shot setting and when re-purposed for sketch recognition. Such superior performances effectively demonstrate the benefit of our sketch-specific design.

Proceedings ArticleDOI
Jie Gui1, Ping Li1
27 Dec 2018
TL;DR: To the best of the knowledge, MvFS is the first algorithm to address the problem of multiview feature selection for HFR, in which the dimensionalities of different views are the same and the number of selected features of differentViews are the the same.
Abstract: While the task of feature selection has been studied for many years, the topic of multi-view feature selection for heterogeneous face recognition (HFR) such as visible (VIS) image versus near infrared (NIR) image recognition, photo versus sketch recognition, and face recognition across pose, is rarely studied. In this paper, we propose a multi-view feature selection method (MvFS) for HFR. To the best of our knowledge, MvFS is the first algorithm to address the problem of multiview feature selection for HFR, in which the dimensionalities of different views are the same and the number of selected features of different views are the same. The proposed algorithm is simple and computationally efficient. Our experiments confirm the effectiveness of MvFS.

Journal ArticleDOI
TL;DR: An improved deep CNN model is proposed as a validated network, which is based on Faster R-CNN, to measure the similarity of real sketches and generated freehand sketches by FHS-GAN, and the model achieves state-of-the-art results in comparison with other baseline models.


Proceedings ArticleDOI
Penghui Sun1, Yan Chen1, Xiaoqing Lyu1, Bei Wang1, Jingwei Qu1, Zhi Tang1 
01 Apr 2018
TL;DR: A dual-mode-based method is proposed to distinguish the gestures for character inputs and non-character inputs instead of the ordinary segmentation approaches and an attribute graph model is established to describe effectively all necessary information of a sketched CSF.
Abstract: Chemical Structural Formula(CSF) recognition plays an important role in the molecular design and component retrieval. However, sketch-based CSF recognition remains an obstacle in current retrieval systems. This paper introduces a system for sketch CSF recognition on smart mobile devices. A dual-mode-based method is proposed to distinguish the gestures for character inputs and non-character inputs instead of the ordinary segmentation approaches. An attribute graph model is established to describe effectively all necessary information of a sketched CSF. Chemical knowledge is adopted to refine the candidates of structure relationship among elements. The experiments results demonstrate that the proposed method outperforms the existing methods for free-sketch CSFs on effectiveness and flexibility.

Posted Content
TL;DR: A new Deep Triplet Classification Siamese Network (DeepTCNet) is proposed which employs DenseNet-169 as the basic feature extractor and is optimized by the triplet loss and classification loss and can break the limitations existed in previous works.
Abstract: Sketch has been employed as an effective communicative tool to express the abstract and intuitive meanings of object. Recognizing the free-hand sketch drawing is extremely useful in many real-world applications. While content-based sketch recognition has been studied for several decades, the instance-level Sketch-Based Image Retrieval (SBIR) tasks have attracted significant research attention recently. The existing datasets such as QMUL-Chair and QMUL-Shoe, focus on the retrieval tasks of chairs and shoes. However, there are several key limitations in previous instance-level SBIR works. The state-of-the-art works have to heavily rely on the pre-training process, quality of edge maps, multi-cropping testing strategy, and augmenting sketch images. To efficiently solve the instance-level SBIR, we propose a new Deep Triplet Classification Siamese Network (DeepTCNet) which employs DenseNet-169 as the basic feature extractor and is optimized by the triplet loss and classification loss. Critically, our proposed DeepTCNet can break the limitations existed in previous works. The extensive experiments on five benchmark sketch datasets validate the effectiveness of the proposed model. Additionally, to study the tasks of sketch-based hairstyle retrieval, this paper contributes a new instance-level photo-sketch dataset - Hairstyle Photo-Sketch dataset, which is composed of 3600 sketches and photos, and 2400 sketch-photo pairs.

Journal ArticleDOI
TL;DR: MATRACK, the developed BSBL based sketch recognition system, outperforms k-NN, HMM and PCA in accuracy and outperforms three high accuracy classifiers namely, Hidden Markov Model, Principle Component Analysis and K-Nearest Neighbour.
Abstract: Human-computer interaction has become increasingly easy and popular using widespread smart devices. Gestures and sketches as the trajectory of hands in 3D space are among the popular interaction media. Therefore, their recognition is essential. However, diversity of human gestures along with the lack of visual cues make the sketch recognition process challenging. This paper aims to develop an accurate sketch recognition algorithm using Block Sparse Bayesian Learning (BSBL). Sketches are acquired from three datasets using a Wii-mote in a virtual-reality environment. We evaluate the performance of the proposed sketch recognition approach (MATRACK) on diverse sketch datasets. Comparisons are drawn with three high accuracy classifiers namely, Hidden Markov Model (HMM), Principle Component Analysis (PCA) and K-Nearest Neighbour (K-NN). MATRACK, the developed BSBL based sketch recognition system, outperforms k-NN, HMM and PCA. Specifically, for the most diverse dataset, MATRACK reaches the accuracy of 93.5%, where other three classifiers approximately catches 80% accuracy.

Proceedings ArticleDOI
01 Jun 2018
TL;DR: Improved sketch recognition techniques are proposed to better support Chinese character educational interfaces' realtime assessment of novice CSL students' character writing and to improve students' natural writing input of Chinese characters.
Abstract: Students of Chinese as a Second Language (CSL) with primarily English fluency often struggle with the language's complex character set. Conventional classroom pedagogy and relevant educational applications have focused on providing valuable assessment feedback to address their challenges, but rely on direct instructor observation and provide constrained assessment, respectively. We propose improved sketch recognition techniques to better support Chinese character educational interfaces' realtime assessment of novice CSL students' character writing. Based on successful assessment feedback approaches from existing educational resources, we developed techniques for supporting richer automated assessment, so that students may be better informed of their writing performance outside the classroom. From our evaluations, our techniques achieved recognition rates of 91% and 85% on expert and novice Chinese character handwriting data, respectively, greater than 90% recognition rate on written technique mistakes, and 80.4% f-measure on distinguishing between expert and novice handwriting samples, without sacrificing students' natural writing input of Chinese characters.

Book ChapterDOI
02 Dec 2018
TL;DR: In this article, a deep metric learning loss was proposed to minimize the Bayesian risk of misclassification for each mini-batch during training. But the loss was not applied to the task of hand-drawn sketch recognition.
Abstract: In this paper, we address the problem of hand-drawn sketch recognition. Inspired by the Bayesian decision theory, we present a deep metric learning loss with the objective to minimize the Bayesian risk of misclassification. We estimate this risk for every mini-batch during training, and learn robust deep embeddings by backpropagating it to a deep neural network in an end-to-end trainable paradigm. Our learnt embeddings are discriminative and robust despite of intra-class variations and inter-class similarities naturally present in hand-drawn sketch images. Outperforming the state of the art on sketch recognition, our method achieves 82.2% and 88.7% on TU-Berlin-250 and TU-Berlin-160 benchmarks respectively.

Book ChapterDOI
30 May 2018
TL;DR: HUNCH’s concept as AI-equipped ‘partner’ for the architect, designer or layman is presented.
Abstract: James Taggart developed the sketch recognition system HUNCH at MIT in 1972. Rather than the user trying to understand the software in order to progress a drawing, in the case of HUNCH, the software observed the user sketching. It enabled a conversation between user and software through the sketch as a medium. HUNCH was one component of the Architecture Machine, created in the Architecture Machine Group (Arch Mac), run by Nicholas Negroponte at MIT between 1967 and 1985 and a brainchild of cross-fertilization between architecture, computer sciences and cybernetics in the early 1970s. One of HUNCHs objectives was to enable even the layman to ‘design’ a dream home. The paper presents HUNCH’s concept as AI-equipped ‘partner’ for the architect, designer or layman.

Proceedings ArticleDOI
01 Aug 2018
TL;DR: A novel Directional Element Histogram of Oriented Gradient (DE-HOG) feature to human free-hand sketch recognition task that achieves superior performance to traditional HOG feature, originally designed for photographic objects.
Abstract: We propose a novel Directional Element Histogram of Oriented Gradient (DE-HOG) feature to human free-hand sketch recognition task that achieves superior performance to traditional HOG feature, originally designed for photographic objects. As a result of modeling the unique characteristics of free-hand sketch, i.e. consisting only a set of strokes omitting visual information such as color and brightness, being highly iconic and abstract. Specifically, we encode sketching strokes as a form of regularized directional vectors from the skeleton of a sketch, whilst still leveraging the HOG feature to meet the local deformation-invariant demands. Such a representation combines the best of two features by encoding necessary and discriminative stroke-level information, but can still robustly deal with various levels of sketching variations. Extensive experiments conducted on two large benchmark sketch recognition datasets demonstrate the performance of our proposed method.

Book ChapterDOI
13 Dec 2018
TL;DR: This paper presents an interactive sketch recognition system that converts user’s sketch into structured geometric shapes in usable electronic format with minimal effort.
Abstract: With the recent advances in tablet devices industry, sketch recognition has become a potential replacement for existing systems’ traditional user interfaces. Structured diagrams (flow charts, Markov chains, module dependency diagrams, state diagrams, block diagrams, UML, graphs, etc.) are very common in many science fields. Usually, such diagrams are created using structured graphics editors like Microsoft Visio. Structured graphics editors are extremely powerful and expressive, but they can be cumbersome to use. This paper presents an interactive sketch recognition system that converts user’s sketch into structured geometric shapes in usable electronic format with minimal effort.

Posted Content
TL;DR: A deep metric learning loss is presented with the objective to minimize the Bayesian risk of misclassification for every mini-batch during training, and robust deep embeddings are learned by backpropagating it to a deep neural network in an end-to-end trainable paradigm.
Abstract: In this paper, we address the problem of hand-drawn sketch recognition. Inspired by the Bayesian decision theory, we present a deep metric learning loss with the objective to minimize the Bayesian risk of misclassification. We estimate this risk for every mini-batch during training, and learn robust deep embeddings by backpropagating it to a deep neural network in an end-to-end trainable paradigm. Our learnt embeddings are discriminative and robust despite of intra-class variations and inter-class similarities naturally present in hand-drawn sketch images. Outperforming the state of the art on sketch recognition, our method achieves 82.2% and 88.7% on TU-Berlin-250 and TU-Berlin-160 benchmarks respectively.

Proceedings ArticleDOI
01 Oct 2018
TL;DR: This paper presents a novel method for matching of face sketch images with face photo images by using fuzzy hamming distance and using the point landmarks in the face to extract perfect face ratios.
Abstract: This paper presents a novel method for matching of face sketch images with face photo images. The basic idea is to extract perfect face ratios for face and sketch as features and calculate the distance between them by using fuzzy hamming distance. To extract perfect face ratios, we use the point landmarks in the face then sixteen features will be extract. An experimental evaluation demonstrates the satisfactory performance of our approach on CUHK dataset. It can be applied with any existing sketches set. It is observed that the proposed algorithm will be a competitor of the other proposed relative approaches. The recognition rate reaches 100% especially in the CUHK dataset.

Proceedings ArticleDOI
17 Aug 2018
TL;DR: The major challenges in grouping ink into identifiable shapes are identified, and the common solutions to these challenges are discussed based on current research, and areas for future work are highlighted.
Abstract: An early step in bottom-up diagram recognition systems is grouping ink strokes into shapes. This paper gives an overview of the key literature on automatic grouping techniques in sketch recognition. In addition, we identify the major challenges in grouping ink into identifiable shapes, discuss the common solutions to these challenges based on current research, and highlight areas for future work.

14 Oct 2018
TL;DR: A preliminary approach that consists in automatically capture information from the sketches using image processing and pattern recognition to verify that simply capturing sketches can provide useful information for later use and then improve the modeling process efficiency.
Abstract: The purpose of agile practices is to optimize engineering processes. Beside it, software documentation often suffers from the priority given to fast and successive deliveries of new functionalities. As a consequence, incomplete documentation and graphical representation make it difficult for a developer to maintain and evolve. Sketching is an integral part of the software design and development. Indeed, among other stakeholders, developers use sketches to informally share knowledge about source code. Since sketches are often hand-drawn, written on paper or whiteboard without any additional technology tool, they are not considered as an artifact in agile method or traditional engineering process but as a disposable production. In this work, we focus on sketches containing Unified Modeling Language (UML) diagrams. To produce documentation or to exploit information from this kind of sketches, developers have to transcribe it in a UML case tool, what they see as a waste of time that hinders productivity. We argue that sketches might be worth considering as non-code artifacts. The fact is that developer or designer drop informally a lot of information about the software, which is unusable. Our goal is to verify that simply capturing sketches can provide useful information for later use and then improve the modeling process efficiency. In this paper, we present a preliminary approach that consists in automatically capture information from the sketches using image processing and pattern recognition. We propose a fledgling prototype that demonstrates the proposal's viability. Then, as a future work, we plan to put in the hands of the developers a finalized version of our prototype and study the added value of our proposal.

23 May 2018
Abstract: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 NOMENCLATURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6