scispace - formally typeset
Search or ask a question

Showing papers on "Sketch recognition published in 2021"


Journal ArticleDOI
TL;DR: A multi-scale gradients self-attention residual learning framework for face photo-sketch transformation that embeds a self-ATTention mechanism in the residual block, making full use of the relationship between features to selectively enhance the characteristics of specific information through self-Attention distribution.
Abstract: Face sketch synthesis, as a key technique for solving face sketch recognition, has made considerable progress in recent years. Due to the difference of modality between face photo and face sketch, traditional exemplar-based methods often lead to missed texture details and deformation while synthesizing sketches. And limited to the local receptive field, Convolutional Neural Networks-based methods cannot deal with the interdependence between features well, which makes the constraint of facial features insufficient; as such, it cannot retain some details in the synthetic image. Moreover, the deeper the network layer is, the more obvious the problems of gradient disappearance and explosion will be, which will lead to instability in the training process. Therefore, in this paper, we propose a multi-scale gradients self-attention residual learning framework for face photo-sketch transformation that embeds a self-attention mechanism in the residual block, making full use of the relationship between features to selectively enhance the characteristics of specific information through self-attention distribution. Simultaneously, residual learning can keep the characteristics of the original features from being destroyed. In addition, the problem of instability in GAN training is alleviated by allowing discriminator to become a function of multi-scale outputs of the generator in the training process. Based on cycle framework, the matching between the target domain image and the source domain image can be constrained while the mapping relationship between the two domains is established so that the tasks of face photo-to-sketch synthesis (FP2S) and face sketch-to-photo synthesis (FS2P) can be achieved simultaneously. Both Image Quality Assessment (IQA) and experiments related to face recognition show that our method can achieve state-of-the-art performance on the public benchmarks, whether using FP2S or FS2P.

20 citations


Journal ArticleDOI
TL;DR: A novel end-to-end single-branch network architecture RNN-Rasterization-CNN (Sketch-R2CNN for short) is proposed to fully leverage the vector format of sketches for recognition to substantially outperforms the state-of-the-art methods.
Abstract: Sketches in existing large-scale datasets like the recent QuickDraw collection are often stored in a vector format, with strokes consisting of sequentially sampled points. However, most existing sketch recognition methods rasterize vector sketches as binary images and then adopt image classification techniques. In this article, we propose a novel end-to-end single-branch network architecture RNN-Rasterization-CNN ( Sketch-R2CNN for short) to fully leverage the vector format of sketches for recognition. Sketch-R2CNN takes a vector sketch as input and uses an RNN for extracting per-point features in the vector space. We then develop a neural line rasterization module to convert the vector sketch and the per-point features to multi-channel point feature maps, which are subsequently fed to a CNN for extracting convolutional features in the pixel space. Our neural line rasterization module is designed in a differentiable way for end-to-end learning. We perform experiments on existing large-scale sketch recognition datasets and show that the RNN-Rasterization design brings consistent improvement over CNN baselines and that Sketch-R2CNN substantially outperforms the state-of-the-art methods.

17 citations


Journal ArticleDOI
TL;DR: The novel iterative local re-ranking with attribute guided synthesis method is proposed for face sketch recognition, which does not require any extra manually annotation or human interaction and achieves superior performances compared with state-of-the-art methods.

17 citations


Journal ArticleDOI
TL;DR: Peng et al. as mentioned in this paper proposed a graph neural network (GNN) for learning representations of sketches from multiple graphs, which simultaneously capture global and local geometric stroke structures as well as temporal information.
Abstract: Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with convolutional neural networks (CNNs) or the temporal sequential property with recurrent neural networks (RNNs). In this work, we propose a new representation of sketches as multiple sparsely connected graphs. We design a novel graph neural network (GNN), the multigraph transformer (MGT), for learning representations of sketches from multiple graphs, which simultaneously capture global and local geometric stroke structures as well as temporal information. We report extensive numerical experiments on a sketch recognition task to demonstrate the performance of the proposed approach. Particularly, MGT applied on 414k sketches from Google QuickDraw: 1) achieves a small recognition gap to the CNN-based performance upper bound (72.80% versus 74.22%) and infers faster than the CNN competitors and 2) outperforms all RNN-based models by a significant margin. To the best of our knowledge, this is the first work proposing to represent sketches as graphs and apply GNNs for sketch recognition. Code and trained models are available at https://github.com/PengBoXiangShang/multigraph_transformer.

16 citations


Journal ArticleDOI
TL;DR: A novel sketch-specific data augmentation (SSDA) method that leverages the quantity and quality of the sketches automatically and can be integrated with any convolutional neural networks, it has a distinct advantage over the existing methods.

9 citations


Book ChapterDOI
28 Jun 2021
TL;DR: In this paper, a neural network-based recognition technique is proposed to recognize and transform hand-drawn BPMN models into digital BPMNs, which can be used for smoothing the modeling process.
Abstract: Despite the widespread availability of process modeling tools, the first version of a process model is often drawn by hand on a piece of paper or whiteboard, especially when several people are involved in its elicitation. Though this has been found to be beneficial for the modeling task itself, it also creates the need to manually convert hand-drawn models afterward, such that they can be further used in a modeling tool. This manual transformation is associated with considerable time and effort and, furthermore, creates undesirable friction in the modeling workflow. In this paper, we alleviate this problem by presenting a technique that can automatically recognize and convert a sketch process model into a digital BPMN model. A key driver and contribution of our work is the creation of a publicly available dataset consisting of 502 manually annotated, hand-drawn BPMN models, covering 25 different BPMN elements. Based on this data set, we have established a neural network-based recognition technique that can reliably recognize and transform hand-drawn BPMN models. Our evaluation shows that our technique considerably outperforms available baselines and, therefore, provides a valuable basis to smoothen the modeling process.

7 citations


Proceedings ArticleDOI
06 May 2021
TL;DR: In this article, a large-scale dataset of 17,979 hand-drawn sketches of 21 UI element categories collected from 967 participants, including UI/UX designers, front-end developers, HCI, and CS grad students, from 10 different countries.
Abstract: This paper contributes the first large-scale dataset of 17,979 hand-drawn sketches of 21 UI element categories collected from 967 participants, including UI/UX designers, front-end developers, HCI, and CS grad students, from 10 different countries. We performed a perceptual study with this dataset and found out that UI/UX designers can recognize the UI element sketches with ~96% accuracy. To compare human performance against computational recognition methods, we trained the state-of-the-art DNN-based image classification models to recognize the UI elements sketches. This study revealed that the ResNet-152 model outperforms other classification networks and detects unknown UI element sketches with 91.77% accuracy (chance is 4.76%). We have open-sourced the entire dataset of UI element sketches to the community intending to pave the way for further research in utilizing AI to assist the conversion of lo-fi UI sketches to higher fidelities.

7 citations


Journal ArticleDOI
TL;DR: An effective method of Regularized Particle Swarm Optimization Based Deep Convolutional Neural Network (RPSO-DCNN) algorithm to retrieve the performance of free hand-drawn sketches and demonstrates the optimal accuracy with different state of art methods.
Abstract: One of the most popular and rising research area of image processing is free hand-drawn sketch recognition and its retrieval Enlarger number of methods is introduced to retrieve the sketch images but it made few complexity issues and their performance often degraded So, in this paper, we proposed an effective method of Regularized Particle Swarm Optimization Based Deep Convolutional Neural Network (RPSO-DCNN) algorithm to retrieve the performance of free hand-drawn sketches In feature extraction, the Regularized Particle Swarm Optimization (RPSO) model that aim is to produce an optimal evolutionary deep learning result Therefore, the free hand-drawn sketch image classification and its retrieval are performed by Support Vector Machine and Levenshtein distance-based fuzzy k-nearest neighbour (L-FkNN) algorithms Hence, this work can bring in communication between human and computer Experimentally, the simulation work of the proposed RPSO-DCNN model is implemented in the running software of MATLAB The sketch images are chosen from the TU-Berlin dataset, Sketch dataset, SHREC13 dataset, Flickr dataset and Sketchy dataset Aiming is to facilitate the performance of the proposed RPSO-DCNN model with various kinds of state of art methods such as H-CNN, Fuzzy, CNN, MARQS and TCVD The experimental result demonstrates that, the proposed RPSO-DCNN accomplish the optimal accuracy with different state-of-art methods

5 citations



Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed an angular-driven feedback restoration network (ADFRNet), which first detects the imperfect parts of a sketch and then refines them into high quality images, to boost the performance of sketch recognition.
Abstract: Automatic hand-drawn sketch recognition is an important task in computer vision. However, the vast majority of prior works focus on exploring the power of deep learning to achieve better accuracy on complete and clean sketch images, and thus fail to achieve satisfactory performance when applied to incomplete or destroyed sketch images. To address this problem, we first develop two datasets that contain different levels of scrawl and incomplete sketches. Then, we propose an angular-driven feedback restoration network (ADFRNet), which first detects the imperfect parts of a sketch and then refines them into high quality images, to boost the performance of sketch recognition. By introducing a novel “feedback restoration loop” to deliver information between the middle stages, the proposed model can improve the quality of generated sketch images while avoiding the extra memory cost associated with popular cascading generation schemes. In addition, we also employ a novel angular-based loss function to guide the refinement of sketch images and learn a powerful discriminator in the angular space. Extensive experiments conducted on the proposed imperfect sketch datasets demonstrate that the proposed model is able to efficiently improve the quality of sketch images and achieve superior performance over the current state-of-the-art methods.

4 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a dual-channel convolutional neural network (DCNN) for hand-drawn sketch recognition, where the sketch and contour are extracted by the contour extraction algorithm and then used as input image of CNN.
Abstract: In hand-drawn sketch recognition, the traditional deep learning method has the problems of insufficient feature extraction and low recognition rate. To solve this problem, a new algorithm based on a dual-channel convolutional neural network is proposed. Firstly, the sketch is preprocessed to get a smooth sketch. The contour of the sketch is obtained by the contour extraction algorithm. Then, the sketch and contour are used as the input image of CNN. Finally, feature fusion is carried out in the full connection layer, and the classification results are obtained by using a softmax classifier. Experimental results show that this method can effectively improve the recognition rate of a hand-drawn sketch.

Proceedings ArticleDOI
03 Apr 2021
TL;DR: In this article, a one-shot learning method with Siamese network is proposed, which only requires one training sample per class, and the similarity score is computed by using Euclidean distance.
Abstract: Deep Convolutional Neural Networks have been widely used in computer vision tasks like classifying an image and detecting an object within an image. To archive the state-of-the-art performance, it normally requires a huge number of labeled samples. However, in the facial sketch recognition tasks, collecting this amount of samples is not feasible. Each subject will only have one sketch and one photo. To address this, a One-shot Learning method with Siamese Network is proposed in this paper due to the fact that it only requires one training sample per class. The network comprises two identical model instances that share the same architecture and weights to be trained to learn the similarity between the two images. The similarity score is computed by using Euclidean distance. Some four different activation functions are evaluated in this research to see how feasible those functions to be used in this recognition task. The results demonstrate that the most suitable activation function for this task is sigmoid, with an accuracy of 100% after about 300 learning iterations for 10- way One-shot Learning. The evaluation is extended to the CUHK dataset and the results indicate the same accuracy pattern.

Posted Content
TL;DR: Zhang et al. as mentioned in this paper use an intermediate latent space between the two modalities to match a given face sketch image against a face photo database, and employ a bidirectional (photo -> sketch and sketch -> photo) collaborative synthesis network.
Abstract: This research features a deep-learning based framework to address the problem of matching a given face sketch image against a face photo database. The problem of photo-sketch matching is challenging because 1) there is large modality gap between photo and sketch, and 2) the number of paired training samples is insufficient to train deep learning based networks. To circumvent the problem of large modality gap, our approach is to use an intermediate latent space between the two modalities. We effectively align the distributions of the two modalities in this latent space by employing a bidirectional (photo -> sketch and sketch -> photo) collaborative synthesis network. A StyleGAN-like architecture is utilized to make the intermediate latent space be equipped with rich representation power. To resolve the problem of insufficient training samples, we introduce a three-step training scheme. Extensive evaluation on public composite face sketch database confirms superior performance of our method compared to existing state-of-the-art methods. The proposed methodology can be employed in matching other modality pairs.

Journal ArticleDOI
22 Apr 2021
TL;DR: In this paper, a web-based sketch recognition algorithm based on Deep Neural Network (DNN), called Marcelle-Sketch, was developed, which end-users can train incrementally.
Abstract: Machine learning systems became pervasive in modern interactive technology but provide users with little, if any, agency with respect to how their models are trained from data. In this paper, we are interested in the way novices handle learning algorithms, what they understand from their behavior and what strategy they may use to "make it work". We developed a web-based sketch recognition algorithm based on Deep Neural Network (DNN), called Marcelle-Sketch, that end-users can train incrementally. We present an experimental study that investigate people's strategies and (mis)understandings in a realistic algorithm-teaching task. Our study involved 12 participants who performed individual teaching sessions using a think-aloud protocol. Our results show that participants adopted heterogeneous strategies in which variability affected the model performances. We highlighted the importance of sketch sequencing, particularly at the early stage of the teaching task. We also found that users' understanding is facilitated by simple operations on drawings, while confusions are caused by certain inherent properties of DNN. From these findings, we propose implications for design of IML systems dedicated to novices and discuss the socio-cultural aspect of this research.

Proceedings ArticleDOI
10 Jan 2021
TL;DR: In this paper, a Siamese graph convolutional network (GCN) was proposed for face sketch recognition, which utilizes a deep learning method to detect the image edges, and then uses a superpixel method to segment the edge image.
Abstract: In this paper, we present a novel Siamese graph convolution network (GCN) for face sketch recognition. To build a graph from an image, we utilize a deep learning method to detect the image edges, and then use a superpixel method to segment the edge image. Each segmented superpixel region is taken as a node, and each pair of adjacent regions forms an edge of the graph. Graphs from both a face sketch and a face photo are input into the Siamese GCN for recognition. A deep graph matching method is used to share messages between cross-modal graphs in this model. Experiments show that the GCN can obtain high performance on several face photo-sketch datasets, including seen and unseen face photo-sketch datasets. It is also shown that the model performance based on the graph structure representation of the data using the Siamese GCN is more stable than a Siamese CNN model.

Book ChapterDOI
03 Jul 2021
TL;DR: By the learning and training process of the sketches’ reconstruction, the features of the images are also mapped to the sketches, which strengthen the architectural relationship in the sketch, so that the original sketch can gradually approach the building images, and it is possible to achieve the sketch-based modeling technology.
Abstract: Architects usually design ideation and conception by hand-sketching. Sketching is a direct expression of the architect’s creativity. But 2D sketches are often vague, intentional and even ambiguous. In the research of sketch-based modeling, it is the most difficult part to make the computer to recognize the sketches. Because of the development of artificial intelligence, especially deep learning technology, Convolutional Neural Networks (CNNs) have shown obvious advantages in the field of extracting features and matching, and Generative Adversarial Neural Networks (GANs) have made great breakthroughs in the field of architectural generation which make the image-to-image translation become more and more popular. As the building images are gradually developed from the original sketches, in this research, we try to develop a system from the sketches to the images of buildings using CycleGAN algorithm. The experiment demonstrates that this method could achieve the mapping process from the sketches to images, and the results show that the sketches’ features could be recognised in the process. By the learning and training process of the sketches’ reconstruction, the features of the images are also mapped to the sketches, which strengthen the architectural relationship in the sketch, so that the original sketch can gradually approach the building images, and then it is possible to achieve the sketch-based modeling technology.

DOI
01 Jan 2021
TL;DR: In this paper, an approach based on automated plan recognition is presented to capture sketches of systems engineering models and to incrementally formalize them using specific representations, and the first implementation of their approach with AI plan recognition was presented.
Abstract: The transition to Computer-Aided Systems Engineering (CASE) changed engineers’ day-to-day tasks in many disciplines such as mechanical or electronic ones. System engineers are still looking for the right set of tools to embrace this opportunity. Indeed, they deal with many kinds of data which evolve a lot during the development life cycle. Model-Based Systems Engineering (MBSE) should be an answer to that but failed to convince and to be accepted by system engineers and architects. The complexity of creating, editing, and annotating models of systems engineering takes its root from different sources: high abstraction levels, static representations, complex interfaces, and the time-consuming activities to keep a model and its associated diagrams consistent. As a result, system architects still heavily rely on traditional methods (whiteboards, papers, and pens) to outline a problem and its solution, and then they use modeling expert users to digitize informal data in modeling tools. In this chapter, we present an approach based on automated plan recognition to capture sketches of systems engineering models and to incrementally formalize them using specific representations. We present a first implementation of our approach with AI plan recognition, and we detail an experiment on applying plan recognition to systems engineering.

Journal ArticleDOI
TL;DR: This paper proposes a Transformer-based network, dubbed as AttentiveNet, for sketch recognition that incorporates ordinal information to perform the classification task in real-time through vector images and performs favorably against state-of-the-art techniques.
Abstract: Sketches have been employed since the ancient era of cave paintings for simple illustrations to represent real-world entities and communication. The abstract nature and varied artistic styling make automatic recognition of these drawings more challenging than other areas of image classification. Moreover, the representation of sketches as a sequence of strokes instead of raster images introduces them at the correct abstract level. However, dealing with images as a sequence of small information makes it challenging. In this paper, we propose a Transformer-based network, dubbed as AttentiveNet, for sketch recognition. This architecture incorporates ordinal information to perform the classification task in real-time through vector images. We employ the proposed model to isolate the discriminating strokes of each doodle using the attention mechanism of Transformers and perform an in-depth qualitative analysis of the isolated strokes for classification of the sketch. Experimental evaluation validates that the proposed network performs favorably against state-of-the-art techniques.

Posted ContentDOI
17 May 2021
TL;DR: A novel algorithm based on double channel convolution neural network based on softmax classifier can effectively improve the recognition rate of hand-drawn sketch recognition.
Abstract: In the task of hand-drawn sketch recognition, traditional deep learning methods have the insufficient of feature extraction and low recognition rate. To improve the insufficient, a novel algorithm based on double channel convolution neural network is proposed. First of all, the hand-drawn sketch is preprocessed to get a smooth sketch. And the contour extraction algorithm is adopted to get the contour of the sketch. The sketch and its contour are then used as input images of the CNN respectively. Finally, through performing feature fusion at the full connection layer, the classification results are obtained using the softmax classifier. The experimental results show that the proposed method can effectively improve the recognition rate of hand-drawn sketch.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, the authors show that the ecological validity of the data collection protocol and the ability to accommodate small datasets are significant factors impacting recognizer accuracy in realistic scenarios, using sketch-based gaming as a use case, demonstrating that deep learning methods, as well as more traditional methods, suffer significantly from dataset shift.
Abstract: Sketch recognition algorithms are engineered and evaluated using publicly available datasets contributed by the sketch recognition community over the years. While existing datasets contain sketches of a limited set of generic objects, each new domain inevitably requires collecting new data for training domain specific recognizers. This gives rise to two fundamental concerns: First, will the data collection protocol yield ecologically valid data? Second, will the amount of collected data suffice to train sufficiently accurate classifiers? In this paper, we draw attention to these two concerns. We show that the ecological validity of the data collection protocol and the ability to accommodate small datasets are significant factors impacting recognizer accuracy in realistic scenarios. More specifically, using sketch-based gaming as a use case, we show that deep learning methods, as well as more traditional methods, suffer significantly from dataset shift. Furthermore, we demonstrate that in realistic scenarios where data is scarce and expensive, standard measures taken for adapting deep learners to small datasets fall short of comparing favorably with alternatives. Although transfer learning, and extensive data augmentation help deep learners, they still perform significantly worse compared to standard setups (e.g., SVMs and GBMs with standard feature representations). We pose learning from small datasets as a key problem for the deep sketch recognition field, one which has been ignored in the bulk of the existing literature.

Posted Content
TL;DR: Wang et al. as mentioned in this paper proposed a hierarchical residual network as a whole for sketch recognition and evaluated it on the Tu-Berlin benchmark thoroughly, and the experimental results show that the proposed network outperforms most of baseline methods and it is excellent among non-sequential models.
Abstract: With the widespread use of touch-screen devices, it is more and more convenient for people to draw sketches on screen. This results in the demand for automatically understanding the sketches. Thus, the sketch recognition task becomes more significant than before. To accomplish this task, it is necessary to solve the critical issue of improving the distinction of the sketch features. To this end, we have made efforts in three aspects. First, a novel multi-scale residual block is designed. Compared with the conventional basic residual block, it can better perceive multi-scale information and reduce the number of parameters during training. Second, a hierarchical residual structure is built by stacking multi-scale residual blocks in a specific way. In contrast with the single-level residual structure, the learned features from this structure are more sufficient. Last but not least, the compact triplet-center loss is proposed specifically for the sketch recognition task. It can solve the problem that the triplet-center loss does not fully consider too large intra-class space and too small inter-class space in sketch field. By studying the above modules, a hierarchical residual network as a whole is proposed for sketch recognition and evaluated on Tu-Berlin benchmark thoroughly. The experimental results show that the proposed network outperforms most of baseline methods and it is excellent among non-sequential models at present.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a triplet network with spatial pyramid pooling to deal with different sizes of images, and an attention model on the image space is proposed to extract features from the same location in the photo and sketch.
Abstract: In this paper, a novel triplet network is proposed for face sketch recognition. A spatial pyramid pooling layer is introduced into the network to deal with different sizes of images, and an attention model on the image space is proposed to extract features from the same location in the photo and sketch. Our attention mechanism builds and improves recognition accuracy by searching similar regions of the images, which include abundant information in order to distinguish different persons in photos and sketches. So that the cross-modality differences between photo and sketch images are reduced when they are mapped into a common feature space. Our proposed solution is tested on composite face photo-sketch datasets, including UoM-SGFS and e-PRIP dataset, and achieves better performance than the state-of-the-art result. Especially for Set B in UoM-SGFS dataset, the accuracy is higher than 81%.

Journal ArticleDOI
TL;DR: The project describes the design of a system for face sketch recognition by a computer vision approach like Discrete Cosine Transform (DCT), Local Binary Pattern Histogram (LBPH) algorithm and a supervised machine learning model called Support Vector Machine (SVM) for face recognition.
Abstract: Now-a-days need for technologies for identification, detection and recognition of suspects has increased. One of the most common biometric techniques is face recognition, since face is the convenient way used by the people to identify each-other. Understanding how humans recognize face sketches drawn by artists is of significant value to both criminal investigators and forensic researchers in Computer Vision. However, studies say that hand-drawn face sketches are still very limited in terms of artists and number of sketches because after any incident a forensic artist prepares a victim’s sketches on behalf of the description provided by an eyewitness. Sometimes suspect uses special mask to hide some common features of faces like nose, eyes, lips, face-color etc. but the outliner features of face biometrics one could never hide. Here we concentrate on some specific facial geometric feature which could be used to calculate some ratio of similarities from the template photograph database against the forensic sketches. The project describes the design of a system for face sketch recognition by a computer vision approach like Discrete Cosine Transform (DCT), Local Binary Pattern Histogram (LBPH) algorithm and a supervised machine learning model called Support Vector Machine (SVM) for face recognition. Tkinter is the standard GUI library for Python. Python when combined with Tkinter provides a fast and easy way to create GUI applications. Tkinter provides a powerful object-oriented interface to the Tk GUI toolkit.

10 May 2021
TL;DR: It is expected that it will be possible not only to improve the cognitive ability of infant figures, but also to measure learning ability and child development through infant drawings, and to utilize it in child psychotherapy through emotion recognition.
Abstract: Due to its unique characteristics, infant paintings have a significantly lower recognition rate than adult images. According to the study of infant art, infant paintings have many features that are different from adult images, such as the appearance of many self-centered and exaggerated expressions. In this paper, we will introduce a method to improve the recognition rate of such children's drawings by utilizing deep learning. Create a pre-processor that generalizes the unique characteristics of the child to improve the low recognition rate of the infant figure, and primarily refine the data. High accuracy was obtained as a result of securing and executing 80 adult sketches for each of 250 classified items using CNN, which is often used for image recognition. Through this research, it is expected that it will be possible not only to improve the cognitive ability of infant figures, but also to measure learning ability and child development through infant drawings, and to utilize it in child psychotherapy through emotion recognition.

Patent
30 Mar 2021
TL;DR: Zhang et al. as discussed by the authors used channel attention and vertical overturn space attention to optimize the characteristics of the convolutional neural network, so that the network can pay attention to learning of a part with higher discrimination and therefore, the recognition precision of the freehand sketch can be effectively improved.
Abstract: The invention discloses a freehand sketch recognition method based on an attention mechanism, and the method comprises the steps: inputting an original freehand sketch into a deep convolution neural network, and obtaining a feature map outputted by a last convolution layer; inputting the feature map into a channel attention module to obtain a feature map based on channel attention optimization; training a classification network used for predicting the vertical overturn of the freehand sketch, and inputting the original freehand sketch into the trained classification network to obtain a vertical overturn space attention graph; combining the feature map based on the channel attention optimization and the vertical overturning space attention map, and calculating to obtain a feature map basedon the vertical overturning space attention optimization; and finally, outputting an identification result through the full connection layer. The method has the advantages that channel attention and vertical overturn space attention are adopted to optimize the characteristics of the convolutional neural network, so that the network can pay attention to learning of a part with higher discrimination, and therefore, the recognition precision of the freehand sketch can be effectively improved.

Proceedings ArticleDOI
10 Jan 2021
TL;DR: Wang et al. as mentioned in this paper adopted a novel method by employing a Graph Convolutional Network (GCN) to extract invariable structural feature under any orders of strokes, and further split the temporal information of sketch into two types of feature, i.e., structural feature (ISF) and drawing habits feature (DHF).
Abstract: Sketch recognition is essential in sketch-related researches. Different from the natural image, the sparse pixel distribution of sketch discards the visual texture which encourages researchers to explore the temporal information of sketch. Using of million-scale datasets, we explore the invariable structure and specific order of strokes in sketch. Prior works based on Recurrent Neural Network (RNN) output different features with changed stroke orders. In particular, we adopt a novel method by employing a Graph Convolutional Network (GCN) to extract invariable structural feature under any orders of strokes. Compared with traditional comprehension of sketch, we further split the temporal information of sketch into two types of feature, invariable structural feature (ISF) and drawing habits feature (DHF) with the aim of finer feature extraction in temporal information. We propose a two-branch GCN-RNN network, Sketch-SNet, to extract two types of feature respectively. The GCN branch is used to extract the ISF through receiving various shuffled strokes of an input sketch. The RNN branch takes the original order to extract DHF by learning the pattern of strokes' order. Extensive experiments on the Quick-Draw dataset demonstrate that our further subdivision of temporal information improves the performance of sketch recognition which surpasses state-of-the-art by a large margin.

Book ChapterDOI
02 Nov 2021
TL;DR: In this article, the authors presented a new game based on sketch recognition, where the player can contribute to the game by drawing and the sketches can provide various scenarios to the interfaces with their intuitive, illustrative and abstract nature.
Abstract: Human freehand sketches can provide various scenarios to the interfaces with their intuitive, illustrative, and abstract nature. Although freehand sketches have been powerful tools for communication and have been studied in different contexts, their capacity to create compelling interactions in games is still under-explored. In this study, we present a new game based on sketch recognition. Specifically, we train various neural networks (Recurrent Neural Networks and Convolutional Neural Networks) and use different classification algorithms (Support Vector Machines and k-Nearest Neighbors) on sketches to create an interactive game interface where the player can contribute to the game by drawing. To measure usability, technology acceptance, immersion, and playfulness aspects, 18 participants played the game and answered the questionnaires composed of four different scales. Technical results and user tests demonstrate the capability and potential of sketch integration as a communication tool to construct an effective and responsive visual medium for novel interactive game experiences.

Proceedings ArticleDOI
19 May 2021
TL;DR: In this paper, the authors compared the performance of four pre-trained CNN architectures, namely ResNet101, DenseNet, MobileNetv2, and ShuffleNetV2, in classifying clothing sketches that have intra-class variations.
Abstract: Sketch is a unique source of information. Unlike an image (photograph), a sketch contains mostly edges and fewer textures; thus, it has fewer visual cues. Besides, it contains inherent intra-class variations – the geometric and shape variations between sketches of the same object – which lead to a more challenging recognition/classification (R/C) task. The sketch R/C is required in many areas; however, the explorations in this direction – particularly those using deep neural networks – are still limited. This work, thus, aims to present and compare the performances of four pre-trained convolutional neural network (CNN) architectures, namely ResNet101, DenseNet, MobileNetv2, and ShuffleNetv2, in classifying clothing sketches that have intra-class variations and fewer visual cues. We measured the training accuracy, training time, validation accuracy as well as testing accuracy. The simulation results showed that the CNN models had good performances in classifying clothing sketches, with the overall testing accuracy being over 94%. All of the architectures set at epoch=20 resulted in the highest accuracy.

Proceedings ArticleDOI
14 Apr 2021
TL;DR: In this article, an intelligent user interface is described to provide automated real-time feedback on hand-drawn free body diagrams that is capable of analyzing the internal forces of a sketched truss to evaluate open-ended design problems.
Abstract: Engineering students need practical, open-ended problems to help them build their problem-solving skills and design abilities. However, large class sizes create a grading challenge for instructors as there is simply not enough time nor support to provide adequate feedback on many design problems. In this work, we describe an intelligent user interface to provide automated real-time feedback on hand-drawn free body diagrams that is capable of analyzing the internal forces of a sketched truss to evaluate open-ended design problems. The system is driven by sketch recognition algorithms developed for recognizing trusses and a robust linear algebra approach for analyzing trusses. Students in an introductory statics course were assigned a truss design problem as a homework assignment using either paper or our software. We used conventional content analysis on four focus groups totaling 16 students to identify key aspects of their experiences with the design problem and our software. We found that the software correctly analyzed all student submissions, students enjoyed the problem compared to typical homework assignments, and students found the problem to be good practice. Additionally, students using our software reported less difficulty understanding the problem, and the majority of all students said they would prefer the software approach over pencil and paper. We also evaluated the recognition performance on a set of 3000 sketches resulting in an f-score of 0.997. We manually reviewed the submitted student work which showed the handful of student complaints about recognition were largely due to user error.

Posted Content
18 Apr 2021
TL;DR: In this article, the authors show that the ecological validity of the data collection protocol and the ability to accommodate small datasets are significant factors impacting recognizer accuracy in realistic scenarios, using sketch-based gaming as a use case, demonstrating that deep learning methods, as well as more traditional methods, suffer significantly from dataset shift.
Abstract: Sketch recognition algorithms are engineered and evaluated using publicly available datasets contributed by the sketch recognition community over the years. While existing datasets contain sketches of a limited set of generic objects, each new domain inevitably requires collecting new data for training domain specific recognizers. This gives rise to two fundamental concerns: First, will the data collection protocol yield ecologically valid data? Second, will the amount of collected data suffice to train sufficiently accurate classifiers? In this paper, we draw attention to these two concerns. We show that the ecological validity of the data collection protocol and the ability to accommodate small datasets are significant factors impacting recognizer accuracy in realistic scenarios. More specifically, using sketch-based gaming as a use case, we show that deep learning methods, as well as more traditional methods, suffer significantly from dataset shift. Furthermore, we demonstrate that in realistic scenarios where data is scarce and expensive, standard measures taken for adapting deep learners to small datasets fall short of comparing favorably with alternatives. Although transfer learning, and extensive data augmentation help deep learners, they still perform significantly worse compared to standard setups (e.g., SVMs and GBMs with standard feature representations). We pose learning from small datasets as a key problem for the deep sketch recognition field, one which has been ignored in the bulk of the existing literature.