scispace - formally typeset
Search or ask a question
Topic

Sketch recognition

About: Sketch recognition is a research topic. Over the lifetime, 1611 publications have been published within this topic receiving 40284 citations.


Papers
More filters
Proceedings ArticleDOI
30 Jul 2006
TL;DR: In this article, a generic sketch recognition system was proposed to enable more natural interaction with design tools in various domains, such as mechanical engineering, military planning, logic design, etc.
Abstract: We are interested in enabling a generic sketch recognition system that would allow more natural interaction with design tools in various domains, such as mechanical engineering, military planning, logic design, etc. We would like to teach the system the symbols for a particular domain by simply drawing an example of each one -- as easy as it is to teach a person. Studies in cognitive science suggest that, when shown a symbol, people attend preferentially to certain geometric features. Relying on such biases, we built a system capable of learning descriptions of hand-drawn symbols from a single example. The generalization power is derived from a qualitative vocabulary reflecting human perceptual categories and a focus on perceptually relevant global properties of the symbol. Our user study shows that the system agrees with the subjects' majority classification about as often as any individual subject did.

7 citations

Posted Content
TL;DR: In this article, a model of learning Sketch Bidirectional Encoder Representation from Transformer (Sketch-BERT) was proposed to improve the performance of the downstream tasks of sketch recognition, sketch retrieval, and sketch gestalt.
Abstract: Previous researches of sketches often considered sketches in pixel format and leveraged CNN based models in the sketch understanding. Fundamentally, a sketch is stored as a sequence of data points, a vector format representation, rather than the photo-realistic image of pixels. SketchRNN studied a generative neural representation for sketches of vector format by Long Short Term Memory networks (LSTM). Unfortunately, the representation learned by SketchRNN is primarily for the generation tasks, rather than the other tasks of recognition and retrieval of sketches. To this end and inspired by the recent BERT model, we present a model of learning Sketch Bidirectional Encoder Representation from Transformer (Sketch-BERT). We generalize BERT to sketch domain, with the novel proposed components and pre-training algorithms, including the newly designed sketch embedding networks, and the self-supervised learning of sketch gestalt. Particularly, towards the pre-training task, we present a novel Sketch Gestalt Model (SGM) to help train the Sketch-BERT. Experimentally, we show that the learned representation of Sketch-BERT can help and improve the performance of the downstream tasks of sketch recognition, sketch retrieval, and sketch gestalt.

7 citations

Book ChapterDOI
03 Sep 2007
TL;DR: Using an emotional multimodal bilingual database for Spanish and Basque, emotion recognition rates in speech have significantly improved for both languages comparing with previous studies.
Abstract: Study of emotions in human-computer interaction is a growing research area. Focusing on automatic emotion recognition, work is being performed in order to achieve good results particularly in speech and facial gesture recognition. This paper presents a study where, using a wide range of speech parameters, improvement in emotion recognition rates is analyzed. Using an emotional multimodal bilingual database for Spanish and Basque, emotion recognition rates in speech have significantly improved for both languages comparing with previous studies. In this particular case, as in previous studies, machine learning techniques based on evolutive algorithms (EDA) have proven to be the best emotion recognition rate optimizers.

7 citations

Proceedings ArticleDOI
01 Sep 2014
TL;DR: This research points out a possibility of integrating together raw ink, direct manipulation and indirect command in many gesture-based complex application such as a sketch drawing application.
Abstract: In most applications of touch based humancomputer interaction, multi-touch gestures are used for directlymanipulating the interface such as scaling, panning, etc. In thispaper, we propose using multi-touch gesture as indirectcommand, such as redo, undo, erase, etc., for the operatingsystem. The proposed recognition system is guided by temporal,spatial and shape information. This is achieved using a graphembedding approach where all previous information are used.We evaluated our multi-touch recognition system on a set of 18different multi-touch gestures. With this graph embeddingmethod and a SVM classifier, we achieve 94.50% recognition rate.We believe that our research points out a possibility ofintegrating together raw ink, direct manipulation and indirectcommand in many gesture-based complex application such as asketch drawing application.

7 citations

01 Jan 2006
TL;DR: This thesis presents a novel multiview correspondence algorithm which automatically establishes correspondences between unordered views of a free-form object with O(N) complexity and present a novel algorithm for 3D free-forms object recognition and segmentation in complex scenes containing clutter and occlusions.
Abstract: The aim of visual recognition is to identify objects in a scene and estimate their pose. Object recognition from 2D images is sensitive to illumination, pose, clutter and occlusions. Object recognition from range data on the other hand does not suffer from these limitations. An important paradigm of recognition is model-based whereby 3D models of objects are constructed offline and saved in a database, using a suitable representation. During online recognition, a similar representation of a scene is matched with the database for recognizing objects present in the scene. A 3D model of a free-form object is constructed offline from its multiple range images (views) acquired from different viewpoints. These views are registered in a common coordinate basis by establishing correspondences between them followed by their integration into a seamless 3D model. Automatic correspondences between overlapping views is the major problem in 3D modeling. This problem becomes more challenging when the views are unordered and hence there is no a priori knowledge about which view pairs overlap. The main challenges in the online recognition phase are the presence of clutter due to unwanted objects and noise, and the presence of occluding objects. This thesis addresses the above challenges and investigates novel representations and matching techniques for 3D free-form rigid object and non-rigid face recognition. A robust representation based on third order tensors is presented. The tensor representation quantizes local surface patches of an object into three-dimensional grids. Each grid is defined in an object centered local coordinate basis which makes the tensors invariant to rigid transformations. This thesis presents a novel multiview correspondence algorithm which automatically establishes correspondences between unordered views of a free-form object with O(N) complexity. It also presents a novel algorithm for 3D free-form object recognition and segmentation in complex scenes containing clutter and occlusions. The combination of the strengths of the tensor representation and the customized use of a 4D hash table for matching constitute the basic ingredients of these algorithms. This thesis demonstrates the superiority of the tensor representation in terms of descriptiveness compared to an existing competitor, i.e. the spin images. It also demonstrates that the proposed correspondence and recognition algorithms outperform the spin image recognition in terms of accuracy and efficiency. The tensor representation is extended to automatic and pose invariant 3D face recognition. As the face is a non-rigid object, expressions can significantly change its 3D shape. Therefore, the last part of this thesis investigates representations and matching techniques for automatic 3D face recognition which are robust to facial expressions. A number of novelties are proposed in this area along with their extensive experimental validation using the largest available 3D face database. These novelties include a region-based matching algorithm for 3D face recognition, a 2D and 3D multimodal hybrid face recognition algorithm, fully automatic 3D nose ridge detection, fully automatic normalization of 3D and 2D faces, a low cost rejection classifier based on a novel Spherical Face Representation, and finally, automatic segmentation of the expression insensitive regions of a face.

7 citations


Network Information
Related Topics (5)
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Object detection
46.1K papers, 1.3M citations
83% related
Feature extraction
111.8K papers, 2.1M citations
82% related
Image segmentation
79.6K papers, 1.8M citations
81% related
Convolutional neural network
74.7K papers, 2M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202326
202271
202130
202029
201946
201827