scispace - formally typeset
Search or ask a question

Showing papers on "Sketch recognition published in 2009"


Journal ArticleDOI
TL;DR: A novel face photo-sketch synthesis and recognition method using a multiscale Markov Random Fields (MRF) model that allows effective matching between the two in face sketch recognition.
Abstract: In this paper, we propose a novel face photo-sketch synthesis and recognition method using a multiscale Markov Random Fields (MRF) model. Our system has three components: 1) given a face photo, synthesizing a sketch drawing; 2) given a face sketch drawing, synthesizing a photo; and 3) searching for face photos in the database based on a query sketch drawn by an artist. It has useful applications for both digital entertainment and law enforcement. We assume that faces to be studied are in a frontal pose, with normal lighting and neutral expression, and have no occlusions. To synthesize sketch/photo images, the face region is divided into overlapping patches for learning. The size of the patches decides the scale of local face structures to be learned. From a training set which contains photo-sketch pairs, the joint photo-sketch model is learned at multiple scales using a multiscale MRF model. By transforming a face photo to a sketch (or transforming a sketch to a photo), the difference between photos and sketches is significantly reduced, thus allowing effective matching between the two in face sketch recognition. After the photo-sketch transformation, in principle, most of the proposed face photo recognition approaches can be applied to face sketch recognition in a straightforward way. Extensive experiments are conducted on a face sketch database including 606 faces, which can be downloaded from our Web site (http://mmlab.ie.cuhk.edu.hk/facesketch.html).

753 citations


Journal ArticleDOI
TL;DR: A categorization based on how a SBIM application chooses to interpret a sketch is presented, of which there are three primary methods: to create a 3D model, to add details to an existing model, or to deform and manipulate a model.

401 citations


Book ChapterDOI
Jiahui Wu1, Gang Pan1, Daqing Zhang2, Guande Qi1, Shijian Li1 
06 Jul 2009
TL;DR: An acceleration-based gesture recognition approach, called FDSVM ( Frame-based Descriptor and multi-class SVM), which needs only a wearable 3-dimensional accelerometer and gives the best resulrs for both user-dependent and user-independent cases.
Abstract: Gesture-based interaction, as a natural way for human-computer interaction, has a wide range of applications in ubiquitous computing environment. This paper presents an acceleration-based gesture recognition approach, called FDSVM ( Frame-based Descriptor and multi-class SVM), which needs only a wearable 3-dimensional accelerometer. With FDSVM, firstly, the acceleration data of a gesture is collected and represented by a frame-based descriptor, to extract the discriminative information. Then a SVM-based multi-class gesture classifier is built for recognition in the nonlinear gesture feature space. Extensive experimental results on a data set with 3360 gesture samples of 12 gestures over weeks demonstrate that the proposed FDSVM approach significantly outperforms other four methods: DTW, Naive Bayes, C4.5 and HMM. In the user-dependent case, FDSVM achieves the recognition rate of 99.38% for the 4 direction gestures and 95.21% for all the 12 gestures. In the user-independent case, it obtains the recognition rate of 98.93% for 4 gestures and 89.29% for 12 gestures. Compared to other accelerometer-based gesture recognition approaches reported in literature FDSVM gives the best resulrs for both user-dependent and user-independent cases.

228 citations


Book
01 Jan 2009
TL;DR: In this paper some results obtained using different approaches to color to grayscale conversion for some well-known metrics as well as for recently proposed combined ones, leading to meaningful increase of the prediction accuracy of image quality for color distortions.

194 citations


Journal ArticleDOI
TL;DR: This book is the most comprehensive study of this field and contains a collection of 69 carefully selected articles contributed by experts of pattern recognition with respect to both methodology and applications.
Abstract: The computer recognition systems are nowadays one of the most promising directions in artificial intelligence. This book is the most comprehensive study of this field. It contains a collection of 69 carefully selected articles contributed by experts of pattern recognition. It reports on current research with respect to both methodology and applications. In particular, it includes the following sections: Features, learning and classifiers, Image processing and computer vision, Speech and word recognition, Medical applications, Miscellaneous applications. This book is a great reference tool for scientists who deal with the problems of designing computer pattern recognition systems. Its target readers can be the as well researchers as students of computer science, artificial intelligence or robotics.

152 citations


Book
23 Feb 2009
TL;DR: The practical basis of sketching is described — why people sketch, what significance it has in design and problem solving, and the cognitive activities it supports and the challenges and opportunities for future advances are proposed.
Abstract: Computational support for sketching is an exciting research area at the intersection of design research, human–computer interaction, and artificial intelligence. Despite the prevalence of software tools, most designers begin their work with physical sketches. Modern computational tools largely treat design as a linear process beginning with a specific problem and ending with a specific solution. Sketch-based design tools offer another approach that may fit design practice better. This review surveys literature related to such tools. First, we describe the practical basis of sketching — why people sketch, what significance it has in design and problem solving, and the cognitive activities it supports. Second, we survey computational support for sketching, including methods for performing sketch recognition and managing ambiguity, techniques for modeling recognizable elements, and human–computer interaction techniques for working with sketches. Last, we propose challenges and opportunities for future advances in this field.

136 citations


Journal Article
TL;DR: A hand gesture recognition system to recognize real time gesture in unstrained environments that can achieve a 90% recognition average rate and is suitable for real time applications is introduced.
Abstract: This paper introduces a hand gesture recognition system to recognize real time gesture in unstrained environments. Efforts should be made to adapt computers to our natural means of communication: Speech and body language. A simple and fast algorithm using orientation histograms will be developed. It will recognize a subset of MAL static hand gestures. A pattern recognition system will be using a transforrn that converts an image into a feature vector, which will be compared with the feature vectors of a training set of gestures. The final system will be Perceptron implementation in MATLAB. This paper includes experiments of 33 hand postures and discusses the results. Experiments shows that the system can achieve a 90% recognition average rate and is suitable for real time applications. Keywords—Hand gesture recognition, Orientation Histogram, Myanmar Alphabet Language, Perceptron network, MATLAB.

103 citations


Journal ArticleDOI
TL;DR: This paper makes use of embedded hidden Markov model (EHMM), which can learn the nonlinearity of sketch-photo pair with less training samples, to produce pseudo-photos in terms of sketches, which makes pseudo-photo more recognizable.

78 citations


Journal ArticleDOI
TL;DR: This paper proposes a method that relaxes the constraint that people will not start to draw a new symbol before the current one has been finished, and relies on a two-dimensional dynamic programming technique, which can correctly segment and recognize interspersed symbols.

69 citations




Book ChapterDOI
14 Jul 2009
TL;DR: It is shown that gestures recognition based on the Bio-mechanical characteristic of the hand provides an intuitive approach which provides more accuracy and less complexity.
Abstract: Nowadays, computer interaction is mostly done using dedicated devices. But gestures are an easy mean of expression between humans that could be used to communicate with computers in a more natural manner. Most of the current research on hand gesture recognition for Human-Computer Interaction rely on either the Neural Networks or Hidden Markov Models (HMMs). In this paper, we compare different approaches for gesture recognition and highlight the major advantages of each. We show that gestures recognition based on the Bio-mechanical characteristic of the hand provides an intuitive approach which provides more accuracy and less complexity.

Proceedings Article
11 Jul 2009
TL;DR: It is found that the entropy rate is significantly higher for text strokes compared to shape strokes and can serve as a distinguishing factor between the two and be an accurate criterion of classification.
Abstract: Most sketch recognition systems are accurate in recognizing either text or shape (graphic) ink strokes, but not both. Distinguishing between shape and text strokes is, therefore, a critical task in recognizing hand-drawn digital ink diagrams that contain text labels and annotations. We have found the 'entropy rate' to be an accurate criterion of classification. We found that the entropy rate is significantly higher for text strokes compared to shape strokes and can serve as a distinguishing factor between the two. Using a single feature -- zero-order entropy rate -- our system produced a correct classification rate of 92.06% on test data belonging to diagrammatic domain for which the threshold was trained on. It also performed favorably on an unseen domain for which no training examples were supplied.

Journal ArticleDOI
TL;DR: This paper presents a framework for learning a compact representation of primitive actions that can be used for video obtained from a single camera for simultaneous action recognition and viewpoint estimation, and shows recognition rates on a publicly available dataset previously only achieved using multiple simultaneous views.
Abstract: Action recognition from video is a problem that has many important applications to human motion analysis. In real-world settings, the viewpoint of the camera cannot always be fixed relative to the subject, so view-invariant action recognition methods are needed. Previous view-invariant methods use multiple cameras in both the training and testing phases of action recognition or require storing many examples of a single action from multiple viewpoints. In this paper, we present a framework for learning a compact representation of primitive actions (e.g., walk, punch, kick, sit) that can be used for video obtained from a single camera for simultaneous action recognition and viewpoint estimation. Using our method, which models the low-dimensional structure of these actions relative to viewpoint, we show recognition rates on a publicly available dataset previously only achieved using multiple simultaneous views.

Proceedings Article
07 Dec 2009
TL;DR: In this paper, a sketch recognition framework was proposed that combines a rich representation of low level visual appearance with a graphical model for capturing high level relationships between symbols, which is less sensitive to noise and drawing variations, improving accuracy and robustness.
Abstract: We propose a new sketch recognition framework that combines a rich representation of low level visual appearance with a graphical model for capturing high level relationships between symbols. This joint model of appearance and context allows our framework to be less sensitive to noise and drawing variations, improving accuracy and robustness. The result is a recognizer that is better able to handle the wide range of drawing styles found in messy freehand sketches. We evaluate our work on two real-world domains, molecular diagrams and electrical circuit diagrams, and show that our combined approach significantly improves recognition performance.

Proceedings Article
09 Apr 2009
TL;DR: Hashigo, a kanji sketch interactive system which achieves human instructor-level critique and feedback on both the visual structure and written technique of students’ sketched kanji, is described.
Abstract: Language students can increase their effectiveness in learning written Japanese by mastering the visual structure and written technique of Japanese kanji. Yet, existing kanji handwriting recognition systems do not assess the written technique sufficiently enough to discourage students from developing bad learning habits. In this paper, we describe our work on Hashigo, a kanji sketch interactive system which achieves human instructor-level critique and feedback on both the visual structure and written technique of students’ sketched kanji. This type of automated critique and feedback allows students to target and correct specific deficiencies in their sketches that, if left untreated, are detrimental to effective long-term kanji learning.

Proceedings ArticleDOI
01 Nov 2009
TL;DR: The research work in recent years have been analyzed and summarized, respectively from the classification of human motion, human motion feature extraction and recognition algorithm to introduce a number of humanmotion recognition.
Abstract: With the development of computer vision and image processing technology, human motion recognition, because of its wide range of applications, now has been attracting extensive attention in the field of computer vision. Vision-based recognition of people's motion not only includes the knowledge of image recognition and computer vision, but also involves the theory of recognition and artificial intelligence, so it is an extremely challenging interdisciplinary project. The research work in recent years have been analyzed and summarized, respectively from the classification of human motion, human motion feature extraction and recognition algorithm to introduce a number of human motion recognition.

Proceedings ArticleDOI
01 Aug 2009
TL;DR: A variant of a popular and simple gesture recognition algorithm that recognizes freely-drawn shapes as well as a highly-accurate but more complex recognizer designed explicitly for free-sketch recognition are introduced.
Abstract: Generating, grouping, and labeling free-sketch data is a difficult and time-consuming task for both user study participants and researchers. To simplify this process for both parties, we would like to have users draw isolated shapes instead of complete sketches that must be hand-labeled and grouped, and then use this data to train our free-sketch symbol recognizer. However, it is an open question whether shapes draw in isolation accurately reflect the way users draw shapes in a complete diagram. Furthermore, many of the simplest shape recognition algorithms were designed to recognize gestures, and it is not clear that they will generalize to freely-drawn shapes. To answer these questions, we perform experiments using three different recognizers to measure the effect of the data collection task on recognition accuracy. We find that recognizers trained only on isolated shapes can classify freely-sketched shapes as well as the same recognizers trained on free-sketches. We also show that user-specific training examples significantly improve recognition rates. Finally, we introduce a variant of a popular and simple gesture recognition algorithm that recognizes freely-drawn shapes as well as a highly-accurate but more complex recognizer designed explicitly for free-sketch recognition.

Dissertation
01 Jan 2009
TL;DR: This thesis examines techniques to improve the robustness of automatic speech recognition (ASR) systems against noise distortions, and proposes to normalize the temporal structure of both training and testing speech features to reduce the feature-model mismatch.
Abstract: This thesis examines techniques to improve the robustness of automatic speech recognition (ASR) systems against noise distortions. The study is important as the performance of ASR systems degrades dramatically in adverse environments, and hence greatly limits the speech recognition application deployment in realistic environments. Towards this end, we examine a feature compensation approach and a discriminative model training approach to improve the robustness of speech recognition system. The degradation of recognition performance is mainly due to the statistical mismatch between clean-trained acoustical model and noisy testing speech features. To reduce the feature-model mismatch, we propose to normalize the temporal structure of both training and testing speech features. Speech features’ temporal structures are represented by the power spectral density (PSD) functions of feature trajectories. We propose to normalize the temporal structures by applying equalizing filters to the feature trajectories. The proposed filter is called temporal structure normalization (TSN) filter. Compared to other temporal filters used in speech recognition, the advantage of the TSN filter is its adaptability to changing environments. The TSN filter can also be viewed as a feature normalization technique that normalizes the PSD function of features, while other normalization methods, such as histogram equalization (HEQ), normalize the probability density function (p.d.f.) of features. Experimental study shows that the TSN filter produces better performance than other state-of-the-art temporal filters on both small vocabulary Aurora-2 task and large vocabulary Aurora-4 task. In the second study, we improve the robustness of speech recognition by improving the generalization capability of acoustic model rather than reducing the feature-model mismatch. In the log likelihood score domain, noise distortion causes the log likelihood score of noisy features to deviate from that of clean features. The deviation may move

Proceedings ArticleDOI
01 Aug 2009
TL;DR: With initial evaluations of these recognizers, it is observed that the context from which training data is taken has an effect on recognition success rates, suggesting that an evaluation platform such as this is a powerful adjunct for sketch recognition research.
Abstract: We present our toolkit to automatically evaluate recognition algorithms. There are few published comparative evaluations of sketch recognition algorithms and those that exist do not provide benchmarking or direct comparisons because standardised data and an evaluation platform is not available. By unifying data collection, labelling and evaluation in one tool, fair, flexible and comprehensive evaluations are possible. Currently we have 6 existing recognizers integrated into this tool. With our initial evaluations of these recognizers we have observed that the context from which training data is taken has an effect on recognition success rates. These results suggest that an evaluation platform such as this is a powerful adjunct for sketch recognition research.

Proceedings ArticleDOI
07 Mar 2009
TL;DR: Experimental results demonstrate the effectiveness of the proposed scheme for recognizing American Sign Language using the use of nonlinear time alignment model with key frame selection facility and gesture trajectory features for hand gesture recognition.
Abstract: The sign language recognition is the most popular research area involving computer vision, pattern recognition and image processing. It enhances communication capabilities of the mute person. In this paper, we present an object based key frame selection. Hausdorff distance and Euclidean distance are used for shape similarity for hand gesture recognition. We proposed the use of nonlinear time alignment model with key frame selection facility and gesture trajectory features for hand gesture recognition. Experimental results demonstrate the effectiveness of our proposed scheme for recognizing American Sign Language.

Proceedings ArticleDOI
01 Aug 2009
TL;DR: Research on sketching with computers dates to the earliest days of modern computing, and advances in sketch-based interaction and modeling can help us understand and support visual thinking.
Abstract: Research on sketching with computers dates to the earliest days of modern computing. Recent work in this area, combined with other advances in hardware and software technologies promises, finally, significant impact. The kinds, qualities, and purposes of sketch-based interaction, or visual languages, vary as widely as do other forms of language. In addition to practical applications in every domain, advances in sketch-based interaction and modeling can help us understand and support visual thinking.

Proceedings ArticleDOI
01 Aug 2009
TL;DR: Initial results and evaluation show that the methods produce good 3D results in a short amount of time and with little user effort, demonstrating the usefulness of an intelligent sketching interface for this application domain.
Abstract: We present sketch-based tools for single-view modeling which allow for quick 3D mark-up of a photograph. With our interface, detailed 3D models can be produced quickly and easily. After establishing the background geometry, foreground objects can be cut out using our novel sketch-based segmentation tools. These tools make use of the stroke speed and length to help determine the user's intentions. Depth detail is added to the scene by drawing occlusion edges. Such edges play an important part in human scene understanding, and thus provide an intuitive form of input to the modeling system. Initial results and evaluation show that our methods produce good 3D results in a short amount of time and with little user effort, demonstrating the usefulness of an intelligent sketching interface for this application domain.

Proceedings ArticleDOI
01 Aug 2009
TL;DR: This article describes sketching games made for the purpose of collecting data about how people make and describe hand-made drawings, which leverages human computation, whereby players provide information about drawings in exchange for entertainment.
Abstract: This article describes sketching games made for the purpose of collecting data about how people make and describe hand-made drawings. The approach leverages human computation, whereby players provide information about drawings in exchange for entertainment. The games facilitate the collection of raw sketch input and associates it with human-provided text descriptions. Researchers may browse and download this data for their own purposes such as training sketch recognizers. Two systems with distinct game mechanics are described: Picture-phone and Stellasketch. The system architectures are briefly presented, followed by a discussion of our initial results using sketching games as a research platform for sketch recognition and interaction.

Journal ArticleDOI
TL;DR: This paper proposes a new WSS method based on robust tools from graph theory, solid modeling and Euclidean geometry that produces a minimal wireframe sketch that corresponds to a topologically correct polyhedron.

Proceedings Article
11 Jul 2009
TL;DR: This work reports here on techniques developed that use information from both sketch and speech to distinguish gesture strokes from non-gestures -- a critical first step in understanding a sketch of a device.
Abstract: Mechanical design tools would be considerably more useful if we could interact with them in the way that human designers communicate design ideas to one another, i.e., using crude sketches and informal speech. Those crude sketches frequently contain pen strokes of two different sorts, one type portraying device structure, the other denoting gestures, such as arrows used to indicate motion. We report here on techniques we developed that use information from both sketch and speech to distinguish gesture strokes from non-gestures -- a critical first step in understanding a sketch of a device. We collected and analyzed unconstrained device descriptions, which revealed six common types of gestures. Guided by this knowledge, we developed a classifier that uses both sketch and speech features to distinguish gesture strokes from nongestures. Experiments with our techniques indicate that the sketch and speech modalities alone produce equivalent classification accuracy, but combining them produces higher accuracy.

Proceedings ArticleDOI
01 Aug 2009
TL;DR: A novel technique for creating a large and varied ground-truthed corpus for hand drawn math recognition via random walks through a context-free grammar, and an algorithm automatically generates ground-truth data for individual symbols and inter-symbol relationships within the math expressions.
Abstract: In sketch recognition systems, ground-truth data sets serve to both train and test recognition algorithms. Unfortunately, generating data sets that are sufficiently large and varied is frequently a costly and time-consuming endeavour. In this paper, we present a novel technique for creating a large and varied ground-truthed corpus for hand drawn math recognition. Candidate math expressions for the corpus are generated via random walks through a context-free grammar, the expressions are transcribed by human writers, and an algorithm automatically generates ground-truth data for individual symbols and inter-symbol relationships within the math expressions. While the techniques we develop in this paper are illustrated through the creation of a ground-truthed corpus of mathematical expressions, they are applicable to any sketching domain that can be described by a formal grammar.

Proceedings ArticleDOI
09 Jun 2009
TL;DR: A definitive framework by which the user, simply by using freehand drawing, can define every kind of sketch-based interface is described by using the developed Sketch Modeling Language (SketchML).
Abstract: Multimodal interfaces can be profitably used to support increasingly complex services in assistive environments. In particular, sketch-based interfaces offer users an effortless and powerful communication way to represent concepts and commands on different devices. Unlike other modalities, sketch-based interaction can be easily fitted according to heterogeneous services. Moreover it can be quickly personalized according to the user needs.Developing a sketch-based interface for a specific service is a time-consuming operation that requires the re-engineering and/or the re-designing of the whole recognizer framework. This paper describes a definitive framework by which the user, simply by using freehand drawing, can define every kind of sketch-based interface. The definition of the interface and its recognition process are performed by using our developed Sketch Modeling Language (SketchML).

01 May 2009
TL;DR: In this article, a high-level sketch recognition algorithm is presented that allows complete interspersing freedom, running in near real-time through early effective sub-tree pruning.
Abstract: Sketch recognition is the automated recognition of hand-drawn diagrams. When allowing users to sketch as they would naturally, users may draw shapes in an interspersed manner, starting a second shape before finishing the first. In order to provide freedom to draw interspersed shapes, an exponential combination of subshapes must be considered. Because of this, most sketch recognition systems either choose not to handle interspersing, or handle only a limited pre-defined amount of interspersing. Our goal is to eliminate such interspersing drawing constraints from the sketcher. This paper presents a high-level recognition algorithm that, while still exponential, allows for complete interspersing freedom, running in near real-time through early effective sub-tree pruning. At the core of the algorithm is an indexing technique that takes advantage of geometric sketch recognition techniques to index each shape for efficient access and fast pruning during recognition. We have stresstested our algorithm to show that the system recognizes shapes in less than a second even with over a hundred candidate subshapes on screen.

Book ChapterDOI
04 Jun 2009
TL;DR: A new gesture recognition method is proposed which can get a recognition result of human gestures before the gestures have finished, and it is realized by using sparse codes of Self-Organizing Map.
Abstract: We propose a new gesture recognition method which is called "early recognition". Early recognition is a method to recognize sequential patterns at their beginning parts. Therefore, in the case of gesture recognition, we can get a recognition result of human gestures before the gestures have finished. We realize early recognition by using sparse codes of Self-Organizing Map.