Showing papers on "Sketch recognition published in 2006"

PDF

Open Access

Proceedings Article•DOI•

Tahuti: a geometrical sketch recognition system for UML class diagrams

[...]

Tracy Hammond¹, Randall Davis¹•Institutions (1)

30 Jul 2006

TL;DR: Tahuti is a dual-view sketch recognition environment for class diagrams in UML based on a multi-layer recognition framework which recognizes multi-stroke objects by their geometrical properties allowing users the freedom to draw naturally as they would on paper rather than requiring the user to draw the objects in a pre-defined manner.

...read moreread less

Abstract: We have created and tested Tahuti, a dual-view sketch recognition environment for class diagrams in UML. The system is based on a multi-layer recognition framework which recognizes multi-stroke objects by their geometrical properties allowing users the freedom to draw naturally as they would on paper rather than requiring the user to draw the objects in a pre-defined manner. Users can draw and edit while viewing either their original strokes or the interpreted version of their strokes engendering user-autonomy in sketching. The experiments showed that users preferred Tahuti to a paint program and to Rational Rose™ because it combined the ease of drawing found in a paint program with the ease of editing available in a UML editor.

...read moreread less

203 citations

Proceedings Article•DOI•

Head gesture recognition in intelligent interfaces: the role of context in improving recognition

[...]

Louis-Philippe Morency¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

29 Jan 2006

TL;DR: This paper describes vision-based head gesture recognition techniques and their use for common user interface commands, and explores two prototype perceptual interface components which use detected head gestures for dialog box confirmation and document browsing, respectively.

...read moreread less

Abstract: Acknowledging an interruption with a nod of the head is a natural and intuitive communication gesture which can be performed without significantly disturbing a primary interface activity. In this paper we describe vision-based head gesture recognition techniques and their use for common user interface commands. We explore two prototype perceptual interface components which use detected head gestures for dialog box confirmation and document browsing, respectively. Tracking is performed using stereo-based alignment, and recognition proceeds using a trained discriminative classifier. An additional context learning component is described, which exploits interface context to obtain robust performance. User studies with prototype recognition components indicate quantitative and qualitative benefits of gesture-based confirmation over conventional alternatives.

...read moreread less

85 citations

Proceedings Article•DOI•

Early Recognition and Prediction of Gestures

[...]

Akihiro Mori¹, Seiichi Uchida¹, Ryo Kurazume¹, Rin-ichiro Taniguchi¹, Tsutomu Hasegawa¹, Hiroaki Sakoe¹ - Show less +2 more•Institutions (1)

Kyushu University¹

20 Aug 2006

TL;DR: The performance of the proposed algorithm of early recognition and prediction algorithm of gestures of gestures was evaluated by experiments of real-time control of a humanoid by gestures.

...read moreread less

Abstract: This paper is concerned with an early recognition and prediction algorithm of gestures. Early recognition is the algorithm to provide recognition results before input gestures are completed. Motion prediction is the algorithm to predict the subsequent posture of the performer by using early recognition. In addition to them, this paper considers a gesture network for improving the performance of these algorithms. The performance of the proposed algorithm was evaluated by experiments of real-time control of a humanoid by gestures.

...read moreread less

78 citations

Proceedings Article•DOI•

Expression-Invariant Face Recognition with Expression Classification

[...]

Xiaoxing Li¹, Greg Mori¹, Hao Zhang¹•Institutions (1)

Simon Fraser University¹

07 Jun 2006

TL;DR: By studying face geometry, this work is able to determine which type of facial expression has been carried out, thus building an expression classifier which is capable of recognizing faces with different expressions.

...read moreread less

Abstract: Face recognition is one of the most intensively studied topics in computer vision and pattern recognition. Facial expression, which changes face geometry, usually has an adverse effect on the performance of a face recognition system. On the other hand, face geometry is a useful cue for recognition. Taking these into account, we utilize the idea of separating geometry and texture information in a face image and model the two types of information by projecting them into separate PCA spaces which are specially designed to capture the distinctive features among different individuals. Subsequently, the texture and geometry attributes are re-combined to form a classifier which is capable of recognizing faces with different expressions. Finally, by studying face geometry, we are able to determine which type of facial expression has been carried out, thus build an expression classifier. Numerical validations of the proposed method are given.

...read moreread less

74 citations

Proceedings Article•

Human model and motion based 3D action recognition in multiple view scenarios

[...]

Cristian Canton-Ferrer¹, Josep R. Casas¹, Montse Pardàs¹•Institutions (1)

Polytechnic University of Catalonia¹

01 Sep 2006

TL;DR: A novel view-independent approach to the recognition of human gestures of several people in low resolution sequences from multiple calibrated cameras using a simple ellipsoid body model and a Bayesian classifier to perform recognition over a small set of actions.

...read moreread less

Abstract: This paper presents a novel view-independent approach to the recognition of human gestures of several people in low resolution sequences from multiple calibrated cameras. In contraposition with other multi-ocular gesture recognition systems based on generating a classification on a fusion of features coming from different views, our system performs a data fusion (3D representation of the scene) and then a feature extraction and classification. Motion descriptors introduced by Bobick et al. for 2D data are extended to 3D and a set of features based on 3D invariant statistical moments are computed. A simple ellipsoid body model is fit to incoming 3D data to capture in which body part the gesture occurs thus increasing the recognition ratio of the overall system and generating a more informative classification output. Finally, a Bayesian classifier is employed to perform recognition over a small set of actions. Results are provided showing the effectiveness of the proposed algorithm in a SmartRoom scenario.

...read moreread less

50 citations

Proceedings Article•DOI•

Illumination Tolerant Face Recognition Using a Novel Face From Sketch Synthesis Approach and Advanced Correlation Filters

[...]

Yung Hui Li¹, Marios Savvides¹, Vijayakumar Bhagavatula¹•Institutions (1)

Carnegie Mellon University¹

14 May 2006

TL;DR: This work proposes to generate a realistic face image from the composite sketch using a hybrid subspace method and then build an illumination tolerant correlation filter which can recognize the person under different illumination variations from a surveillance video footage.

...read moreread less

Abstract: Current state-of-the-art approach for performing face sketch recognition transforms all the test face images into sketches, and then performs recognition on sketch domain using the sketch composite. In our approach we propose the opposite; which has advantages in a real-time sysrtem; we propose to generate a realistic face image from the composite sketch using a Hybrid subspace method and then build an illumination tolerant correlation filter which can recognize the person under different illumination variations from a surveillance video footage. We show how effective proposed algorithm works on the CMU PIE (Pose Illumination and Expression) database.

...read moreread less

50 citations

Proceedings Article•DOI•

A Framework for Hand Gesture Recognition with Applications to Sign Language

[...]

Manas Kamal Bhuyan¹, D. Ghosh², Prabin Kumar Bora¹•Institutions (2)

Indian Institute of Technology Guwahati¹, Multimedia University²

01 Sep 2006

TL;DR: Experimental results demonstrate that the proposed gesture recognition system can be used reliably in recognizing some signs of native Indian sign language.

...read moreread less

Abstract: Sign language is the most natural and expressive way for the hearing impaired Because of this, automatic sign language recognition has long attracted vision researchers It offers enhancement of communication capabilities for the speech and hearing impaired, promising improved social opportunities and integration This paper describes a gesture recognition system which can recognize wide classes of hand gesture in a vision based setup Experimental results demonstrate that our proposed recognition system can be used reliably in recognizing some signs of native Indian sign language

...read moreread less

39 citations

Proceedings Article•DOI•

Scale-space based feature point detection for digital ink

[...]

Tevfik Metin Sezgin¹, Randall Davis¹•Institutions (1)

Massachusetts Institute of Technology¹

30 Jul 2006

TL;DR: This paper presents a threshold-free feature point detection method using ideas from the scale-space theory for free-hand strokes that avoids hand-tuned thresholds for filtering out the false positives.

...read moreread less

Abstract: Feature point detection is generally the first step in model-based approaches to sketch recognition. Feature point detection in free-hand strokes is a hard problem because the input has noise from digitization, from natural hand tremor, and from lack of perfect motor control during drawing. Existing feature point detection methods for free-hand strokes require hand-tuned thresholds for filtering out the false positives. In this paper, we present a threshold-free feature point detection method using ideas from the scale-space theory.

...read moreread less

33 citations

Proceedings Article•DOI•

Hand Gesture Recognition for Deaf People Interfacing

[...]

I.G. Incertis, J.G. Garcia-Bermejo¹, E.Z. Casanova¹•Institutions (1)

University of Valladolid¹

20 Aug 2006

TL;DR: The proposed approach combines a number of norms to evaluate the distance of the current sign, to the sign models stored in a database (a dictionary), which leads to a largely selective criterion.

...read moreread less

Abstract: In this paper, an approach for deaf-people interfacing using computer vision is presented. The recognition of alphabetic static signs of the Spanish Sign Language is addressed. The proposed approach combines a number of norms to evaluate the distance of the current sign, to the sign models stored in a database (a dictionary). This solution leads to a largely selective criterion. The method is simple enough to provide real-time recognition, and works suitably for most letters.

...read moreread less

33 citations

Journal Article•DOI•

An OOPR-based rose variety recognition system

[...]

Miao Zhenjiang¹, M. H. Gandelin¹, Yuan Baozong¹•Institutions (1)

Beijing Jiaotong University¹

01 Feb 2006-Engineering Applications of Artificial Intelligence

TL;DR: The major principles presented are the mathematical description methods for rose features such as shape, size and color of the flower, petal, leaf, etc, and the object-oriented pattern recognition (OOPR) approach which mathematically deals with how to comprehensively use all different rose features rationally in the recognition scheme.

...read moreread less

30 citations

Proceedings Article•DOI•

LADDER: a language to describe drawing, display, and editing in sketch recognition

[...]

Tracy Hammond¹, Randall Davis¹•Institutions (1)

Massachusetts Institute of Technology¹

30 Jul 2006

TL;DR: LADDER is the first language to describe how sketched diagrams in a domain are drawn, displayed, and edited, and includes the concept of "abstract shapes", analogous to abstract classes in an object oriented language.

...read moreread less

Abstract: We have created LADDER, the first language to describe how sketched diagrams in a domain are drawn, displayed, and edited. The difficulty in creating such a language is choosing a set of predefined entities that is broad enough to support a wide range of domains, while remaining narrow enough to be comprehensible. The language consists of predefined shapes, constraints, editing behaviors, and display methods, as well as a syntax for specifying a domain description sketch grammar and extending the language, ensuring that shapes and shape groups from many domains can be described. The language allows shapes to be built hierarchically (e.g., an arrow is built out of three lines), and includes the concept of "abstract shapes", analogous to abstract classes in an object oriented language. Shape groups describe how multiple domain shapes interact and can provide the sketch recognition system with information to be used in top-down recognition. Shape groups can also be used to describe "chain-reaction" editing commands that effect multiple shapes at once. To test that recognition is feasible using this language, we have built a simple domain-independent sketch recognition system that parses the domain descriptions and generates the code necessary to recognize the shapes.

...read moreread less

Proceedings Article•DOI•

Constellation models for sketch recognition

[...]

Dana Sharon¹, M. van de Panne¹•Institutions (1)

University of British Columbia¹

03 Sep 2006

TL;DR: This work draws on constellation models first proposed in the computer vision literature to develop probabilistic models for object sketches, based on multiple example drawings, which are applied to estimate the most-likely labels for a new sketch.

...read moreread less

Abstract: Sketch-based modeling shares many of the difficulties of the branch of computer vision that deals with single image interpretation. Most obviously, they must both identify the parts observed in a given 2D drawing or image.We draw on constellation models first proposed in the computer vision literature to develop probabilistic models for object sketches, based on multiple example drawings. These models are then applied to estimate the most-likely labels for a new sketch. A multi-pass branch-and-bound algorithm allows well-formed sketches to be quickly labelled, while still supporting the recognition of more ambiguous sketches. Results are presented for five classes of objects.

...read moreread less

Proceedings Article•DOI•

AraPen: An Arabic Online Handwriting Recognition System

[...]

Bilal Alsallakh¹, Hani Safadi¹•Institutions (1)

Damascus University¹

24 Apr 2006

TL;DR: AraPen, a trainable system developed to recognize Arabic online handwriting based on mathematical matching techniques, shows high recognition rate for non-cursive character recognition, and are promising for cursive recognition.

...read moreread less

Abstract: In this paper we present AraPen, a trainable system we developed to recognize Arabic online handwriting. The system is based on mathematical matching techniques and our testing results show high recognition rate for non-cursive character recognition, and are promising for cursive recognition. The low memory and CPU requirements enable the system to run on low-end devices interactively.

...read moreread less

Proceedings Article•DOI•

Interactive learning of structural shape descriptions from automatically generated near-miss examples

[...]

Tracy Hammond¹, Randall Davis¹•Institutions (1)

Massachusetts Institute of Technology¹

29 Jan 2006

TL;DR: This work presents a technique to debug over- and under-constrained shapes using a novel form of active learning that generates its own suspected near-miss examples and implemented a graphical debugging tool for use by sketch interface developers.

...read moreread less

Abstract: Sketch interfaces provide more natural interaction than the traditional mouse and palette tool, but can be time consuming to build if they have to be built anew for each new domain. A shape description language, such as the LADDER language we created, can significantly reduce the time necessary to create a sketch interface by enabling automatic generation of the interface from a domain description. However, structural shape descriptions, whether written by users or created automatically by the computer, are frequently over- or under- constrained. We present a technique to debug over- and under-constrained shapes using a novel form of active learning that generates its own suspected near-miss examples. Using this technique we implemented a graphical debugging tool for use by sketch interface developers.

...read moreread less

Proceedings Article•DOI•

A Multi-layer Parsing Strategy for On-line Recognition of Hand-drawn Diagrams

[...]

Gennaro Costagliola¹, V. Vincenzo, Michele Risi•Institutions (1)

University of Salerno¹

04 Sep 2006

TL;DR: A parsing strategy for the recognition of hand-drawn diagrams that can be used in interactive sketch interfaces based on grammar formalism, namely sketch grammars (SkGs), for describing both the symbols' shape and the syntax of diagrammatic notations, and from which recognizers are automatically generated.

...read moreread less

Abstract: The existing sketch recognizers perform only a limited drawing recognition since they process simple sketches, or rely on drawing style assumptions that reduce the recognition complexity, and in most cases they require a substantial amount of training data. In this paper we present a parsing strategy for the recognition of hand-drawn diagrams that can be used in interactive sketch interfaces. The approach is based on a grammar formalism, namely Sketch Grammars (SkGs), for describing both the symbols shape and the syntax of diagrammatic notations, and from which recognizers are automatically generated. The recognition system was evaluated in the domain of UML use case diagrams and the results highlight the recognition accuracy improvements produced by the use of context in the disambiguation process.

...read moreread less

Proceedings Article•DOI•

Optimal Pose for Face Recognition

[...]

Xiaoming Liu¹, Jens Rittscher¹, Tsuhan Chen²•Institutions (2)

General Electric¹, Carnegie Mellon University²

17 Jun 2006

TL;DR: A number of algorithms are used to evaluate face recognition performance when various poses are used for training, and it is found that the 3/4 view is the best, justified by the discrimination power of different regions on the face, computed from both the appearance and the geometry of these regions.

...read moreread less

Abstract: Researchers in psychology have well studied the impact of the pose of a face as perceived by humans, and concluded that the so-called 3/4 view, halfway between the front view and the profile view, is the easiest for face recognition by humans. For face recognition by machines, while much work has been done to create recognition algorithms that are robust to pose variation, little has been done in finding the most representative pose for recognition. In this paper, we use a number of algorithms to evaluate face recognition performance when various poses are used for training. The result, similar to findings in psychology that the 3/4 view is the best, is also justified by the discrimination power of different regions on the face, computed from both the appearance and the geometry of these regions. We believe our study is both scientifically interesting and practically beneficial for many applications.

...read moreread less

Book Chapter•DOI•

Robust correspondence recognition for computer vision

[...]

Radim Sara¹•Institutions (1)

Czech Technical University in Prague¹

01 Jan 2006

TL;DR: This paper introduces constraint satisfaction framework suitable for the task of finding correspondences in computer vision based on a robust modification of graph-theoretic notion known as digraph kernel.

...read moreread less

Abstract: In this paper we introduce constraint satisfaction framework suitable for the task of finding correspondences in computer vision. This task lies in the heart of many problems like stereovision, 3D model reconstruction, image stitching, camera autocalibration, recognition and image retrieval and a host of others. If the problem domain is general enough, the correspondence problem can seldom employ any well-structured prior knowledge. This leads to tasks that have to find maximum cardinality solutions satisfying some weak optimality condition and a set of constraints. To avoid artifacts, robustness is required to cope with decision under occlusion, uncertainty or insufficiency of data and local violations of prior model. The proposed framework is based on a robust modification of graph-theoretic notion known as digraph kernel.

...read moreread less

Proceedings Article•DOI•

An Agent-Based Multimodal Interface for Sketch Interpretation

[...]

S. Azar, Laurent Couvreur¹, V. Delfosse, B. Jaspart², C. Boulanger - Show less +1 more•Institutions (2)

Faculté polytechnique de Mons¹, University of Liège²

01 Oct 2006

TL;DR: A multimodal interface for sketch interpretation that relies on a multi-agent architecture and the design of the interpretation engine and the different agents are based on a user-centered approach where efficiency measure is defined as user satisfaction.

...read moreread less

Abstract: We present a multimodal interface for sketch interpretation that relies on a multi-agent architecture. The design of the interpretation engine and the different agents are based on a user-centered approach where efficiency measure is defined as user satisfaction. So far, several graphical agents have been implemented for recognizing basic graphical objects (e.g., lines, circles, etc) as well as more complex (e.g., hatches, stairs, captions, etc) in architectural design. Besides, vocal agents have been developed for recognizing spoken annotations (e.g., dimensions) and interface commands. Realistic evaluations with professional users have demonstrated the potential interest of the proposed system.

...read moreread less

Proceedings Article•DOI•

Dynamically constructed Bayes nets for multi-domain sketch understanding

[...]

Christine Alvarado¹, Randall Davis²•Institutions (2)

Harvey Mudd College¹, Massachusetts Institute of Technology²

30 Jul 2006

TL;DR: A novel form of dynamically constructed Bayes net, developed for multi-domain sketch recognition, integrating the influence of stroke data and domain-specific context in recognition, enabling the recognition engine to handle noisy input.

...read moreread less

Abstract: This paper presents a novel form of dynamically constructed Bayes net, developed for multi-domain sketch recognition. Our sketch recognition engine integrates shape information and domain knowledge to improve recognition accuracy across a variety of domains using an extendible, hierarchical approach. Our Bayes net framework integrates the influence of stroke data and domain-specific context in recognition, enabling our recognition engine to handle noisy input. We illustrate this behavior with qualitative and quantitative results in two domains: hand-drawn family trees and circuits.

...read moreread less

Proceedings Article•DOI•

The Frame of Cognitive Pattern Recognition

[...]

Pi Youguo¹, Shu Huailin, Liang Tiancai¹•Institutions (1)

South China University of Technology¹

01 Jul 2006

TL;DR: Recognition of printed digit character was performed according to frame of cognitive pattern recognition, and the frame is supported by the result of experiment.

...read moreread less

Abstract: Cognitive pattern recognition has two basic research problems, one is to understand principle of human pattern recognition, and the other is to develop computer recognition system which has certain learning ability and adaptive ability based on principle of human pattern recognition. Some achievement of pattern recognition in cognitive science was present , the frame of tradition machine pattern recognition was described. How to apply achievement of cognitive science to traditional machine pattern recognition by combining with characteristic of machine pattern recognition was discussed. Recognition of printed digit character was performed according to frame of cognitive pattern recognition, and the frame is supported by the result of experiment.

...read moreread less

Proceedings Article•DOI•

Face Recognition by Expression-Driven Sketch Graph Matching

[...]

Zijian Xu¹, Jiebo Luo²•Institutions (2)

University of California, Los Angeles¹, Eastman Kodak Company²

20 Aug 2006

TL;DR: The use of sketch increases the robustness of recognition under varying lighting conditions and with high-level semantic understanding of the face, the method overcomes the significant drop in accuracy under expression changes suffered by other edge-based methods.

...read moreread less

Abstract: We present a novel face recognition method using automatically extracted sketch by a multi-layer grammatical face model. First, the observed face is parsed into a 3- layer (face, parts and sketch) graph. In the sketch layer, the nodes not only capture the local features (strength, orientation and profile of the edge), but also remember the global information inherited from the upper layers (i.e. the facial part they belong to and status of the part). Next, a sketch graph matching is performed between the parsed graph and a pre-built reference graph database, in which each individual has a parsed sketch graph. Similar to the other successful edge-based methods in the literature, the use of sketch increases the robustness of recognition under varying lighting conditions. Furthermore, with high-level semantic understanding of the face, we are able to perform an intelligent recognition process driven by the status of the face, i.e. changes in expressions and poses. As shown in the experiment, our method overcomes the significant drop in accuracy under expression changes suffered by other edge-based methods.

...read moreread less

Dissertation•

Sketch interpretation using multiscale stochastic models of temporal patterns

[...]

Randall Davis¹, Tevfik Metin Sezgin¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 2006

TL;DR: A statistical framework based on Dynamic Bayesian Networks that can learn temporal models of object-level and stroke-level patterns for recognition of sketches is described, showing that in certain domains, stroke orderings used in the course of drawing individual objects contain temporal patterns that can aid recognition.

...read moreread less

Abstract: Sketching is a natural mode of interaction used in a variety of settings. For example, people sketch during early design and brainstorming sessions to guide the thought process; when we communicate certain ideas, we use sketching as an additional modality to convey ideas that can not be put in words. The emergence of hardware such as PDAs and Tablet PCs has enabled capturing freehand sketches, enabling the routine use of sketching as an additional human-computer interaction modality. But despite the availability of pen based information capture hardware, relatively little effort has been put into developing software capable of understanding and reasoning about sketches. To date, most approaches to sketch recognition have treated sketches as images (i.e., static finished products) and have applied vision algorithms for recognition. However, unlike images, sketches are produced incrementally and interactively, one stroke at a time and their processing should take advantage of this. This thesis explores ways of doing sketch recognition by extracting as much information as possible from temporal patterns that appear during sketching. We present a sketch recognition framework based on hierarchical statistical models of temporal patterns. We show that in certain domains, stroke orderings used in the course of drawing individual objects contain temporal patterns that can aid recognition. We build on this work to show how sketch recognition systems can use knowledge of both common stroke orderings and common object orderings. We describe a statistical framework based on Dynamic Bayesian Networks that can learn temporal models of object-level and stroke-level patterns for recognition. Our framework supports multi-object strokes, multi-stroke objects, and allows interspersed drawing of objects---relaxing the assumption that objects are drawn one at a time. Our system also supports real-valued feature representations using a numerically stable recognition algorithm. We present recognition results for hand-drawn electronic circuit diagrams. The results show that modeling temporal patterns at multiple scales provides a significant increase in correct recognition rates, with no added computational penalties. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

...read moreread less

Journal Article•

Adaptive online multi-stroke sketch recognition based on hidden markov model

[...]

Zhengxing Sun, Wei Jiang, Jianyong Sun

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: Wang et al. as mentioned in this paper presented an approach for adaptive online multi-stroke sketch recognition based on Hidden Markov Model (HMM) which views the drawing sketch as the result of a stochastic process that is governed by a hidden model and identified according to its probability of generating the output.

...read moreread less

Abstract: This paper presents a novel approach for adaptive online multi-stroke sketch recognition based on Hidden Markov Model (HMM). The method views the drawing sketch as the result of a stochastic process that is governed by a hidden stochastic model and identified according to its probability of generating the output. To capture a user's drawing habits, a composite feature combining both geometric and dynamic characteristics of sketching is defined for sketch representation. To implement the stochastic process of online multi-stroke sketch recognition, multi-stroke sketching is modeled as an HMM chain while the strokes are mapped as different HMM states. To fit the requirement of adaptive online sketch recognition, a variable state-number determining method for HMM is also proposed. The experiments prove both the effectiveness and efficiency of the proposed method.

...read moreread less

Proceedings Article•DOI•

Perceptually based learning of shape descriptions for sketch recognition

[...]

Olya Veselova¹, Randall Davis²•Institutions (2)

Microsoft¹, Massachusetts Institute of Technology²

30 Jul 2006

TL;DR: In this article, a generic sketch recognition system was proposed to enable more natural interaction with design tools in various domains, such as mechanical engineering, military planning, logic design, etc.

...read moreread less

Abstract: We are interested in enabling a generic sketch recognition system that would allow more natural interaction with design tools in various domains, such as mechanical engineering, military planning, logic design, etc. We would like to teach the system the symbols for a particular domain by simply drawing an example of each one -- as easy as it is to teach a person. Studies in cognitive science suggest that, when shown a symbol, people attend preferentially to certain geometric features. Relying on such biases, we built a system capable of learning descriptions of hand-drawn symbols from a single example. The generalization power is derived from a qualitative vocabulary reflecting human perceptual categories and a focus on perceptually relevant global properties of the symbol. Our user study shows that the system agrees with the subjects' majority classification about as often as any individual subject did.

...read moreread less

Representations and matching techniques for 3D free-form object and face recognition

[...]

Ajmal Mian

01 Jan 2006

TL;DR: This thesis presents a novel multiview correspondence algorithm which automatically establishes correspondences between unordered views of a free-form object with O(N) complexity and present a novel algorithm for 3D free-forms object recognition and segmentation in complex scenes containing clutter and occlusions.

...read moreread less

Abstract: The aim of visual recognition is to identify objects in a scene and estimate their pose. Object recognition from 2D images is sensitive to illumination, pose, clutter and occlusions. Object recognition from range data on the other hand does not suffer from these limitations. An important paradigm of recognition is model-based whereby 3D models of objects are constructed offline and saved in a database, using a suitable representation. During online recognition, a similar representation of a scene is matched with the database for recognizing objects present in the scene. A 3D model of a free-form object is constructed offline from its multiple range images (views) acquired from different viewpoints. These views are registered in a common coordinate basis by establishing correspondences between them followed by their integration into a seamless 3D model. Automatic correspondences between overlapping views is the major problem in 3D modeling. This problem becomes more challenging when the views are unordered and hence there is no a priori knowledge about which view pairs overlap. The main challenges in the online recognition phase are the presence of clutter due to unwanted objects and noise, and the presence of occluding objects. This thesis addresses the above challenges and investigates novel representations and matching techniques for 3D free-form rigid object and non-rigid face recognition. A robust representation based on third order tensors is presented. The tensor representation quantizes local surface patches of an object into three-dimensional grids. Each grid is defined in an object centered local coordinate basis which makes the tensors invariant to rigid transformations. This thesis presents a novel multiview correspondence algorithm which automatically establishes correspondences between unordered views of a free-form object with O(N) complexity. It also presents a novel algorithm for 3D free-form object recognition and segmentation in complex scenes containing clutter and occlusions. The combination of the strengths of the tensor representation and the customized use of a 4D hash table for matching constitute the basic ingredients of these algorithms. This thesis demonstrates the superiority of the tensor representation in terms of descriptiveness compared to an existing competitor, i.e. the spin images. It also demonstrates that the proposed correspondence and recognition algorithms outperform the spin image recognition in terms of accuracy and efficiency. The tensor representation is extended to automatic and pose invariant 3D face recognition. As the face is a non-rigid object, expressions can significantly change its 3D shape. Therefore, the last part of this thesis investigates representations and matching techniques for automatic 3D face recognition which are robust to facial expressions. A number of novelties are proposed in this area along with their extensive experimental validation using the largest available 3D face database. These novelties include a region-based matching algorithm for 3D face recognition, a 2D and 3D multimodal hybrid face recognition algorithm, fully automatic 3D nose ridge detection, fully automatic normalization of 3D and 2D faces, a low cost rejection classifier based on a novel Spherical Face Representation, and finally, automatic segmentation of the expression insensitive regions of a face.

...read moreread less

Journal Article•DOI•

Writer independent online handwritten character recognition using a simple approach

[...]

Muhammad Faisal Zafar¹, Dzulkifli Mohamad¹, M Razib Othman•Institutions (1)

Universiti Teknologi Malaysia¹

01 Mar 2006-Information Technology Journal

TL;DR: This research evaluates the use of the Back-propagation Neural Network (BPN) and presents feature extraction mechanism in full detail to work with on-line handwriting recognition and indicates that the application of multiple thresholds has significant effect on recognition mechanism.

...read moreread less

Abstract: This study describes the simple approach involved in online handwriting recognition. Conventionally, the data obtained needs a lot of preprocessing including filtering, smoothing, slant removing and size normalization before recognition process. Instead of doing such lengthy preprocessing, this study presents a simple approach to extract the useful character information. The whole process requires no preprocessing and size normalization. This research evaluates the use of the Back-propagation Neural Network (BPN) and presents feature extraction mechanism in full detail to work with on-line handwriting recognition. The obtained recognition rates were 51 to 83% using the BPN for different sets of character samples. This study also describes a performance study in which a recognition mechanism with multiple thresholds is evaluated for back-propagation architecture. The results indicate that the application of multiple thresholds has significant effect on recognition mechanism. This is a writer-independent system and the method is applicable for off-line character recognition as well. The technique is tested for upper-case English alphabets for a number of different styles from different subjects.

...read moreread less

Proceedings Article•DOI•

An Agent-Based Framework for Context-Driven Interpretation of Symbols in Diagrammatic Sketches

[...]

G. Casella¹, Gennaro Costagliola², Vincenzo Deufemia², Maurizio Martelli¹, Viviana Mascardi¹ - Show less +1 more•Institutions (2)

University of Genoa¹, University of Salerno²

04 Sep 2006

TL;DR: This paper presents an agent-based framework for context-driven interpretation of symbols in diagrammatic sketches that heavily exploits contextual information for ambiguity resolution and management of low-level hand-drawn symbol recognizers.

...read moreread less

Abstract: Parsing hand-drawn diagrams is a definitely complex recognition problem. The input drawings are often intrinsically ambiguous, and require context to be interpreted in a correct way. Many existing sketch recognition systems avoid this problem by recognizing single segments or simple geometric shapes in a stroke. However, for a recognition system to be effective and precise, context must be exploited, and both the simplifications on the sketch features, and the constraints under which recognition may take place, must be reduced to the minimum. In this paper we present an agent-based framework for context-driven interpretation of symbols in diagrammatic sketches that heavily exploits contextual information for ambiguity resolution. Agents manage the activity of low-level hand-drawn symbol recognizers, that may be heterogeneous for better adapting to the characteristics of each symbol to be recognized, and coordinate themselves in order to exchange contextual information, thus leading to an efficient and precise interpretation of sketches.

...read moreread less

Proceedings Article•DOI•

Research on e-learning system model based on affective computing

[...]

Jingjing Chen¹, Qi Luo•Institutions (1)

Central China Normal University¹

01 Nov 2006

TL;DR: The model of e-learning system based on affective computing was constructed by using facial expression recognition, emotion recognition of speech and motion recognition techniques and the key techniques of realizing the system were introduced.

...read moreread less

Abstract: Aiming at emotion deficiency in present e-learning system, a lot of negative effects were analyzed and corresponding countermeasures were proposed. Basing on it, we combined affective computing with the traditional e-learning system. The model of e-learning system based on affective computing was constructed by using facial expression recognition, emotion recognition of speech and motion recognition techniques. The key techniques of realizing the system were also introduced

...read moreread less

Proceedings Article•DOI•

A 'Personalized' Facial Expression Recognition System using Case Based Reasoning

[...]

M. Zubair Shafiq¹, A. Khanam¹•Institutions (1)

National University of Science and Technology¹

01 Jan 2006

TL;DR: A `personalized' system for facial expression recognition from facial features that uses a case-based reasoning system for personalized output and degree of customization of the system is a function of time as system adapts itself to the user.

...read moreread less

Abstract: Facial expression recognition has increasing importance in assisting human-computer interaction issues so that the system can customize to user needs and requirements. In this paper we present a 'personalized' system for facial expression recognition from facial features. Personalization refers to custom-tailoring of the system for a specific user. The facial expression recognition module uses a case-based reasoning system for personalized output. Personalization helps to customize the output of system according to a particular user. Degree of customization of our system is a function of time as system adapts itself to the user. This paper presents further enhancements to our previous work on embedding case based reasoning for facial expression recognition.

...read moreread less

Book Chapter•DOI•

Solving ambiguities for sketch-based interaction in mobile environments

[...]

Danilo Avola, Maria Chiara Caschera, Patrizia Grifoni

29 Oct 2006

TL;DR: This paper presents a classification of meaningful ambiguities in sketch-based interaction and discusses methods to solve them taking into account of the spatial and temporal information that characterise the drawing process.

...read moreread less

Abstract: The diffusion of mobile devices and the development of their services and applications are connected with the possibility to communicate anytime and anywhere according to a natural approach, which combines different modalities (speech, sketch, etc.) A natural communication approach, such as sketch-based interaction, frequently produces ambiguities Ambiguities can arise in sketch recognition process by the gap between the user's intention and the system interpretation. This paper presents a classification of meaningful ambiguities in sketch-based interaction and discusses methods to solve them taking into account of the spatial and temporal information that characterise the drawing process The proposed solution methods use both sketch-based approaches and/or integrated approaches with other modalities They are classified in: prevention, a-posteriori and approximation methods.

...read moreread less