scispace - formally typeset
Search or ask a question

Showing papers on "Feature (computer vision) published in 2006"


Book ChapterDOI
07 May 2006
TL;DR: It is shown that machine learning can be used to derive a feature detector which can fully process live PAL video using less than 7% of the available processing time.
Abstract: Where feature points are used in real-time frame-rate applications, a high-speed feature detector is necessary. Feature detectors such as SIFT (DoG), Harris and SUSAN are good methods which yield high quality features, however they are too computationally intensive for use in real-time applications of any complexity. Here we show that machine learning can be used to derive a feature detector which can fully process live PAL video using less than 7% of the available processing time. By comparison neither the Harris detector (120%) nor the detection stage of SIFT (300%) can operate at full frame rate. Clearly a high-speed detector is of limited use if the features produced are unsuitable for downstream processing. In particular, the same scene viewed from two different positions should yield features which correspond to the same real-world 3D locations [1]. Hence the second contribution of this paper is a comparison corner detectors based on this criterion applied to 3D scenes. This comparison supports a number of claims made elsewhere concerning existing corner detectors. Further, contrary to our initial expectations, we show that despite being principally constructed for speed, our detector significantly outperforms existing feature detectors according to this criterion.

3,828 citations


Journal Article
TL;DR: In this paper, the same scene viewed from two different positions should yield features which correspond to the same real-world 3D locations, and a comparison of corner detectors based on this criterion applied to 3D scenes is made.
Abstract: Where feature points are used in real-time frame-rate applications, a high-speed feature detector is necessary. Feature detectors such as SIFT (DoG), Harris and SUSAN are good methods which yield high quality features, however they are too computationally intensive for use in real-time applications of any complexity. Here we show that machine learning can be used to derive a feature detector which can fully process live PAL video using less than 7% of the available processing time. By comparison neither the Harris detector (120%) nor the detection stage of SIFT (300%) can operate at full frame rate. Clearly a high-speed detector is of limited use if the features produced are unsuitable for downstream processing. In particular, the same scene viewed from two different positions should yield features which correspond to the same real-world 3D locations[1]. Hence the second contribution of this paper is a comparison corner detectors based on this criterion applied to 3D scenes. This comparison supports a number of claims made elsewhere concerning existing corner detectors. Further, contrary to our initial expectations, we show that despite being principally constructed for speed, our detector significantly outperforms existing feature detectors according to this criterion. © Springer-Verlag Berlin Heidelberg 2006.

3,413 citations


Journal ArticleDOI
TL;DR: The location-independent property of feature-based attention makes it particularly well suited to modify selectively the neural representations of stimuli or parts within complex visual scenes that match the currently attended feature.

849 citations


Journal ArticleDOI
TL;DR: This paper proposes some new feature extractors based on maximum margin criterion (MMC) and establishes a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA.
Abstract: In pattern recognition, feature extraction techniques are widely employed to reduce the dimensionality of data and to enhance the discriminatory information. Principal component analysis (PCA) and linear discriminant analysis (LDA) are the two most popular linear dimensionality reduction methods. However, PCA is not very effective for the extraction of the most discriminant features, and LDA is not stable due to the small sample size problem . In this paper, we propose some new (linear and nonlinear) feature extractors based on maximum margin criterion (MMC). Geometrically, feature extractors based on MMC maximize the (average) margin between classes after dimensionality reduction. It is shown that MMC can represent class separability better than PCA. As a connection to LDA, we may also derive LDA from MMC by incorporating some constraints. By using some other constraints, we establish a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA. The kernelized (nonlinear) counterpart of this linear feature extractor is also established in the paper. Our extensive experiments demonstrate that the new feature extractors are effective, stable, and efficient.

838 citations


Proceedings ArticleDOI
01 Jan 2006
TL;DR: This work shows that when applied to human faces, the constrained local model (CLM) algorithm is more robust and more accurate than the original AAM search method, which relies on the image reconstruction error to update the model parameters.
Abstract: We present an efficient and robust model matching method which uses a joint shape and texture appearance model to generate a set of region template detectors. The model is fitted to an unseen image in an iterative manner by generating templates using the joint model and the current parameter estimates, correlating the templates with the target image to generate response images and optimising the shape parameters so as to maximise the sum of responses. The appearance model is similar to that used in the Active Appearance Model due to Cootes et al. However in our approach the appearance model is used to generate likely feature templates, instead of trying to approximate the image pixels directly. We show that when applied to human faces, our constrained local model (CLM) algorithm is more robust and more accurate than the original AAM search method, which relies on the image reconstruction error to update the model parameters. We demonstrate improved localisation accuracy on two publicly available face data sets and improved tracking on a challenging set of in-car face sequences.

802 citations


Journal ArticleDOI
TL;DR: A system that estimates the motion of a stereo head, or a single moving camera, based on video input, in real time with low delay, and the motion estimates are used for navigational purposes.
Abstract: We present a system that estimates the motion of a stereo head, or a single moving camera, based on video input. The system operates in real time with low delay, and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image trajectories at video rate. Robust estimates of the camera motion are then produced from the feature tracks using a geometric hypothesize-and-test architecture. This generates motion estimates from visual input alone. No prior knowledge of the scene or the motion is necessary. The visual estimates can also be used in conjunction with information from other sources, such as a global positioning system, inertia sensors, wheel encoders, etc. The pose estimation method has been applied successfully to video from aerial, automotive, and handheld platforms. We focus on results obtained with a stereo head mounted on an autonomous ground vehicle. We give examples of camera trajectories estimated in real time purely from images over previously unseen distances (600 m) and periods of time. © 2006 Wiley Periodicals, Inc.

704 citations


Journal ArticleDOI
TL;DR: A hierarchical framework is presented to automate the task of mood detection from acoustic music data, by following some music psychological theories in western cultures, and has the advantage of emphasizing the most suitable features in different detection tasks.
Abstract: Music mood describes the inherent emotional expression of a music clip. It is helpful in music understanding, music retrieval, and some other music-related applications. In this paper, a hierarchical framework is presented to automate the task of mood detection from acoustic music data, by following some music psychological theories in western cultures. The hierarchical framework has the advantage of emphasizing the most suitable features in different detection tasks. Three feature sets, including intensity, timbre, and rhythm are extracted to represent the characteristics of a music clip. The intensity feature set is represented by the energy in each subband, the timbre feature set is composed of the spectral shape features and spectral contrast features, and the rhythm feature set indicates three aspects that are closely related with an individual's mood response, including rhythm strength, rhythm regularity, and tempo. Furthermore, since mood is usually changeable in an entire piece of classical music, the approach to mood detection is extended to mood tracking for a music piece, by dividing the music into several independent segments, each of which contains a homogeneous emotional expression. Preliminary evaluations indicate that the proposed algorithms produce satisfactory results. On our testing database composed of 800 representative music clips, the average accuracy of mood detection achieves up to 86.3%. We can also on average recall 84.1% of the mood boundaries from nine testing music pieces.

544 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: A biologically inspired model of visual object recognition to the multiclass object categorization problem, modifies that of Serre, Wolf, and Poggio, and demonstrates the value of retaining some position and scale information above the intermediate feature level.
Abstract: We apply a biologically inspired model of visual object recognition to the multiclass object categorization problem. Our model modifies that of Serre, Wolf, and Poggio. As in that work, we first apply Gabor filters at all positions and scales; feature complexity and position/scale invariance are then built up by alternating template matching and max pooling operations. We refine the approach in several biologically plausible ways, using simple versions of sparsification and lateral inhibition. We demonstrate the value of retaining some position and scale information above the intermediate feature level. Using feature selection we arrive at a model that performs better with fewer features. Our final model is tested on the Caltech 101 object categories and the UIUC car localization task, in both cases achieving state-of-the-art performance. The results strengthen the case for using this class of model in computer vision.

539 citations


Journal ArticleDOI
TL;DR: A method for partial matching of surfaces represented by triangular meshes that matches surface regions that are numerically and topologically dissimilar, but approximately similar regions, and introduces novel local surface descriptors which efficiently represent the geometry of local regions of the surface.
Abstract: This article introduces a method for partial matching of surfaces represented by triangular meshes. Our method matches surface regions that are numerically and topologically dissimilar, but approximately similar regions. We introduce novel local surface descriptors which efficiently represent the geometry of local regions of the surface. The descriptors are defined independently of the underlying triangulation, and form a compatible representation that allows matching of surfaces with different triangulations. To cope with the combinatorial complexity of partial matching of large meshes, we introduce the abstraction of salient geometric features and present a method to construct them. A salient geometric feature is a compound high-level feature of nontrivial local shapes. We show that a relatively small number of such salient geometric features characterizes the surface well for various similarity applications. Matching salient geometric features is based on indexing rotation-invariant features and a voting scheme accelerated by geometric hashing. We demonstrate the effectiveness of our method with a number of applications, such as computing self-similarity, alignments, and subparts similarity.

534 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: An unsupervised data driven Bayesian clustering algorithm which has detection of individual entities as its primary goal and can be augmented with subject-specific filtering, but is shown to already be effective at detecting individual entities in crowds of people, insects, and animals.
Abstract: While crowds of various subjects may offer applicationspecific cues to detect individuals, we demonstrate that for the general case, motion itself contains more information than previously exploited. This paper describes an unsupervised data driven Bayesian clustering algorithm which has detection of individual entities as its primary goal. We track simple image features and probabilistically group them into clusters representing independently moving entities. The numbers of clusters and the grouping of constituent features are determined without supervised learning or any subject-specific model. The new approach is instead, that space-time proximity and trajectory coherence through image space are used as the only probabilistic criteria for clustering. An important contribution of this work is how these criteria are used to perform a one-shot data association without iterating through combinatorial hypotheses of cluster assignments. Our proposed general detection algorithm can be augmented with subject-specific filtering, but is shown to already be effective at detecting individual entities in crowds of people, insects, and animals. This paper and the associated video examine the implementation and experiments of our motion clustering framework.

472 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work bases its approach on a highly parallelized version of the KLT tracker in order to process the video into a set of feature trajectories and proposes a simple means of spatially and temporally conditioning the trajectories.
Abstract: In its full generality, motion analysis of crowded objects necessitates recognition and segmentation of each moving entity. The difficulty of these tasks increases considerably with occlusions and therefore with crowding. When the objects are constrained to be of the same kind, however, partitioning of densely crowded semi-rigid objects can be accomplished by means of clustering tracked feature points. We base our approach on a highly parallelized version of the KLT tracker in order to process the video into a set of feature trajectories. While such a set of trajectories provides a substrate for motion analysis, their unequal lengths and fragmented nature present difficulties for subsequent processing. To address this, we propose a simple means of spatially and temporally conditioning the trajectories. Given this representation, we integrate it with a learned object descriptor to achieve a segmentation of the constituent motions. We present experimental results for the problem of estimating the number of moving objects in a dense crowd as a function of time.

Journal ArticleDOI
TL;DR: RFE outperforms SVM-RFE and KWS on the task of finding small subsets of features with high discrimination levels on PTR-MS data sets, and it is shown how selection probabilities and features co-occurrence can be used to highlight the most relevant features for discrimination.

Patent
15 Dec 2006
TL;DR: In this paper, a New To Me (NTM) feature is provided for an interactive media guidance system implemented as a home network having multiple user equipment devices, which identifies programs or advertisements that have been previously viewed by an individual user or a user equipment device within the home network, or even by a household.
Abstract: A “New To Me” feature is provided for an interactive media guidance system implemented as a home network having multiple user equipment devices. Functionally speaking, the “New To Me” feature of the interactive media guidance system identifies programs or advertisements that have been previously viewed by an individual user or a user equipment device within the home network, or even by a household. The interactive media guidance system may use the information gathered regarding the programs and/or advertisements that have already been seen by a user, device or household to, for example, remove the programs or advertisements from future displays of recommendations, search results or listings of available programming.

Journal ArticleDOI
TL;DR: This paper addresses two specific issues related to the implementation of the FSV method, namely "how well does it produce results that agree with visual assessment?" and "what benefit can it provide in a practical validation environment?"
Abstract: The feature selective validation (FSV) method has been proposed as a technique to allow the objective, quantified, comparison of data for inter alia validation of computational electromagnetics. In the companion paper "Feature selective validation for validation of computational electromagnetics. Part I-The FSV method," the method was outlined in some detail. This paper addresses two specific issues related to the implementation of the FSV method, namely "how well does it produce results that agree with visual assessment?" and "what benefit can it provide in a practical validation environment?" The first of these questions is addressed by comparing the FSV output to the results of an extensive survey of EMC engineers from several countries. The second is approached via a case study analysis

Journal ArticleDOI
TL;DR: By regionalizing the detection area, false positives are eliminated and the speed of detection is increased due to the reduction of the area examined.
Abstract: Viola and Jones [9] introduced a method to accurately and rapidly detect faces within an image. This technique can be adapted to accurately detect facial features. However, the area of the image being analyzed for a facial feature needs to be regionalized to the location with the highest probability of containing the feature. By regionalizing the detection area, false positives are eliminated and the speed of detection is increased due to the reduction of the area examined.

22 Nov 2006
TL;DR: The derivation and implementation of convolutional neural networks are discussed, followed by an extension which allows one to learn sparse combinations of feature maps, and small snippets of MATLAB code are given to accompany the equations.
Abstract: We discuss the derivation and implementation of convolutional neural networks, followed by an extension which allows one to learn sparse combinations of feature maps. The derivation we present is specific to two-dimensional data and convolutions, but can be extended without much additional effort to an arbitrary number of dimensions. Throughout the discussion, we emphasize efficiency of the implementation, and give small snippets of MATLAB code to accompany the equations.

Dissertation
17 Jul 2006
TL;DR: This thesis introduces grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images and proposes descriptors based on oriented histograms of differential optical flow to detect moving humans in videos.
Abstract: This thesis targets the detection of humans and other object classes in images and videos. Our focus is on developing robust feature extraction algorithms that encode image regions as highdimensional feature vectors that support high accuracy object/non-object decisions. To test our feature sets we adopt a relatively simple learning framework that uses linear Support Vector Machines to classify each possible image region as an object or as a non-object. The approach is data-driven and purely bottom-up using low-level appearance and motion vectors to detect objects. As a test case we focus on person detection as people are one of the most challenging object classes with many applications, for example in film and video analysis, pedestrian detection for smart cars and video surveillance. Nevertheless we do not make any strong class specific assumptions and the resulting object detection framework also gives state-of-the-art performance for many other classes including cars, motorbikes, cows and sheep. This thesis makes four main contributions. Firstly, we introduce grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images. The HOG descriptors are computed over dense and overlapping grids of spatial blocks, with image gradient orientation features extracted at fixed resolution and gathered into a highdimensional feature vector. They are designed to be robust to small changes in image contour locations and directions, and significant changes in image illumination and colour, while remaining highly discriminative for overall visual form. We show that unsmoothed gradients, fine orientation voting, moderately coarse spatial binning, strong normalisation and overlapping blocks are all needed for good performance. Secondly, to detect moving humans in videos, we propose descriptors based on oriented histograms of differential optical flow. These are similar to static HOG descriptors, but instead of image gradients, they are based on local differentials of dense optical flow. They encode the noisy optical flow estimates into robust feature vectors in a manner that is robust to the overall camera motion. Several variants are proposed, some capturing motion boundaries while others encode the relative motions of adjacent image regions. Thirdly, we propose a general method based on kernel density estimation for fusing multiple overlapping detections, that takes into account the number of detections, their confidence scores and the scales of the detections. Lastly, we present work in progress on a parts based approach to person detection that first detects local body parts like heads, torso, and legs and then fuses them to create a global overall person detector.

Journal ArticleDOI
TL;DR: A feature-level fusion approach for improving the efficiency of palmprint identification using multiple elliptical Gabor filters with different orientations to extract the phase information on a palmprint image, which is then merged according to a fusion rule to produce a single feature called the Fusion Code.

Journal ArticleDOI
01 Jan 2006
TL;DR: This paper introduces a new information gain and divergence-based feature selection method for statistical machine learning-based text categorization without relying on more complex dependence models.
Abstract: Most previous works of feature selection emphasized only the reduction of high dimensionality of the feature space. But in cases where many features are highly redundant with each other, we must utilize other means, for example, more complex dependence models such as Bayesian network classifiers. In this paper, we introduce a new information gain and divergence-based feature selection method for statistical machine learning-based text categorization without relying on more complex dependence models. Our feature selection method strives to reduce redundancy between features while maintaining information gain in selecting appropriate features for text categorization. Empirical results are given on a number of dataset, showing that our feature selection method is more effective than Koller and Sahami's method [Koller, D., & Sahami, M. (1996). Toward optimal feature selection. In Proceedings of ICML-96, 13th international conference on machine learning], which is one of greedy feature selection methods, and conventional information gain which is commonly used in feature selection for text categorization. Moreover, our feature selection method sometimes produces more improvements of conventional machine learning algorithms over support vector machines which are known to give the best classification accuracy.

Journal ArticleDOI
TL;DR: The aims of this paper are to define Gaussian mixture models (GMMs) of colored texture on several feature spaces and to compare the performance of these models in various classification tasks, both with each other and with other models popular in the literature.

Journal ArticleDOI
TL;DR: The novel insight that the simultaneous localization and mapping (SLAM) information matrix is exactly sparse in a delayed-state framework is reported, which means it can produce equivalent results to the full-covariance solution.
Abstract: This paper reports the novel insight that the simultaneous localization and mapping (SLAM) information matrix is exactly sparse in a delayed-state framework. Such a framework is used in view-based representations of the environment that rely upon scan-matching raw sensor data to obtain virtual observations of robot motion with respect to a place it has previously been. The exact sparseness of the delayed-state information matrix is in contrast to other recent feature-based SLAM information algorithms, such as sparse extended information filter or thin junction-tree filter, since these methods have to make approximations in order to force the feature-based SLAM information matrix to be sparse. The benefit of the exact sparsity of the delayed-state framework is that it allows one to take advantage of the information space parameterization without incurring any sparse approximation error. Therefore, it can produce equivalent results to the full-covariance solution. The approach is validated experimentally using monocular imagery for two datasets: a test-tank experiment with ground truth, and a remotely operated vehicle survey of the RMS Titanic

Journal ArticleDOI
01 Feb 2006
TL;DR: This paper presents an online feature selection algorithm using genetic programming (GP) that simultaneously selects a good subset of features and constructs a classifier using the selected features and produces a feature ranking scheme.
Abstract: This paper presents an online feature selection algorithm using genetic programming (GP). The proposed GP methodology simultaneously selects a good subset of features and constructs a classifier using the selected features. For a c-class problem, it provides a classifier having c trees. In this context, we introduce two new crossover operations to suit the feature selection process. As a byproduct, our algorithm produces a feature ranking scheme. We tested our method on several data sets having dimensions varying from 4 to 7129. We compared the performance of our method with results available in the literature and found that the proposed method produces consistently good results. To demonstrate the robustness of the scheme, we studied its effectiveness on data sets with known (synthetically added) redundant/bad features.

Journal ArticleDOI
TL;DR: In this paper, color distinctiveness is explicitly incorporated into the design of saliency detection, which is called color saliency boosting, and is based on an analysis of the statistics of color image derivatives.
Abstract: The aim of salient feature detection is to find distinctive local events in images. Salient features are generally determined from the local differential structure of images. They focus on the shape-saliency of the local neighborhood. The majority of these detectors are luminance-based, which has the disadvantage that the distinctiveness of the local color information is completely ignored in determining salient image features. To fully exploit the possibilities of salient point detection in color images, color distinctiveness should be taken into account in addition to shape distinctiveness. In this paper, color distinctiveness is explicitly incorporated into the design of saliency detection. The algorithm, called color saliency boosting, is based on an analysis of the statistics of color image derivatives. Color saliency boosting is designed as a generic method easily adaptable to existing feature detectors. Results show that substantial improvements in information content are acquired by targeting color salient features.

Journal ArticleDOI
TL;DR: An automated algorithm for tissue segmentation of noisy, low-contrast magnetic resonance (MR) images of the brain is presented and the applicability of the framework can be extended to diseased brains and neonatal brains.
Abstract: An automated algorithm for tissue segmentation of noisy, low-contrast magnetic resonance (MR) images of the brain is presented. A mixture model composed of a large number of Gaussians is used to represent the brain image. Each tissue is represented by a large number of Gaussian components to capture the complex tissue spatial layout. The intensity of a tissue is considered a global feature and is incorporated into the model through tying of all the related Gaussian parameters. The expectation-maximization (EM) algorithm is utilized to learn the parameter-tied, constrained Gaussian mixture model. An elaborate initialization scheme is suggested to link the set of Gaussians per tissue type, such that each Gaussian in the set has similar intensity characteristics with minimal overlapping spatial supports. Segmentation of the brain image is achieved by the affiliation of each voxel to the component of the model that maximized the a posteriori probability. The presented algorithm is used to segment three-dimensional, T1-weighted, simulated and real MR images of the brain into three different tissues, under varying noise conditions. Results are compared with state-of-the-art algorithms in the literature. The algorithm does not use an atlas for initialization or parameter learning. Registration processes are therefore not required and the applicability of the framework can be extended to diseased brains and neonatal brains

Proceedings ArticleDOI
18 Dec 2006
TL;DR: This paper proposes a new approach to construct high speed payload-based anomaly IDS intended to be accurate and hard to evade, and uses a feature clustering algorithm originally proposed for text classification problems to reduce the dimensionality of the feature space.
Abstract: Unsupervised or unlabeled learning approaches for network anomaly detection have been recently proposed. In particular, recent work on unlabeled anomaly detection focused on high speed classification based on simple payload statistics. For example, PAYL, an anomaly IDS, measures the occurrence frequency in the payload of n-grams. A simple model of normal traffic is then constructed according to this description of the packets' content. It has been demonstrated that anomaly detectors based on payload statistics can be "evaded" by mimicry attacks using byte substitution and padding techniques. In this paper we propose a new approach to construct high speed payload-based anomaly IDS intended to be accurate and hard to evade. We propose a new technique to extract the features from the payload. We use a feature clustering algorithm originally proposed for text classification problems to reduce the dimensionality of the feature space. Accuracy and hardness of evasion are obtained by constructing our anomaly-based IDS using an ensemble of one-class SVM classifiers that work on different feature spaces.

Proceedings ArticleDOI
22 Jul 2006
TL;DR: It is shown that reducing the feature set improves performance on three opinion classification tasks, especially when combined with traditional feature selection.
Abstract: Lexical features are key to many approaches to sentiment analysis and opinion detection. A variety of representations have been used, including single words, multi-word Ngrams, phrases, and lexico-syntactic patterns. In this paper, we use a subsumption hierarchy to formally define different types of lexical features and their relationship to one another, both in terms of representational coverage and performance. We use the subsumption hierarchy in two ways: (1) as an analytic tool to automatically identify complex features that outperform simpler features, and (2) to reduce a feature set by removing unnecessary features. We show that reducing the feature set improves performance on three opinion classification tasks, especially when combined with traditional feature selection.

Proceedings ArticleDOI
28 May 2006
TL;DR: A theory of FOR is developed that relates code refac-toring to algebraic factoring, and explains relationships between features and their implementing modules, and why fea-tures in different programs of a product-line can have different implementations.
Abstract: Feature oriented refactoring (FOR) is the process of decomposinga program into features, where a feature is an increment in programfunctionality. We develop a theory of FOR that relates code refac-toring to algebraic factoring. Our theory explains relationshipsbetween features and their implementing modules, and why fea-tures in different programs of a product-line can have differentimplementations. We describe a tool and refactoring methodologybased on our theory, and present a validating case study.

Patent
07 Jun 2006
TL;DR: A batch processing method for enhancing an appearance of a face located in a digital image, where the image is one of a large number of images that are being processed through a batch process, comprises the steps of: (a) providing a script file that identifies one or more original digital images that have been selected for enhancement, wherein the script file includes an instruction for the location of each original digital image; (b) using the instructions in the script files, acquiring an original image containing one or multiple faces; detecting a location of facial feature points in the one OR more faces;
Abstract: A batch processing method for enhancing an appearance of a face located in a digital image, where the image is one of a large number of images that are being processed through a batch process, comprises the steps of: (a) providing a script file that identifies one or more original digital images that have been selected for enhancement, wherein the script file includes an instruction for the location of each original digital image; (b) using the instructions in the script file, acquiring an original digital image containing one or more faces; (c) detecting a location of facial feature points in the one or more faces, said facial feature points including points identifying salient features including one or more of skin, eyes, eyebrows, nose, mouth, and hair; (d) using the location of the facial feature points to segment the face into different regions, said different regions including one or more of skin, eyes, eyebrows, nose, mouth, neck and hair regions; (e) determining one or more facially relevant characteristics of the different regions; (f) based on the facially relevant characteristics of the different regions, selecting one or more enhancement filters each customized especially for a particular region and selecting the default parameters for the enhancement filters; (g) executing the enhancement filters on the particular regions, thereby producing an enhanced digital image from the original digital image; (h) storing the enhanced digital image; and (i) generating an output script file having instructions that indicate one or more operations in one or more of the steps (c)-(f) that have been performed on the enhanced digital image.

Journal ArticleDOI
TL;DR: In this paper, a geometry-driven facial expression synthesis system is proposed to automatically synthesize a corresponding expression image that includes photorealistic and natural looking expression details such as wrinkles due to skin deformation.
Abstract: Expression mapping (also called performance driven animation) has been a popular method for generating facial animations. A shortcoming of this method is that it does not generate expression details such as the wrinkles due to skin deformations. In this paper, we provide a solution to this problem. We have developed a geometry-driven facial expression synthesis system. Given feature point positions (the geometry) of a facial expression, our system automatically synthesizes a corresponding expression image that includes photorealistic and natural looking expression details. Due to the difficulty of point tracking, the number of feature points required by the synthesis system is, in general, more than what is directly available from a performance sequence. We have developed a technique to infer the missing feature point motions from the tracked subset by using an example-based approach. Another application of our system is expression editing where the user drags feature points while the system interactively generates facial expressions with skin deformation details.

Book ChapterDOI
18 Sep 2006
TL;DR: The probabilistic interpretation of linear PCA is exploited together with recent results on latent variable models in Gaussian Processes in order to introduce an objective function for KPCA, and this new approach can be extended to reconstruct corrupted test data using fixed kernel feature extractors.
Abstract: Kernel Principal Component Analysis (KPCA) is a widely used technique for visualisation and feature extraction. Despite its success and flexibility, the lack of a probabilistic interpretation means that some problems, such as handling missing or corrupted data, are very hard to deal with. In this paper we exploit the probabilistic interpretation of linear PCA together with recent results on latent variable models in Gaussian Processes in order to introduce an objective function for KPCA. This in turn allows a principled approach to the missing data problem. Furthermore, this new approach can be extended to reconstruct corrupted test data using fixed kernel feature extractors. The experimental results show strong improvements over widely used heuristics.