scispace - formally typeset
Search or ask a question

Showing papers on "Feature vector published in 2004"


Journal ArticleDOI
TL;DR: A method is presented for automated segmentation of vessels in two-dimensional color images of the retina based on extraction of image ridges, which coincide approximately with vessel centerlines, which is compared with two recently published rule-based methods.
Abstract: A method is presented for automated segmentation of vessels in two-dimensional color images of the retina. This method can be used in computer analyses of retinal images, e.g., in automated screening for diabetic retinopathy. The system is based on extraction of image ridges, which coincide approximately with vessel centerlines. The ridges are used to compose primitives in the form of line elements. With the line elements an image is partitioned into patches by assigning each image pixel to the closest line element. Every line element constitutes a local coordinate frame for its corresponding patch. For every pixel, feature vectors are computed that make use of properties of the patches and the line elements. The feature vectors are classified using a kNN-classifier and sequential forward feature selection. The algorithm was tested on a database consisting of 40 manually labeled images. The method achieves an area under the receiver operating characteristic curve of 0.952. The method is compared with two recently published rule-based methods of Hoover et al. and Jiang et al. . The results show that our method is significantly better than the two rule-based methods (p<0.01). The accuracy of our method is 0.944 versus 0.947 for a second observer.

3,416 citations


Book ChapterDOI
11 May 2004
TL;DR: A novel approach to face recognition which considers both shape and texture information to represent face images and the simplicity of the proposed method allows for very fast feature extraction.
Abstract: In this work, we present a novel approach to face recognition which considers both shape and texture information to represent face images. The face area is first divided into small regions from which Local Binary Pattern (LBP) histograms are extracted and concatenated into a single, spatially enhanced feature histogram efficiently representing the face image. The recognition is performed using a nearest neighbour classifier in the computed feature space with Chi square as a dissimilarity measure. Extensive experiments clearly show the superiority of the proposed scheme over all considered methods (PCA, Bayesian Intra/extrapersonal Classifier and Elastic Bunch Graph Matching) on FERET tests which include testing the robustness of the method against different facial expressions, lighting and aging of the subjects. In addition to its efficiency, the simplicity of the proposed method allows for very fast feature extraction.

2,191 citations


Journal ArticleDOI
TL;DR: A framework to handle semantic scene classification, where a natural scene may contain multiple objects such that the scene can be described by multiple class labels, is presented and appears to generalize to other classification problems of the same nature.

2,161 citations


Proceedings ArticleDOI
19 Jul 2004
TL;DR: This paper proposes a novel method for solving single-image super-resolution problems, given a low-resolution image as input, and recovers its high-resolution counterpart using a set of training examples, inspired by recent manifold teaming methods.
Abstract: In this paper, we propose a novel method for solving single-image super-resolution problems. Given a low-resolution image as input, we recover its high-resolution counterpart using a set of training examples. While this formulation resembles other learning-based methods for super-resolution, our method has been inspired by recent manifold teaming methods, particularly locally linear embedding (LLE). Specifically, small image patches in the lowand high-resolution images form manifolds with similar local geometry in two distinct feature spaces. As in LLE, local geometry is characterized by how a feature vector corresponding to a patch can be reconstructed by its neighbors in the feature space. Besides using the training image pairs to estimate the high-resolution embedding, we also enforce local compatibility and smoothness constraints between patches in the target high-resolution image through overlapping. Experiments show that our method is very flexible and gives good empirical results.

1,951 citations


Journal ArticleDOI
TL;DR: A nonlinear version of the recursive least squares (RLS) algorithm that uses a sequential sparsification process that admits into the kernel representation a new input sample only if its feature space image cannot be sufficiently well approximated by combining the images of previously admitted samples.
Abstract: We present a nonlinear version of the recursive least squares (RLS) algorithm. Our algorithm performs linear regression in a high-dimensional feature space induced by a Mercer kernel and can therefore be used to recursively construct minimum mean-squared-error solutions to nonlinear least-squares problems that are frequently encountered in signal processing applications. In order to regularize solutions and keep the complexity of the algorithm bounded, we use a sequential sparsification process that admits into the kernel representation a new input sample only if its feature space image cannot be sufficiently well approximated by combining the images of previously admitted samples. This sparsification procedure allows the algorithm to operate online, often in real time. We analyze the behavior of the algorithm, compare its scaling properties to those of support vector machines, and demonstrate its utility in solving two signal processing problems-time-series prediction and channel equalization.

1,011 citations


Journal ArticleDOI
TL;DR: In this article, a new nonlinear process monitoring technique based on kernel principal component analysis (KPCA) is developed, which can efficiently compute principal components in high-dimensional feature spaces by means of integral operators and nonlinear kernel functions.

927 citations


Proceedings ArticleDOI
27 Jun 2004
TL;DR: This work shows how it can do both automatic image annotation and retrieval (using one word queries) from images and videos using a multiple Bernoulli relevance model, which significantly outperforms previously reported results on the task of image and video annotation.
Abstract: Retrieving images in response to textual queries requires some knowledge of the semantics of the picture. Here, we show how we can do both automatic image annotation and retrieval (using one word queries) from images and videos using a multiple Bernoulli relevance model. The model assumes that a training set of images or videos along with keyword annotations is provided. Multiple keywords are provided for an image and the specific correspondence between a keyword and an image is not provided. Each image is partitioned into a set of rectangular regions and a real-valued feature vector is computed over these regions. The relevance model is a joint probability distribution of the word annotations and the image feature vectors and is computed using the training set. The word probabilities are estimated using a multiple Bernoulli model and the image feature probabilities using a non-parametric kernel density estimate. The model is then used to annotate images in a test set. We show experiments on both images from a standard Corel data set and a set of video key frames from NIST's video tree. Comparative experiments show that the model performs better than a model based on estimating word probabilities using the popular multinomial distribution. The results also show that our model significantly outperforms previously reported results on the task of image and video annotation.

815 citations


Journal ArticleDOI
TL;DR: This paper presents a new learning technique, which extends Multiple-Instance Learning (MIL), and its application to the problem of region-based image categorization, and provides experimental results on an image categorizing problem and a drug activity prediction problem.
Abstract: Designing computer programs to automatically categorize images using low-level features is a challenging research topic in computer vision. In this paper, we present a new learning technique, which extends Multiple-Instance Learning (MIL), and its application to the problem of region-based image categorization. Images are viewed as bags, each of which contains a number of instances corresponding to regions obtained from image segmentation. The standard MIL problem assumes that a bag is labeled positive if at least one of its instances is positive; otherwise, the bag is negative. In the proposed MIL framework, DD-SVM, a bag label is determined by some number of instances satisfying various properties. DD-SVM first learns a collection of instance prototypes according to a Diverse Density (DD) function. Each instance prototype represents a class of instances that is more likely to appear in bags with the specific label than in the other bags. A nonlinear mapping is then defined using the instance prototypes and maps every bag to a point in a new feature space, named the bag feature space. Finally, standard support vector machines are trained in the bag feature space. We provide experimental results on an image categorization problem and a drug activity prediction problem.

698 citations


Journal ArticleDOI
TL;DR: Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances, consistently very good when compared to a variety of methods described in the literature.
Abstract: In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a (convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LS-SVMs), a least squares cost function is proposed so as to obtain a linear set of equations in the dual space. While the SVM classifier has a large margin interpretation, the LS-SVM formulation is related in this paper to a ridge regression approach for classification with binary targets and to Fisher's linear discriminant analysis in the feature space. Multiclass categorization problems are represented by a set of binary classifiers using different output coding schemes. While regularization is used to control the effective number of parameters of the LS-SVM classifier, the sparseness property of SVMs is lost due to the choice of the 2-norm. Sparseness can be imposed in a second stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LS-SVM classifiers with linear, polynomial and radial basis function (RBF) kernels. Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances. These SVM and LS-SVM performances are consistently very good when compared to a variety of methods described in the literature including decision tree based algorithms, statistical algorithms and instance based learning methods. We show on ten UCI datasets that the LS-SVM sparse approximation procedure can be successfully applied.

698 citations


Journal ArticleDOI
TL;DR: In this article, a new scheme for the diagnosis of localised defects in ball bearings based on the wavelet transform and neuro-fuzzy classification was proposed. But this scheme was only applied to a single motor-driven experimental system, and the results demonstrate that the method can reliably separate different fault conditions under the presence of load variations.

599 citations


Journal ArticleDOI
TL;DR: A view-based approach to recognize humans from their gait by employing a hidden Markov model (HMM) and the statistical nature of the HMM lends overall robustness to representation and recognition.
Abstract: We propose a view-based approach to recognize humans from their gait. Two different image features have been considered: the width of the outer contour of the binarized silhouette of the walking person and the entire binary silhouette itself. To obtain the observation vector from the image features, we employ two different methods. In the first method, referred to as the indirect approach, the high-dimensional image feature is transformed to a lower dimensional space by generating what we call the frame to exemplar (FED) distance. The FED vector captures both structural and dynamic traits of each individual. For compact and effective gait representation and recognition, the gait information in the FED vector sequences is captured in a hidden Markov model (HMM). In the second method, referred to as the direct approach, we work with the feature vector directly (as opposed to computing the FED) and train an HMM. We estimate the HMM parameters (specifically the observation probability B) based on the distance between the exemplars and the image features. In this way, we avoid learning high-dimensional probability density functions. The statistical nature of the HMM lends overall robustness to representation and recognition. The performance of the methods is illustrated using several databases.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: A new method for extracting features from palmprints using the competitive coding scheme and angular matching and the execution time for the whole process of verification, including preprocessing, feature extraction and final matching is 1s.
Abstract: There is increasing interest in the development of reliable, rapid and non-intrusive security control systems. Among the many approaches, biometrics such as palmprints provide highly effective automatic mechanisms for use in personal identification. This paper presents a new method for extracting features from palmprints using the competitive coding scheme and angular matching. The competitive coding scheme uses multiple 2-D Gabor filters to extract orientation information from palm lines. This information is then stored in a feature vector called the competitive code. The angular matching with an effective implementation is then defined for comparing the proposed codes, which can make over 9,000 comparisons within 1s. In our testing database of 7,752 palmprint samples from 386 palms, we can achieve a high genuine acceptance rate of 98.4% and a low false acceptance rate of 3/spl times/10/sup -6/%. The execution time for the whole process of verification, including preprocessing, feature extraction and final matching, is 1s.

Journal ArticleDOI
TL;DR: A change detection approach based on an object-based classification of remote sensing data is introduced that classifies not single pixels but groups of pixels that represent already existing objects in a GIS database based on a supervised maximum likelihood classification.
Abstract: In this paper, a change detection approach based on an object-based classification of remote sensing data is introduced. The approach classifies not single pixels but groups of pixels that represent already existing objects in a GIS database. The approach is based on a supervised maximum likelihood classification. The multispectral bands grouped by objects and very different measures that can be derived from multispectral bands represent the n -dimensional feature space for the classification. The training areas are derived automatically from the geographical information system (GIS) database. After an introduction into the general approach, different input channels for the classification are defined and discussed. The results of a test on two test areas are presented. Afterwards, further measures, which can improve the result of the classification and enable the distinction between more land-use classes than with the introduced approach, are presented.

Journal ArticleDOI
TL;DR: The task of classifying the types of moving vehicles in a distributed, wireless sensor network is investigated and a data set that consists of 820 MByte raw time series data, 70 MByte of preprocessed, extracted spectral feature vectors, and baseline classification results using the maximum likelihood classifier is compiled.

Journal ArticleDOI
TL;DR: The results illustrate the potential to direct training data acquisition strategies to target the most useful training samples to allow efficient and accurate image classification.

Book ChapterDOI
23 May 2004
TL;DR: In this article, a feature-based steganalytic method for JPEG images is proposed, where the features are calculated as an L 1 norm of the difference between a specific macroscopic functional calculated from the stego image and the same functional obtained from a decompressed, cropped, and recompressed stegos image.
Abstract: In this paper, we introduce a new feature-based steganalytic method for JPEG images and use it as a benchmark for comparing JPEG steganographic algorithms and evaluating their embedding mechanisms. The detection method is a linear classifier trained on feature vectors corresponding to cover and stego images. In contrast to previous blind approaches, the features are calculated as an L1 norm of the difference between a specific macroscopic functional calculated from the stego image and the same functional obtained from a decompressed, cropped, and recompressed stego image. The functionals are built from marginal and joint statistics of DCT coefficients. Because the features are calculated directly from DCT coefficients, conclusions can be drawn about the impact of embedding modifications on detectability. Three different steganographic paradigms are tested and compared. Experimental results reveal new facts about current steganographic methods for JPEGs and new de-sign principles for more secure JPEG steganography.

Proceedings ArticleDOI
Michael Gamon1
23 Aug 2004
TL;DR: It is demonstrated that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data by using large feature vectors in combination with feature reduction and the addition of deep linguistic analysis features to a set of surface level word n-gram features contributes consistently to classification accuracy.
Abstract: We demonstrate that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data. We show that by using large feature vectors in combination with feature reduction, we can train linear support vector machines that achieve high classification accuracy on data that present classification challenges even for a human annotator. We also show that, surprisingly, the addition of deep linguistic analysis features to a set of surface level word n-gram features contributes consistently to classification accuracy in this domain.

Journal ArticleDOI
TL;DR: A logic-driven clustering in which prototypes are formed and evaluated in a sequential manner that considers an inverse similarity problem and shows how the relevance of the prototypes translates into their granularity.
Abstract: We introduce a logic-driven clustering in which prototypes are formed and evaluated in a sequential manner. The way of revealing a structure in data is realized by maximizing a certain performance index (objective function) that takes into consideration an overall level of matching (to be maximized) and a similarity level between the prototypes (the component to be minimized). The prototypes identified in the process come with the optimal weight vector that serves to indicate the significance of the individual features (coordinates) in the data grouping represented by the prototype. Since the topologies of these groupings are in general quite diverse the optimal weight vectors are reflecting the anisotropy of the feature space, i.e., they show some local ranking of features in the data space. Having found the prototypes we consider an inverse similarity problem and show how the relevance of the prototypes translates into their granularity.

Journal ArticleDOI
TL;DR: In this paper, the authors implemented the matrix multiplication of a neural network to enhance the time performance of a text detection system using an ATI RADEON 9700 PRO board, which produced a 20-fold performance enhancement.

Proceedings ArticleDOI
10 Oct 2004
TL;DR: MRBIR first makes use of a manifold ranking algorithm to explore the relationship among all the data points in the feature space, and then measures relevance between the query and all the images in the database accordingly, which is different from traditional similarity metrics based on pair-wise distance.
Abstract: In this paper, we propose a novel transductive learning framework named manifold-ranking based image retrieval (MRBIR) Given a query image, MRBIR first makes use of a manifold ranking algorithm to explore the relationship among all the data points in the feature space, and then measures relevance between the query and all the images in the database accordingly, which is different from traditional similarity metrics based on pair-wise distance In relevance feedback, if only positive examples are available, they are added to the query set to improve the retrieval result; if examples of both labels can be obtained, MRBIR discriminately spreads the ranking scores of positive and negative examples, considering the asymmetry between these two types of images Furthermore, three active learning methods are incorporated into MRBIR, which select images in each round of relevance feedback according to different principles, aiming to maximally improve the ranking result Experimental results on a general-purpose image database show that MRBIR attains a significant improvement over existing systems from all aspects

Proceedings ArticleDOI
17 May 2004
TL;DR: This paper uses a vision-based page segmentation algorithm to partition a web page into semantic blocks with a hierarchical structure, then spatial features and content features are extracted and used to construct a feature vector for each block.
Abstract: Previous work shows that a web page can be partitioned into multiple segments or blocks, and often the importance of those blocks in a page is not equivalent. Also, it has been proven that differentiating noisy or unimportant blocks from pages can facilitate web mining, search and accessibility. However, no uniform approach and model has been presented to measure the importance of different segments in web pages. Through a user study, we found that people do have a consistent view about the importance of blocks in web pages. In this paper, we investigate how to find a model to automatically assign importance values to blocks in a web page. We define the block importance estimation as a learning problem. First, we use a vision-based page segmentation algorithm to partition a web page into semantic blocks with a hierarchical structure. Then spatial features (such as position and size) and content features (such as the number of images and links) are extracted to construct a feature vector for each block. Based on these features, learning algorithms are used to train a model to assign importance to different segments in the web page. In our experiments, the best model can achieve the performance with Micro-F1 79% and Micro-Accuracy 85.9%, which is quite close to a person's view.

Journal ArticleDOI
TL;DR: This article compares 14 distance measures and their modifications between feature vectors with respect to the recognition performance of the principal component analysis (PCA)-based face recognition method and proposes modified sum square error (SSE)-based distance.

Journal ArticleDOI
TL;DR: A comparison of normalization functions shows that moment-based functions outperform the dimension-based ones and the aspect ratio mapping is influential and the comparison of feature vectors shows that the improved feature extraction strategies outperform their baseline counterparts.

01 Jun 2004
TL;DR: The aim if this paper is to examine the effectiveness of the CamShift algorithm as a general-purpose object tracking approach in the case where no assumptions have been made about the target to be tracked.
Abstract: The Continuously Adaptive Mean Shift Algorithm (CamShift) is an adaptation of the Mean Shift algorithm for object tracking that is intended as a step towards head and face tracking for a perceptual user interface. In this paper, we review the CamShift Algorithm and extend a default implementation to allow tracking in an arbitrary number and type of feature spaces.In order to compute the new probability that a pixel value belongs to the target model, we weight the multidimensional histogram with a simple monotonically decreasing kernel profile prior to histogram back-projection.We evaluate the effectiveness of this approach by comparing the results with a generic implementation of the Mean Shift algorithm in a quantized feature space of equivalent dimension.The aim if this paper is to examine the effectiveness of the CamShift algorithm as a general-purpose object tracking approach in the case where no assumptions have been made about the target to be tracked.

Book ChapterDOI
11 May 2004
TL;DR: In this article, a probabilistic mapping between continuous image feature vectors and the supplied word tokens is proposed to learn both word-to-region associations and object relations, which augments scene segmentation due to smoothing implicit in spatial consistency.
Abstract: We consider object recognition as the process of attaching meaningful labels to specific regions of an image, and propose a model that learns spatial relationships between objects. Given a set of images and their associated text (e.g. keywords, captions, descriptions), the objective is to segment an image, in either a crude or sophisticated fashion, then to find the proper associations between words and regions. Previous models are limited by the scope of the representation. In particular, they fail to exploit spatial context in the images and words. We develop a more expressive model that takes this into account. We formulate a spatially consistent probabilistic mapping between continuous image feature vectors and the supplied word tokens. By learning both word-to-region associations and object relations, the proposed model augments scene segmentations due to smoothing implicit in spatial consistency. Context introduces cycles to the undirected graph, so we cannot rely on a straightforward implementation of the EM algorithm for estimating the model parameters and densities of the unknown alignment variables. Instead, we develop an approximate EM algorithm that uses loopy belief propagation in the inference step and iterative scaling on the pseudo-likelihood approximation in the parameter update step. The experiments indicate that our approximate inference and learning algorithm converges to good local solutions. Experiments on a diverse array of images show that spatial context considerably improves the accuracy of object recognition. Most significantly, spatial context combined with a nonlinear discrete object representation allows our models to cope well with over-segmented scenes.

Proceedings ArticleDOI
27 Jun 2004
TL;DR: A novel discriminative feature space which is efficient not only for face detection but also for recognition is introduced, and the same facial representation can be efficiently used for both detection and recognition.
Abstract: We introduce a novel discriminative feature space which is efficient not only for face detection but also for recognition. The face representation is based on local binary patterns (LBP) and consists of encoding both local and global facial characteristics into a compact feature histogram. The proposed representation is invariant with respect to monotonic gray scale transformations and can be derived in a single scan through the image. Considering the derived feature space, a second-degree polynomial kernel SVM classifier was trained to detect frontal faces in gray scale images. Experimental results using several complex images show that the proposed approach performs favorably compared to the state-of-the-art methods. Additionally, experiments with detecting and recognizing low-resolution faces from video sequences were carried out, demonstrating that the same facial representation can be efficiently used for both detection and recognition.

Patent
03 Jun 2004
TL;DR: In this paper, the user selects a time interval in the video as a query definition of training images for training an image class statistical model, which can be as short as one frame or consist of disjoint segments or shots.
Abstract: Methods for interactive selecting video queries consisting of training images from a video for a video similarity search and for displaying the results of the similarity search are disclosed. The user selects a time interval in the video as a query definition of training images for training an image class statistical model. Time intervals can be as short as one frame or consist of disjoint segments or shots. A statistical model of the image class defined by the training images is calculated on-the-fly from feature vectors extracted from transforms of the training images. For each frame in the video, a feature vector is extracted from the transform of the frame, and a similarity measure is calculated using the feature vector and the image class statistical model. The similarity measure is derived from the likelihood of a Gaussian model producing the frame. The similarity is then presented graphically, which allows the time structure of the video to be visualized and browsed. Similarity can be rapidly calculated for other video files as well, which enables content-based retrieval by example. A content-aware video browser featuring interactive similarity measurement is presented. A method for selecting training segments involves mouse click-and-drag operations over a time bar representing the duration of the video; similarity results are displayed as shades in the time bar. Another method involves selecting periodic frames of the video as endpoints for the training segment.

Journal ArticleDOI
TL;DR: An attempt to reflect shape information of the iris by analyzing local intensity variations of an iris image by constructing a set of one-dimensional intensity signals that reflect to a large extent their various spatial modes and are used as distinguishing features.

Journal ArticleDOI
TL;DR: An efficient face recognition scheme which has two features: representation of face images by two-dimensional wavelet subband coefficients and recognition by a modular, personalised classification method based on kernel associative memory models.
Abstract: In this paper, we propose an efficient face recognition scheme which has two features: 1) representation of face images by two-dimensional (2D) wavelet subband coefficients and 2) recognition by a modular, personalised classification method based on kernel associative memory models. Compared to PCA projections and low resolution "thumb-nail" image representations, wavelet subband coefficients can efficiently capture substantial facial features while keeping computational complexity low. As there are usually very limited samples, we constructed an associative memory (AM) model for each person and proposed to improve the performance of AM models by kernel methods. Specifically, we first applied kernel transforms to each possible training pair of faces sample and then mapped the high-dimensional feature space back to input space. Our scheme using modular autoassociative memory for face recognition is inspired by the same motivation as using autoencoders for optical character recognition (OCR), for which the advantages has been proven. By associative memory, all the prototypical faces of one particular person are used to reconstruct themselves and the reconstruction error for a probe face image is used to decide if the probe face is from the corresponding person. We carried out extensive experiments on three standard face recognition datasets, the FERET data, the XM2VTS data, and the ORL data. Detailed comparisons with earlier published results are provided and our proposed scheme offers better recognition accuracy on all of the face datasets.

Journal Article
TL;DR: A new feature-based steganalytic method for JPEG images that is a linear classifier trained on feature vectors corresponding to cover and stego images and used as a benchmark for comparing JPEG steganographic algorithms and evaluating their embedding mechanisms.
Abstract: In this paper, we introduce a new feature-based steganalytic method for JPEG images and use it as a benchmark for comparing JPEG steganographic algorithms and evaluating their embedding mechanisms. The detection method is a linear classifier trained on feature vectors corresponding to cover and stego images. In contrast to previous blind approaches, the features are calculated as an L 1 norm of the difference between a specific macroscopic functional calculated from the stego image and the same functional obtained from a decompressed, cropped, and recompressed stego image. The functionals are built from marginal and joint statistics of DCT coefficients. Because the features are calculated directly from DCT coefficients, conclusions can be drawn about the impact of embedding modifications on detectability. Three different steganographic paradigms are tested and compared. Experimental results reveal new facts about current steganographic methods for JPEGs and new design principles for more secure JPEG steganography.