scispace - formally typeset
Search or ask a question

Showing papers on "Feature extraction published in 2007"


Proceedings ArticleDOI
17 Jun 2007
TL;DR: A simple method for the visual saliency detection is presented, independent of features, categories, or other forms of prior knowledge of the objects, and a fast method to construct the corresponding saliency map in spatial domain is proposed.
Abstract: The ability of human visual system to detect visual saliency is extraordinarily fast and reliable. However, computational modeling of this basic intelligent behavior still remains a challenge. This paper presents a simple method for the visual saliency detection. Our model is independent of features, categories, or other forms of prior knowledge of the objects. By analyzing the log-spectrum of an input image, we extract the spectral residual of an image in spectral domain, and propose a fast method to construct the corresponding saliency map in spatial domain. We test this model on both natural pictures and artificial images such as psychological patterns. The result indicate fast and robust saliency detection of our method.

3,464 citations


Journal ArticleDOI
TL;DR: The method is fast, does not require video alignment, and is applicable in many scenarios where the background is known, and the robustness of the method is demonstrated to partial occlusions, nonrigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action, and low-quality video.
Abstract: Human action in video sequences can be seen as silhouettes of a moving torso and protruding limbs undergoing articulated motion. We regard human actions as three-dimensional shapes induced by the silhouettes in the space-time volume. We adopt a recent approach [14] for analyzing 2D shapes and generalize it to deal with volumetric space-time action shapes. Our method utilizes properties of the solution to the Poisson equation to extract space-time features such as local space-time saliency, action dynamics, shape structure, and orientation. We show that these features are useful for action recognition, detection, and clustering. The method is fast, does not require video alignment, and is applicable in (but not limited to) many scenarios where the background is known. Moreover, we demonstrate the robustness of our method to partial occlusions, nonrigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action, and low-quality video.

1,842 citations


Journal ArticleDOI
TL;DR: This paper attempts to provide a comprehensive survey of the recent technical achievements in high-level semantic-based image retrieval, identifying five major categories of the state-of-the-art techniques in narrowing down the 'semantic gap'.

1,713 citations


Proceedings ArticleDOI
17 Jun 2007
TL;DR: An unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions that alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.
Abstract: We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64% error on MNIST, and 54% average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.

1,232 citations


Journal ArticleDOI
TL;DR: A general tensor discriminant analysis (GTDA) is developed as a preprocessing step for LDA for face recognition and achieves good performance for gait recognition based on image sequences from the University of South Florida (USF) HumanID Database.
Abstract: Traditional image representations are not suited to conventional classification methods such as the linear discriminant analysis (LDA) because of the undersample problem (USP): the dimensionality of the feature space is much higher than the number of training samples. Motivated by the successes of the two-dimensional LDA (2DLDA) for face recognition, we develop a general tensor discriminant analysis (GTDA) as a preprocessing step for LDA. The benefits of GTDA, compared with existing preprocessing methods such as the principal components analysis (PCA) and 2DLDA, include the following: 1) the USP is reduced in subsequent classification by, for example, LDA, 2) the discriminative information in the training tensors is preserved, and 3) GTDA provides stable recognition rates because the alternating projection optimization algorithm to obtain a solution of GTDA converges, whereas that of 2DLDA does not. We use human gait recognition to validate the proposed GTDA. The averaged gait images are utilized for gait representation. Given the popularity of Gabor-function-based image decompositions for image understanding and object recognition, we develop three different Gabor-function-based image representations: 1) GaborD is the sum of Gabor filter responses over directions, 2) GaborS is the sum of Gabor filter responses over scales, and 3) GaborSD is the sum of Gabor filter responses over scales and directions. The GaborD, GaborS, and GaborSD representations are applied to the problem of recognizing people from their averaged gait images. A large number of experiments were carried out to evaluate the effectiveness (recognition rate) of gait recognition based on first obtaining a Gabor, GaborD, GaborS, or GaborSD image representation, then using GDTA to extract features and, finally, using LDA for classification. The proposed methods achieved good performance for gait recognition based on image sequences from the University of South Florida (USF) HumanID Database. Experimental comparisons are made with nine state-of-the-art classification methods in gait recognition.

1,160 citations


Journal ArticleDOI
TL;DR: This paper reviews recent research and development in pattern recognition- and non-pattern recognition-based myoelectric control, and presents state-of-the-art achievements in terms of their type, structure, and potential application.

1,111 citations


Reference BookDOI
29 Oct 2007
TL;DR: This book discusses Supervised, Unsupervised, and Semi-Supervised Feature Selection Key Contributions and Organization of the Book Looking Ahead Unsuper supervised Feature Selection.
Abstract: PREFACE Introduction and Background Less Is More Huan Liu and Hiroshi Motoda Background and Basics Supervised, Unsupervised, and Semi-Supervised Feature Selection Key Contributions and Organization of the Book Looking Ahead Unsupervised Feature Selection Jennifer G. Dy Introduction Clustering Feature Selection Feature Selection for Unlabeled Data Local Approaches Summary Randomized Feature Selection David J. Stracuzzi Introduction Types of Randomizations Randomized Complexity Classes Applying Randomization to Feature Selection The Role of Heuristics Examples of Randomized Selection Algorithms Issues in Randomization Summary Causal Feature Selection Isabelle Guyon, Constantin Aliferis, and Andre Elisseeff Introduction Classical "Non-Causal" Feature Selection The Concept of Causality Feature Relevance in Bayesian Networks Causal Discovery Algorithms Examples of Applications Summary, Conclusions, and Open Problems Extending Feature Selection Active Learning of Feature Relevance Emanuele Olivetti, Sriharsha Veeramachaneni, and Paolo Avesani Introduction Active Sampling for Feature Relevance Estimation Derivation of the Sampling Benefit Function Implementation of the Active Sampling Algorithm Experiments Conclusions and Future Work A Study of Feature Extraction Techniques Based on Decision Border Estimate Claudia Diamantini and Domenico Potena Introduction Feature Extraction Based on Decision Boundary Generalities about Labeled Vector Quantizers Feature Extraction Based on Vector Quantizers Experiments Conclusions Ensemble-Based Variable Selection Using Independent Probes Eugene Tuv, Alexander Borisov, and Kari Torkkola Introduction Tree Ensemble Methods in Feature Ranking The Algorithm: Ensemble-Based Ranking against Independent Probes Experiments Discussion Efficient Incremental-Ranked Feature Selection in Massive Data Roberto Ruiz, Jesus S. Aguilar-Ruiz, and Jose C. Riquelme Introduction Related Work Preliminary Concepts Incremental Performance over Ranking Experimental Results Conclusions Weighting and Local Methods Non-Myopic Feature Quality Evaluation with (R)ReliefF Igor Kononenko and Marko Robnik Sikonja Introduction From Impurity to Relief ReliefF for Classification and RReliefF for Regression Extensions Interpretation Implementation Issues Applications Conclusion Weighting Method for Feature Selection in k-Means Joshua Zhexue Huang, Jun Xu, Michael Ng, and Yunming Ye Introduction Feature Weighting in k-Means W-k-Means Clustering Algorithm Feature Selection Subspace Clustering with k-Means Text Clustering Related Work Discussions Local Feature Selection for Classification Carlotta Domeniconi and Dimitrios Gunopulos Introduction The Curse of Dimensionality Adaptive Metric Techniques Large Margin nearest Neighbor Classifiers Experimental Comparisons Conclusions Feature Weighting through Local Learning Yijun Sun Introduction Mathematical Interpretation of Relief Iterative Relief Algorithm Extension to Multiclass Problems Online Learning Computational Complexity Experiments Conclusion Text Classification and Clustering Feature Selection for Text Classification George Forman Introduction Text Feature Generators Feature Filtering for Classification Practical and Scalable Computation A Case Study Conclusion and Future Work A Bayesian Feature Selection Score Based on Naive Bayes Models Susana Eyheramendy and David Madigan Introduction Feature Selection Scores Classification Algorithms Experimental Settings and Results Conclusion Pairwise Constraints-Guided Dimensionality Reduction Wei Tang and Shi Zhong Introduction Pairwise Constraints-Guided Feature Projection Pairwise Constraints-Guided Co-Clustering Experimental Studies Conclusion and Future Work Aggressive Feature Selection by Feature Ranking Masoud Makrehchi and Mohamed S. Kamel Introduction Feature Selection by Feature Ranking Proposed Approach to Reducing Term Redundancy Experimental Results Summary Feature Selection in Bioinformatics Feature Selection for Genomic Data Analysis Lei Yu Introduction Redundancy-Based Feature Selection Empirical Study Summary A Feature Generation Algorithm with Applications to Biological Sequence Classification Rezarta Islamaj Dogan, Lise Getoor, and W. John Wilbur Introduction Splice-Site Prediction Feature Generation Algorithm Experiments and Discussion Conclusions An Ensemble Method for Identifying Robust Features for Biomarker Discovery Diana Chan, Susan M. Bridges, and Shane C. Burgess Introduction Biomarker Discovery from Proteome Profiles Challenges of Biomarker Identification Ensemble Method for Feature Selection Feature Selection Ensemble Results and Discussion Conclusion Model Building and Feature Selection with Genomic Data Hui Zou and Trevor Hastie Introduction Ridge Regression, Lasso, and Bridge Drawbacks of the Lasso The Elastic Net The Elastic-Net Penalized SVM Sparse Eigen-Genes Summary INDEX

1,097 citations


Journal ArticleDOI
TL;DR: A novel method without the pure-pixel assumption is presented, referred to as the minimum volume constrained nonnegative matrix factorization (MVC-NMF), for unsupervised endmember extraction from highly mixed image data, which outperforms several other advanced endmember detection approaches.
Abstract: Endmember extraction is a process to identify the hidden pure source signals from the mixture. In the past decade, numerous algorithms have been proposed to perform this estimation. One commonly used assumption is the presence of pure pixels in the given image scene, which are detected to serve as endmembers. When such pixels are absent, the image is referred to as the highly mixed data, for which these algorithms at best can only return certain data points that are close to the real endmembers. To overcome this problem, we present a novel method without the pure-pixel assumption, referred to as the minimum volume constrained nonnegative matrix factorization (MVC-NMF), for unsupervised endmember extraction from highly mixed image data. Two important facts are exploited: First, the spectral data are nonnegative; second, the simplex volume determined by the endmembers is the minimum among all possible simplexes that circumscribe the data scatter space. The proposed method takes advantage of the fast convergence of NMF schemes, and at the same time eliminates the pure-pixel assumption. The experimental results based on a set of synthetic mixtures and a real image scene demonstrate that the proposed method outperforms several other advanced endmember detection approaches

870 citations


Proceedings ArticleDOI
01 Dec 2007
TL;DR: This paper employs probabilistic neural network (PNN) with image and data processing techniques to implement a general purpose automated leaf recognition for plant classification with an accuracy greater than 90%.
Abstract: In this paper, we employ probabilistic neural network (PNN) with image and data processing techniques to implement a general purpose automated leaf recognition for plant classification. 12 leaf features are extracted and orthogonalized into 5 principal variables which consist the input vector of the PNN. The PNN is trained by 1800 leaves to classify 32 kinds of plants with an accuracy greater than 90%. Compared with other approaches, our algorithm is an accurate artificial intelligence approach which is fast in execution and easy in implementation.

823 citations


Journal ArticleDOI
TL;DR: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed and two segmentation methods are considered.
Abstract: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed. A line detector, previously used in mammography, is applied to the green channel of the retinal image. It is based on the evaluation of the average grey level along lines of fixed length passing through the target pixel at different orientations. Two segmentation methods are considered. The first uses the basic line detector whose response is thresholded to obtain unsupervised pixel classification. As a further development, we employ two orthogonal line detectors along with the grey level of the target pixel to construct a feature vector for supervised classification using a support vector machine. The effectiveness of both methods is demonstrated through receiver operating characteristic analysis on two publicly available databases of color fundus images.

819 citations


Journal ArticleDOI
TL;DR: A new feature selection strategy based on rough sets and particle swarm optimization (PSO), which does not need complex operators such as crossover and mutation, and requires only primitive and simple mathematical operators, and is computationally inexpensive in terms of both memory and runtime.

Proceedings Article
01 Jan 2007
TL;DR: An overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with the MIRtoolbox, an integrated set of functions written in Matlab dedicated to the extraction of musical features from audio files.
Abstract: We present the MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the extraction of musical features from audio files The design is based on a modular framework: the different algorithms are decomposed into stages, formalized using a minimal set of elementary mechanisms, and integrating different variants proposed by alternative approaches – including new strategies we have developed –, that users can select and parametrize This paper offers an overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with the MIRtoolbox One particular analysis is provided as an example The toolbox also includes functions for statistical analysis, segmentation and clustering Particular attention has been paid to the design of a syntax that offers both simplicity of use and transparent adaptiveness to a multiplicity of possible input types Each feature extraction method can accept as argument an audio file, or any preliminary result from intermediary stages of the chain of operations Also the same syntax can be used for analyses of single audio files, batches of files, series of audio segments, multi-channel signals, etc For that purpose, the data and methods of the toolbox are organised in an object-oriented architecture 1 MOTIVATION AND APPROACH MIRtoolbox is a Matlab toolbox dedicated to the extraction of musically-related features from audio recordings It has been designed in particular with the objective of enabling the computation of a large range of features from databases of audio files, that can be subjected to statistical analyses Few softwares have been proposed in this area One particularity of our own approach relies in the use of the Matlab computing environment, which offers good visualisation capabilities and gives access to a large variety of other toolboxes In particular, the MIRtoolbox makes use of functions available in public-domain toolboxes such as the Auditory Toolbox [6], NetLab [5] and SOMtoolbox [10] Other toolboxes, such as the Statistics toolbox or the Neural Network toolbox from MathWorks, can be directly used for further analyses of the features extracted c © 2007 Austrian Computer Society (OCG) by MIRtoolbox without having to export the data from one software to another Such computational framework, because of its general objectives, could be useful to the research community in Music Information Retrieval (MIR), but also for educational purposes For that reason, particular attention has been paid concerning the ease of use of the toolbox In particular, complex analytic processes can be designed using a very simple syntax, whose expressive power comes from the use of an object-oriented paradigm The different musical features extracted from the audio files are highly interdependent: in particular, as can be seen in figure 1, some features are based on the same initial computations In order to improve the computational efficiency, it is important to avoid redundant computations of these common components Each of these intermediary components, and the final musical features, are therefore considered as building blocks that can been freely articulated one with each other Besides, in keeping with the objective of optimal ease of use of the toolbox, each building block has been conceived in a way that it can adapt to the type of input data For instance, the computation of the MFCCs can be based on the waveform of the initial audio signal, or on the intermediary representations such as spectrum, or mel-scale spectrum (see Fig 1) Similarly, autocorrelation is computed for different range of delays depending on the type of input data (audio waveform, envelope, spectrum) This decomposition of all feature extraction algorithms into a common set of building blocks has the advantage of offering a synthetic overview of the different approaches studied in this domain of research 2 FEATURE EXTRACTION 21 Feature overview Figure 1 shows an overview of the main features implemented in the toolbox All the different processes start from the audio signal (on the left) and form a chain of operations proceeding to right Each musical feature is related to one of the musical dimensions traditionally defined in music theory Boldface characters highlight features related to pitch and tonality Bold italics indicate features related to rhythm Simple italics highlight a large set of features that can be associated to timbre and dynamics Among them, all the operators in grey italics can be Audio signal waveform Zero-crossing rate RMS energy Envelope Low Energy Rate Attack Slope Attack Time Envelope Autocorrelation Tempo Onsets

Journal ArticleDOI
TL;DR: The stated results show that the proposed method could point out the ability of design of a new intelligent assistance diagnosis system.

Journal ArticleDOI
TL;DR: The proposed methods are successfully applied to face recognition, and the experiment results on the large-scale FERET and CAS-PEAL databases show that the proposed algorithms significantly outperform other well-known systems in terms of recognition rate.
Abstract: A novel object descriptor, histogram of Gabor phase pattern (HGPP), is proposed for robust face recognition. In HGPP, the quadrant-bit codes are first extracted from faces based on the Gabor transformation. Global Gabor phase pattern (GGPP) and local Gabor phase pattern (LGPP) are then proposed to encode the phase variations. GGPP captures the variations derived from the orientation changing of Gabor wavelet at a given scale (frequency), while LGPP encodes the local neighborhood variations by using a novel local XOR pattern (LXP) operator. They are both divided into the nonoverlapping rectangular regions, from which spatial histograms are extracted and concatenated into an extended histogram feature to represent the original image. Finally, the recognition is performed by using the nearest-neighbor classifier with histogram intersection as the similarity measurement. The features of HGPP lie in two aspects: 1) HGPP can describe the general face images robustly without the training procedure; 2) HGPP encodes the Gabor phase information, while most previous face recognition methods exploit the Gabor magnitude information. In addition, Fisher separation criterion is further used to improve the performance of HGPP by weighing the subregions of the image according to their discriminative powers. The proposed methods are successfully applied to face recognition, and the experiment results on the large-scale FERET and CAS-PEAL databases show that the proposed algorithms significantly outperform other well-known systems in terms of recognition rate

Journal ArticleDOI
TL;DR: An active near infrared (NIR) imaging system is presented that is able to produce face images of good condition regardless of visible lights in the environment, and it is shown that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone.
Abstract: Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thus-constrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a novel solution for illumination invariant face recognition for indoor, cooperative-user applications. First, we present an active near infrared (NIR) imaging system that is able to produce face images of good condition regardless of visible lights in the environment. Second, we show that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone; based on this, we use local binary pattern (LBP) features to compensate for the monotonic transform, thus deriving an illumination invariant face representation. Then, we present methods for face recognition using NIR images; statistical learning algorithms are used to extract most discriminative features from a large pool of invariant LBP features and construct a highly accurate face matching engine. Finally, we present a system that is able to achieve accurate and fast face recognition in practice, in which a method is provided to deal with specular reflections of active NIR lights on eyeglasses, a critical issue in active NIR image-based face recognition. Extensive, comparative results are provided to evaluate the imaging hardware, the face and eye detection algorithms, and the face recognition algorithms and systems, with respect to various factors, including illumination, eyeglasses, time lapse, and ethnic groups

Proceedings ArticleDOI
26 Dec 2007
TL;DR: An interactive framework for soft segmentation and matting of natural images and videos is presented, based on the optimal, linear time, computation of weighted geodesic distances to the user-provided scribbles, from which the whole data is automatically segmented.
Abstract: An interactive framework for soft segmentation and matting of natural images and videos is presented in this paper. The proposed technique is based on the optimal, linear time, computation of weighted geodesic distances to the user-provided scribbles, from which the whole data is automatically segmented. The weights are based on spatial and/or temporal gradients, without explicit optical flow or any advanced and often computationally expensive feature detectors. These could be naturally added to the proposed framework as well if desired, in the form of weights in the geodesic distances. A localized refinement step follows this fast segmentation in order to accurately compute the corresponding matte function. Additional constraints into the distance definition permit to efficiently handle occlusions such as people or objects crossing each other in a video sequence. The presentation of the framework is complemented with numerous and diverse examples, including extraction of moving foreground from dynamic background, and comparisons with the recent literature.


Journal ArticleDOI
TL;DR: A new worst-case metric is proposed for predicting practical system performance in the absence of matching failures, and the worst case theoretical equal error rate (EER) is predicted to be as low as 2.59 times 10-1 available data sets.
Abstract: This paper presents a novel iris coding method based on differences of discrete cosine transform (DCT) coefficients of overlapped angular patches from normalized iris images. The feature extraction capabilities of the DCT are optimized on the two largest publicly available iris image data sets, 2,156 images of 308 eyes from the CASIA database and 2,955 images of 150 eyes from the Bath database. On this data, we achieve 100 percent correct recognition rate (CRR) and perfect receiver-operating characteristic (ROC) curves with no registered false accepts or rejects. Individual feature bit and patch position parameters are optimized for matching through a product-of-sum approach to Hamming distance calculation. For verification, a variable threshold is applied to the distance metric and the false acceptance rate (FAR) and false rejection rate (FRR) are recorded. A new worst-case metric is proposed for predicting practical system performance in the absence of matching failures, and the worst case theoretical equal error rate (EER) is predicted to be as low as 2.59 times 10-1 available data sets

Journal ArticleDOI
TL;DR: A fully automatic face recognition algorithm that is multimodal (2D and 3D) and performs hybrid (feature based and holistic) matching in order to achieve efficiency and robustness to facial expressions is presented.
Abstract: We present a fully automatic face recognition algorithm and demonstrate its performance on the FRGC v2.0 data. Our algorithm is multimodal (2D and 3D) and performs hybrid (feature based and holistic) matching in order to achieve efficiency and robustness to facial expressions. The pose of a 3D face along with its texture is automatically corrected using a novel approach based on a single automatically detected point and the Hotelling transform. A novel 3D spherical face representation (SFR) is used in conjunction with the scale-invariant feature transform (SIFT) descriptor to form a rejection classifier, which quickly eliminates a large number of candidate faces at an early stage for efficient recognition in case of large galleries. The remaining faces are then verified using a novel region-based matching approach, which is robust to facial expressions. This approach automatically segments the eyes- forehead and the nose regions, which are relatively less sensitive to expressions and matches them separately using a modified iterative closest point (ICP) algorithm. The results of all the matching engines are fused at the metric level to achieve higher accuracy. We use the FRGC benchmark to compare our results to other algorithms that used the same database. Our multimodal hybrid algorithm performed better than others by achieving 99.74 percent and 98.31 percent verification rates at a 0.001 false acceptance rate (FAR) and identification rates of 99.02 percent and 95.37 percent for probes with a neutral and a nonneutral expression, respectively.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A hierarchical model that can be characterized as a constellation of bags-of-features and that is able to combine both spatial and spatial-temporal features is proposed and shown to improve the classification performance over bag of feature models.
Abstract: We present a novel model for human action categorization. A video sequence is represented as a collection of spatial and spatial-temporal features by extracting static and dynamic interest points. We propose a hierarchical model that can be characterized as a constellation of bags-of-features and that is able to combine both spatial and spatial-temporal features. Given a novel video sequence, the model is able to categorize human actions in a frame-by-frame basis. We test the model on a publicly available human action dataset [2] and show that our new method performs well on the classification task. We also conducted control experiments to show that the use of the proposed mixture of hierarchical models improves the classification performance over bag of feature models. An additional experiment shows that using both dynamic and static features provides a richer representation of human actions when compared to the use of a single feature type, as demonstrated by our evaluation in the classification task.

Proceedings ArticleDOI
01 Apr 2007
TL;DR: It is found that the CDP-based detector and the HMM-based classifier can detect and classify incoming signals at a range of low SNRs.
Abstract: Spectrum awareness is currently one of the most challenging problems in cognitive radio (CR) design. Detection and classification of very low SNR signals with relaxed information on the signal parameters being detected is critical for proper CR functionality as it enables the CR to react and adapt to the changes in its radio environment. In this work, the cycle frequency domain profile (CDP) is used for signal detection and preprocessing for signal classification. Signal features are extracted from CDP using a threshold-test method. For classification, a Hidden Markov Model (HMM) has been used to process extracted signal features due to its robust pattern-matching capability. We also investigate the effects of varied observation length on signal detection and classification. It is found that the CDP-based detector and the HMM-based classifier can detect and classify incoming signals at a range of low SNRs.

Proceedings ArticleDOI
26 Dec 2007
TL;DR: This work proposes a solution to the problem of scene summarization by examining the distribution of images in the collection to select a set of canonical views to form the scene summary, using clustering techniques on visual features.
Abstract: We formulate the problem of scene summarization as selecting a set of images that efficiently represents the visual content of a given scene. The ideal summary presents the most interesting and important aspects of the scene with minimal redundancy. We propose a solution to this problem using multi-user image collections from the Internet. Our solution examines the distribution of images in the collection to select a set of canonical views to form the scene summary, using clustering techniques on visual features. The summaries we compute also lend themselves naturally to the browsing of image collections, and can be augmented by analyzing user-specified image tag data. We demonstrate the approach using a collection of images of the city of Rome, showing the ability to automatically decompose the images into separate scenes, and identify canonical views for each scene.

Journal ArticleDOI
TL;DR: The proposed method for fault diagnosis based on empirical mode decomposition (EMD), an improved distance evaluation technique and the combination of multiple adaptive neuro-fuzzy inference systems (ANFISs) show that the multiple ANFIS combination can reliably recognise different fault categories and severities.

Proceedings ArticleDOI
14 May 2007
TL;DR: It is found that the discriminatively trained CRF performs as well as or better than an HMM even when the model features do not violate the independence assumptions of the HMM, and it is confirmed that CRFs are robust against any degradation in performance.
Abstract: Activity recognition is a key component for creating intelligent, multi-agent systems. Intrinsically, activity recognition is a temporal classification problem. In this paper, we compare two models for temporal classification: hidden Markov models (HMMs), which have long been applied to the activity recognition problem, and conditional random fields (CRFs). CRFs are discriminative models for labeling sequences. They condition on the entire observation sequence, which avoids the need for independence assumptions between observations. Conditioning on the observations vastly expands the set of features that can be incorporated into the model without violating its assumptions. Using data from a simulated robot tag domain, chosen because it is multi-agent and produces complex interactions between observations, we explore the differences in performance between the discriminatively trained CRF and the generative HMM. Additionally, we examine the effect of incorporating features which violate independence assumptions between observations; such features are typically necessary for high classification accuracy. We find that the discriminatively trained CRF performs as well as or better than an HMM even when the model features do not violate the independence assumptions of the HMM. In cases where features depend on observations from many time steps, we confirm that CRFs are robust against any degradation in performance.

Journal ArticleDOI
TL;DR: This paper presents a method for classification of structural brain magnetic resonance (MR) images, by using a combination of deformation-based morphometry and machine learning methods, which demonstrates not only high classification accuracy but also good stability.
Abstract: This paper presents a method for classification of structural brain magnetic resonance (MR) images, by using a combination of deformation-based morphometry and machine learning methods. A morphological representation of the anatomy of interest is first obtained using a high-dimensional mass-preserving template warping method, which results in tissue density maps that constitute local tissue volumetric measurements. Regions that display strong correlations between tissue volume and classification (clinical) variables are extracted using a watershed segmentation algorithm, taking into account the regional smoothness of the correlation map which is estimated by a cross-validation strategy to achieve robustness to outliers. A volume increment algorithm is then applied to these regions to extract regional volumetric features, from which a feature selection technique using support vector machine (SVM)-based criteria is used to select the most discriminative features, according to their effect on the upper bound of the leave-one-out generalization error. Finally, SVM-based classification is applied using the best set of features, and it is tested using a leave-one-out cross-validation strategy. The results on MR brain images of healthy controls and schizophrenia patients demonstrate not only high classification accuracy (91.8% for female subjects and 90.8% for male subjects), but also good stability with respect to the number of features selected and the size of SVM kernel used

Book
26 Dec 2007
TL;DR: This first up-to-date textbook for machine vision software provides all the details on the theory and practical use of the relevant algorithms, and features real-world examples, example code with HALCON, and further exercises.
Abstract: This first up-to-date textbook for machine vision software provides all the details on the theory and practical use of the relevant algorithms. The first part covers image acquisition, including illumination, lenses, cameras, frame grabbers, and bus systems, while the second deals with the algorithms themselves. This includes data structures, image enhancement and transformations, segmentation, feature extraction, morphology, template matching, stereo reconstruction, and camera calibration. The final part concentrates on applications, and features real-world examples, example code with HALCON, and further exercises. Uniting the latest research results with an industrial approach, this textbook is ideal for students of electrical engineering, physics and informatics, electrical and mechanical engineers, as well as those working in the sensor, automation and optical industries. Free software available with registration code

Journal ArticleDOI
TL;DR: This paper presents a novel approach to solve the supervised dimensionality reduction problem by encoding an image object as a general tensor of second or even higher order, and proposes a discriminant tensor criterion, whereby multiple interrelated lower dimensional discriminative subspaces are derived for feature extraction.
Abstract: There is a growing interest in subspace learning techniques for face recognition; however, the excessive dimension of the data space often brings the algorithms into the curse of dimensionality dilemma. In this paper, we present a novel approach to solve the supervised dimensionality reduction problem by encoding an image object as a general tensor of second or even higher order. First, we propose a discriminant tensor criterion, whereby multiple interrelated lower dimensional discriminative subspaces are derived for feature extraction. Then, a novel approach, called k-mode optimization, is presented to iteratively learn these subspaces by unfolding the tensor along different tensor directions. We call this algorithm multilinear discriminant analysis (MDA), which has the following characteristics: 1) multiple interrelated subspaces can collaborate to discriminate different classes, 2) for classification problems involving higher order tensors, the MDA algorithm can avoid the curse of dimensionality dilemma and alleviate the small sample size problem, and 3) the computational cost in the learning stage is reduced to a large extent owing to the reduced data dimensions in k-mode optimization. We provide extensive experiments on ORL, CMU PIE, and FERET databases by encoding face images as second- or third-order tensors to demonstrate that the proposed MDA algorithm based on higher order tensors has the potential to outperform the traditional vector-based subspace learning algorithms, especially in the cases with small sample sizes

Journal ArticleDOI
TL;DR: An open source software tool, SuperHirn, that comprises a set of modules to process LC‐MS data acquired on a high resolution mass spectrometer, which automatically detects profiling trends in an unsupervised manner and is able to associate proteins to their correct theoretical dilution profile.
Abstract: Label-free quantification of high mass resolution LC-MS data has emerged as a promising technology for proteome analysis. Computational methods are required for the accurate extraction of peptide signals from LC-MS data and the tracking of these features across the measurements of different samples. We present here an open source software tool, SuperHirn, that comprises a set of modules to process LC-MS data acquired on a high resolution mass spectrometer. The program includes newly developed functionalities to analyze LC-MS data such as feature extraction and quantification, LC-MS similarity analysis, LC-MS alignment of multiple datasets, and intensity normalization. These program routines extract profiles of measured features and comprise tools for clustering and classification analysis of the profiles. SuperHirn was applied in an MS1-based profiling approach to a benchmark LC-MS dataset of complex protein mixtures with defined concentration changes. We show that the program automatically detects profiling trends in an unsupervised manner and is able to associate proteins to their correct theoretical dilution profile.

Journal ArticleDOI
Yijun Sun1
TL;DR: This paper proposes an iterative RELIEF (I-RELIEF) algorithm to alleviate the deficiencies of RELIEf by exploring the framework of the expectation-maximization algorithm.
Abstract: RELIEF is considered one of the most successful algorithms for assessing the quality of features. In this paper, we propose a set of new feature weighting algorithms that perform significantly better than RELIEF, without introducing a large increase in computational complexity. Our work starts from a mathematical interpretation of the seemingly heuristic RELIEF algorithm as an online method solving a convex optimization problem with a margin-based objective function. This interpretation explains the success of RELIEF in real application and enables us to identify and address its following weaknesses. RELIEF makes an implicit assumption that the nearest neighbors found in the original feature space are the ones in the weighted space and RELIEF lacks a mechanism to deal with outlier data. We propose an iterative RELIEF (I-RELIEF) algorithm to alleviate the deficiencies of RELIEF by exploring the framework of the expectation-maximization algorithm. We extend I-RELIEF to multiclass settings by using a new multiclass margin definition. To reduce computational costs, an online learning algorithm is also developed. Convergence analysis of the proposed algorithms is presented. The results of large-scale experiments on the UCI and microarray data sets are reported, which demonstrate the effectiveness of the proposed algorithms, and verify the presented theoretical results

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A time-efficient action detection method based on dynamic learning of subspaces for tensor CCA for the case that actions are not aligned in the space-time domain is proposed.
Abstract: We introduce a new framework, namely tensor canonical correlation analysis (TCCA) which is an extension of classical canonical correlation analysis (CCA) to multidimensional data arrays (or tensors) and apply this for action/gesture classification in videos. By tensor CCA, joint space-time linear relationships of two video volumes are inspected to yield flexible and descriptive similarity features of the two videos. The TCCA features are combined with a discriminative feature selection scheme and a nearest neighbor classifier for action classification. In addition, we propose a time-efficient action detection method based on dynamic learning of subspaces for tensor CCA for the case that actions are not aligned in the space-time domain. The proposed method delivered significantly better accuracy and comparable detection speed over state-of-the-art methods on the KTH action data set as well as self-recorded hand gesture data sets.