scispace - formally typeset
Search or ask a question

Showing papers on "Mahalanobis distance published in 2017"


Journal ArticleDOI
TL;DR: The R package pcadapt performs genome scans to detect genes under selection based on population genomic data and is compared to other computer programs for genome scans, finding that pcadapt and hapflk are the most powerful in scenarios of population divergence and range expansion.
Abstract: The R package pcadapt performs genome scans to detect genes under selection based on population genomic data. It assumes that candidate markers are outliers with respect to how they are related to population structure. Because population structure is ascertained with principal component analysis, the package is fast and works with large-scale data. It can handle missing data and pooled sequencing data. By contrast to population-based approaches, the package handle admixed individuals and does not require grouping individuals into populations. Since its first release, pcadapt has evolved in terms of both statistical approach and software implementation. We present results obtained with robust Mahalanobis distance, which is a new statistic for genome scans available in the 2.0 and later versions of the package. When hierarchical population structure occurs, Mahalanobis distance is more powerful than the communality statistic that was implemented in the first version of the package. Using simulated data, we compare pcadapt to other computer programs for genome scans (BayeScan, hapflk, OutFLANK, sNMF). We find that the proportion of false discoveries is around a nominal false discovery rate set at 10% with the exception of BayeScan that generates 40% of false discoveries. We also find that the power of BayeScan is severely impacted by the presence of admixed individuals whereas pcadapt is not impacted. Last, we find that pcadapt and hapflk are the most powerful in scenarios of population divergence and range expansion. Because pcadapt handles next-generation sequencing data, it is a valuable tool for data analysis in molecular ecology.

594 citations


Journal ArticleDOI
TL;DR: A discriminative deep multi-metric learning method to jointly learn multiple neural networks, under which the correlation of different features of each sample is maximized, and the distance of each positive pair is reduced and that of each negative pair is enlarged.
Abstract: This paper presents a new discriminative deep metric learning (DDML) method for face and kinship verification in wild conditions. While metric learning has achieved reasonably good performance in face and kinship verification, most existing metric learning methods aim to learn a single Mahalanobis distance metric to maximize the inter-class variations and minimize the intra-class variations, which cannot capture the nonlinear manifold where face images usually lie on. To address this, we propose a DDML method to train a deep neural network to learn a set of hierarchical nonlinear transformations to project face pairs into the same latent feature space, under which the distance of each positive pair is reduced and that of each negative pair is enlarged. To better use the commonality of multiple feature descriptors to make all the features more robust for face and kinship verification, we develop a discriminative deep multi-metric learning method to jointly learn multiple neural networks, under which the correlation of different features of each sample is maximized, and the distance of each positive pair is reduced and that of each negative pair is enlarged. Extensive experimental results show that our proposed methods achieve the acceptable results in both face and kinship verification.

264 citations


Journal ArticleDOI
TL;DR: A metric transfer learning framework (MTF) is proposed to encode metric learning in transfer learning to make knowledge transfer across domains more effective and develops general solutions to both classification and regression problems on top of MTLF.
Abstract: Transfer learning has been proven to be effective for the problems where training data from a source domain and test data from a target domain are drawn from different distributions. To reduce the distribution divergence between the source domain and the target domain, many previous studies have been focused on designing and optimizing objective functions with the Euclidean distance to measure dissimilarity between instances. However, in some real-world applications, the Euclidean distance may be inappropriate to capture the intrinsic similarity or dissimilarity between instances. To deal with this issue, in this paper, we propose a metric transfer learning framework (MTLF) to encode metric learning in transfer learning. In MTLF, instance weights are learned and exploited to bridge the distributions of different domains, while Mahalanobis distance is learned simultaneously to maximize the intra-class distances and minimize the inter-class distances for the target domain. Unlike previous work where instance weights and Mahalanobis distance are trained in a pipelined framework that potentially leads to error propagation across different components, MTLF attempts to learn instance weights and a Mahalanobis distance in a parallel framework to make knowledge transfer across domains more effective. Furthermore, we develop general solutions to both classification and regression problems on top of MTLF, respectively. We conduct extensive experiments on several real-world datasets on object recognition, handwriting recognition, and WiFi location to verify the effectiveness of MTLF compared with a number of state-of-the-art methods.

170 citations


Journal ArticleDOI
TL;DR: A novel pairwise similarity measure that advances existing models by i) expanding traditional linear projections into affine transformations and ii) fusing affine Mahalanobis distance and Cosine similarity by a data-driven combination is presented.
Abstract: Cross-domain visual data matching is one of the fundamental problems in many real-world vision tasks, e.g., matching persons across ID photos and surveillance videos. Conventional approaches to this problem usually involves two steps: i) projecting samples from different domains into a common space, and ii) computing (dis-)similarity in this space based on a certain distance. In this paper, we present a novel pairwise similarity measure that advances existing models by i) expanding traditional linear projections into affine transformations and ii) fusing affine Mahalanobis distance and Cosine similarity by a data-driven combination. Moreover, we unify our similarity measure with feature representation learning via deep convolutional neural networks. Specifically, we incorporate the similarity measure matrix into the deep architecture, enabling an end-to-end way of model optimization. We extensively evaluate our generalized similarity model in several challenging cross-domain matching tasks: person re-identification under different views and face verification over different modalities (i.e., faces from still images and videos, older and younger faces, and sketch and photo portraits). The experimental results demonstrate superior performance of our model over other state-of-the-art methods.

143 citations


Journal ArticleDOI
TL;DR: A novel semisupervised JITL framework is proposed for soft sensor modeling for nonlinear processes, which is based on semisuperedvised weighted probabilistic principal component regression (SWPPCR) and the effectiveness and flexibility of the proposed method are demonstrated.
Abstract: Just-in-time learning (JITL) is a commonly used technique for industrial soft sensing of nonlinear processes. However, traditional JITL approaches mainly focus on equal sample sizes between process (input) variables and quality (output) variables, which may not be practical in industrial processes since quality variables are usually much harder to obtain than other process variables. In order to handle unequal length dataset with only a few labeled data, a novel semisupervised JITL framework is proposed for soft sensor modeling for nonlinear processes, which is based on semisupervised weighted probabilistic principal component regression (SWPPCR). In the new semisupervised JITL framework, traditional Mahalanobis distance and a new proposed scaled Mahalanobis distance are used for similarity measurement and weight assignment. By selecting the most relevant labeled and unlabeled samples and assigning them with the corresponding weights, a local SWPPCR can be built to estimate the output variables of the query sample. Case studies are carried out to evaluate the prediction performance of the proposed semisupervised JITL framework on a numerical example and an industrial process. The effectiveness and flexibility of the proposed method are demonstrated by the prediction results.

134 citations


Posted Content
TL;DR: psmatch2 as discussed by the authors implements full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. This routine supersedes the previous 'psmatch' routine of B. Sianesi.
Abstract: psmatch2 implements full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. This routine supersedes the previous 'psmatch' routine of B. Sianesi. The April 2012 revision of pstest changes the syntax of that command.

109 citations


Journal ArticleDOI
TL;DR: The basis of tephrochronology as a chronostratigraphic correlational and dating tool for palaeoenvironmental, geological, and archaeological research is summarized and recent advances in analytical methods used to determine the major, minor, and trace elements of individual glass shards from tephra or cryptotephra deposits are documented.

92 citations


Journal ArticleDOI
TL;DR: A robust similarity measure for two attributed scattering center (ASC) sets and applies it to synthetic aperture radar (SAR) automatic target recognition (ATR) and Experimental results on the moving and stationary target acquisition and recognition (MSTAR) dataset verify the validity and robustness of the proposed method.

91 citations


Journal ArticleDOI
TL;DR: Experimental results show that EED outperforms existing methods that estimate Euclidean distances in an indirect manner and the application of EED to the Minimal Learning Machine (MLM), a distance-based supervised learning method, provides promising results.

79 citations


Journal ArticleDOI
TL;DR: A new region growing algorithm for the automated segmentation of both planar and non-planar surfaces in point clouds is presented, capable of more accurately estimating point normals located in highly curved regions or near sharp features.

65 citations


Journal ArticleDOI
TL;DR: An improved electrocardiogram (ECG) beats classification system is proposed, which is based on Fuzzy C-Means (FCM) clustering algorithm, and Mahalanobis Distance (MD) is used in the proposed model in order to improve the distance measurement procedure.

Journal ArticleDOI
15 Apr 2017-Wear
TL;DR: In this paper, a new approach for detecting tool wear in milling process using multi-sensor signals and Mahalanobis-Taguchi system (MTS) is presented.

Posted Content
TL;DR: A Bayesian analysis is presented, where the posterior for the marginal likelihood is obtained, using $k$th nearest-neighbour distances in parameter space, using the Mahalanobis distance metric, under the assumption that the points in the chain (thinned if required) are independent.
Abstract: In this paper, we present a method for computing the marginal likelihood, also known as the model likelihood or Bayesian evidence, from Markov Chain Monte Carlo (MCMC), or other sampled posterior distributions. In order to do this, one needs to be able to estimate the density of points in parameter space, and this can be challenging in high numbers of dimensions. Here we present a Bayesian analysis, where we obtain the posterior for the marginal likelihood, using $k$th nearest-neighbour distances in parameter space, using the Mahalanobis distance metric, under the assumption that the points in the chain (thinned if required) are independent. We generalise the algorithm to apply to importance-sampled chains, where each point is assigned a weight. We illustrate this with an idealised posterior of known form with an analytic marginal likelihood, and show that for chains of length $\sim 10^5$ points, the technique is effective for parameter spaces with up to $\sim 20$ dimensions. We also argue that $k=1$ is the optimal choice, and discuss failure modes for the algorithm. In a companion paper (Heavens et al. 2017) we apply the technique to the main MCMC chains from the 2015 Planck analysis of cosmic background radiation data, to infer that quantitatively the simplest 6-parameter flat $\Lambda$CDM standard model of cosmology is preferred over all extensions considered.

Journal ArticleDOI
TL;DR: The resultant method, called the covariance matrix self-adaptation with repelling subpopulations (RS-CMSA), is assessed and compared to several state-of-the-art niching methods on a standard test suite for multimodal optimization.
Abstract: During the recent decades, many niching methods have been proposed and empirically verified on some available test problems. They often rely on some particular assumptions associated with the distribution, shape, and size of the basins, which can seldom be made in practical optimization problems. This study utilizes several existing concepts and techniques, such as taboo points, normalized Mahalanobis distance, and the Ursem's hill-valley function in order to develop a new tool for multimodal optimization, which does not make any of these assumptions. In the proposed method, several subpopulations explore the search space in parallel. Offspring of a subpopulation are forced to maintain a sufficient distance to the center of fitter subpopulations and the previously identified basins, which are marked as taboo points. The taboo points repel the subpopulation to prevent convergence to the same basin. A strategy to update the repelling power of the taboo points is proposed to address the challenge of basins of dissimilar size. The local shape of a basin is also approximated by the distribution of the subpopulation members converging to that basin. The proposed niching strategy is incorporated into the covariance matrix self-adaptation evolution strategy CMSA-ES, a potent global optimization method. The resultant method, called the covariance matrix self-adaptation with repelling subpopulations RS-CMSA, is assessed and compared to several state-of-the-art niching methods on a standard test suite for multimodal optimization. An organized procedure for parameter setting is followed which assumes a rough estimation of the desired/expected number of minima available. Performance sensitivity to the accuracy of this estimation is also studied by introducing the concept of robust mean peak ratio. Based on the numerical results using the available and the introduced performance measures, RS-CMSA emerges as the most successful method when robustness and efficiency are considered at the same time.

Journal ArticleDOI
06 Jun 2017-Sensors
TL;DR: The results are promising and demonstrate that the system would be able to position users with these reasonable values of accuracy and precision.
Abstract: This paper presents a study of positioning system that provides advanced information services based on Wi-Fi and Bluetooth Low Energy (BLE) technologies. It uses Wi-Fi for rough positioning and BLE for fine positioning. It is designed for use in public transportation system stations and terminals where the conditions are “hostile” or unfavourable due to signal noise produced by the continuous movement of passengers and buses, data collection conducted in the constant presence thereof, multipath fading, non-line of sight (NLOS) conditions, the fact that part of the wireless communication infrastructure has already been deployed and positioned in a way that may not be optimal for positioning purposes, variable humidity conditions, etc. The ultimate goal is to provide a service that may be used to assist people with special needs. We present experimental results based on scene analysis; the main distance metric used was the Euclidean distance but the Mahalanobis distance was also used in one case. The algorithm employed to compare fingerprints was the weighted k-nearest neighbor one. For Wi-Fi, with only three visible access points, accuracy ranged from 3.94 to 4.82 m, and precision from 5.21 to 7.0 m 90% of the time. With respect to BLE, with a low beacon density (1 beacon per 45.7 m2), accuracy ranged from 1.47 to 2.15 m, and precision from 1.81 to 3.58 m 90% of the time. Taking into account the fact that this system is designed to work in real situations in a scenario with high environmental fluctuations, and comparing the results with others obtained in laboratory scenarios, our results are promising and demonstrate that the system would be able to position users with these reasonable values of accuracy and precision.

Journal ArticleDOI
TL;DR: In this article, the authors identify a quantitative measure for a priori estimation of prediction confidence in data-driven turbulence modeling, which represents the distance in feature space between the training flows and the flow to be predicted.
Abstract: Although Reynolds-Averaged Navier–Stokes (RANS) equations are still the dominant tool for engineering design and analysis applications involving turbulent flows, standard RANS models are known to be unreliable in many flows of engineering relevance, including flows with separation, strong pressure gradients or mean flow curvature. With increasing amounts of 3-dimensional experimental data and high fidelity simulation data from Large Eddy Simulation (LES) and Direct Numerical Simulation (DNS), data-driven turbulence modeling has become a promising approach to increase the predictive capability of RANS simulations. However, the prediction performance of data-driven models inevitably depends on the choices of training flows. This work aims to identify a quantitative measure for a priori estimation of prediction confidence in data-driven turbulence modeling. This measure represents the distance in feature space between the training flows and the flow to be predicted. Specifically, the Mahalanobis distance and the kernel density estimation (KDE) technique are used as metrics to quantify the distance between flow data sets in feature space. To examine the relationship between these two extrapolation metrics and the machine learning model prediction performance, the flow over periodic hills at Re = 10595 is used as test set and seven flows with different configurations are individually used as training sets. The results show that the prediction error of the Reynolds stress anisotropy is positively correlated with Mahalanobis distance and KDE distance, demonstrating that both extrapolation metrics can be used to estimate the prediction confidence a priori. A quantitative comparison using correlation coefficients shows that the Mahalanobis distance is less accurate in estimating the prediction confidence than KDE distance. The extrapolation metrics introduced in this work and the corresponding analysis provide an approach to aid in the choice of data source and to assess the prediction performance for data-driven turbulence modeling.

Journal ArticleDOI
TL;DR: Three new metrics that can be used to identify outliers in multivariate space, while making no strong assumptions about the distribution of the data are developed and implemented in the R package minotaur.
Abstract: Genome scans are widely used to identify 'outliers' in genomic data: loci with different patterns compared with the rest of the genome due to the action of selection or other nonadaptive forces of evolution. These genomic data sets are often high dimensional, with complex correlation structures among variables, making it a challenge to identify outliers in a robust way. The Mahalanobis distance has been widely used, but has the major limitation of assuming that data follow a simple parametric distribution. Here, we develop three new metrics that can be used to identify outliers in multivariate space, while making no strong assumptions about the distribution of the data. These metrics are implemented in the R package minotaur, which also includes an interactive web-based application for visualizing outliers in high-dimensional data sets. We illustrate how these metrics can be used to identify outliers from simulated genetic data and discuss some of the limitations they may face in application.

Journal ArticleDOI
TL;DR: A new method for view-invariant action recognition that utilizes the temporal position of skeletal joints obtained by Kinect sensor and is capable of recognizing both the voluntary and involuntary actions, as well as pose-based and trajectory-based ones with a high accuracy rate.
Abstract: This paper proposes a new method for view-invariant action recognition that utilizes the temporal position of skeletal joints obtained by Kinect sensor. In this method, the actions are represented as sequences of several pre-defined poses. After pre-processing, which includes skeleton alignment and scaling, the appropriate feature vectors are obtained for recognizing and discriminating the pose of every frame by the proposed Fisherposes method. The proposed regularized Mahalanobis distance metric is used in order to recognize both the involuntary and highly made-up actions at the same time. Hidden Markov model (HMM) is then used to classify the action related to an input sequence of poses. For taking into account the motion in the actions which are not separable by solely their temporal poses, histograms of trajectories are also proposed. The proposed action recognition method is capable of recognizing both the voluntary and involuntary actions, as well as pose-based and trajectory-based ones with a high accuracy rate. The effectiveness of the proposed method is experimented on three publicly available data sets, TST fall detection, UTKinect, and UCFKinect data sets.

Journal ArticleDOI
28 Mar 2017-PLOS ONE
TL;DR: It was found that least-cost and resistance distance were not linearly related unless a transformation was applied and the metric used to infer movement or gene flow and the manipulations applied to the data used to calculate these metrics may govern findings.
Abstract: Least-cost modelling and circuit theory are common analogs used in ecology and evolution to model gene flow or animal movement across landscapes. Least-cost modelling estimates the least-cost distance, whereas circuit theory estimates resistance distance. The bias added in choosing one method over the other has not been well documented. We designed an experiment to test whether both methods were linearly related. We also tested the sensitivity of these metrics to variation in Euclidean distance, spatial autocorrelation, the number of pixels representing the landscape, and data aggregation. We found that least-cost and resistance distance were not linearly related unless a transformation was applied. Resistance distance was less sensitive to the number of pixels representing a landscape and was also less sensitive than least-cost distance to the Euclidean distance between nodes. Spatial autocorrelation did not affect either method or the relationship between methods. Resistance distance was more sensitive to aggregation in any form compared to least-cost distance. Therefore, the metric used to infer movement or gene flow and the manipulations applied to the data used to calculate these metrics may govern findings.

Journal ArticleDOI
TL;DR: In this paper, the authors examined the weighted Mahalanobis distance based similarity measure applied to CBR cost estimation, and carried out comparative research on the existing distance measurement methods of CBR.

Journal ArticleDOI
TL;DR: In this article, the Wasserstein distance is used to detect non-autonomous dynamics from a Lorenz system driven by seasonal cycles and a warming trend in a toy example.
Abstract: . The climate system can been described by a dynamical system and its associated attractor. The dynamics of this attractor depends on the external forcings that influence the climate. Such forcings can affect the mean values or variances, but regions of the attractor that are seldom visited can also be affected. It is an important challenge to measure how the climate attractor responds to different forcings. Currently, the Euclidean distance or similar measures like the Mahalanobis distance have been favored to measure discrepancies between two climatic situations. Those distances do not have a natural building mechanism to take into account the attractor dynamics. In this paper, we argue that a Wasserstein distance, stemming from optimal transport theory, offers an efficient and practical way to discriminate between dynamical systems. After treating a toy example, we explore how the Wasserstein distance can be applied and interpreted to detect non-autonomous dynamics from a Lorenz system driven by seasonal cycles and a warming trend.

Journal ArticleDOI
Han Gao1, Yunwei Tang1, Linhai Jing1, Hui Li1, Haifeng Ding1 
24 Oct 2017-Sensors
TL;DR: This study proposes a novel unsupervised evaluation method to quantitatively measure the quality of segmentation results and verified the effectiveness of the proposed method and demonstrated the reliability and improvements of this method with respect to other methods.
Abstract: The segmentation of a high spatial resolution remote sensing image is a critical step in geographic object-based image analysis (GEOBIA). Evaluating the performance of segmentation without ground truth data, i.e., unsupervised evaluation, is important for the comparison of segmentation algorithms and the automatic selection of optimal parameters. This unsupervised strategy currently faces several challenges in practice, such as difficulties in designing effective indicators and limitations of the spectral values in the feature representation. This study proposes a novel unsupervised evaluation method to quantitatively measure the quality of segmentation results to overcome these problems. In this method, multiple spectral and spatial features of images are first extracted simultaneously and then integrated into a feature set to improve the quality of the feature representation of ground objects. The indicators designed for spatial stratified heterogeneity and spatial autocorrelation are included to estimate the properties of the segments in this integrated feature set. These two indicators are then combined into a global assessment metric as the final quality score. The trade-offs of the combined indicators are accounted for using a strategy based on the Mahalanobis distance, which can be exhibited geometrically. The method is tested on two segmentation algorithms and three testing images. The proposed method is compared with two existing unsupervised methods and a supervised method to confirm its capabilities. Through comparison and visual analysis, the results verified the effectiveness of the proposed method and demonstrated the reliability and improvements of this method with respect to other methods.

Journal ArticleDOI
TL;DR: A composite measure of upper extremity proprioception is outlined to provide a single continuous outcome measure of proprioceptive function for use in clinical trials of rehabilitation.
Abstract: Proprioception is the sense of the position and movement of our limbs, and is vital for executing coordinated movements. Proprioceptive disorders are common following stroke, but clinical tests for measuring impairments in proprioception are simple ordinal scales that are unreliable and relatively crude. We developed and validated specific kinematic parameters to quantify proprioception and compared two common metrics, Euclidean and Mahalanobis distances, to combine these parameters into an overall summary score of proprioception. We used the KINARM robotic exoskeleton to assess proprioception of the upper limb in subjects with stroke (N = 285. Mean days post-stroke = 12 ± 15). Two aspects of proprioception (position sense and kinesthetic sense) were tested using two mirror-matching tasks without vision. The tasks produced 12 parameters to quantify position sense and eight to quantify kinesthesia. The Euclidean and Mahalanobis distances of the z-scores for these parameters were computed each for position sense, kinesthetic sense, and overall proprioceptive function (average score of position and kinesthetic sense). A high proportion of stroke subjects were impaired on position matching (57%), kinesthetic matching (65%), and overall proprioception (62%). Robotic tasks were significantly correlated with clinical measures of upper extremity proprioception, motor impairment, and overall functional independence. Composite scores derived from the Euclidean distance and Mahalanobis distance showed strong content validity as they were highly correlated (r = 0.97–0.99). We have outlined a composite measure of upper extremity proprioception to provide a single continuous outcome measure of proprioceptive function for use in clinical trials of rehabilitation. Multiple aspects of proprioception including sense of position, direction, speed, and amplitude of movement were incorporated into this measure. Despite similarities in the scores obtained with these two distance metrics, the Mahalanobis distance was preferred.

Journal ArticleDOI
TL;DR: The experimental comparison results of the improved manifold learning algorithm and the traditional algorithm prove that the proposed method is more effective in rolling element bearing fault diagnosis.
Abstract: Fault feature can be extracted by traditional manifold learning algorithms, which construct neighborhood graphs by Euclidean distance (ED). It is difficult to get an excellent dimensionality reduction result when processed data has strong correlations. In order to improve the effect of dimensionality reduction and increase accuracy of bearing fault diagnosis in mechanical systems, an improved manifold learning method based on Mahalanobis distance (MD) is proposed. In this paper, we use time-domain analysis and frequency-domain analysis to construct high-dimensional feature vectors in the first step. Then, MD is used to replace ED in neighborhood construction of manifold learning. After using the improved manifold learning method, low-dimensional feature vectors can be extracted. Finally, fault diagnosis of rolling element bearing can be made by applying the K-nearest neighbor classifier. In part of experiment, to verify the efficiency of the improved manifold learning methods, artificial data sets and rolling element bearing fault data are adopted. The experimental comparison results of the improved manifold learning algorithm and the traditional algorithm prove that the proposed method is more effective in rolling element bearing fault diagnosis.

Journal ArticleDOI
TL;DR: In this article, a Mahalanobis distance-based damage detection method is studied and compared to the well-known subspace-based approach in the context of two large case studies, in which the joint features of the methods are concluded in a control chart in an attempt to enhance the resolution of the damage detection.

Journal ArticleDOI
TL;DR: Results demonstrate that the time domain Correlation Coefficient is the most robust method while the Discrete Wavelet Transform is the elected one between the transform-based methods tested.

Journal ArticleDOI
Xiangjun Du1, Fengjing Shao1, Shunyao Wu1, Hanlin Zhang1, Si Xu1 
TL;DR: This work analyzes correlations between water quality variables and proposes an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance, which is the first attempt to apply MahalanOBis distance for coastal water quality Assessment.
Abstract: Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

Journal ArticleDOI
TL;DR: In this article, the spectral information potential of images captured with an unmanned aerial vehicle (UAV) in the context of crop-weed discrimination is assessed. But the spectral mixings in the pixels are modeled, based on an image with a 60mm spatial resolution, to estimate the impact of the resolution on the ability to discriminate small plants.
Abstract: This study aimed to assess the spectral information potential of images captured with an unmanned aerial vehicle, in the context of crop–weed discrimination. A model is proposed in which the entire image acquisition chain is simulated in order to compute the digital values of image pixels according to several parameters (light, plant characteristics, optical filters, sensors…) to reproduce in-field acquisition conditions. The spectral mixings in the pixels are modeled, based on an image with a 60 mm spatial resolution, to estimate the impact of the resolution on the ability to discriminate small plants. The classification potential (i.e. the ability to separate two classes) in soil and vegetation and in monocotyledon and dicotyledon classes is studied using simulations for different vegetation rates (defined as the proportion of vegetation covering the surface projected in the considered pixel). The classification is unsupervised and based on the Mahalanobis distance computation. The results of soil-vegetation discrimination show that pixels with low vegetation rates can be classified as vegetation: pixels with vegetation rate greater than 0.5 had a probability to be correctly classified between 80 and 100%. Classification between monocotyledonous and dicotyledonous plants requires pixels with a high vegetation rate: to obtain a probability to be correctly classified better than 80%, vegetation rates in the pixels have to be over 0.9. To compare the results with data from real images, the same classification was tested on multispectral images of a weed infested field. The comparison confirmed the ability of the model to assess vegetation–soil and crop–weed discrimination potential for specific sensors (such as the multiSPEC 4C sensor, AIRINOV, Paris, France), where the acquisition chain parameters can be tested.

Journal ArticleDOI
TL;DR: An improved tracking algorithm based on a mean shift tracker that can reinitialize the target when it converges to a local minima and it can cope with scale changes, occlusions and appearance changes by using the online learning-based detector to develop a novel tracking algorithm.
Abstract: A new tracking method combining a mean shift tracker with an online learning-based detector and a Kalman filter.A Mahalanobis distance-based validation region for reduction of calculation time.Target model update scheme for long-term tracking.Experiments on eight challenging video sequences to compare against state-of-the-art methods.Demonstration of superiority in term of accuracy and speed. Color-based visual object tracking is one of the most commonly used tracking methods. Among many tracking methods, the mean shift tracker is used most often because it is simple to implement and consumes less computational time. However, mean shift trackers exhibit several limitations when used for long-term tracking. In challenging conditions that include occlusions, pose variations, scale changes, and illumination changes, the mean shift tracker does not work well. In this paper, an improved tracking algorithm based on a mean shift tracker is proposed to overcome the weaknesses of existing methods based on mean shift tracker. The main contributions of this paper are to integrate mean shift tracker with an online learning-based detector and to newly define the Kalman filter-based validation region for reducing computational burden of the detector. We combine the mean shift tracker with the online learning-based detector, and integrate the Kalman filter to develop a novel tracking algorithm. The proposed algorithm can reinitialize the target when it converges to a local minima and it can cope with scale changes, occlusions and appearance changes by using the online learning-based detector. It updates the target model for the tracker in order to ensure long-term tracking. Moreover, the validation region obtained by using the Kalman filter and the Mahalanobis distance is used in order to operate detector in real-time. Through a comparison against various mean shift tracker-based methods and other state-of-the-art methods on eight challenging video sequences, we demonstrate that the proposed algorithm is efficient and superior in terms of accuracy and speed. Hence, it is expected that the proposed method can be applied to various applications which need to detect and track an object in real-time.

Journal ArticleDOI
TL;DR: In this paper, an extension of the PCMD index proposed for sensor fault diagnosis in linear systems to the nonlinear case is proposed, which is based on Mahalanobis distance and moving window kernel principal component analysis (MWKPCA) technique.
Abstract: This paper suggests an extension of the PCMD index proposed for sensor fault diagnosis in linear systems to the nonlinear case. The proposed index is entitled KPCMD, and it is based on Mahalanobis distance and moving window kernel principal component analysis (MWKPCA) technique. The principle of this index is to detect dissimilarity between a reference KPCA model that represents normal operation of the system and a current KPCA model that represents current system behavior. The proposed KPCMD index compute Mahalanobis distance between principal components corresponding to the reference KPCA model and those corresponding to the current KPCA model which are obtained online using MWKPCA. The proposed KPCMD indices have been applied successfully for monitoring of numerical example as well a continuous stirred tank reactor (CSTR).