Showing papers on "Mahalanobis distance published in 2003"

PDF

Open Access

Posted Content•

PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing

[...]

Edwin Leuven, Barbara Sianesi

01 Jan 2003-Statistical Software Components

TL;DR: psmatch2 as discussed by the authors implements full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. This routine supersedes the previous 'psmatch' routine of B. Sianesi.

...read moreread less

Abstract: psmatch2 implements full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. This routine supersedes the previous 'psmatch' routine of B. Sianesi. The April 2012 revision of pstest changes the syntax of that command.

...read moreread less

1,887 citations

Journal Article•DOI•

A robust minimax approach to classification

[...]

Gert R. G. Lanckriet¹, Laurent El Ghaoui¹, Chiranjib Bhattacharyya¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

01 Mar 2003-Journal of Machine Learning Research

TL;DR: This work considers a binary classification problem where the mean and covariance matrix of each class are assumed to be known, and addresses the issue of robustness with respect to estimation errors via a simple modification of the input data.

...read moreread less

Abstract: When constructing a classifier, the probability of correct classification of future data points should be maximized. We consider a binary classification problem where the mean and covariance matrix of each class are assumed to be known. No further assumptions are made with respect to the class-conditional distributions. Misclassification probabilities are then controlled in a worst-case setting: that is, under all possible choices of class-conditional densities with given mean and covariance matrix, we minimize the worst-case (maximum) probability of misclassification of future data points. For a linear decision boundary, this desideratum is translated in a very direct way into a (convex) second order cone optimization problem, with complexity similar to a support vector machine problem. The minimax problem can be interpreted geometrically as minimizing the maximum of the Mahalanobis distances to the two classes. We address the issue of robustness with respect to estimation errors (in the means and covariances of the classes) via a simple modification of the input data. We also show how to exploit Mercer kernels in this setting to obtain nonlinear decision boundaries, yielding a classifier which proves to be competitive with current methods, including support vector machines. An important feature of this method is that a worst-case bound on the probability of misclassification of future data is always obtained explicitly.

...read moreread less

508 citations

Proceedings Article•

Learning distance functions using equivalence relations

[...]

Aharon Bar-Hillel¹, Tomer Hertz¹, Noam Shental¹, Daphna Weinshall¹•Institutions (1)

Hebrew University of Jerusalem¹

21 Aug 2003

TL;DR: It is empirically demonstrate that learning a distance metric using the RCA algorithm significantly improves clustering performance, similarly to the alternative algorithm.

...read moreread less

Abstract: We address the problem of learning distance metrics using side-information in the form of groups of "similar" points. We propose to use the RCA algorithm, which is a simple and efficient algorithm for learning a full ranked Mahalanobis metric (Shental et al., 2002). We first show that RCA obtains the solution to an interesting optimization problem, founded on an information theoretic basis. If the Mahalanobis matrix is allowed to be singular, we show that Fisher's linear discriminant followed by RCA is the optimal dimensionality reduction algorithm under the same criterion. We then show how this optimization problem is related to the criterion optimized by another recent algorithm for metric learning (Xing et al., 2002), which uses the same kind of side information. We empirically demonstrate that learning a distance metric using the RCA algorithm significantly improves clustering performance, similarly to the alternative algorithm. Since the RCA algorithm is much more efficient and cost effective than the alternative, as it only uses closed form expressions of the data, it seems like a preferable choice for the learning of full rank Mahalanobis distances.

...read moreread less

481 citations

Proceedings Article•DOI•

Steganalysis of additive-noise modelable information hiding

[...]

Jeremiah Harmsen¹, William A. Pearlman¹•Institutions (1)

Rensselaer Polytechnic Institute¹

20 Jun 2003-electronic imaging

TL;DR: In this article, it is shown that these embedding methods are equivalent to a lowpass filtering of histograms that is quantified by a decrease in the HCF center of mass (COM), which is exploited in known scheme detection to classify unaltered and spread spectrum images using a bivariate classifier.

...read moreread less

Abstract: The process of information hiding is modeled in the context of additive noise. Under an independence assumption, the histogram of the stegomessage is a convolution of the noise probability mass function (PMF) and the original histogram. In the frequency domain this convolution is viewed as a multiplication of the histogram characteristic function (HCF) and the noise characteristic function. Least significant bit, spread spectrum, and DCT hiding methods for images are analyzed in this framework. It is shown that these embedding methods are equivalent to a lowpass filtering of histograms that is quantified by a decrease in the HCF center of mass (COM). These decreases are exploited in a known scheme detection to classify unaltered and spread spectrum images using a bivariate classifier. Finally, a blind detection scheme is built that uses only statistics from unaltered images. By calculating the Mahalanobis distance from a test COM to the training distribution, a threshold is used to identify steganographic images. At an embedding rate of 1 b.p.p. greater than 95% of the stegoimages are detected with false alarm rate of 5%.

...read moreread less

444 citations

Journal Article•DOI•

Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance

[...]

Oren Farber¹, Ronen Kadmon¹•Institutions (1)

Hebrew University of Jerusalem¹

01 Feb 2003-Ecological Modelling

TL;DR: In this article, the authors introduce the concept of Mahalanobis distance to bioclimatic modeling, and argue that the envelopes defined by this distance can better reflect the principle of central tendency as expressed by niche theory.

...read moreread less

354 citations

Analyzing PCA-based Face Recognition Algorithms: Eigenvector Selection and Distance Measures

[...]

Wendy S. Yambor, Bruce A. Draper, J. Ross Beveridge

01 Jan 2003

TL;DR: Using a combinations of traditional distance measures in Eigenspace to improve performance in the matching stage of face recognition and comparing variations in performance due to different distance measures and numbers of Eigenvectors is compared.

...read moreread less

Abstract: This study examines the role of Eigenvector selection and Eigenspace distance measures on PCA-based face recognition systems. In particular, it builds on earlier results from the FERET face recognition evaluation studies, which created a large face database (1,196 subjects) and a baseline face recognition system for comparative evaluations. This study looks at using a combinations of traditional distance measures (City-block, Euclidean, Angle, Mahalanobis) in Eigenspace to improve performance in the matching stage of face recognition. A statistically significant improvement is observed for the Mahalanobis distance alone when compared to the other three alone. However, no combinations of these measures appear to perform better than Mahalanobis alone. This study also examines questions of how many Eigenvectors to select and according to what ordering criterion. It compares variations in performance due to different distance measures and numbers of Eigenvectors. Ordering Eigenvectors according to a like-image difference value rather than their Eigenvalues is also considered.

...read moreread less

263 citations

Journal Article•DOI•

A technique for generating regional climate scenarios using a nearest‐neighbor algorithm

[...]

David Yates¹, David Yates², Subhrendu Gangopadhyay¹, Balaji Rajagopalan¹, Kenneth Strzepek¹ - Show less +1 more•Institutions (2)

University of Colorado Boulder¹, National Center for Atmospheric Research²

01 Jul 2003-Water Resources Research

TL;DR: In this article, a K-nearest neighbor (K-nn) resampling scheme is presented that simulates daily weather variables, and consequently seasonal climate and spatial and temporal dependencies, at multiple stations in a given region.

...read moreread less

Abstract: [1] A K-nearest neighbor (K-nn) resampling scheme is presented that simulates daily weather variables, and consequently seasonal climate and spatial and temporal dependencies, at multiple stations in a given region. A strategy is introduced that uses the K-nn algorithm to produce alternative climate data sets conditioned upon hypothetical climate scenarios, e.g., warmer-drier springs, warmer-wetter winters, and so on. This technique allows for the creation of ensembles of climate scenarios that can be used in integrated assessment and water resource management models for addressing the potential impacts of climate change and climate variability. This K-nn algorithm makes use of the Mahalanobis distance as the metric for neighbor selection, as opposed to a Euclidian distance. The advantage of the Mahalanobis distance is that the variables do not have to be standardized nor is there a requirement to preassign weights to variables. The model is applied to two sets of station data in climatologically diverse areas of the United States, including the Rocky Mountains and the north central United States and is shown to reproduce synthetic series that largely preserve important cross correlations and autocorrelations. Likewise, the adapted K-nn algorithm is used to generate alternative climate scenarios based upon prescribed conditioning criteria.

...read moreread less

223 citations

Proceedings Article•DOI•

Comparison of similarity metrics for texture image retrieval

[...]

Manesh Kokare¹, B. N. Chatterji¹, Pradipta Biswas¹•Institutions (1)

Indian Institute of Technology Kharagpur¹

15 Oct 2003

TL;DR: Experimental results on the Brodatz texture database indicate that the retrieval performance can be improved significantly by using the Canberra and Bray-Curtis distance metrics as compare to traditional Euclidean and Mahalanobis distance based approaches.

...read moreread less

Abstract: Similarity metrics plays an important role in content-based image retrieval. The paper compares nine image similarity measures - Manhattan (L1), weighted-mean-variance (WMV), Euclidean (L2), Chebychev (L/spl infin/), Mahalanobis, Canberra, Bray-Curtis, squared chord and squared chi-squared distances - for texture image retrieval. A large texture database of 1856 images, derived from the Brodatz album, is used to check the retrieval performance. Features of all the database images were extracted using the Gabor wavelet. Experimental results on the Brodatz texture database indicate that the retrieval performance can be improved significantly by using the Canberra and Bray-Curtis distance metrics as compare to traditional Euclidean and Mahalanobis distance based approaches.

...read moreread less

187 citations

Journal Article•DOI•

Exploring process data with the use of robust outlier detection algorithms

[...]

Leo H. Chiang¹, Randy J. Pell¹, Mary Beth Seasholtz¹•Institutions (1)

Dow Chemical Company¹

01 Aug 2003-Journal of Process Control

TL;DR: Closest distance to center (CDC) is proposed in this paper as an alternative for outlier detection and better performance was obtained when CDC is incorporated with MVT, compared to using CDC and MVT alone.

...read moreread less

167 citations

Journal Article•DOI•

A review and analysis of the Mahalanobis-Taguchi system

[...]

William H. Woodall¹, Rachelle Koudelik¹, Kwok-Leung Tsui², Seoung Bum Kim², Zachary G. Stoumbos³, Christos P. Carvounis⁴ - Show less +2 more•Institutions (4)

Virginia Tech¹, Georgia Institute of Technology², Rutgers University³, Stony Brook University⁴

01 Feb 2003-Technometrics

TL;DR: The Mahalanobis-Taguchi system (MTS) as mentioned in this paper is a relatively new collection of methods proposed for diagnosis and forecasting using multivariate data, which is used to measure the level of abnormality of abnormal items compared to a group of normal items.

...read moreread less

Abstract: The Mahalanobis–Taguchi system (MTS) is a relatively new collection of methods proposed for diagnosis and forecasting using multivariate data. The primary proponent of the MTS is Genichi Taguchi, who is very well known for his controversial ideas and methods for using designed experiments. The MTS results in a Mahalanobis distance scale used to measure the level of abnormality of “abnormal” items compared to a group of “normal” items. First, it must be demonstrated that a Mahalanobis distance measure based on all available variables on the items is able to separate the abnormal items from the normal items. If this is the case, then orthogonal arrays and signal-to-noise ratios are used to select an “optimal” combination of variables for calculating the Mahalanobis distances. Optimality is defined in terms of the ability of the Mahalanobis distance scale to match a prespecified or estimated scale that measures the severity of the abnormalities. In this expository article, we review the methods of the MTS an...

...read moreread less

152 citations

Journal Article•DOI•

Wavelet-clustering-neural network model for freeway incident detection

[...]

Samanwoy Ghosh-Dastidar¹, Hojjat Adeli¹•Institutions (1)

Ohio State University¹

01 Sep 2003-Computer-aided Civil and Infrastructure Engineering

TL;DR: An improved freeway incident‐detection model is presented based on speed, volume, and occupancy data from a single detector station using a combination of wavelet‐based signal processing, statistical cluster analysis, and neural network pattern recognition.

...read moreread less

Abstract: This paper presents an improved freeway incident detection (FID) model that is based on speed, volume, and occupancy data from a single detector station using a combination of wavelet-based signal processing, statistical cluster analysis, and neural network pattern recognition. A comparative study of different wavelets and filtering schemes was conducted in terms of efficacy and accuracy of smoothing. It was concluded that the 4th-order Coifman wavelet is more effective than other types of wavelets for the FID problem. A statistical multivariate analysis based on the Mahalanobis distance is employed to perform data clustering and parameter reduction to reduce the size of the input space for the subsequent step of classification by the Levenberg--Marquardt backpropagation neural network. For a straight 2-lane freeway using real data, the model yields an FID rate of 100%, false alarm rate of 0.3%, and detection time of 35.6 seconds.

...read moreread less

Journal Article•DOI•

Comparison of Distance Measures for Planar Curves

[...]

Helmut Alt¹, Christian Knauer¹, Carola Wenk²•Institutions (2)

Free University of Berlin¹, University of Arizona²

01 Oct 2003-Algorithmica

TL;DR: It is shown that for closed convex curves both distance measures are the same and are within a constant factor of each other for so-called κ-straight curves, i.e., curves where the arc length between any two points on the curve is at most a constant κ times their Euclidean distance.

...read moreread less

Abstract: The Hausdorff distance is a very natural and straightforward distance measure for comparing geometric shapes like curves or other compact sets. Unfortunately, it is not an appropriate distance measure in some cases. For this reason, the Frechet distance has been investigated for measuring the resemblance of geometric shapes which avoids the drawbacks of the Hausdorff distance. Unfortunately, it is much harder to compute. Here we investigate under which conditions the two distance measures approximately coincide, i.e., the pathological cases for the Hausdorff distance cannot occur. We show that for closed convex curves both distance measures are the same. Furthermore, they are within a constant factor of each other for so-called κ-straight curves, i.e., curves where the arc length between any two points on the curve is at most a constant κ times their Euclidean distance. Therefore, algorithms for computing the Hausdorff distance can be used in these cases to get exact or approximate computations of the Frechet distance, as well.

...read moreread less

Proceedings Article•DOI•

A piecewise Gaussian model for profiling and differentiating retinal vessels

[...]

Huiqi Li¹, Wynne Hsu¹, Mong Li Lee¹, Huan Wang¹•Institutions (1)

National University of Singapore¹

24 Nov 2003

TL;DR: A piecewise Gaussian model is proposed to describe the intensity distribution of vessel profile and the characteristic of central reflex is specially considered in the proposed model.

...read moreread less

Abstract: Accurate measurement and identification of blood vessels could provide useful information to clinical diagnosis. A piecewise Gaussian model is proposed to describe the intensity distribution of vessel profile in this paper. The characteristic of central reflex is specially considered in the proposed model. The comparison with the single Gaussian model is performed, which shows that the piecewise Gaussian model is a more appropriate model for vessel profile. The obtained model parameters could be utilized in the identification of vessel type. The minimum Mahalanobis distance classifier is employed in the classification. 505 segments of vessels were tested. The success rate is 82.46% and 89.03% for the arteries and veins respectively.

...read moreread less

Journal Article•DOI•

Application of near-infrared spectroscopy to wood discrimination

[...]

Satoru Tsuchikawa¹, Kinuyo Inoue¹, Junichi Noma, Kazuo Hayashi²•Institutions (2)

Nagoya University¹, Ehime University²

01 Feb 2003-Journal of Wood Science

TL;DR: In this article, a new discriminant analysis employing second derivative spectra was proposed for wood classification based on Mahalanobis' generalized distance between softwoods and hardwoods, and its accuracy and reasonability were examined for wood samples with various moisture contents ranging from oven-dried to a fully saturated free water state.

...read moreread less

Abstract: This study deals with a new nondestructive discriminant analysis by which wood can be classified on the basis of a combination of near-infrared (NIR) spectroscopy and Mahalanobis' generalized distance. Its accuracy and reasonability were examined for wood samples with various moisture contents ranging from oven-dried to a fully saturated free water state. In a discriminant analysis employing second derivative spectra, each wood group was well distinguished. Mahalanobis' generalized distances between softwoods are relatively independent of analytical pattern, whereas the distances between hardwoods are large for easy classification. There may be two reasons for selecting a wavelength: (1) when the chemical component of wood substance relates to the discriminant analysis; and (2) when the difference in moisture content with wood species relates to them. When we correctly construct the database of NIR spectra, confirming the purpose of the analysis, suitable wood discrimination should be possible.

...read moreread less

Proceedings Article•DOI•

Modeling hyperspectral imaging data

[...]

David B. Marden¹, Dimitris G. Manolakis²•Institutions (2)

Northeastern University¹, Massachusetts Institute of Technology²

23 Sep 2003

TL;DR: This paper will focus on techniques used to segment HSI data into homogenous clusters, and the definition of the multivariate Elliptically Contoured Distribution mixture model will be developed.

...read moreread less

Abstract: Developing proper models for hyperspectral imaging (HSI) data allows for useful and reliable algorithms for data exploitation. These models provide the foundation for development and evaluation of detection, classification, clustering, and estimation algorithms. To date, real world HSI data has been modeled as a single multivariate Gaussian, however it is well known that real data often exhibits non-Gaussian behavior with multi-modal distributions. Instead of the single multivariate Gaussian distribution, HSI data can be model as a finite mixture model, where each of the mixture components need not be Gaussian. This paper will focus on techniques used to segment HSI data into homogenous clusters. Once the data has been segmented, each individual cluster can be modeled, and the benefits provided by the homogeneous clustering of the data versus non-clustering explored. One of the promising techniques uses the Expectation-Maximization (EM) algorithm to cluster the data into Elliptically Contoured Distributions (ECDs). A larger family of distributions, the family of ECDs includes the mutlivariate Gaussian distribution and exhibits most of its properties. ECDs are uniquely defined by their multivariate mean, covariance and the distribution of its Mahalanobis (or quadratic) distance metric. This metric lets multivariate data be identified using a univariate statistic and can be adjusted to more closely match the longer tailed distributions of real data. This paper will focus on three issues. First, the definition of the multivariate Elliptically Contoured Distribution mixture model will be developed. Second, various techniques will be described that segment the mixed data into homogeneous clusters. Most of this work will focus on the EM algorithm and the multivariate t-distribution, which is a member of the family of ECDs and provides longer tailed distributions than the Gaussian. Lastly, results using HSI data from the AVIRIS sensor will be shown, and the benefits of clustered data will be presented.

...read moreread less

Journal Article•DOI•

Improved classification of Landsat Thematic Mapper data using modified prior probabilities in large and complex landscapes

[...]

L. Pedroni¹•Institutions (1)

Centro Agronómico Tropical de Investigación y Enseñanza¹

01 Jan 2003-International Journal of Remote Sensing

TL;DR: A procedure that generates large sets of prior probability estimates from class frequencies modelled with ancillary data and a Mahalanobis Distance selection of previously classified pixels is presented, which improves classification accuracy in large and complex landscapes with spectrally mixed land-cover categories.

...read moreread less

Abstract: The use of modified prior probabilities to exploit ancillary data and increase classification accuracy has been proposed before. However, this method has not been widely applied because it has heavy computing requirements and because obtaining prior probability estimates has presented practical problems. This article presents a procedure that generates large sets of prior probability estimates from class frequencies modelled with ancillary data and a Mahalanobis Distance selection of previously classified pixels. The method produces a pixel sample size that is large enough to estimate class frequencies in numerous strata, which is particularly desirable for the study of large and complex landscapes. A case study is presented in which the procedure made it possible to estimate 537 sets of prior probabilities for an entire Landsat Thematic Mapper (TM) scene of central Costa Rica. After modifying the class prior probabilities, the overall classification consistency of the training sites improved from 74.6% t...

...read moreread less

Proceedings Article•DOI•

An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing

[...]

H. Jin¹, Beng Chin Ooi², Heng Tao Shen², Cui Yu³, Aoying Zhou⁴ - Show less +1 more•Institutions (4)

University of Michigan¹, National University of Singapore², Monmouth University³, Fudan University⁴

05 Mar 2003

TL;DR: This work presents an adaptive multilevel mahalanobis-based dimensionality reduction (MMDR) technique for high-dimensional indexing that achieves higher precision, but also enables queries to be processed efficiently.

...read moreread less

Abstract: The notorious "dimensionality curse" is a well-known phenomenon for any multidimensional indexes attempting to scale up to high dimensions. One well known approach to overcoming degradation in performance with respect to increasing dimensions is to reduce the dimensionality of the original dataset before constructing the index. However, identifying the correlation among the dimensions and effectively reducing them is a challenging task. We present an adaptive multilevel mahalanobis-based dimensionality reduction (MMDR) technique for high-dimensional indexing. Our MMDR technique has three notable features compared to existing methods. First, it discovers elliptical clusters using only the low-dimensional subspaces. Second, data points in the different axis systems are indexed using a single B/sup +/-tree. Third, our technique is highly scalable in terms of data size and dimensionality. An extensive performance study using both real and synthetic datasets was conducted, and the results show that our technique not only achieves higher precision, but also enables queries to be processed efficiently.

...read moreread less

Journal Article•DOI•

Discriminant analysis of wood-based materials using near-infrared spectroscopy

[...]

Satoru Tsuchikawa¹, Kaori Yamato¹, Kinuyo Inoue¹•Institutions (1)

Nagoya University¹

01 Jun 2003-Journal of Wood Science

TL;DR: In this article, Mahalanobis' generalized distance, K nearest neighbors (KNN), and soft independent modeling of class analogy (SIMCA) were evaluated to determine the best analytical procedure.

...read moreread less

Abstract: This study deals with the suitable discriminant techniques of wood-based materials by means of near-infrared spectroscopy (NIRS) and several chemometric analyses. The concept of Mahalanobis' generalized distance, K nearest neighbors (KNN), and soft independent modeling of class analogy (SIMCA) were evaluated to determine the best analytical procedure. The difference in the accuracy of classification with the spectrophotometer, the wavelength range as the explanatory variables, and the light-exposure condition of the sample were examined in detail. It was difficult to apply Mahalanobis' generalized distances to the classification of wood-based materials where NIR spectra varied widely within the sample category. The performance of KNN in the NIR region (800–2500 nm), for which the device used in the laboratory was employed, exhibited a high rate of correct answers of validation (>98%) independent of the light-exposure conditions of the sample. When employing the device used in the field, both KNN and SIMCA revealed correct answers of validation (>88%) at wavelengths of 550–1010 nm. These results suggest the applicability of NIRS to a reasonable classification of used wood at the factory and at job sites.

...read moreread less

Patent•

Method and system for managing semiconductor manufacturing equipment

[...]

Shunji Hayashi¹•Institutions (1)

Oki Electric Industry¹

14 Nov 2003

TL;DR: A management method capable of making an accurate decision about a malfunction of the semiconductor manufacturing equipment includes sampling a plurality of data of at least one parameter under normal operating conditions of the manufacturing equipment.

...read moreread less

Abstract: A management method capable of making an accurate decision about a malfunction of the semiconductor manufacturing equipment includes sampling a plurality of data of at least one parameter under normal operating conditions of the semiconductor manufacturing equipment; generating a Mahalanobis space A from a group of sampled data; calculating a Mahalanobis distance from measured values of the parameter under ordinary operating conditions of the semiconductor manufacturing equipment; and deciding that a malfunction occurred in the semiconductor manufacturing equipment when the value of the Mahalanobis distance exceeds a predetermined value.

...read moreread less

Proceedings Article•

Improved Automatic Skin Detection in Color Images

[...]

Filipe Tomaz, Tiago Candeias, Hamid Reza Shahbazkia

01 Jan 2003

TL;DR: The present work uses automatic skin detection after an initial camera calibration using the TSL color space, where undesired effects are reduced and the skin distribution fits better in a Gaussian model than in others color spaces.

...read moreread less

Abstract: Mahalanobis distance has already proved its strength in hu- man skin detection using a set of skin values. We present this work that uses automatic skin detection after an initial camera calibration. The calibration is done by human sampling from test individuals. A scaling is performed on the work data, before applying the Mahalanobis distance that ensures better results than previous works. We use the TSL color space also used successfully by others authors, where undesired effects are reduced and the skin distribution fits better in a Gaussian model than in others color spaces. Also, using an initial filter, normally large areas of easily distinct non skin pixels, are eliminated from further processing. Analyzing and grouping the resulting elements from the discriminator, improves the ratio of correct detection and reduce the small non skin ar- eas present in a common complex image background, including Asiatic, Caucasian, African and interracial descent persons. Also this method is not restricted to orientation, size or grouping candidates. The present work is a first step in a approach for human face detection in color images, but not limited in any way to this goal.

...read moreread less

Journal Article•DOI•

[...]

Joni-Kristian Kamarainen¹, Ville Kyrki¹, Jarmo Ilonen¹, Heikki Kälviäinen¹•Institutions (1)

Lappeenranta University of Technology¹

01 Aug 2003-Pattern Recognition Letters

TL;DR: In this study, similarity measures are analyzed in the context of ordered histogram type data, such as gray-level histograms of digital images or color spectra, and the performance of the studied similarity measures can be improved using a smoothing projection, called neighbor-bank projection.

...read moreread less

Journal Article•DOI•

A study of parameter values for a Mahalanobis distance fuzzy classifier

[...]

Peter Deer¹, Peter W. Eklund²•Institutions (2)

Griffith University¹, University of Queensland²

16 Jul 2003-Fuzzy Sets and Systems

TL;DR: This paper attempts to rigorously justify previous experimental findings on suitable values for this fuzzy exponent, using the criterion that fuzzy set memberships reflect class proportions in the mixed pixels of a remotely sensed image.

...read moreread less

Proceedings Article•DOI•

Factor analysis based anomaly detection

[...]

Ningning Wu¹, J. Zhang•Institutions (1)

University of Arkansas¹

18 Jun 2003

TL;DR: This work presents a factor analysis-based network anomaly detection algorithm and applies it to DARPA intrusion detection evaluation data and results show that the proposed algorithm is able to detect network intrusions with relatively low false alarms.

...read moreread less

Abstract: We propose a novel anomaly detection algorithm based on factor analysis and Mahalanobis distance. Factor analysis is used to uncover the latent structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of factors. The Mahalanobis distance is used to determine the "similarity" of a set of values from an "unknown" sample to a set of values measured from a collection of "known" samples. Combined with factor analysis, Mahalanobis distance is extended to examine whether a given vector is an outlier from a model identified by "factors" based on factor analysis. We present a factor analysis-based network anomaly detection algorithm and apply it to DARPA intrusion detection evaluation data. The experimental results show that the proposed algorithm is able to detect network intrusions with relatively low false alarms.

...read moreread less

Journal Article•DOI•

COMPUTER INTRUSION DETECTION WITH CLASSIFICATION AND ANOMALY DETECTION, USING SVMs

[...]

Mike Fugate¹, James R. Gattiker¹•Institutions (1)

Los Alamos National Laboratory¹

01 May 2003-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: First, issues in supervised classification are discussed, then the incorporation of anomaly detection enhancing the modeling and prediction of cyber-attacks is incorporated, to investigate joint performance of classification and anomaly detection.

...read moreread less

Abstract: This paper describes experiences and results applying Support Vector Machine (SVM) to a Computer Intrusion Detection (CID) dataset. First, issues in supervised classification are discussed, then the incorporation of anomaly detection enhancing the modeling and prediction of cyber-attacks. SVM methods are seen as competitive with benchmark methods and other studies, and are used as a standard for the anomaly detection investigation. The anomaly detection approaches compare one class SVMs with a thresholded Mahalanobis distance to define support regions. Results compare the performance of the methods and investigate joint performance of classification and anomaly detection. The dataset used is the DARPA/KDD-99 publicly available dataset of features from network packets, classified into nonattack and four-attack categories.

...read moreread less

Journal Article•DOI•

Near-infrared Raman spectroscopy of human coronary arteries: histopathological classification based on Mahalanobis distance.

[...]

Landulfo Silveira¹, Sokki Sathaiah, Renato Amaro Zângaro, Marcos Tadeu Tavares Pacheco, Maria Cristina Chavantes, Carlos Augusto Pasqualucci - Show less +2 more•Institutions (1)

University of Paraíba Valley¹

01 Aug 2003-Journal of Clinical Laser Medicine & Surgery

TL;DR: An algorithm (with a minimum of mathematical and computational requirements) based on the discriminant analysis of spectral features has been developed to classify atherosclerotic lesions with high sensitivities and specificities.

...read moreread less

Abstract: Objective: In this study, near-infrared Raman spectroscopy (NIRS) was used for evaluation of human atherosclerotic lesions using a simple algorithm based on discriminant analysis. The Mahalanobis distance was used to classify the clustered spectral features extracted from NIRS of a total of 111 arterial fragments of human coronary arteries. Background Data: Raman spectroscopy has been used for diagnosis of a variety of diseases. For real-time applications, it is important to have a simple algorithm that could perform fast data acquisition and analysis. The ultimate goal is to obtain a feasible diagnosis, which discriminates various atherosclerotic lesions with high sensitivities and specificities. Materials and Methods: Non-atherosclerotic (NA) arteries, atherosclerotic plaques without calcification (NC), and atherosclerotic plaques with classification (C) were obtained and scanned with an NIR Raman spectrometer with 830-nm laser excitation. An algorithm based on the discriminant analysis using the Mahala...

...read moreread less

Color image segmentation using the dempster-shafer theory of evidence for the fusion of texture

[...]

J. B. Mena¹•Institutions (1)

University of Alcalá¹

01 Jan 2003

TL;DR: In this article, the Dempster-Shafer theory of evidence is applied to fuse the information from three different sources for the same image, and the results prove the potential of the method for real images starting from the three RGB bands only.

...read moreread less

Abstract: We present a new method for the segmentation of color images for extracting information from terrestrial, aerial or satellite images. It is a supervised method for solving a part of the automatic extraction problem. The basic technique consists in fusing information coming from three different sources for the same image. The first source uses the information stored in each pixel, by means of the Mahalanobis distance. The second uses the multidimensional distribution of the three bands in a window centred in each pixel, using the Bhattacharyya distance. The last source also uses the Bhattacharyya distance, in this case coocurrence matrices are compared over the cube texture built around each pixel. Each source represent a different order of statistic. The Dempster - Shafer theory of evidence is applied in order to fuse the information from these three sources. This method shows the importance of applying context and textural properties for the extraction process. The results prove the potential of the method for real images starting from the three RGB bands only. Finally, some examples about the extraction of linear cartographic features, specially roads, are shown.

...read moreread less

Patent•

Electroencephalogram diagnosis apparatus and method

[...]

Isao Yamaguchi¹, Kazuhisa Ichikawa¹•Institutions (1)

Fuji Xerox¹

18 Apr 2003

TL;DR: In this paper, the abnormality of the electroencephalogram is judged on the basis of the Mahalanobis distance, and a result of the judgment is outputted.

...read moreread less

Abstract: Measuring electrodes are disposed in positions T5 and T6 according to the international 10-20 system. Electroencephalographic data obtained from these measuring electrodes is received in an input portion, and converted into phase analysis data on a phase plane V-dV/dt by a phase analysis portion. By use of a set of feature parameters selected from an aspect ratio, a V-axis maximum value, a sub/total revolution number ratio and an RL/UB distribution ratio in a feature parameter calculating portion, a Mahalanobis distance is calculated in a Mahalanobis distance calculating portion. The abnormality of the electroencephalogram is judged on the basis of the Mahalanobis distance, and a result of the judgment is outputted.

...read moreread less

Journal Article•DOI•

EWMA Charts for Monitoring the Mean and the Autocovariances of Stationary Gaussian Processes

[...]

M. Rosołowski¹, Wolfgang Schmid¹•Institutions (1)

Goethe University Frankfurt¹

31 Dec 2003-Sequential Analysis

TL;DR: In this article, simultaneous individual control charts for the mean and the autocovariances of a stationary process are introduced, where the control statistic is obtained by exponentially smoothing these variables.

...read moreread less

Abstract: In this article simultaneous individual control charts for the mean and the autocovariances of a stationary process are introduced. All control schemes are EWMA (exponentially weighted moving average) charts. A multivariate quality characteristic is considered. It describes the behavior of the mean and the autocovariances. This quantity is transformed to a one-dimensional variable by using the Mahalanobis distance. The control statistic is obtained by exponentially smoothing these variables. Another control procedure is based on a multivariate EWMA recursion applied directly to our multivariate quality characteristic. After that the resulting statistic is transformed to a univariate random variable. Besides modified control charts we consider residual charts. For the residual charts the same procedure is used but the original observations are replaced by the residuals. In an extensive simulation study all control schemes are compared with each other. The target process is assumed to be an ARMA(1,...

...read moreread less

Proceedings Article•DOI•

Model-based Segmentation of Abdominal Aortic Aneurysms in CTA Images

[...]

Marleen de Bruijne¹, Bram van Ginneken¹, Wiro J. Niessen¹, Marco Loog¹, Max A. Viergever¹ - Show less +1 more•Institutions (1)

Utrecht University¹

15 May 2003

TL;DR: An automated method is presented, combining a three-dimensional shape model with a one-dimensional boundary appearance model, that effectively deals with a highly varying background and proposes a way of generalizing models of curvilinear structures from small training sets.

...read moreread less

Abstract: Segmentation of thrombus in abdominal aortic aneurysms is complicated by regions of low boundary contrast and by the presence of many neighboring structures in close proximity to the aneurysm wall. This paper presents an automated method that is similar to the well known Active Shape Models (ASM), which combine a three-dimensional shape model with a one-dimensional boundary appearance model. Our contribution is twofold: First, we show how the generalizability of a shape model of curvilinear objects can be improved by modeling the objects axis deformation independent of its cross-sectional deformation. Second, a non-parametric appearance modeling scheme that effectively deals with a highly varying background is presented. In contrast with the conventional ASM approach, the new appearance model trains on both true and false examples of boundary profiles. The probability that a given image profile belongs to the boundary is obtained using k nearest neighbor (kNN) probability density estimation. The performance of this scheme is compared to that of original ASMs, which minimize the Mahalanobis distance to the average true profile in the training set. A set of leave-one-out experiments is performed on 23 datasets. Modeling the axis and cross-section separately reduces the shape reconstruction error in all cases. The average reconstruction error was reduced from 2.2 to 1.6 mm. Segmentation using the kNN appearance model significantly outperforms the original ASM scheme; average volume errors are 5.9% and 46% respectively.

...read moreread less

Journal Article•

Assessment of remote sensing techniques for habitat mapping in coastal dune ecosystems

[...]

Sanjeevi Shanmugam, Neil S. Lucas, P.C. Phipps, Andrew E. Richards, M.J. Barnsley - Show less +1 more

21 May 2003-Journal of Coastal Research

TL;DR: In this paper, the authors used a linear mixture model, fuzzy membership functions and neural networks to map the relative proportions of sand and vegetation in the spectral end members of the spectral spectrum.

...read moreread less

Abstract: Bare sand and semi-fixed dunes represent ideal conditions for successionally young slack habitats that support rare species of coastal dune flora such as fen orchid (Liparis loeselii) and liverworts (e.g., Petallophyllum ralfsii). In ecologically significant and large dune systems, such as the Kenfig National Nature Reserve, UK, the identification and mapping of habitats and the provision of information on the relative proportion of sand and vegetation form the key to conservation management. To map this habitat, mapping algorithms are applied to Compact Airborne Spectrographic Imager (CASI) data. Per-pixel mapping was performed using the minimum distance, maximum likelihood and Mahalanobis distance classification algorithms with training data extracted for habitats at various levels of the National Vegetation Classification (NVC) scheme. Sub-pixel mapping was performed using a linear mixture model, fuzzy membership functions and neural networks, and the sub-pixel proportions of the spectral end members viz. sand, vegetation and shade/moisture were defined. Results indicate that per-pixel mapping can only be achieved for broad habitat categories that correspond to level I of the NVC. Of the algorithms used, the minimum distance, with an overall mapping accuracy of 92%, outperforms both maximum likelihood and Mahalanobis distance. Results from the sub-pixel algorithms indicate that all three techniques can be used to map the relative proportions of sand and vegetation. It is argued that both approaches provide baseline maps that are required to implement successfully an effective dune conservation programme.

...read moreread less