Showing papers on "Mahalanobis distance published in 1986"

PDF

Open Access

Journal Article•DOI•

Standard Distance in Univariate and Multivariate Analysis

[...]

Bernhard K. Flury¹, Hans Riedwyl¹•Institutions (1)

01 Aug 1986-The American Statistician

TL;DR: In this article, the standard distance for the quantity in univariate analysis is generalized to the multivariate situation, where it coincides with the square root of the Mahalanobis distance between two samples.

...read moreread less

Abstract: We propose to use the term standard distance for the quantity in univariate analysis and show that it can be easily generalized to the multivariate situation, where it coincides with the square root of the Mahalanobis distance between two samples.

...read moreread less

245 citations

Journal Article•DOI•

Choice of the smoothing parameter and efficiency of k-nearest neighbor classification

[...]

Gregory G. Enas, Sung C. Choi¹•Institutions (1)

VCU Medical Center¹

01 Feb 1986-Computers & Mathematics With Applications

TL;DR: An adaptive rule which selects k by iteratively maximizing the local Mahalanobis distance is shown to be efficient, thus abrogating the need to know the underlying population variance-covariance structure.

...read moreread less

Abstract: A simulation study was performed to investigate the sensitivity of the k -nearest neighbor (NN k ) rule of classification to the choice of k . The optimal choice of k was found to be a function of the dimension of the sample space, the size of the space, the covariance structure and the sample proportions. The nearest neighbor rules chosen using the k suggested by the simulations had correct classification rates at least as high as those rates for the linear discriminant function and the logistic regression method. In particular, the rule became more efficient as the difference in the covariance matrices increased, and also when the difference in sample proportion was large. An adaptive rule which selects k by iteratively maximizing the local Mahalanobis distance is shown to be efficient, thus abrogating the need to know the underlying population variance-covariance structure.

...read moreread less

83 citations

Journal Article•DOI•

Testing for Normality in Arbitrary Dimension

[...]

Sandor Csorgo

01 Jun 1986-Annals of Statistics

TL;DR: In this article, the univariate weak convergence theorem of Murota and Takeuchi (1981) is extended for the Mahalanobis transform of the empirical characteristic function, and a maximal deviation statistic is proposed for testing the composite hypothesis of $d$-variate normality.

...read moreread less

Abstract: The univariate weak convergence theorem of Murota and Takeuchi (1981) is extended for the Mahalanobis transform of the $d$-variate empirical characteristic function, $d \geq 1$. Then a maximal deviation statistic is proposed for testing the composite hypothesis of $d$-variate normality. Fernique's inequality is used in conjunction with a combination of analytic, numerical analytic, and computer techniques to derive exact upper bounds for the asymptotic percentage points of the statistic. The resulting conservative large sample test is shown to be consistent against every alternative with components having a finite variance. (If $d = 1$ it is consistent against every alternative.) Monte Carlo experiments and the performance of the test on some well-known data sets are also discussed.

...read moreread less

70 citations

Journal Article•DOI•

Cluster Analysis as an Experimental Design Generator, With Application to Gasoline Blend ing Experiments

[...]

Peter J. Zemroch

01 Feb 1986-Technometrics

TL;DR: In this article, the authors used the Furthestneighbor cluster analysis to select experimental design points from existing series of candidates when the design variables are too interrelated to be manipulated independently.

...read moreread less

Abstract: This article concerns the selection of experimental design points from existing series of candidates when the design variables are too interrelated to be manipulated independently. Designs with an even spread of points are shown to estimate the parameters of an assumed linear or polynomial model reasonably efficiently while providing good tests of lack of fit. Furthestneighbor cluster analysis can be used to select the points of such a design under either the Euclidean or the Mahalanobis measure of distance. The technique is used to select the base fuels in actual series of experiments to measure the effect of blending a particular alcohol into gasolines. A new blending model parameterization is proposed, which relates the blending octane number of this alcohol to both its concentration and to the properties of the base fuel. An analagous generalized least squares model is discussed, which gives a simple expression for the expected mean squares in different error strata.

...read moreread less

32 citations

Journal Article•DOI•

Asymptotic error rates of the W and Z statistics when the training observations are dependent

[...]

Charles R. O. Lawoko¹, Geoffrey J. McLachlan¹•Institutions (1)

University of Queensland¹

01 Jan 1986-Pattern Recognition

TL;DR: It is shown that neither Z nor W is absolutely superior to the other and their relative performance is dependent on the extent of correlation among the training observations and the size of the separation between the two populations, as measured by the Mahalanobis distance.

...read moreread less

12 citations

Journal Article•DOI•

On the estimation of the expected probability of misclassification in discriminant analysis with mixed binary and continuous variables

[...]

Ioannis G. Vlachonikolis¹•Institutions (1)

University of Oxford¹

01 Feb 1986-Computers & Mathematics With Applications

TL;DR: In this article, Monte Carlo estimates have been obtained for the unconditional probability of misclassification incurred by the "estimative" optimum allocation rule in discriminant analysis involving mixtures of binary and continuous variables.

...read moreread less

Abstract: Monte Carlo estimates have been obtained for the unconditional probability of misclassification incurred by the “estimative” optimum allocation rule in discriminant analysis involving mixtures of binary and continuous variables. The rule is based on the location model and leads effectively to a different linear discriminant function for each of the multinomial locations defined by the binary variables. A comparison is made between the Monte Carlo estimates and an approximation based on an asymptotic expansion of the distribution of the location “estimative” linear discriminant function derived by Vlachonikolis. Results are presented for various combinations involving equal sample sizes of 50, 100 and 200; two and three binary variables; one, three and five continuous variables; three different settings of location Mahalanobis distances and several choices of location probabilities.

...read moreread less

7 citations

Book Chapter•DOI•

Principal Components Used with Other Multivariate Techniques

[...]

Ian T. Jolliffe¹•Institutions (1)

University of Kent¹

01 Jan 1986

TL;DR: In this article, the authors discuss three multivariate techniques, namely discriminant analysis, cluster analysis and canonical correlation analysis; for each of these three techniques, examples are given in the literature which use PCA as a dimension-reducing technique.

...read moreread less

Abstract: Principal component analysis is often used as a dimension-reducing technique within some other type of analysis. For example, Chapter 8 described the use of PCs as regressor variables in a multiple regression analysis. The present chapter discusses three multivariate techniques, namely discriminant analysis, cluster analysis and canonical correlation analysis; for each of these three techniques, examples are given in the literature which use PCA as a dimension-reducing technique.

...read moreread less

4 citations

Journal Article•DOI•

Monte carlo study of forward stepwise discrimination based on small samples

[...]

M.C. Costanza¹, T. Ashikaga¹•Institutions (1)

University of Vermont¹

01 Feb 1986-Computers & Mathematics With Applications

TL;DR: In this paper, the classification performance of forward subset selection procedures designed for use in the two group, P -variate normal classification problem was examined in Monte Carlo studies of 54 cases where the P measurements were statistically independent and provided an optimal probability of correct classification of 90%.

...read moreread less

Abstract: The classification performance of forward subset selection procedures designed for use in the two group, P -variate normal classification problem was examined in Monte Carlo studies of 54 cases where the P measurements were statistically independent and provided an optimal probability of correct classification of 90%. The cases were characterized by differing reference sample sizes, sample size ratios and different rates at which the Mahalanobis distances would increase if the forward selection algorithm were applied to the population parameters. Classification performance appears to be dependent upon these underlying rates, which would be unknown in practice. Therefore, uniform specification of optimal “significance levels” for the standard F tests cannot be made A two-stage subset selection procedure which involves determining this rate before applying the F tests is suggested.

...read moreread less

4 citations

Journal Article•DOI•

Asymptotic consistency for subset selection procedures satisfying the P∗-condition

[...]

Jan F. Bjørnstad¹•Institutions (1)

University of Tromsø¹

01 Jan 1986-Journal of Statistical Planning and Inference

TL;DR: In this paper, the authors consider the multiple decision problem of subset selection, restricting attention to procedures which control the probability that the best population is selected, and derive necessary and sufficient conditions for both pointwise and uniform (on compact sets) consistency.

...read moreread less

2 citations

Proceedings Article•DOI•

On the dimensionality of steady-state vowel normalization

[...]

D. Friedman¹•Institutions (1)

Mitre Corporation¹

01 Apr 1986

TL;DR: Vowel classification is considered from the viewpoint of cluster separation in a vector space, with Mahalanobis distance as the criterion and the number of significant axes of variation needed to characterize each speaker is found to be on the order of four.

...read moreread less

Abstract: Vowel classification is considered from the viewpoint of cluster separation in a vector space, with Mahalanobis distance as the criterion. The number of significant axes of variation needed to characterize each speaker, weighted with respect to cluster separation, is found from actual formant data to be on the order of four, and the potential improvement in separation accountable to structure in the data is estimated at about 3 db by comparison with results for the same procedure applied to random data.

...read moreread less

2 citations

Journal Article•DOI•

Statistical Analysis of Mahalanobis Distances of Students from their Fathers

[...]

Ayala Cohen¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 May 1986-Sociological Methods & Research

TL;DR: In this paper, a random sample of eleven-grade students were asked about certain work values and their fathers were also asked about the same work values, and a question raised in the study was whether eleventh grade students were closer to their fathers or to their age group with regard to those work values.

...read moreread less

Abstract: A random sample of eleventh-grade students were asked about certain work values. For a subsample of these students, their fathers were also asked about the same work values. A question raised in the study was whether eleventh-grade students were closer to their fathers or to their age group with regard to those work values. To answer this question, we suggest a method based on Mahalanobis distances; two aspects of the problem are discussed. First, how the distances are constructed; second, which statistical procedures apply for assessing significant differences among the distances. Formal statistical tests are employed, as well as graphical methods for summarizing the results. Unlike classical methods, the procedure suggested does not require distributional assumptions.

...read moreread less

Journal Article•DOI•

Judgment of outlying observations in automatically measured SI wafer inspection data

[...]

Yukio Aoki¹, Tooru Toyabes¹, Shozo Shimada², Shoji Sato¹•Institutions (2)

Hitachi¹, Hosei University²

01 Jan 1986-Electronics and Communications in Japan Part Ii-electronics

TL;DR: The conclusions are that the combination of the principal component analysis method and the method by Mahalanobis's distance is useful in saving manpower for routine judgment of the same data produced in the same line for several measured items.

...read moreread less

Abstract: Research and development of LSIs require rapid evaluation of the characteristics of fabricated devices For this purpose automatic measurement systems have been developed in various laboratories Some outlying data unavoidably exist in the data collected by the automatic data acquisition system, and it is necessary to judge the outliers in data processing In this paper two algorithms to judge outliers are examined and one new algorithm is presented They applied for Si wafer inspection data collected automatically The three algorithms are the outlier judgment method proposed by Grubbs, the judgment method using Mahalanobis's distance and the method combining the principal component analysis method with the latter method The conclusions are as follows: (1) Grubb's method is easy to use for one kind of measured items (1) Mahalanobis's method is useful for several kinds of measured items and is especially effective when they are correlated (3) The combination of the principal component analysis method and the method by Mahalanobis's distance is useful in saving manpower for routine judgment of the same data produced in the same line for several measured items

...read moreread less

Journal Article•DOI•

Classification of an observation into linear manifolds,their closed subsets and closed convex subsets

[...]

R.P. Bhargava¹•Institutions (1)

Ontario Institute for Studies in Education¹

01 Jan 1986-Communications in Statistics-theory and Methods

TL;DR: In this paper, the authors generalize the notion of classification of an observation (sample), into one of the given n populations to the case where some or all of the populations into which the new observation is to be classified may be new but related in a simple way to the given N populations.

...read moreread less

Abstract: In this paper, we generalize the notion of classification of an observation (sample), into one of the given n populations to the case where some or all of the populations into which the new observation is to be classified may be new but related in a simple way to the given n populations. The discussion is in the frame-work of the given set of observations obeying the usual multivariate general linear hypothesis model. The set ofpopulations into which the new observation may be classified could be linear manifolds of the parameter space or their closed subsets or closed convex subsets or a combination of them or simply t subsets of the parameter space each of which has a finite number of elements. In the last case alikelihood ratio procedure can be obtained easily. Classification procedures given here are based on Mahalanobis distance. Bonferroni lower bound estimate of the probability of correctly classifying an observation is given for the case when the covariance matrix is known or is estimated from a l...

...read moreread less

Journal Article•DOI•

Two Basic Programs to Compute Hotelling's T2 Statistic

[...]

Stephen Powers, Patricia B. Jones¹•Institutions (1)

University of Arizona¹

01 Sep 1986-Educational and Psychological Measurement

TL;DR: In this article, the authors describe two BASIC computer programs that calculate Hotelling's T2 statistic either for one sample or for two samples, and output of the program includes the Mahalanobis distance D2, the F ratio associated with T2, and its probability level.

...read moreread less

Abstract: This paper describes two BASIC computer programs that calculate Hotelling's T2 statistic either for one sample or for two samples. Output of the program includes the Mahalanobis distance D2, the F ratio associated with T2, and its probability level.

...read moreread less