scispace - formally typeset
Search or ask a question

Showing papers on "Principal component analysis published in 1969"


Book ChapterDOI
Joseph B. Kruskal1
01 Jan 1969
TL;DR: A major problem in data analysis is to find any structure in a set of multivariate observations, if each observation is represented as a point in multidimensional space, this means finding the structure of a configuration of points in high-dimensional space.
Abstract: Publisher Summary A major problem in data analysis is to find any structure in a set of multivariate observations. If each observation is represented as a point in multidimensional space, this means finding the structure of a configuration of points in high-dimensional space. To find linear relationships among the variables, linear regression, principal components, and factor analysis are often used. One very simple and important kind of structure is clustering. Whenever the points cluster together, the knowledge of this is almost sure to be useful to the man who is interested in the data. Some structure-seeking methods depend on the distances between the points. For example, many cluster-seeking techniques look for collections of points whose interpoint distances are small in some sense. Similarly, the method of parametric mapping starts by calculating the matrix of interpoint distances of the original configuration and subsequently works only with that.

143 citations


Journal ArticleDOI
TL;DR: Binary-encoded descriptions of 85 named cultures and 15 unnamed soil isolates were analyzed by a two-stage principal component procedure including the condensation of the water vapour in order to characterize the phytochemical properties of these soils.
Abstract: Binary-encoded descriptions of 85 named cultures (mostly of soil origin) and of 15 unnamed soil isolates were analyzed by a two-stage principal component procedure including the condensation of the...

24 citations


Journal ArticleDOI
TL;DR: The specific technique of principal component analysis is developed into a more general component analysis approach that can lead to a useful condensation of a mass of data, a better under-standing of the observed individuals as entities rather than collections of isolated measurements, and to the formulation of new hypotheses for subsequent examination.
Abstract: Principal component analysis is a mathematical technique for summarizing a set of related measurements as a set of derived variates, frequently fewer in number, which are definable as independent linear functions of the original measurements. Consideration of their mathematical nature shows that they do not, themselves, necessarily correspond to sensible biological concepts, though they are more amenable to statistical study than the original measurements. Further, by assessing the extent to which they are in accordance with biological hypotheses, or with the results of other, similar, analyses, they can be transformed into other linear functions which are meaningful in the biological sense, or consistent with other results. Thus the specific technique of principal component analysis is developed into a more general component analysis approach. With proper regard for the purpose the analysis is intended to serve and for the mathematical restrictions involved, this approach can lead to a useful condensation of a mass of data, a better under-standing of the observed individuals as entities rather than collections of isolated measurements, and to the formulation of new hypotheses for subsequent examination.

20 citations


Journal ArticleDOI
TL;DR: This method is applied to data on F3 lines from two rice varieties that were previously used for analysing 'genetic plant types' and the results show that the "genetic vectors" are comparable to the component vectors obtained from genetic correlations.
Abstract: A method is described for estimating the genetic colntributions, to individual variates, of principal components extracted from phenotypic correlations. The contributions are given in terms of regression coefficients of the genetic values of individual variates on the principal components. The vector of regression coefficients is called the 'genetic vector.' The environmental contributions can also be estirmated similarly. This method is applied to data on F3 lines from two rice varieties that were previously used for analysing 'genetic plant types.' The results show that the 'genetic vectors' are comparable to the component vectors obtained from genetic correlations.

18 citations


Journal ArticleDOI
TL;DR: In this article, principal component analysis (PCA) and canonical analysis (CA) are used to investigate the responses of organisms considered as a whole, whereas established statistical methods are usually concerned with measured characteristics considered one at a time.
Abstract: Multivariate statistical methods are used increasingly in biological research to investigate the responses of organisms considered as a whole, whereas established statistical methods are usually concerned with measured characteristics considered one at a time. Multivariate techniques are mostly explained in terms of matrix algebra, which is a way of dealing with groups of numbers rather than individual ones. A brief description is given of some elementary results of matrix algebra and a method is presented whereby hypotheses can be generated about interrelations within an organism. Two techniques, principal component analysis and canonical analysis, are described in greater detail. It is emphasized that hypotheses need to be tested even though they have been generated by objective statistical means.

14 citations


Journal ArticleDOI
TL;DR: In this paper, an algorithm is described that serves to fit a linear latent structure model to multicategory data, which has the formal properties of a (generalized) common factor model, in contrast to principal component analysis.
Abstract: An algorithm is described that serves to fit a linear latent structure model to multicategory data. The model has the formal properties of a (generalized) common factor model, in contrast to principal component analysis. An empirical example is given.

14 citations


Journal ArticleDOI
James M. Parks1
TL;DR: The FORTRAN IV program as discussed by the authors performs a distance function principal components analysis to compute orthogonal (uncorrelated) factor measurements for the distance function cluster analysis of locations.
Abstract: Facies maps are constructed from paleontologic and lithologic data to depict major and subtle depositional environment differences across a region during a specified time span. Ratio maps and three-component maps exhibit a lack of discrimination because they cannot incorporate all available data. Factor analysis and cluster analysis techniques can be used to construct truly multivariate facies maps. Earlier attempts at factor or cluster analysis multivariate facies maps had one or more deficiencies: (1) inability to handle a sufficient number of variables and locations; (2) inability to handle mixed-mode data (presence-absence, coded states, integer counts, and continuously variable measurements); (3) inability to take into account redundant or highly correlated variables and (4) inability to accommodate to missing data. A new cluster analysis classification computer program has been written to overcome these deficiencies. The FORTRAN IV program can utilize up to 200 variables on as many as 1,000 stations. It performs a distance function principal components analysis to compute orthogonal (uncorrelated) factor measurements for a distance function cluster analysis of locations. This combination will handle mixed-mode data and will adjust to missing data. From the resulting multivariate classification of paleontologic and lithological data, a facies map showing the distribution of the various classes was constructed and compared with previously published facies maps. An example using multivariate lithologic data from coded AmStrat sample-description logs from central Montana demonstrates the potentialities of this method. End_of_Article - Last_Page 735------------

8 citations