scispace - formally typeset
Search or ask a question
Book ChapterDOI

Relations Between Two Sets of Variates

01 Dec 1936-Biometrika (Springer, New York, NY)-Vol. 28, pp 321-377
TL;DR: The concept of correlation and regression may be applied not only to ordinary one-dimensional variates but also to variates of two or more dimensions as discussed by the authors, where the correlation of the horizontal components is ordinarily discussed, whereas the complex consisting of horizontal and vertical deviations may be even more interesting.
Abstract: Concepts of correlation and regression may be applied not only to ordinary one-dimensional variates but also to variates of two or more dimensions. Marksmen side by side firing simultaneous shots at targets, so that the deviations are in part due to independent individual errors and in part to common causes such as wind, provide a familiar introduction to the theory of correlation; but only the correlation of the horizontal components is ordinarily discussed, whereas the complex consisting of horizontal and vertical deviations may be even more interesting. The wind at two places may be compared, using both components of the velocity in each place. A fluctuating vector is thus matched at each moment with another fluctuating vector. The study of individual differences in mental and physical traits calls for a detailed study of the relations between sets of correlated variates. For example the scores on a number of mental tests may be compared with physical measurements on the same persons. The questions then arise of determining the number and nature of the independent relations of mind and body shown by these data to exist, and of extracting from the multiplicity of correlations in the system suitable characterizations of these independent relations. As another example, the inheritance of intelligence in rats might be studied by applying not one but s different mental tests to N mothers and to a daughter of each
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the statistical tests used in the analysis of structural equation models with unobservable variables and measurement error are examined, and a drawback of the commonly applied chi square test, in additit...
Abstract: The statistical tests used in the analysis of structural equation models with unobservable variables and measurement error are examined. A drawback of the commonly applied chi square test, in addit...

56,555 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: An analytical strategy for integrating scRNA-seq data sets based on common sources of variation is introduced, enabling the identification of shared populations across data sets and downstream comparative analysis.
Abstract: Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.

7,741 citations

Journal ArticleDOI
TL;DR: FactoMineR an R package dedicated to multivariate data analysis with the possibility to take into account different types of variables (quantitative or categorical), different kinds of structure on the data, and finally supplementary information (supplementary individuals and variables).
Abstract: In this article, we present FactoMineR an R package dedicated to multivariate data analysis. The main features of this package is the possibility to take into account different types of variables (quantitative or categorical), different types of structure on the data (a partition on the variables, a hierarchy on the variables, a partition on the individuals) and finally supplementary information (supplementary individuals and variables). Moreover, the dimensions issued from the different exploratory data analyses can be automatically described by quantitative and/or categorical variables. Numerous graphics are also available with various options. Finally, a graphical user interface is implemented within the Rcmdr environment in order to propose an user friendly package.

6,472 citations

Book
01 Jan 2004
TL;DR: This book provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.
Abstract: Kernel methods provide a powerful and unified framework for pattern discovery, motivating algorithms that can act on general types of data (e.g. strings, vectors or text) and look for general types of relations (e.g. rankings, classifications, regressions, clusters). The application areas range from neural networks and pattern recognition to machine learning and data mining. This book, developed from lectures and tutorials, fulfils two major roles: firstly it provides practitioners with a large toolkit of algorithms, kernels and solutions ready to use for standard pattern discovery problems in fields such as bioinformatics, text analysis, image analysis. Secondly it provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.

6,050 citations


Cites background from "Relations Between Two Sets of Varia..."

  • ...A few years later in 1936, Hotelling [62] further introduced canonical correlation analysis (CCA), with the article ‘Relations between two sets of variables’....

    [...]