scispace - formally typeset
Search or ask a question
Topic

Independence (probability theory)

About: Independence (probability theory) is a research topic. Over the lifetime, 1887 publications have been published within this topic receiving 55533 citations. The topic is also known as: independent & statistically independent.


Papers
More filters
Journal ArticleDOI
TL;DR: An efficient algorithm is proposed, which allows the computation of the ICA of a data matrix within a polynomial time and may actually be seen as an extension of the principal component analysis (PCA).

8,522 citations

Proceedings ArticleDOI
07 Aug 2005
TL;DR: The relationship between the predictions made by different learning algorithms and true posterior probabilities is examined, showing that maximum margin methods such as boosted trees and boosted stumps push probability mass away from 0 and 1 yielding a characteristic sigmoid shaped distortion in the predicted probabilities.
Abstract: We examine the relationship between the predictions made by different learning algorithms and true posterior probabilities. We show that maximum margin methods such as boosted trees and boosted stumps push probability mass away from 0 and 1 yielding a characteristic sigmoid shaped distortion in the predicted probabilities. Models such as Naive Bayes, which make unrealistic independence assumptions, push probabilities toward 0 and 1. Other models such as neural nets and bagged trees do not have these biases and predict well calibrated probabilities. We experiment with two ways of correcting the biased probabilities predicted by some learning methods: Platt Scaling and Isotonic Regression. We qualitatively examine what kinds of distortions these calibration methods are suitable for and quantitatively examine how much data they need to be effective. The empirical results show that after calibration boosted trees, random forests, and SVMs predict the best probabilities.

1,320 citations

Journal ArticleDOI
TL;DR: This article considers high-order measures of independence for the independent component analysis problem and discusses the class of Jacobi algorithms for their optimization and compares the proposed approaches with gradient-based techniques from the algorithmic point of view and also on a set of biomedical data.
Abstract: This article considers high-order measures of independence for the independent component analysis problem and discusses the class of Jacobi algorithms for their optimization. Several implementations are discussed. We compare the proposed approaches with gradient-based techniques from the algorithmic point of view and also on a set of biomedical data.

1,271 citations

Book
05 Apr 1976
TL;DR: In this paper, the Euler-MacLaurin sum formula for functions of several variables has been applied to the problem of convergence of probability measures and uniformity classes, and it has been shown that it is possible to obtain strong convergence for continuous, singular, and discrete probability measures.
Abstract: Preface to the Classics Edition Preface 1. Weak convergence of probability measures and uniformity classes 2. Fourier transforms and expansions of characteristic functions 3. Bounds for errors of normal approximation 4. Asymptotic expansions-nonlattice distributions 5. Asymptotic expansions-lattice distributions 6. Two recent improvements 7. An application of Stein's method Appendix A.1. Random vectors and independence Appendix A.2. Functions of bounded variation and distribution functions Appendix A.3. Absolutely continuous, singular, and discrete probability measures Appendix A.4. The Euler-MacLaurin sum formula for functions of several variables References Index.

1,125 citations

Journal ArticleDOI
TL;DR: In this article, the authors show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Gaussian, and (log p)/n → 0, and obtain explicit rates.
Abstract: This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Gaussian, and (log p)/n → 0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general cross-validation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. 1. Introduction. Estimation of covariance matrices is important in a number of areas of statistical analysis, including dimension reduction by principal component analysis (PCA), classification by linear or quadratic discriminant analysis (LDA and QDA), establishing independence and conditional independence relations in the context of graphical models, and setting confidence intervals on linear functions of the means of the components. In recent years, many application areas where these tools are used have been dealing with very high-dimensional datasets, and sample sizes can be very small relative to dimension. Examples include genetic data, brain imaging, spectroscopic imaging, climate data and many others. It is well known by now that the empirical covariance matrix for samples of size n from a p-variate Gaussian distribution, Np(μ, � p), is not a good estimator of the population covariance if p is large. Many results in random matrix theory illustrate this, from the classical Mary law [29] to the more recent work of Johnstone and his students on the theory of the largest eigenvalues [12, 23, 30] and associated eigenvectors [24]. However, with the exception of a method for estimating the covariance spectrum [11], these probabilistic results do not offer alternatives to the sample covariance matrix. Alternative estimators for large covariance matrices have therefore attracted a lot of attention recently. Two broad classes of covariance estimators have emerged: those that rely on a natural ordering among variables, and assume that variables

1,052 citations


Network Information
Related Topics (5)
Matrix (mathematics)
105.5K papers, 1.9M citations
86% related
Estimator
97.3K papers, 2.6M citations
86% related
Monte Carlo method
95.9K papers, 2.1M citations
84% related
Nonlinear system
208.1K papers, 4M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20244
20232,340
20225,496
2021141
2020103
2019110