Topic

# Dimensionality reduction

About: Dimensionality reduction is a(n) research topic. Over the lifetime, 21987 publication(s) have been published within this topic receiving 579272 citation(s). The topic is also known as: dimension reduction & dimensional reduction.

##### Papers published on a yearly basis

##### Papers

More filters

•

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

Abstract: We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. t-SNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for high-dimensional data that lie on several different, but related, low-dimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large datasets, we show how t-SNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of t-SNE on a wide variety of datasets and compare it with many other non-parametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by t-SNE are significantly better than those produced by the other techniques on almost all of the datasets.

22,120 citations

•

01 May 1986Abstract: Introduction * Properties of Population Principal Components * Properties of Sample Principal Components * Interpreting Principal Components: Examples * Graphical Representation of Data Using Principal Components * Choosing a Subset of Principal Components or Variables * Principal Component Analysis and Factor Analysis * Principal Components in Regression Analysis * Principal Components Used with Other Multivariate Techniques * Outlier Detection, Influential Observations and Robust Estimation * Rotation and Interpretation of Principal Components * Principal Component Analysis for Time Series and Other Non-Independent Data * Principal Component Analysis for Special Types of Data * Generalizations and Adaptations of Principal Component Analysis

17,426 citations

••

15 Oct 2005Abstract: When large multivariate datasets are analyzed, it is often desirable to reduce their dimensionality. Principal component analysis is one technique for doing this. It replaces the p original variables by a smaller number, q, of derived variables, the principal components, which are linear combinations of the original variables. Often, it is possible to retain most of the variability in the original variables with q very much smaller than p. Despite its apparent simplicity, principal component analysis has a number of subtleties, and it has many uses and extensions. A number of choices associated with the technique are briefly discussed, namely, covariance or correlation, how many components, and different normalization constraints, as well as confusion with factor analysis. Various uses and extensions are outlined.
Keywords:
dimension reduction;
factor analysis;
multivariate analysis;
variance maximization

14,631 citations

••

Abstract: High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder" networks, but this works well only if the initial weights are close to a good solution. We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.

14,206 citations

••

TL;DR: Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.

Abstract: Many areas of science depend on exploratory data analysis and visualization. The need to analyze large amounts of multivariate data raises the fundamental problem of dimensionality reduction: how to discover compact representations of high-dimensional data. Here, we introduce locally linear embedding (LLE), an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs. Unlike clustering methods for local dimensionality reduction, LLE maps its inputs into a single global coordinate system of lower dimensionality, and its optimizations do not involve local minima. By exploiting the local symmetries of linear reconstructions, LLE is able to learn the global structure of nonlinear manifolds, such as those generated by images of faces or documents of text.

13,822 citations