Laplacian Eigenmaps for dimensionality reduction and data representation
Summary (2 min read)
1 Introduction
- In many areas of artificial intelligence, information retrieval, and data mining, one is often confronted with intrinsically low-dimensional data lying in a very high-dimensional space.
- The general problem of dimensionality reduction has a long history.
- Classical approaches include principal components analysis (PCA) and multidimensional scaling.
- Most of these methods do not explicitly consider the structure of the manifold on which the data may possibly reside.
- Thus, the embedding maps for the data approximate the eigenmaps of the Laplace Beltrami operator, which are maps intrinsically defined on the entire manifold.
2 The Algorithm
- The embedding map is now provided by computing the eigenvectors of the graph Laplacian.
- Step 1 (constructing the adjacency graph).
- Easier to choose; does not tend to lead to disconnected graphs, also known as Advantages.
- Here as well, the authors have two variations for weighting the edges: (a) Heat kernel (parameter t ∈ R).
- This simplification avoids the need to choose t. 3. Step 3 .
3 Justification
- Let us first show that the embedding provided by the Laplacian eigenmap algorithm preserves local information optimally in a certain sense.
- The following section is based on standard spectral graph theory.
- It follows from equation 3.1 that L is a positive semidefinite matrix, and the vector y that minimizes the objective function is given by the minimum eigenvalue solution to the generalized eigenvalue problem: Ly = λDy.
- For the one-dimensional embedding problem, the constraint prevents collapse onto a point.
- This observation leads to several possible approximation schemes for the manifold Laplacian.
4 Connections to Spectral Clustering
- The approach to dimensionality reduction considered in this letter uses maps provided by the eigenvectors of the graph Laplacian and eigenfunctions of Laplace Beltrami operator on the manifold.
- The approach considered there uses a graph that is globally connected with exponentially decaying weights.
- The weight Wij associated with the edge eij is the similarity between vi and vj.
- The authors assume that the matrix of pairwise similarities is symmetric and the corresponding undirected graph is connected.
- The central observation to be made here is that the process of dimensionality reduction that preserves locality yields the same solution as clustering.
5 Analysis of Locally Linear Embedding Algorithm
- The authors provide a brief analysis of the LLE algorithm recently proposed by Roweis and Saul (2000) and show its connection to the Laplacian.
- Step 1 (discovering the adjacency information).
- Let Wij be such that∑ j Wijxij equals the orthogonal projection of xi onto the affine linear span of xij ’s.
- The authors develop this argument over several steps: Step 1: Let us fix a data point xi.
- Since the difference of two points can be regarded as a vector with the origin at the second point, the authors see that vj’s are vectors in the tangent plane originating at o.
6 Examples
- The authors now briefly consider several possible applications of the algorithmic framework developed in this letter.
- The authors choose 1000 images (500 containing vertical bars and 500 containing horizontal bars) at random.
- Each word is represented as a vector in a 600-dimensional space using information about the frequency of its left and right neighbors (computed from the corpus).
- Points mapped to the same region in the representation space share similar phonetic features, though points with the same label may originate from different occurrences of the same phoneme.
7 Conclusions
- The authors introduced a coherent framework for dimensionality reduction for the case where data reside on a low-dimensional manifold embedded in a higher-dimensional space.
- They do not in general provide an isometric embedding.
- It is unclear how to estimate reliably even such a simple invariant as the intrinsic dimensionality of the manifold.
- There are further issues pertaining to their framework that need to be sorted out.
- First, the authors have implicitly assumed a uniform probability distribution on the manifold according to which the data points have been sampled.
Did you find this useful? Give us your feedback
Citations
11,201 citations
9,141 citations
6,783 citations
5,390 citations
References
29,130 citations
15,106 citations
13,652 citations
11,827 citations
9,043 citations
Related Papers (5)
Frequently Asked Questions (13)
Q2. What future works have the authors mentioned in the paper "Laplacian eigenmaps for dimensionality reduction and data representation" ?
There are further issues pertaining to their framework that need to be sorted out.
Q3. What is the Riemannian structure on the manifold?
If the manifold is embedded in Rl, the Riemannian structure (metric tensor) on the manifold is induced by the standard Riemannian structure on Rl.
Q4. What is the dimensionality of the space of all images of the same object?
the intrinsic dimensionality of the space of all images of the same object is the number of degrees of freedom of the camera.
Q5. What is the simplest way to approximate the distances on the manifold?
One approach to nonlinear dimensionality reduction as exemplified by Tenenbaum et al. attempts to approximate all geodesic distances on the manifold faithfully.
Q6. What is the eigenfunction of the Laplace Beltrami operator?
Let the eigenvalues (in increasing order) be 0 = λ0 ≤ λ1 ≤ λ2 ≤ . . . , and let fi be the eigenfunction corresponding to eigenvalue λi.
Q7. What is the simplest way to construct a weighted graph?
Given k points x1, . . . , xk in Rl, the authors construct a weighted graph with k nodes, one for each point, and a set of edges connecting neighboring points.
Q8. What is the representation map generated by the algorithm?
The representation map generated by the algorithm may be viewed as a discrete approximation to a continuous map that naturally arises from the geometry of the manifold.
Q9. What is the implication of the previous remark?
Since much of the discussion of Seung and Lee (2000), Roweis and Saul (2000), and Tenenbaum et al. (2000) is motivated by the role that nonlinear dimensionality reduction may play in human perception and learning, it is worthwhile to consider the implication of the previous remark in this context.
Q10. What constraint prevents collapse onto a subspace of dimension less than m1?
For the m-dimensional embedding problem, the constraint presented above prevents collapse onto a subspace of dimension less than m−1 (m if, as in one-dimensional case, the authors require orthogonality to the constant vector).
Q11. What is the way to estimate the isometric embedding of a manif?
The celebrated Nash’s embedding theorem (Nash, 1954) guarantees that an n-dimensional manifold admits an isometric C1 embedding into a 2n+1–dimensional Euclidean space.
Q12. How is the word represented in the Brown corpus?
Each word is represented as a vector in a 600-dimensional space using information about the frequency of its left and right neighbors (computed from the corpus).
Q13. What is the eigenvalue solution to the generalized eigenvalue problem?
It follows from equation 3.1 that L is a positive semidefinite matrix, and the vector y that minimizes the objective function is given by the minimum eigenvalue solution to the generalized eigenvalue problem: