4 Contributions

The objective of  LE algorithm is to find a transform that goes from the original n-dimensional space to a k-dimensional space, with k << n, in which the local distances are conserved as much as possible.
Given k points x1,...,xk in Rl, we construct a weighted graph  G = (V, E) with k nodes, one for each point, and a set of edges connecting neighboring points.  The embedding map is now provided by computing the eigenvectors of the graph Laplacian. The algorithmic procedure is formally stated below.
- Build the adjacency graph.
Construct the adjacency graph G by connecting neighboring nodes (i,j)  Neighbors selection, this can be selected by two way
– Є‐neighborhoods
  \(\left|x\ _i-\ x_j\right|^2\ \ <E\)
Advantages: Geometrically motivated, Disadvantages: Disconnected graph 
– n nearest neighbors: nodes i and j  are connected if i among n nearst neighbors of j.
Advantages:easier to choose, no disconnected graph, Disadvantsges Less geometrically intuitive.
- Choose the weights for edges in the graph.
Here are two options:
- Simple‐minded: 1 if connected, 0 otherwise 
- Heat Kernel:  \(W\ _{ij}=e^{-\frac{\left|xi-x_j\right|^2}{t}^{ }}\) if i and j connected, 0 otherwise
With  \(t=\inf\) we get the simple‐minded approach.
- Eigen‐decomposition of the graph Laplacian.
Assume the graph G, constructed above, is connected Compute eigenvalues and eigenvectors for the generalized eigenvector problem, 
                                \(Lf=λDf\)
where D is diagonal weight matrix and its entries are rows or column sum of W.
                            \(D_{ii}\ =\ \Sigma_j\ w_{ij}\)
                         \(L\ =\ D-W\)
L is the Laplacian matrix. Laplacian is a symmetric, positive semidefinite  matrix that can be thought of as an operator on functions defined on vertices of G.Let f0, f1,....,fk-1 the solution of the generilized eigen vector problem ordered accordeng the eigen values then
                       \(Lf_0\ =\ λ_0Df_0\)
                      \(Lf_1\ =\ λ_1Df_1\)
                                     .
                                     .
                                     .
                       \(Lf_{k-1}\ =\ λ_{k-1}Df_{k-1}\)
f0 is the eigenvector corresponding to eigenvalue 0. So we can represent the data xi in the m-dimensional space as:
                    \(x->\left(f_1,...,f_n\right)\)
  we can supposhe vector (y1,y2,...,ym) is the mapping for x in the m-dimension space, then this vector should rmiminize the following objective function to be a good representation,
                    \(\Sigma_{ij}\ \left(y_i-y_j\right)^2\ w_{ij}\)
we can choos wij so that if xi and xj are close then yi and yj are close also. We can prove that:
                    \(\Sigma_{ij}\ \left(y_i-y_j\right)^2\ w_{ij}\ =\ y^T\ Ly\)
where as before, \(L\ =\ D-W\)
                                \(\Sigma_{ij}\ \left(y_i-y_j\right)^2\ w_{ij}\ =\ \Sigma_{ij}\ \left(y_i^2+y_j^2-2y_jy_i\right)^{ }\ w_{ij}=\Sigma_iD_{ii}y_i^2+\Sigma_jD_{jj}y_j^2\)
                                                                                                                \(-2\Sigma_{ji}y_jy_iw_{ji}=2y^TLy\)
Where \(D_{ii}\ =\ \Sigma_j\ w_{ij}\)
Note that this calculation also shows that L is positive semidefinite.Therefore, the minimization problem reduces to finding
                                                \(\min_y\ y^TLy\ \ \ with\ \ y^TDy=1\)
The constraint  \(y^TDy=1\) removes an arbitrary scaling factor in the embedding.  Matrix D provides a natural measure on the vertices of the graph. The bigger the value \(D_{ii}\) (corresponding to the ith vertex) is, the more “important” is that vertex. It follows that L is a positive semidefinite matrix, and the vector y that minimizes the objective function is given by the minimum eigenvalue solution to the generalized eigenvalue problem:
                                \(Lf=λDf\)
Now consider the more general problem of embedding the graph into m-dimensional Euclidean space. The embedding is given by the  k×m matrix  \(Y\ =\ \left[y_1,y_2,....,y_m\right]\) where the ith row provides the embedding coordinatesof the ith vertex. Similarly, we need to minimize:
            \(\Sigma_{ij}\ \left(y^{\left(i\right)}-y^{\left(j\right)}\right)^2\ W_{ij}\ =\ tr\left(Y^T\ LY\right)\)
Where " tr"  is the trace of the matrix and \(y^{\left(i\right)}\) is the m-dimensional representation of the ith vertex. This reduces to finding
                            \(\min_{_Y}\ tr\left(Y^T\ LY\right)\ with\ Y^TDY=I\)
 We mintion here that the laplacian function here is discrete or analogous so when we have the case of a graph with edge's lenght end to zero then we need a contenues laplacian function so that the  authors use a Laplace Beltrami operator to deal with the continues manifolds. 

5 Experiments

The data set of 2000 points chosen at random from the swiss roll is shown in Figure 3. The swiss roll is a flat twodimensional submanifold of R3 . Two-dimensional representations of the swiss roll for different values of parameters N and t are shown in Figure 4. Note that t = ∞corresponds to the case whenthe weights are set to 1. Unlike Isomap, our algorithm does not attempt to isometrically embed the swiss roll into R2. However, it manages to unroll the swiss roll, thereby preserving the locality, although not the distances, on the manifold. We observe that for small values of N, we obtain virtually identical representations for different t’s. However, when N becomes bigger, smaller values of t seemingly lead to better representations.