scispace - formally typeset
Search or ask a question

Showing papers on "Singular value decomposition published in 2009"


Posted Content
TL;DR: In this article, a modular framework for constructing randomized algorithms that compute partial matrix decompositions is presented, which uses random sampling to identify a subspace that captures most of the action of a matrix and then the input matrix is compressed to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization.
Abstract: Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed---either explicitly or implicitly---to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis.

2,356 citations


Journal ArticleDOI
TL;DR: A penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix, and establishes connections between the SCoTLASS method for sparse principal component analysis and the method of Zou and others (2006).
Abstract: SUMMARY We present a penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix. We approximate the matrix X as ˆ X = � K=1 dkukv T , where dk, uk, and

1,540 citations


Journal ArticleDOI
TL;DR: An algorithm is presented that preferentially chooses columns and rows that exhibit high “statistical leverage” and exert a disproportionately large “influence” on the best low-rank fit of the data matrix, obtaining improved relative-error and constant-factor approximation guarantees in worst-case analysis, as opposed to the much coarser additive-error guarantees of prior work.
Abstract: Principal components analysis and, more generally, the Singular Value Decomposition are fundamental data analysis tools that express a data matrix in terms of a sequence of orthogonal or uncorrelated vectors of decreasing importance. Unfortunately, being linear combinations of up to all the data points, these vectors are notoriously difficult to interpret in terms of the data and processes generating the data. In this article, we develop CUR matrix decompositions for improved data analysis. CUR decompositions are low-rank matrix decompositions that are explicitly expressed in terms of a small number of actual columns and/or actual rows of the data matrix. Because they are constructed from actual data elements, CUR decompositions are interpretable by practitioners of the field from which the data are drawn (to the extent that the original data are). We present an algorithm that preferentially chooses columns and rows that exhibit high “statistical leverage” and, thus, in a very precise statistical sense, exert a disproportionately large “influence” on the best low-rank fit of the data matrix. By selecting columns and rows in this manner, we obtain improved relative-error and constant-factor approximation guarantees in worst-case analysis, as opposed to the much coarser additive-error guarantees of prior work. In addition, since the construction involves computing quantities with a natural and widely studied statistical interpretation, we can leverage ideas from diagnostic regression analysis to employ these matrix decompositions for exploratory data analysis.

815 citations


Posted Content
TL;DR: In this article, the authors provided the best bounds on the number of randomly sampled entries required to reconstruct an unknown low-rank matrix by minimizing the nuclear norm of the hidden matrix subject to agreement with the provided entries.
Abstract: This paper provides the best bounds to date on the number of randomly sampled entries required to reconstruct an unknown low rank matrix. These results improve on prior work by Candes and Recht, Candes and Tao, and Keshavan, Montanari, and Oh. The reconstruction is accomplished by minimizing the nuclear norm, or sum of the singular values, of the hidden matrix subject to agreement with the provided entries. If the underlying matrix satisfies a certain incoherence condition, then the number of entries required is equal to a quadratic logarithmic factor times the number of parameters in the singular value decomposition. The proof of this assertion is short, self contained, and uses very elementary analysis. The novel techniques herein are based on recent work in quantum information theory.

710 citations


Journal ArticleDOI
TL;DR: It is shown that problem structure in the semidefinite programming formulation can be exploited to develop more efficient implementations of interior-point methods, and a variant of a simple subspace algorithm is presented in which low-rank matrix approximations are computed via nuclear norm minimization instead of the singular value decomposition.
Abstract: The nuclear norm (sum of singular values) of a matrix is often used in convex heuristics for rank minimization problems in control, signal processing, and statistics. Such heuristics can be viewed as extensions of $\ell_1$-norm minimization techniques for cardinality minimization and sparse signal estimation. In this paper we consider the problem of minimizing the nuclear norm of an affine matrix-valued function. This problem can be formulated as a semidefinite program, but the reformulation requires large auxiliary matrix variables, and is expensive to solve by general-purpose interior-point solvers. We show that problem structure in the semidefinite programming formulation can be exploited to develop more efficient implementations of interior-point methods. In the fast implementation, the cost per iteration is reduced to a quartic function of the problem dimensions and is comparable to the cost of solving the approximation problem in the Frobenius norm. In the second part of the paper, the nuclear norm approximation algorithm is applied to system identification. A variant of a simple subspace algorithm is presented in which low-rank matrix approximations are computed via nuclear norm minimization instead of the singular value decomposition. This has the important advantage of preserving linear matrix structure in the low-rank approximation. The method is shown to perform well on publicly available benchmark data.

534 citations


Journal ArticleDOI
TL;DR: It is shown how the Tree-Tucker format, described by a tree with the leafs corresponding to the Tucker decompositions of three-dimensional tensors, can be applied to the problem of multidimensional convolution.
Abstract: For $d$-dimensional tensors with possibly large $d>3$, an hierarchical data structure, called the Tree-Tucker format, is presented as an alternative to the canonical decomposition. It has asymptotically the same (and often even smaller) number of representation parameters and viable stability properties. The approach involves a recursive construction described by a tree with the leafs corresponding to the Tucker decompositions of three-dimensional tensors, and is based on a sequence of SVDs for the recursively obtained unfolding matrices and on the auxiliary dimensions added to the initial “spatial” dimensions. It is shown how this format can be applied to the problem of multidimensional convolution. Convincing numerical examples are given.

480 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe an efficient algorithm for low-rank approximation of matrices that produces accuracy that is very close to the best possible accuracy, for matrices of arbitrary sizes.
Abstract: Principal component analysis (PCA) requires the computation of a low-rank approximation to a matrix containing the data being analyzed. In many applications of PCA, the best possible accuracy of any rank-deficient approximation is at most a few digits (measured in the spectral norm, relative to the spectral norm of the matrix being approximated). In such circumstances, efficient algorithms have not come with guarantees of good accuracy, unless one or both dimensions of the matrix being approximated are small. We describe an efficient algorithm for the low-rank approximation of matrices that produces accuracy that is very close to the best possible accuracy, for matrices of arbitrary sizes. We illustrate our theoretical results via several numerical examples.

389 citations


Proceedings ArticleDOI
04 Jan 2009
TL;DR: In this paper, a two-stage algorithm that runs in O(min{mn2, m2n}) time and returns as output an m x k matrix C consisting of exactly k columns of A is presented.
Abstract: We consider the problem of selecting the "best" subset of exactly k columns from an m x n matrix A. In particular, we present and analyze a novel two-stage algorithm that runs in O(min{mn2, m2n}) time and returns as output an m x k matrix C consisting of exactly k columns of A. In the first stage (the randomized stage), the algorithm randomly selects O(k log k) columns according to a judiciously-chosen probability distribution that depends on information in the top-k right singular subspace of A. In the second stage (the deterministic stage), the algorithm applies a deterministic column-selection procedure to select and return exactly k columns from the set of columns selected in the first stage. Let C be the m x k matrix containing those k columns, let PC denote the projection matrix onto the span of those columns, and let Ak denote the "best" rank-k approximation to the matrix A as computed with the singular value decomposition. Then, we prove that[EQUATION]with probability at least 0.7. This spectral norm bound improves upon the best previously-existing result (of Gu and Eisenstat [21]) for the spectral norm version of this Column Subset Selection Problem. We also prove that[EQUATION]with the same probability. This Frobenius norm bound is only a factor of √k log k worse than the best previously existing existential result and is roughly O(√k!) better than the best previous algorithmic result (both of Deshpande et al. [11]) for the Frobenius norm version of this Column Subset Selection Problem.

362 citations


Journal ArticleDOI
TL;DR: This paper obtains a closed-form expression for the joint probability density function of k consecutive ordered eigenvalues and, as a special case, the PDF of the lscrthordered eigenvalue of Wishart matrices and proposes a general methodology to evaluate some multiple nested integrals of interest.
Abstract: Random matrices play a crucial role in the design and analysis of multiple-input multiple-output (MIMO) systems. In particular, performance of MIMO systems depends on the statistical properties of a subclass of random matrices known as Wishart when the propagation environment is characterized by Rayleigh or Rician fading. This paper focuses on the stochastic analysis of this class of matrices and proposes a general methodology to evaluate some multiple nested integrals of interest. With this methodology we obtain a closed-form expression for the joint probability density function of k consecutive ordered eigenvalues and, as a special case, the PDF of the lscrth ordered eigenvalue of Wishart matrices. The distribution of the largest eigenvalue can be used to analyze the performance of MIMO maximal ratio combining systems. The PDF of the smallest eigenvalue can be used for MIMO antenna selection techniques. Finally, the PDF the kth largest eigenvalue finds applications in the performance analysis of MIMO singular value decomposition systems.

185 citations


Journal ArticleDOI
TL;DR: Two adaptive algorithms to update the decomposition of a PARAFAC decomposition at instant t+1 are proposed, the new tensor being obtained from the old one after appending a new slice in the 'time' dimension.
Abstract: The PARAFAC decomposition of a higher-order tensor is a powerful multilinear algebra tool that becomes more and more popular in a number of disciplines. Existing PARAFAC algorithms are computationally demanding and operate in batch mode - both serious drawbacks for on-line applications. When the data are serially acquired, or the underlying model changes with time, adaptive PARAFAC algorithms that can track the sought decomposition at low complexity would be highly desirable. This is a challenging task that has not been addressed in the literature, and the topic of this paper. Given an estimate of the PARAFAC decomposition of a tensor at instant t, we propose two adaptive algorithms to update the decomposition at instant t+1, the new tensor being obtained from the old one after appending a new slice in the 'time' dimension. The proposed algorithms can yield estimation performance that is very close to that obtained via repeated application of state-of-art batch algorithms, at orders of magnitude lower complexity. The effectiveness of the proposed algorithms is illustrated using a MIMO radar application (tracking of directions of arrival and directions of departure) as an example.

182 citations


Journal ArticleDOI
TL;DR: In this paper, a form of bi-cross-validation for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF), is presented.
Abstract: This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a self-consistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller hold-out sets lead to more over-fitting, while larger ones are more prone to under-fitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well.

Journal ArticleDOI
TL;DR: This article presents a form of bi-cross-validation for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF), and proves a self-consistency result expressing the prediction error as a residual from a low rank approximation.
Abstract: This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a self-consistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller hold-out sets lead to more over-fitting, while larger ones are more prone to under-fitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well.

Proceedings ArticleDOI
19 Jul 2009
TL;DR: These experiments on EachMovie and Netflix, the two largest public benchmarks to date, demonstrate that the nonparametric models make more accurate predictions of user ratings, and are computationally comparable or sometimes even faster in training, in comparison with previous state-of-the-art parametric matrix factorization models.
Abstract: With the sheer growth of online user data, it becomes challenging to develop preference learning algorithms that are sufficiently flexible in modeling but also affordable in computation. In this paper we develop nonparametric matrix factorization methods by allowing the latent factors of two low-rank matrix factorization methods, the singular value decomposition (SVD) and probabilistic principal component analysis (pPCA), to be data-driven, with the dimensionality increasing with data size. We show that the formulations of the two nonparametric models are very similar, and their optimizations share similar procedures. Compared to traditional parametric low-rank methods, nonparametric models are appealing for their flexibility in modeling complex data dependencies. However, this modeling advantage comes at a computational price--it is highly challenging to scale them to large-scale problems, hampering their application to applications such as collaborative filtering. In this paper we introduce novel optimization algorithms, which are simple to implement, which allow learning both nonparametric matrix factorization models to be highly efficient on large-scale problems. Our experiments on EachMovie and Netflix, the two largest public benchmarks to date, demonstrate that the nonparametric models make more accurate predictions of user ratings, and are computationally comparable or sometimes even faster in training, in comparison with previous state-of-the-art parametric matrix factorization models.

Proceedings ArticleDOI
06 Dec 2009
TL;DR: This paper looks at bipartite graphs changing over time and considers matrix- and tensor-based methods for predicting links, and presents a weight-based method for collapsing multi-year data into a single matrix.
Abstract: The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for time periods 1 through T, can we predict the links in time period T +1? Specifically, we look at bipartite graphs changing over time and consider matrix- and tensor-based methods for predicting links. We present a weight-based method for collapsing multi-year data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural three-dimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrix and tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem.

Journal ArticleDOI
TL;DR: In this paper, two statistics are presented that can be used to rank input parameters utilized by a model in terms of their relative identifiability based on a given or possible future calibration dataset.

01 Jan 2009
TL;DR: The implementation of an efficient algorithm based on singular value decomposition followed by local manifold optimization, for solving the low-rank matrix completion problem, and the robustness of the algorithm with respect to noise, and its performance on actual collaborative filtering datasets are studied.
Abstract: We consider the problem of reconstructing a low rank matrix from a small subset of its entries. In this paper, we describe the implementation of an efficient algorithm proposed in [19], based on singular value decomposition followed by local manifold optimization, for solving the low-rank matrix completion problem. It has been shown that if the number of revealed entries is large enough, the output of singular value decomposition gives a good estimate for the original matrix, so that local optimization reconstructs the correct matrix with high probability. We present numerical results which show that this algorithm can reconstruct the low rank matrix exactly from a very small subset of its entries. We further study the robustness of the algorithm with respect to noise, and its performance on actual collaborative filtering datasets.

Proceedings ArticleDOI
23 May 2009
TL;DR: This paper presents the implementation of singular value decomposition (SVD) of a dense matrix on GPU using the CUDA programming model and shows a speedup of upto 60 over the MATLAB implementation and upto 8 over the Intel MKL implementation on a Intel Dual Core 2.66GHz PC for large matrices.
Abstract: Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high performance co-processors due to their tremendous computing power. In this paper, we present the implementation of singular value decomposition (SVD) of a dense matrix on GPU using the CUDA programming model. SVD is implemented using the twin steps of bidiagonalization followed by diagonalization. It has not been implemented on the GPU before. Bidiagonalization is implemented using a series of Householder transformations which map well to BLAS operations. Diagonalization is performed by applying the implicitly shifted QR algorithm. Our complete SVD implementation outperforms the MATLAB and Intel ®Math Kernel Library (MKL) LAPACK implementation significantly on the CPU. We show a speedup of upto 60 over the MATLAB implementation and upto 8 over the Intel MKL implementation on a Intel Dual Core 2.66GHz PC on NVIDIA GTX 280 for large matrices. We also give results for very large matrices on NVIDIA Tesla S1070.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed approach can effectively improve the quality of the watermarked image and the robustness of the embedded watermark against various attacks.

Journal ArticleDOI
TL;DR: A new ICP algorithm is established, named Scale-ICP algorithm, for registration of the data sets with isotropic stretches, and a way to select the initial registrations is proposed in order to achieve global convergence for the proposed algorithm.
Abstract: In this paper, we are concerned with the registration of two 3D data sets with large-scale stretches and noises. First, by incorporating a scale factor into the standard iterative closest point (ICP) algorithm, we formulate the registration into a constraint optimization problem over a 7D nonlinear space. Then, we apply the singular value decomposition (SVD) approach to iteratively solving such optimization problem. Finally, we establish a new ICP algorithm, named Scale-ICP algorithm, for registration of the data sets with isotropic stretches. In order to achieve global convergence for the proposed algorithm, we propose a way to select the initial registrations. To demonstrate the performance and efficiency of the proposed algorithm, we give several comparative experiments between Scale-ICP algorithm and the standard ICP algorithm.

Journal ArticleDOI
TL;DR: In this article, a signal can be decomposed into the linear sum of a series of component signals by Hankel matrix-based SVD, and essentially what the component signals reflect are projections of original signal on the orthonormal bases of m-dimensional and n-dimensional vector spaces.

Book
14 Dec 2009
TL;DR: In this article, the authors introduce different mathematical methods of scientific computation to solve minimization problems using examples ranging from locating an aircraft, finding the best time to replace a computer, analyzing developments on the stock market, and constructing phylogenetic trees.
Abstract: Using real-life applications, this graduate-level textbook introduces different mathematical methods of scientific computation to solve minimization problems using examples ranging from locating an aircraft, finding the best time to replace a computer, analyzing developments on the stock market, and constructing phylogenetic trees. The textbook focuses on several methods, including nonlinear least squares with confidence analysis, singular value decomposition, best basis, dynamic programming, linear programming, and various optimization procedures. Each chapter solves several realistic problems, introducing the modeling optimization techniques and simulation as required. This allows readers to see how the methods are put to use, making it easier to grasp the basic ideas. There are also worked examples, practical notes, and background materials to help the reader understand the topics covered. Interactive exercises are available at www.cambridge.org/9780521849890.

Proceedings ArticleDOI
14 Jun 2009
TL;DR: This paper addresses the problem of approximate singular value decomposition of large dense matrices that arises naturally in many machine learning applications and proposes an efficient adaptive sampling technique to select informative columns from the original matrix.
Abstract: This paper addresses the problem of approximate singular value decomposition of large dense matrices that arises naturally in many machine learning applications. We discuss two recently introduced sampling-based spectral decomposition techniques: the Nystrom and the Column-sampling methods. We present a theoretical comparison between the two methods and provide novel insights regarding their suitability for various applications. We then provide experimental results motivated by this theory. Finally, we propose an efficient adaptive sampling technique to select informative columns from the original matrix. This novel technique outperforms standard sampling methods on a variety of datasets.

01 Jan 2009
TL;DR: The intention of this abstract is to provide a simple explanation of the basic assumptions made in SSA and its application to the modeling of plane waves.
Abstract: Summary Singular spectrum analysis (SSA) is a method utilized for the analysis of time series arising from dynamical systems. The method is used to capture oscillations from a given time series via the analysis of the eigenspectra of the so-called trajectory matrix. The trajectory matrix is composed of multiple data views. The singular value decomposition (SVD) of the trajectory matrix can be used for rank reduction and noise elimination. We apply SSA in the FX domain and present a comparison with classical FX deconvolution. The algorithm arising from SSA analysis is equivalent to Cadzow FX noise attenuation, a method recently proposed by Trickett (2008). It is important to stress, however, that Cadzow filtering (Cadzow, 1988) is a general framework for noise reduction of signals and images. Cadzow filtering is equivalent to SSA when considering sinusoidal waveforms immersed in additive random noise. The intention of this abstract is to provide a simple explanation of the basic assumptions made in SSA and its application to the modeling of plane waves.

Journal ArticleDOI
TL;DR: In this article, an empirical mode decomposition (EMD) method was used to decompose the vibration signal into a number of intrinsic mode functions (IMFs) by which the initial feature vector matrices could be formed automatically.
Abstract: Targeting the characteristics that periodic impulses usually occur whilst the rotating machinery exhibits local faults and the limitations of singular value decomposition (SVD) techniques, the SVD technique based on empirical mode decomposition (EMD) is applied to the fault feature extraction of the rotating machinery vibration signals. The EMD method is used to decompose the vibration signal into a number of intrinsic mode functions (IMFs) by which the initial feature vector matrices could be formed automatically. By applying the SVD technique to the initial feature vector matrices, the singular values of matrices could be obtained, which could be used as the fault feature vectors of support vector machines (SVMs) classifier. The analysis results from the gear and roller bearing vibration signals show that the fault diagnosis method based on EMD, SVD and SVM can extract fault features effectively and classify working conditions and fault patterns of gears and roller bearings accurately even when the number of samples is small.

Journal ArticleDOI
TL;DR: A generalization of the Gröbner basis method for polynomial equation solving, which improves overall numerical stability and is shown how the action matrix can be computed in the general setting of an arbitrary linear basis for ℂ[x]/I.
Abstract: This paper presents several new results on techniques for solving systems of polynomial equations in computer vision. Grobner basis techniques for equation solving have been applied successfully to several geometric computer vision problems. However, in many cases these methods are plagued by numerical problems. In this paper we derive a generalization of the Grobner basis method for polynomial equation solving, which improves overall numerical stability. We show how the action matrix can be computed in the general setting of an arbitrary linear basis for ?[x]/I. In particular, two improvements on the stability of the computations are made by studying how the linear basis for ?[x]/I should be selected. The first of these strategies utilizes QR factorization with column pivoting and the second is based on singular value decomposition (SVD). Moreover, it is shown how to improve stability further by an adaptive scheme for truncation of the Grobner basis. These new techniques are studied on some of the latest reported uses of Grobner basis methods in computer vision and we demonstrate dramatically improved numerical stability making it possible to solve a larger class of problems than previously possible.

Journal ArticleDOI
TL;DR: In this paper, an under-determined inverse approach is proposed to quantify an elementary source distribution on a source surface from acoustic pressure measurements, which is then reduced to a square system using singular value decomposition.

Journal ArticleDOI
TL;DR: The source reconstruction method (SRM) is a technique developed for antenna diagnostics and for carrying out near-field to far-field transformation based on the application of the electromagnetic equivalence principle that can resolve equivalent currents that are smaller than half a wavelength in size, thus providing super-resolution.
Abstract: The source reconstruction method (SRM) is a technique developed for antenna diagnostics and for carrying out near-field (NF) to far-field (FF) transformation. The SRM is based on the application of the electromagnetic equivalence principle, in which one establishes an equivalent current distribution that radiates the same fields as the actual currents induced in the antenna under test (AUT). The knowledge of the equivalent currents allows the determination of the antenna radiating elements, as well as the prediction of the AUT-radiated fields outside the equivalent currents domain. The unique feature of the novel methodology presented in this paper is that it can resolve equivalent currents that are smaller than half a wavelength in size, thus providing super-resolution. Furthermore, the measurement field samples can be taken at field spacings greater than half a wavelength, thus going beyond the classical sampling criteria. These two distinctive features are possible due to the choice of a model-based parameter estimation methodology where the unknowns are approximated by a continuous basis and, secondly, through the use of the analytic Green's function. The latter condition also guarantees the invertibility of the electric field operator and provides a stable solution for the currents even when evanescent waves are present in the measurements. In addition, the use of the singular value decomposition in the solution of the matrix equations provides the user with a quantitative tool to assess the quality and the quantity of the measured data. Alternatively, the use of the iterative conjugate gradient (CG) method in solving the ill-conditioned matrix equations can also be implemented. Two examples of an antenna diagnostics method are presented to illustrate the applicability and accuracy of the proposed methodology.

Journal ArticleDOI
TL;DR: In this article, the authors extend one-way functional principal component analysis (PCA) to two way functional data by introducing regularization of both left and right singular vectors in the singular value decomposition (SVD) of the data matrix.
Abstract: Two-way functional data consist of a data matrix whose row and column domains are both structured, for example, temporally or spatially, as when the data are time series collected at different locations in space. We extend one-way functional principal component analysis (PCA) to two-way functional data by introducing regularization of both left and right singular vectors in the singular value decomposition (SVD) of the data matrix. We focus on a penalization approach and solve the nontrivial problem of constructing proper two-way penalties from one-way regression penalties. We introduce conditional cross-validated smoothing parameter selection whereby left-singular vectors are cross-validated conditional on right-singular vectors, and vice versa. The concept can be realized as part of an alternating optimization algorithm. In addition to the penalization approach, we briefly consider two-way regularization with basis expansion. The proposed methods are illustrated with one simulated and two real data exam...

Proceedings Article
01 Jan 2009
TL;DR: In this article, the Tucker decomposition is used to analyze multi-aspect data and extract latent factors, which capture the multilinear data structure and are powerful mining tools for extracting patterns from large data volumes.
Abstract: Tensors naturally model many real world processes which generate multi-aspect data. Such processes appear in many different research disciplines, e.g, chemometrics, computer vision, psychometrics and neuroimaging analysis. Tensor decompositions such as the Tucker decomposition are used to analyze multi-aspect data and extract latent factors, which capture the multilinear data structure. Such decompositions are powerful mining tools for extracting patterns from large data volumes. However, most frequently used algorithms for such decompositions involve the computationally expensive Singular Value Decomposition. In this paper we propose MACH, a new sampling algorithm to compute such decompositions. Our method is of significant practical value for tensor streams, such as environmental monitoring systems, IP traffic matrices over time, where large amounts of data are accumulated and the analysis is computationally intensive but also in “post-mortem” data analysis cases where the tensor does not fit in the available memory. We provide the theoretical analysis of our proposed method and verify its efficacy on synthetic data and two real world monitoring system applications.

Journal ArticleDOI
TL;DR: Or orthogonal rotation of the spatial-domain vectors arising from singular value decomposition (SVD) of the spectral data matrix will be shown to be an effective method for making physically acceptable and easily interpretable estimates of the pure-component spectra and abundances.
Abstract: Full-spectrum imaging is fast becoming a tool of choice for characterizing heterogeneous materials. Spectral images, which consist of a complete spectrum at each point in a spatial array, can be acquired from a wide variety of surface and microanalytical spectroscopic techniques. It is not uncommon that such spectral image data sets comprise tens of thousands of individual spectra, or more. Given the vast quantities of raw spectral data, factor analysis methods have proved indispensable for extracting the chemical information from these high-dimensional data sets into a limited number of factors that represent the spectral and spatial characteristics of the sample's composition. It is well known that factor models suffer a ‘rotational ambiguity’, that is, there are an infinite number of factor models that will fit the data equally well. Thus, physically inspired constraints are often employed to derive relatively unique models that make the individual factors more easily interpreted by the practicing analyst. In the present work, we note that many samples undergoing spectral image analysis are ‘simple’ in the sense that only one or a few of the sample's constituents are present at any particular location. When this situation prevails, simplicity in the spatial domain can be exploited to make the resulting factor models more realistic. In particular, orthogonal rotation of the spatial-domain vectors arising from singular value decomposition (SVD) of the spectral data matrix will be shown to be an effective method for making physically acceptable and easily interpretable estimates of the pure-component spectra and abundances. Copyright © 2009 John Wiley & Sons, Ltd.