scispace - formally typeset
Search or ask a question

Showing papers on "Matrix (mathematics) published in 2016"


Book
02 Jan 2016
TL;DR: In this paper, the authors present an ALGOLGOL-based approach for the inclusion of complex Zeros of polynomials of a function of one real variable in a system of linear systems of equations.
Abstract: Preface to the English Edition. Preface to the German Edition. Real Interval Arithmetic. Further Concepts and Properties. Interval Evaluation and Range of Real Functions. Machine Interval Arithmetic. Complex Interval Arithmetic. Metric, Absolute, Value, and Width in. Inclusion of Zeros of a Function of One Real Variable. Methods for the Simultaneous Inclusion of Real Zeros of Polynomials. Methods for the Simultaneous Inclusion of Complex Zeros of Polynomials. Interval Matrix Operations. Fixed Point Iteration for Nonlinear Systems of Equations. Systems of Linear Equations Amenable to Interation. Optimality of the Symmetric Single Step Method with Taking Intersection after Every Component. On the Feasibility of the Gaussian Algorithm for Systems of Equations with Intervals as Coefficients. Hansen's Method. The Procedure of Kupermann and Hansen. Ireation Methods for the Inclusion of the Inverse Matrix and for Triangular Decompositions. Newton-like Methods for Nonlinear Systems of Equations. Newton-like Methods without Matrix Inversions. Newton-like Methods for Particular Systems of Nonlinear Equations. Newton-like Total step and Single Step Methods. Appendix A. The Order of Convergence of Iteration Methods in vn(Ic) and Mmn(iC) ). Appendix B. Realizations of Machine Interval Arithmetics in ALGOL 60. Appendix C. ALGOL Procedures. Bibliography. Index of Notation. Subject Index.

2,054 citations


Proceedings Article
19 Jun 2016
TL;DR: This work constructs an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned, and demonstrates the potential of this architecture by achieving state of the art results in several hard tasks involving very long-term dependencies.
Abstract: Recurrent neural networks (RNNs) are notoriously difficult to train. When the eigenvalues of the hidden to hidden weight matrix deviate from absolute value 1, optimization becomes difficult due to the well studied issue of vanishing and exploding gradients, especially when trying to learn long-term dependencies. To circumvent this problem, we propose a new architecture that learns a unitary weight matrix, with eigenvalues of absolute value exactly 1. The challenge we address is that of parametrizing unitary matrices in a way that does not require expensive computations (such as eigendecomposition) after each weight update. We construct an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned. Optimization with this parameterization becomes feasible only when considering hidden states in the complex domain. We demonstrate the potential of this architecture by achieving state of the art results in several hard tasks involving very longterm dependencies.

630 citations


Journal ArticleDOI
TL;DR: The aim is to provide an overview of the major algorithmic developments that have taken place over the past few decades in the numerical solution of this and related problems, which are producing reliable numerical tools in the formulation and solution of advanced mathematical models in engineering and scientific computing.
Abstract: Given the square matrices $A, B, D, E$ and the matrix $C$ of conforming dimensions, we consider the linear matrix equation $A{\mathbf X} E+D{\mathbf X} B = C$ in the unknown matrix ${\mathbf X}$. Our aim is to provide an overview of the major algorithmic developments that have taken place over the past few decades in the numerical solution of this and related problems, which are producing reliable numerical tools in the formulation and solution of advanced mathematical models in engineering and scientific computing.

451 citations


Journal ArticleDOI
Junho Lee1, Gye-Tae Gil1, Yong Hoon Lee1
TL;DR: An efficient open-loop channel estimator for a millimeter-wave (mm-wave) hybrid multiple-input multiple-output (MIMO) system consisting of radio-frequency beamformers with large antenna arrays followed by a baseband MIMO processor is proposed.
Abstract: We propose an efficient open-loop channel estimator for a millimeter-wave (mm-wave) hybrid multiple-input multiple-output (MIMO) system consisting of radio-frequency (RF) beamformers with large antenna arrays followed by a baseband MIMO processor. A sparse signal recovery problem exploiting the sparse nature of mm-wave channels is formulated for channel estimation based on the parametric channel model with quantized angles of departures/arrivals (AoDs/AoAs), called the angle grids. The problem is solved by the orthogonal matching pursuit (OMP) algorithm employing a redundant dictionary consisting of array response vectors with finely quantized angle grids. We suggest the use of non-uniformly quantized angle grids and show that such grids reduce the coherence of the redundant dictionary. The lower and upper bounds of the sum-of-squared errors of the proposed OMP-based estimator are derived analytically: the lower bound is derived by considering the oracle estimator that assumes the knowledge of AoDs/AoAs, and the upper bound is derived based on the results of the OMP performance guarantees. The design of training vectors (or sensing matrix) is particularly important in hybrid MIMO systems, because the RF beamformer prevents the use of independent and identically distributed random training vectors, which are popular in compressed sensing. We design training vectors so that the total coherence of the equivalent sensing matrix is minimized for a given RF beamforming matrix, which is assumed to be unitary. It is observed that the estimation accuracy can be improved significantly by randomly permuting the columns of the RF beamforming matrix. The simulation results demonstrate the advantage of the proposed OMP with a redundant dictionary over the existing methods such as the least squares method and the OMP based on the virtual channel model.

447 citations


Journal ArticleDOI
TL;DR: This paper provides an overview of modern techniques for exploiting low-rank structure to perform matrix recovery in these settings, providing a survey of recent advances in this rapidly-developing field.
Abstract: Low-rank matrices play a fundamental role in modeling and computational methods for signal processing and machine learning. In many applications where low-rank matrices arise, these matrices cannot be fully sampled or directly observed, and one encounters the problem of recovering the matrix given only incomplete and indirect observations. This paper provides an overview of modern techniques for exploiting low-rank structure to perform matrix recovery in these settings, providing a survey of recent advances in this rapidly-developing field. Specific attention is paid to the algorithms most commonly used in practice, the existing theoretical guarantees for these algorithms, and representative practical applications of these techniques.

424 citations


Proceedings Article
12 Feb 2016
TL;DR: Experimental results show that TranSparse outperforms Trans(E, H, R, and D) significantly, and achieves state-of-the-art performance on triplet classification and link prediction tasks.
Abstract: We model knowledge graphs for their completion by encoding each entity and relation into a numerical space. All previous work including Trans(E, H, R, and D) ignore the heterogeneity (some relations link many entity pairs and others do not) and the imbalance (the number of head entities and that of tail entities in a relation could be different) of knowledge graphs. In this paper, we propose a novel approach TranSparse to deal with the two issues. In TranSparse, transfer matrices are replaced by adaptive sparse matrices, whose sparse degrees are determined by the number of entities (or entity pairs) linked by relations. In experiments, we design structured and unstructured sparse patterns for transfer matrices and analyze their advantages and disadvantages. We evaluate our approach on triplet classification and link prediction tasks. Experimental results show that TranSparse outperforms Trans(E, H, R, and D) significantly, and achieves state-of-the-art performance.

376 citations


Journal ArticleDOI
TL;DR: In this article, the authors established a theoretical guarantee for the matrix factorization-based formulation to correctly recover the underlying low-rank matrix and showed that under similar conditions to those in previous works, many standard optimization algorithms converge to the global optima of a factorizationbased formulation and recover the true lowrank matrix.
Abstract: Matrix factorization is a popular approach for large-scale matrix completion. The optimization formulation based on matrix factorization, even with huge size, can be solved very efficiently through the standard optimization algorithms in practice. However, due to the non-convexity caused by the factorization model, there is a limited theoretical understanding of whether these algorithms will generate a good solution. In this paper, we establish a theoretical guarantee for the factorization-based formulation to correctly recover the underlying low-rank matrix. In particular, we show that under similar conditions to those in previous works, many standard optimization algorithms converge to the global optima of a factorization-based formulation and recover the true low-rank matrix. We study the local geometry of a properly regularized objective and prove that any stationary point in a certain local region is globally optimal. A major difference of this paper from the existing results is that we do not need resampling (i.e., using independent samples at each iteration) in either the algorithm or its analysis.

299 citations


Journal ArticleDOI
TL;DR: An alternating direction method (ADM)-based nonnegative latent factor (ANLF) model is proposed, which ensures fast convergence and high prediction accuracy, as well as the maintenance of nonnegativity constraints.
Abstract: Nonnegative matrix factorization (NMF)-based models possess fine representativeness of a target matrix, which is critically important in collaborative filtering (CF)-based recommender systems. However, current NMF-based CF recommenders suffer from the problem of high computational and storage complexity, as well as slow convergence rate, which prevents them from industrial usage in context of big data. To address these issues, this paper proposes an alternating direction method (ADM)-based nonnegative latent factor (ANLF) model. The main idea is to implement the ADM-based optimization with regard to each single feature, to obtain high convergence rate as well as low complexity. Both computational and storage costs of ANLF are linear with the size of given data in the target matrix, which ensures high efficiency when dealing with extremely sparse matrices usually seen in CF problems. As demonstrated by the experiments on large, real data sets, ANLF also ensures fast convergence and high prediction accuracy, as well as the maintenance of nonnegativity constraints. Moreover, it is simple and easy to implement for real applications of learning systems.

295 citations


Journal ArticleDOI
01 Aug 2016-Nature
TL;DR: This corrects the article to show that the method used to derive the H2O2 “spatially aggregating force” is based on a two-step process, not a single step, like in the case of H1N1.
Abstract: Nature 530, 307–312 (2016); doi:10.1038/nature16948 In the last sentence of page 310 of this Letter, the parameter h should equal 2, rather than 1. In addition, after equation (4), the text should have stated ‘Aij > 0’ and ‘positive interactions’, to read “...the weighted connectivity matrix Aij > 0 captures the positive interactions between the nodes.

295 citations


Posted Content
TL;DR: It is shown that there are no spurious local minima in the non-convex factorized parametrization of low-rank matrix recovery from incoherent linear measurements, which yields a polynomial time global convergence guarantee for stochastic gradient descent.
Abstract: We show that there are no spurious local minima in the non-convex factorized parametrization of low-rank matrix recovery from incoherent linear measurements. With noisy measurements we show all local minima are very close to a global optimum. Together with a curvature bound at saddle points, this yields a polynomial time global convergence guarantee for stochastic gradient descent {\em from random initialization}.

260 citations


Journal ArticleDOI
TL;DR: This paper addresses the problem of unsupervised domain transfer learning in which no labels are available in the target domain by the inexact augmented Lagrange multiplier method and can avoid a potentially negative transfer by using a sparse matrix to model the noise and, thus, is more robust to different types of noise.
Abstract: In this paper, we address the problem of unsupervised domain transfer learning in which no labels are available in the target domain. We use a transformation matrix to transfer both the source and target data to a common subspace, where each target sample can be represented by a combination of source samples such that the samples from different domains can be well interlaced. In this way, the discrepancy of the source and target domains is reduced. By imposing joint low-rank and sparse constraints on the reconstruction coefficient matrix, the global and local structures of data can be preserved. To enlarge the margins between different classes as much as possible and provide more freedom to diminish the discrepancy, a flexible linear classifier (projection) is obtained by learning a non-negative label relaxation matrix that allows the strict binary label matrix to relax into a slack variable matrix. Our method can avoid a potentially negative transfer by using a sparse matrix to model the noise and, thus, is more robust to different types of noise. We formulate our problem as a constrained low-rankness and sparsity minimization problem and solve it by the inexact augmented Lagrange multiplier method. Extensive experiments on various visual domain adaptation tasks show the superiority of the proposed method over the state-of-the art methods. The MATLAB code of our method will be publicly available at http://www.yongxu.org/lunwen.html .

Proceedings ArticleDOI
07 Sep 2016
TL;DR: A co-factorization model, CoFactor, which jointly decomposes the user-item interaction matrix and the item-item co-occurrence matrix with shared item latent factors and provides qualitative results that explain how CoFactor improves the quality of the inferred factors.
Abstract: Matrix factorization (MF) models and their extensions are standard in modern recommender systems. MF models decompose the observed user-item interaction matrix into user and item latent factors. In this paper, we propose a co-factorization model, CoFactor, which jointly decomposes the user-item interaction matrix and the item-item co-occurrence matrix with shared item latent factors. For each pair of items, the co-occurrence matrix encodes the number of users that have consumed both items. CoFactor is inspired by the recent success of word embedding models (e.g., word2vec) which can be interpreted as factorizing the word co-occurrence matrix. We show that this model significantly improves the performance over MF models on several datasets with little additional computational overhead. We provide qualitative results that explain how CoFactor improves the quality of the inferred factors and characterize the circumstances where it provides the most significant improvements.

Posted Content
TL;DR: In this paper, a Riemannian network architecture is proposed for symmetric positive definite (SPD) matrix learning, where bilinear mapping layers are used to transform the input SPD matrices to more desirable SPD matrix matrices, eigenvalue rectification layers are exploited to apply a non-linear activation function to the new non-regular activation function, and an eigen value logarithm layer is designed to perform Riemanian computing on the resulting SPD matures for regular output layers.
Abstract: Symmetric Positive Definite (SPD) matrix learning methods have become popular in many image and video processing tasks, thanks to their ability to learn appropriate statistical representations while respecting Riemannian geometry of underlying SPD manifolds. In this paper we build a Riemannian network architecture to open up a new direction of SPD matrix non-linear learning in a deep model. In particular, we devise bilinear mapping layers to transform input SPD matrices to more desirable SPD matrices, exploit eigenvalue rectification layers to apply a non-linear activation function to the new SPD matrices, and design an eigenvalue logarithm layer to perform Riemannian computing on the resulting SPD matrices for regular output layers. For training the proposed deep network, we exploit a new backpropagation with a variant of stochastic gradient descent on Stiefel manifolds to update the structured connection weights and the involved SPD matrix data. We show through experiments that the proposed SPD matrix network can be simply trained and outperform existing SPD matrix learning and state-of-the-art methods in three typical visual classification tasks.

Journal ArticleDOI
TL;DR: The first theoretical accuracy guarantee for 1-b compressed sensing with unknown covariance matrix of the measurement vectors is given, and the single-index model of non-linearity is considered, allowing the non- linearity to be discontinuous, not one-to-one and even unknown.
Abstract: We study the problem of signal estimation from non-linear observations when the signal belongs to a low-dimensional set buried in a high-dimensional space. A rough heuristic often used in practice postulates that the non-linear observations may be treated as noisy linear observations, and thus, the signal may be estimated using the generalized Lasso. This is appealing because of the abundance of efficient, specialized solvers for this program. Just as noise may be diminished by projecting onto the lower dimensional space, the error from modeling non-linear observations with linear observations will be greatly reduced when using the signal structure in the reconstruction. We allow general signal structure, only assuming that the signal belongs to some set $K \subset \mathbb {R} ^{n}$ . We consider the single-index model of non-linearity. Our theory allows the non-linearity to be discontinuous, not one-to-one and even unknown. We assume a random Gaussian model for the measurement matrix, but allow the rows to have an unknown covariance matrix. As special cases of our results, we recover near-optimal theory for noisy linear observations, and also give the first theoretical accuracy guarantee for 1-b compressed sensing with unknown covariance matrix of the measurement vectors.

Proceedings Article
05 Dec 2016
TL;DR: This work provides a theoretical argument to determine if a unitary parameterization has restricted capacity, and shows how a complete, full-capacity unitary recurrence matrix can be optimized over the differentiable manifold of unitary matrices.
Abstract: Recurrent neural networks are powerful models for processing sequential data, but they are generally plagued by vanishing and exploding gradient problems. Unitary recurrent neural networks (uRNNs), which use unitary recurrence matrices, have recently been proposed as a means to avoid these issues. However, in previous experiments, the recurrence matrices were restricted to be a product of parameterized unitary matrices, and an open question remains: when does such a parameterization fail to represent all unitary matrices, and how does this restricted representational capacity limit what can be learned? To address this question, we propose full-capacity uRNNs that optimize their recurrence matrix over all unitary matrices, leading to significantly improved performance over uRNNs that use a restricted-capacity recurrence matrix. Our contribution consists of two main components. First, we provide a theoretical argument to determine if a unitary parameterization has restricted capacity. Using this argument, we show that a recently proposed unitary parameterization has restricted capacity for hidden state dimension greater than 7. Second, we show how a complete, full-capacity unitary recurrence matrix can be optimized over the differentiable manifold of unitary matrices. The resulting multiplicative gradient step is very simple and does not require gradient clipping or learning rate adaptation. We confirm the utility of our claims by empirically evaluating our new full-capacity uRNNs on both synthetic and natural data, achieving superior performance compared to both LSTMs and the original restricted-capacity uRNNs.

Journal ArticleDOI
TL;DR: The strength of chaos in large N quantum systems can be quantified using the rate of growth of certain out-of-time-order four point functions in weakly coupled matrix Φ4 theory as mentioned in this paper.
Abstract: The strength of chaos in large N quantum systems can be quantified using λ L , the rate of growth of certain out-of-time-order four point functions. We calculate λ L to leading order in a weakly coupled matrix Φ4 theory by numerically diagonalizing a ladder kernel. The computation reduces to an essentially classical problem.

Proceedings Article
19 Jun 2016
TL;DR: A variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices is introduced and "pseudo-data" (Snelson & Ghahramani, 2005) is incorporated in this model, which allows for more efficient posterior sampling while maintaining the properties of the original model.
Abstract: We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian (Gupta & Nagar, 1999) parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the "local reprarametrization trick" (Kingma et al., 2015) on this posterior distribution we arrive at a Gaussian Process (Rasmussen, 2006) interpretation of the hidden units in each layer and we, similarly with (Gal & Ghahramani, 2015), provide connections with deep Gaussian processes. We continue in taking advantage of this duality and incorporate "pseudo-data" (Snelson & Ghahramani, 2005) in our model, which in turn allows for more efficient posterior sampling while maintaining the properties of the original model. The validity of the proposed approach is verified through extensive experiments.

Journal ArticleDOI
TL;DR: In this paper, the evolution of spectral statistics across the many-body localization transition (MBLT) between ergodic and manybody localized phases in disordered interacting systems is investigated.
Abstract: The many-body localization transition (MBLT) between ergodic and many-body localized phases in disordered interacting systems is a subject of much recent interest. The statistics of eigenenergies is known to be a powerful probe of crossovers between ergodic and integrable systems in simpler examples of quantum chaos. We consider the evolution of the spectral statistics across the MBLT, starting with mapping to a Brownian motion process that analytically relates the spectral properties to the statistics of matrix elements. We demonstrate that the flow from Wigner-Dyson to Poisson statistics is a two-stage process. First, a fractal enhancement of matrix elements upon approaching the MBLT from the delocalized side produces an effective power-law interaction between energy levels, and leads to a plasma model for level statistics. At the second stage, the gas of eigenvalues has local interactions and the level statistics belongs to a semi-Poisson universality class. We verify our findings numerically on the XXZ spin chain. We provide a microscopic understanding of the level statistics across the MBLT and discuss implications for the transition that are strong constraints on possible theories.

Proceedings Article
05 Dec 2016
TL;DR: In this paper, it was shown that all local minima are very close to a global optimum and a curvature bound at saddle points yields a polynomial time global convergence guarantee for stochastic gradient descent.
Abstract: We show that there are no spurious local minima in the non-convex factorized parametrization of low-rank matrix recovery from incoherent linear measurements. With noisy measurements we show all local minima are very close to a global optimum. Together with a curvature bound at saddle points, this yields a polynomial time global convergence guarantee for stochastic gradient descent from random initialization.

Journal ArticleDOI
TL;DR: The goal of this survey article is to impart a working knowledge of the underlying theory and practice of sparse direct methods for solving linear systems and least-squares problems, and to provide an overview of the algorithms, data structures, and software available to solve these problems.
Abstract: Wilkinson defined a sparse matrix as one with enough zeros that it pays to take advantage of them.1 This informal yet practical definition captures the essence of the goal of direct methods for solving sparse matrix problems. They exploit the sparsity of a matrix to solve problems economically: much faster and using far less memory than if all the entries of a matrix were stored and took part in explicit computations. These methods form the backbone of a wide range of problems in computational science. A glimpse of the breadth of applications relying on sparse solvers can be seen in the origins of matrices in published matrix benchmark collections (Duff and Reid 1979a, Duff, Grimes and Lewis 1989a, Davis and Hu 2011). The goal of this survey article is to impart a working knowledge of the underlying theory and practice of sparse direct methods for solving linear systems and least-squares problems, and to provide an overview of the algorithms, data structures, and software available to solve these problems, so that the reader can both understand the methods and know how best to use them.

Proceedings ArticleDOI
18 Jun 2016
TL;DR: The GraphBLAS standard as discussed by the authors defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments.
Abstract: The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix-based graph algorithms to the broadest possible audience. Mathematically, the GraphBLAS defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the mathematics of the GraphBLAS. Graphs represent connections between vertices with edges. Matrices can represent a wide range of graphs using adjacency matrices or incidence matrices. Adjacency matrices are often easier to analyze while incidence matrices are often better for representing data. Fortunately, the two are easily connected by matrix multiplication. A key feature of matrix mathematics is that a very small number of matrix operations can be used to manipulate a very wide range of graphs. This composability of a small number of operations is the foundation of the GraphBLAS. A standard such as the GraphBLAS can only be effective if it has low performance overhead. Performance measurements of prototype GraphBLAS implementations indicate that the overhead is low.

Journal ArticleDOI
TL;DR: In this article, a theoretical framework is developed to estimate the optimal binning of X-ray spectra, which takes into account both the number of photons in a given spectral model bin and their average energy over the bin size.
Abstract: Aims. A theoretical framework is developed to estimate the optimal binning of X-ray spectra.Methods. We derived expressions for the optimal bin size for model spectra as well as for observed data using different levels of sophistication. Results. It is shown that by taking into account both the number of photons in a given spectral model bin and their average energy over the bin size, the number of model energy bins and the size of the response matrix can be reduced by a factor of 10−100. The response matrix should then contain the response at the bin centre as well as its derivative with respect to the incoming photon energy. We provide practical guidelines for how to construct optimal energy grids as well as how to structure the response matrix. A few examples are presented to illustrate the present methods.

Journal ArticleDOI
TL;DR: In this paper, the time-dependent treatment of a three-level atom in the V configuration confined in an optical cavity as well as the dynamics of the cavity field in two different regimes, in the presence and absence of the nonlinear mirror, is discussed.
Abstract: The time-dependent treatment of a three-level atom in the V configuration confined in an optical cavity as well as the dynamics of the cavity field in two different regimes, in the presence and absence of the nonlinear mirror, has been discussed. The time-dependent infinite coupled differential equations of motion are solved by the matrix continued fraction method. In deriving the numerical inverse Laplace transform, the quotient difference algorithm and the fast Fourier transform have been applied. The nonlinear effects of the system over the time evolution of the population inversion, the mean photon number and the second-order correlation function for some dimensionless parameters are delineated. Ultimately the reduction of the three-level atom to an effective two-level one under specific conditions and in a weak-driving limit has been performed.

Proceedings Article
19 Jun 2016
TL;DR: Procrustes Flow as discussed by the authors uses a thresholding scheme followed by gradient descent on a non-convex objective to recover a low-rank matrix from linear measurements and shows that as long as the measurements obey a standard restricted isometry property, their algorithm converges to the unknown matrix at a geometric rate.
Abstract: In this paper we study the problem of recovering a low-rank matrix from linear measurements. Our algorithm, which we call Procrustes Flow, starts from an initial estimate obtained by a thresholding scheme followed by gradient descent on a non-convex objective. We show that as long as the measurements obey a standard restricted isometry property, our algorithm converges to the unknown matrix at a geometric rate. In the case of Gaussian measurements, such convergence occurs for a n1 × n2 matrix of rank r when the number of measurements exceeds a constant times (n1 + n2)r.

Journal ArticleDOI
TL;DR: In this article, a fully data driven estimator based on adaptive constrained $\ell_{1}$ minimization is proposed and its rate of convergence is obtained over a collection of parameter spaces.
Abstract: Precision matrix is of significant importance in a wide range of applications in multivariate analysis. This paper considers adaptive minimax estimation of sparse precision matrices in the high dimensional setting. Optimal rates of convergence are established for a range of matrix norm losses. A fully data driven estimator based on adaptive constrained $\ell_{1}$ minimization is proposed and its rate of convergence is obtained over a collection of parameter spaces. The estimator, called ACLIME, is easy to implement and performs well numerically. A major step in establishing the minimax rate of convergence is the derivation of a rate-sharp lower bound. A “two-directional” lower bound technique is applied to obtain the minimax lower bound. The upper and lower bounds together yield the optimal rates of convergence for sparse precision matrix estimation and show that the ACLIME estimator is adaptively minimax rate optimal for a collection of parameter spaces and a range of matrix norm losses simultaneously.

Journal ArticleDOI
TL;DR: This paper is concerned with global exponential stability problem for a class of neural networks with time-varying delays using a new proposed inequality called free-matrix-based integral inequality, which is expressed by linear matrix inequalities.

Posted Content
TL;DR: In this article, the authors presented a quantum algorithm for recommendation systems that has running time O(text{poly}(k), poly{poly log n}(mn), for a small constant $k.
Abstract: A recommendation system uses the past purchases or ratings of $n$ products by a group of $m$ users, in order to provide personalized recommendations to individual users. The information is modeled as an $m \times n$ preference matrix which is assumed to have a good rank-$k$ approximation, for a small constant $k$. In this work, we present a quantum algorithm for recommendation systems that has running time $O(\text{poly}(k)\text{polylog}(mn))$. All known classical algorithms for recommendation systems that work through reconstructing an approximation of the preference matrix run in time polynomial in the matrix dimension. Our algorithm provides good recommendations by sampling efficiently from an approximation of the preference matrix, without reconstructing the entire matrix. For this, we design an efficient quantum procedure to project a given vector onto the row space of a given matrix. This is the first algorithm for recommendation systems that runs in time polylogarithmic in the dimensions of the matrix and provides an example of a quantum machine learning algorithm for a real world application.

Journal ArticleDOI
TL;DR: A hybrid between alternating optimization and the alternating direction method of multipliers, each matrix factor is updated in turn, using ADMM, hence the name AO-ADMM, which can naturally accommodate a great variety of constraints on the factor matrices, and almost all possible loss measures for the fitting.
Abstract: We propose a general algorithmic framework for constrained matrix and tensor factorization, which is widely used in signal processing and machine learning. The new framework is a hybrid between alternating optimization (AO) and the alternating direction method of multipliers (ADMM): each matrix factor is updated in turn, using ADMM, hence the name AO-ADMM. This combination can naturally accommodate a great variety of constraints on the factor matrices, and almost all possible loss measures for the fitting. Computation caching and warm start strategies are used to ensure that each update is evaluated efficiently, while the outer AO framework exploits recent developments in block coordinate descent (BCD)-type methods which help ensure that every limit point is a stationary point, as well as faster and more robust convergence in practice. Three special cases are studied in detail: non-negative matrix/tensor factorization, constrained matrix/tensor completion, and dictionary learning. Extensive simulations and experiments with real data are used to showcase the effectiveness and broad applicability of the proposed framework.

Journal ArticleDOI
TL;DR: The problem of estimating the spectral density carefully is defined and how to measure the accuracy of an approximate spectral density is discussed, which is generally costly and wasteful, especially for matrices of large dimension.
Abstract: In physics, it is sometimes desirable to compute the so-called density of states (DOS), also known as the spectral density, of a real symmetric matrix $A$. The spectral density can be viewed as a probability density distribution that measures the likelihood of finding eigenvalues near some point on the real line. The most straightforward way to obtain this density is to compute all eigenvalues of $A$, but this approach is generally costly and wasteful, especially for matrices of large dimension. There exist alternative methods that allow us to estimate the spectral density function at much lower cost. The major computational cost of these methods is in multiplying $A$ with a number of vectors, which makes them appealing for large-scale problems where products of the matrix $A$ with arbitrary vectors are relatively inexpensive. This article defines the problem of estimating the spectral density carefully and discusses how to measure the accuracy of an approximate spectral density. It then surveys a few kno...

Posted Content
TL;DR: This work addresses the rectangular matrix completion problem by lifting the unknown matrix to a positive semidefinite matrix in higher dimension, and optimizing a nonconvex objective over the semideFinite factor using a simple gradient descent scheme.
Abstract: We address the rectangular matrix completion problem by lifting the unknown matrix to a positive semidefinite matrix in higher dimension, and optimizing a nonconvex objective over the semidefinite factor using a simple gradient descent scheme. With $O( \mu r^2 \kappa^2 n \max(\mu, \log n))$ random observations of a $n_1 \times n_2$ $\mu$-incoherent matrix of rank $r$ and condition number $\kappa$, where $n = \max(n_1, n_2)$, the algorithm linearly converges to the global optimum with high probability.