Nonnegative Matrix Factorization: A Comprehensive Review
read more
Citations
Parameter-less Auto-weighted multiple graph regularized Nonnegative Matrix Factorization for data representation
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
State-of-the-Art Deep Learning: Evolving Machine Intelligence Toward Tomorrow’s Intelligent Network Traffic Control Systems
Community discovery using nonnegative matrix factorization
References
Regression Shrinkage and Selection via the Lasso
Learning the parts of objects by non-negative matrix factorization
Learning parts of objects by non-negative matrix factorization
Algorithms for Non-negative Matrix Factorization
Related Papers (5)
Frequently Asked Questions (19)
Q2. What are the contributions mentioned in the paper "Nonnegative matrix factorization: a comprehensive review" ?
This survey paper mainly focuses on the theoretical research into NMF over the last 5 years, where the principles, basic models, properties, and algorithms of NMF along with its various modifications, extensions, and generalizations are summarized systematically. Moreover, some open issues remained to be solved are discussed. This survey aims to construct an integrated, state-of-the-art framework for NMF concept, from which the follow-up research may benefit.
Q3. What are the exemplars of low-rank approximations?
The canonical methods, such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Independent Component Analysis (ICA), Vector Quantization (VQ), etc., are the exemplars of such low-rank approximations.
Q4. What is the inspiring benefit of NMF?
It was hoped that NMF would produce an intrinsically parts-based and sparse representation in unsupervised mode [3], which is the most inspiring benefit of NMF.
Q5. Why is NMF an imperative tool in multivariate data analysis?
Because of the enhanced semantic interpretability under the nonnegativity and the ensuing sparsity, NMF has become an imperative tool in multivariate data analysis, and been widely used in the fields of mathematics, optimization, neural computing, pattern recognition and machine learning [9], data mining [10], signal processing [11], image engineering and computer vision [11], spectral data analysis [12], bioinformatics [13], chemometrics [1], geophysics [14], finance and economics [15].
Q6. What is the smallest L that makes the decomposition possible?
The smallest L making the decomposition possible is called the nonnegative rank of the nonnegative matrix X, denoted as rankþðXÞ.
Q7. What is the way to solve the rank-one NMF problem?
Given that computing a globally optimal rank-oneapproximation can be done in polynomial time while thegeneral NMF problem is NP-hard, Gillis and Glineur [72]introduced Nonnegative Matrix Underapproximation(NMU) to solve the higher rank NMF problem in arecursive way.
Q8. What is the approach to determine the number of factor matrices?
In practice, the trial and error approach is often adopted, where L is set in advance and then adjusted according to the feedback of the factorization results, such as the approximation errors.
Q9. What is the crest of the previous work on Basic NMF?
The “ANLS using Projected Gradient (PG) methods” proposed by Lin [56] is the crest of the previous work on Basic NMF, which makes headway in the bound-constrained optimization.
Q10. What is the purpose of the constrained gradient distance minimization problem?
Given that the norm of the gradient of a mapping H from the lowdimensional manifold to the original high-dimensional space provides the measure of how far apart H maps nearby points, a constrained gradient distance minimization problem is formulated, whose goal is to find the map that best preserves local topology.
Q11. Why is it important to use the local rather than global minimization characteristic?
Because of the local rather than global minimization characteristic, it is obvious that the initialization of U and V will directly influence the convergence rate and the solution quality.
Q12. What is the main consideration to reduce the computational consumption of the basic NMF algorithms?
Another consideration to decrease the computational consumption is the parallel implementation of the existing Basic NMF algorithms, which tries to divide and distribute the factorization task block-wisely among several CPUs or GPUs [74].
Q13. How did Berry et al. solve the LS subproblem?
Berry et al. [10] recommended ALS NMF algorithm by computing the solutions to the subproblems as unconstrained LS problems with multiple right-hand sides and maintaining nonnegativity via setting negative values to zero per iteration.
Q14. How did Cai and his colleagues model the manifold structure?
Graph regularized NMF (GRNMF) proposed by Cai et al. [99], [98] modeled the manifold structure by constructing a nearest neighborhood graph on a scatter of data points.
Q15. What is the way to mitigate the problem of local minima?
To mitigate the problem of local minima, Cichocki andZdunek [85], [60] recommended a simple yet effectiveapproach named multilayer NMF by replacing the basismatrix U with a set of cascaded factor matrices.
Q16. What is the possible direction for the optimization problem?
Given that while the optimization problem is not jointly convex in both U and V , it is separately convex in either Uor V , the alternating minimizations are seemly the feasible direction.
Q17. What is the way to represent the high-dimensional stochastic pattern?
In other words, this approach tries to represent the high-dimensional stochastic pattern with far fewer bases, so the perfect approximation can be achieved successfully only if the intrinsic features are identified in U .
Q18. What is the way to select a rotational matrix D?
there are many ways to select a rotational matrix D which is not necessarily a generalized permutation or even nonnegative matrix, so that the transformed factor matrices U and V are still nonnegative.
Q19. What are the penalty terms for imposing certain application dependent constraints?
The various Constrained NMF models can be unified under the similar extended objective functionDC X UVkð Þ ¼ D X UVkð Þ þ J1ðUÞ þ J2ðV Þ; ð13Þwhere J1ðUÞ and J2ðV Þ are the penalty terms to enforce certain application dependent constraints, and are small regularization parameters balancing the tradeoff between the fitting goodness and the constraints.