scispace - formally typeset
Search or ask a question

Showing papers by "Michael K. Ng published in 2022"


Journal ArticleDOI
TL;DR: In this paper, a deep learning model (i.e. superpixel-based Residual Networks 50, SP-ResNet50) was used to automatically differentiate leaves from non-leaves in phenocam images and to derive leaf fraction at the tree-crown scale.
Abstract: Tropical leaf phenology—particularly its variability at the tree-crown scale—dominates the seasonality of carbon and water fluxes. However, given enormous species diversity, accurate means of monitoring leaf phenology in tropical forests is still lacking. Time series of the Green Chromatic Coordinate (GCC) metric derived from tower-based red–greenblue (RGB) phenocams have been widely used to monitor leaf phenology in temperate forests, but its application in the tropics remains problematic. To improve monitoring of tropical phenology, we explored the use of a deep learning model (i.e. superpixel-based Residual Networks 50, SP-ResNet50) to automatically differentiate leaves from non-leaves in phenocam images and to derive leaf fraction at the tree-crown scale. To evaluate our model, we used a year of data from six phenocams in two contrasting forests in Panama. We first built a comprehensive library of leaf and non-leaf pixels across various acquisition times, exposure conditions and specific phenocams. We then divided this library into training and testing components. We evaluated the model at three levels: 1) superpixel level with a testing set, 2) crown level by comparing the model-derived leaf fractions with those derived using image-specific supervised classification, and 3) temporally using all daily images to assess the diurnal stability of the model-derived leaf fraction. Finally, we compared the model-derived leaf fraction phenology with leaf phenology derived from GCC. Our results show that: 1) the SP-ResNet50 model accurately differentiates leaves from non-leaves (overall accuracy of 93%) and is robust across all three levels of evaluations; 2) the model accurately quantifies leaf fraction phenology across tree-crowns and forest ecosystems; and 3) the combined use of leaf fraction and GCC helps infer the timing of leaf emergence, maturation and senescence, critical information for modeling photosynthetic seasonality of tropical forests. Collectively, this study offers an improved means for automated tropical phenology monitoring using phenocams.

11 citations


Journal ArticleDOI
TL;DR: A new minimization problem with an objective combining nuclear norm and a quadratic loss weighted among three channels and the error bound in both clean and corrupted regimes is proposed, which relies on some new results of quaternion matrices.
Abstract: In this paper, we study color image inpainting as a pure quaternion matrix completion problem. In the literature, the theoretical guarantee for quaternion matrix completion is not well-established. Our main aim is to propose a new minimization problem with an objective combining nuclear norm and a quadratic loss weighted among three channels. To fill the theoretical vacancy, we obtain the error bound in both clean and corrupted regimes, which relies on some new results of quaternion matrices. A general Gaussian noise is considered in robust completion where all observations are corrupted. Motivated by the error bound, we propose to handle unbalanced or correlated noise via a cross-channel weight in the quadratic loss, with the main purpose of rebalancing noise level, or removing noise correlation. Extensive experimental results on synthetic and color image data are presented to confirm and demonstrate our theoretical findings.

11 citations



Journal ArticleDOI
TL;DR: A novel method named Multiple Graphs and Low-rank Embedding (MGLE), which models the local structure information of multiple domains using multiple graphs and learns the low-rank embedding of the target domain and develops an iterative optimization algorithm to solve the resulting problem.
Abstract: Multi-source domain adaptation is a challenging topic in transfer learning, especially when the data of each domain are represented by different kinds of features, i.e., Multi-source Heterogeneous Domain Adaptation (MHDA). It is important to take advantage of the knowledge extracted from multiple sources as well as bridge the heterogeneous spaces for handling the MHDA paradigm. This article proposes a novel method named Multiple Graphs and Low-rank Embedding (MGLE), which models the local structure information of multiple domains using multiple graphs and learns the low-rank embedding of the target domain. Then, MGLE augments the learned embedding with the original target data. Specifically, we introduce the modules of both domain discrepancy and domain relevance into the multiple graphs and low-rank embedding learning procedure. Subsequently, we develop an iterative optimization algorithm to solve the resulting problem. We evaluate the effectiveness of the proposed method on several real-world datasets. Promising results show that the performance of MGLE is better than that of the baseline methods in terms of several metrics, such as AUC, MAE, accuracy, precision, F1 score, and MCC, demonstrating the effectiveness of the proposed method.

9 citations


Journal ArticleDOI
TL;DR: This article presents a simple yet effective semi-supervised node classification method named Hypergraph Convolution on Nodes-Hyperedges network, which performs filtering on both nodes and hyperedges as well as recovers the original hypergraph with the least information loss.
Abstract: Hypergraphs have shown great power in representing high-order relations among entities, and lots of hypergraph-based deep learning methods have been proposed to learn informative data representations for the node classification problem. However, most of these deep learning approaches do not take full consideration of either the hyperedge information or the original relationships among nodes and hyperedges. In this article, we present a simple yet effective semi-supervised node classification method named Hypergraph Convolution on Nodes-Hyperedges network, which performs filtering on both nodes and hyperedges as well as recovers the original hypergraph with the least information loss. Instead of only reducing the cross-entropy loss over the labeled samples as most previous approaches do, we additionally consider the hypergraph reconstruction loss as prior information to improve prediction accuracy. As a result, by taking both the cross-entropy loss on the labeled samples and the hypergraph reconstruction loss into consideration, we are able to achieve discriminative latent data representations for training a classifier. We perform extensive experiments on the semi-supervised node classification problem and compare the proposed method with state-of-the-art algorithms. The promising results demonstrate the effectiveness of the proposed method.

9 citations


Journal ArticleDOI
TL;DR: In this paper , a convex relaxation scheme was proposed to solve the problem of nonlinear response and multilinear low-rank constraint in the kernel tensor, which can be solved by alternating direction method of multipliers directly.
Abstract: Deep neural networks have shown impressive performance in many areas, including computer vision and natural language processing. Millions of parameters in deep neural network limit its deployment in low-end devices due to intensive memory and expensive computational cost. In the literature, several network compression techniques based on tensor decompositions have been proposed to compress deep neural networks. Existing techniques are designed in each network unit by approximating linear response or kernel tensor using various tensor decomposition methods. What is more, research has shown that there exists significant redundancy between different filters and feature channels of kernel tensor in each convolution layer. In this paper, we propose a new algorithm to compress deep neural network by considering both nonlinear response and the multilinear low-rank constraint in the kernel tensor. To overcome the resulted difficulty of nonconvex optimization, we propose a convex relaxation scheme such that it can be solved by alternating direction method of multipliers (ADMM) directly. Thus, the Tucker-2 rank and the feature matrix of Tucker decomposition can be determined simultaneously. The effectiveness of the proposed method is evaluated on CIFAR-10 and large-scale ILSVRC12 datasets for CNNs including ResNet-18, AlexNet and GoogleNet. According to our numerical computation, the proposed method is able to obtain highly reduction in model size with a small loss in accuracy. The compression performance of the proposed method is better than existing methods.

8 citations


Journal ArticleDOI
TL;DR: Hospitals with higher levels of infrastructure and resources have better outcomes after cancer surgery, independent of country income.

6 citations


Journal ArticleDOI
TL;DR: In this paper , a transformed total variation (TTV) minimization model was proposed to investigate robust image recovery from a certain number of noisy measurements by the proposed TTV minimisation model.
Abstract: Transformed $L_1$ (TL1) regularization has been shown to have comparable signal recovery capability with $L_1-L_2$ regularization and $L_1/L_2$ regularization, regardless of whether the measurement matrix satisfies the restricted isometry property (RIP). In the spirit of the TL1 method, we introduce a transformed total variation (TTV) minimization model to investigate robust image recovery from a certain number of noisy measurements by the proposed TTV minimization model in this paper. An optimal error bound, up to a logarithmic factor, of robust image recovery from compressed measurements via the TTV minimization model is established, and the RIP based condition is improved compared with total variation (TV) minimization. Numerical results of image reconstruction demonstrate our theoretical results and illustrate the efficiency of the TTV minimization model among state-of-the-art methods. Empirically, the error bound between the reconstructed image and the original image is shown to be better than that produced by TV minimization.

4 citations


Journal ArticleDOI
TL;DR: In this article , the authors proposed to truncate and properly dither the data prior to a uniform quantization, and showed that the quantization only slightly worsens the multiplicative factor.
Abstract: Modern datasets often exhibit heavy-tailed behaviour, while quantization is inevitable in digital signal processing and many machine learning problems. This paper studies the quantization of heavy-tailed data in some fundamental statistical estimation problems, where the underlying distributions have bounded moments of some order. We propose to truncate and properly dither the data prior to a uniform quantization. Our major standpoint is that (near) minimax rates of estimation error are achievable merely from the quantized data produced by the proposed scheme. In particular, concrete results are worked out for covariance estimation, compressed sensing, and matrix completion, all agreeing that the quantization only slightly worsens the multiplicative factor. Besides, we study compressed sensing where both covariate (i.e., sensing vector) and response are quantized. Under covariate quantization, although our recovery program is nonconvex because the covariance matrix estimator lacks positive semi-definiteness, all local minimizers are proved to enjoy near optimal error bound. Moreover, by the concentration inequality of product process and covering argument, we establish near minimax uniform recovery guarantee for quantized compressed sensing with heavy-tailed noise. Finally, numerical simulations are provided to corroborate our theoretical results.

4 citations


Journal Article
TL;DR: Under both sub-Gaussian and heavy-tailed regimes, new estimators that handle high-dimensional scaling are proposed and it is shown that their estimators achieve minimax rates up to logarithmic factors, hence the quantization nearly costs nothing from the perspective of statistical learning rate.
Abstract: Compared with data with high precision, one-bit (binary) data are preferable in many applications because of the efficiency in signal storage, processing, transmission, and enhancement of privacy. In this paper, we study three fundamental statistical estimation problems, i.e., sparse covariance matrix estimation, sparse linear regression, and low-rank matrix completion via binary data arising from an easy-to-implement one-bit quantization process that contains truncation, dithering and quantization as typical steps. Under both sub-Gaussian and heavy-tailed regimes, new estimators that handle high-dimensional scaling are proposed. In sub-Gaussian case, we show that our estimators achieve minimax rates up to logarithmic factors, hence the quantization nearly costs nothing from the perspective of statistical learning rate. In heavy-tailed case, we truncate the data before dithering to achieve a bias-variance trade-off, which results in estimators embracing convergence rates that are the square root of the corresponding minimax rates. Experimental results on synthetic data are reported to support and demonstrate the statistical properties of our estimators under one-bit quantization.

4 citations


Journal ArticleDOI
TL;DR: This work systematically compares four types of Generative Adversarial Networks (GANs), which are the combinations of GAN/Wasserstein GAN and its multi-scale version, and finds that the Multi-scale GAN regularization, in general, performs the best among all the methods.
Abstract: Precipitation nowcasting is an important task, which can be used in numerous applications. The key challenge of the task lies in radar echo map prediction. Previous studies leverage Convolutional Recurrent Neural Network (ConvRNN) to address the problem. However, the approaches are built upon mean square losses and the results tend to have inaccurate appearances, shapes and positions for predictions. To alleviate this problem, we explore the idea of adversarial regularization, and systematically compare four types of Generative Adversarial Networks (GANs), which are the combinations of GAN/Wasserstein GAN and its multi-scale version. Extensive experiments on a real-world radar data set and four typical meteorological examples are conducted. The results validate the effectiveness of adversarial regularization. The developed models show superior performances over the existing prediction approaches in the majority circumstances. Moreover, we find that the Wasserstein GAN regularization often delivers better results than the GAN regularization due to its robustness, and the Multi-scale Wasserstein GAN, in general, performs the best among all the methods. To reproduce the results, we release the source code at: https://github.com/luochuyao/MultiScaleGAN and the test system at: http://39.97.217.145:80/.

Journal ArticleDOI
TL;DR: This paper proposes a novel model called Hypergraph Collaborative Network (HCoN), which takes the information from both previous vertices and hyperedges into consideration to achieve informative latent representations and further introduces the hypergraph reconstruction error as a regularizer to learn an effective classifier.
Abstract: In many practical datasets, such as co-citation and co-authorship, relationships across the samples are more complex than pair-wise. Hypergraphs provide a flexible and natural representation for such complex correlations and thus obtain increasing attention in the machine learning and data mining communities. Existing deep learning-based hypergraph approaches seek to learn the latent vertex representations based on either vertices or hyperedges from previous layers and focus on reducing the cross-entropy error over labeled vertices to obtain a classifier. In this paper, we propose a novel model called Hypergraph Collaborative Network (HCoN), which takes the information from both previous vertices and hyperedges into consideration to achieve informative latent representations and further introduces the hypergraph reconstruction error as a regularizer to learn an effective classifier. We evaluate the proposed method on two cases, i.e., semi-supervised vertex and hyperedge classifications. We carry out the experiments on several benchmark datasets and compare our method with several state-of-the-art approaches. Experimental results demonstrate that the performance of the proposed method is better than that of the baseline methods.

Journal ArticleDOI
TL;DR: A uniformly dithered one-bit quantization scheme for highdimensional statistical estimation that achieves optimal minimax rates up to logarithmic factors and a novel setting where both measurement and covariate are quantized is first proposed and studied.
Abstract: In this paper, we propose a uniformly dithered 1-bit quantization scheme for high-dimensional statistical estimation. The scheme contains truncation, dithering, and quantization as typical steps. As canonical examples, the quantization scheme is applied to the estimation problems of sparse covariance matrix estimation, sparse linear regression (i.e., compressed sensing), and matrix completion. We study both sub-Gaussian and heavy-tailed regimes, where the underlying distribution of heavy-tailed data is assumed to have bounded moments of some order. We propose new estimators based on 1-bit quantized data. In sub-Gaussian regime, our estimators achieve near minimax rates, indicating that our quantization scheme costs very little. In heavy-tailed regime, while the rates of our estimators become essentially slower, these results are either the first ones in an 1-bit quantized and heavy-tailed setting, or already improve on existing comparable results from some respect. Under the observations in our setting, the rates are almost tight in compressed sensing and matrix completion. Our 1-bit compressed sensing results feature general sensing vector that is sub-Gaussian or even heavy-tailed. We also first investigate a novel setting where both the covariate and response are quantized. In addition, our approach to 1-bit matrix completion does not rely on likelihood and represent the first method robust to pre-quantization noise with unknown distribution. Experimental results on synthetic data are presented to support our theoretical analysis.

Proceedings ArticleDOI
16 Nov 2022
TL;DR: In this paper , the authors proposed a transfer learning method of PINNs via keeping singular vectors and optimizing singular values (namely SVD-PINNs), which works for solving a class of PDEs with different but close right-hand-side functions.
Abstract: Physics-informed neural networks (PINNs) have attracted significant attention for solving partial differential equations (PDEs) in recent years because they alleviate the curse of dimensionality that appears in traditional methods. However, the most disadvantage of PINNs is that one neural network corresponds to one PDE. In practice, we usually need to solve a class of PDEs, not just one. With the explosive growth of deep learning, many useful techniques in general deep learning tasks are also suitable for PINNs. Transfer learning methods may reduce the cost for PINNs in solving a class of PDEs. In this paper, we proposed a transfer learning method of PINNs via keeping singular vectors and optimizing singular values (namely SVD-PINNs). Numerical experiments on high dimensional PDEs (10-d linear parabolic equations and l0-d Allen-Cahn equations) show that SVD-PINNs work for solving a class of PDEs with different but close right-hand-side functions.

Journal ArticleDOI
TL;DR: Although NGPR involving measurement matrix A k is more computationally demanding than NPR involving measurement vector α k, the results reveal that NGPR exhibits stronger robustness than NPR under biased and deterministic noise.
Abstract: A noisy generalized phase retrieval (NGPR) problem refers to a problem of estimating x 0 ∈ C d by noisy quadratic samples (cid:8) x ∗ 0 A k x 0 + η k (cid:9) n k =1 where A k is a Hermitian matrix and η k is a noise scalar. When A k = α k α ∗ k for some α k ∈ C d , it reduces to a standard noisy phase retrieval (NPR) problem. The main aim of this paper is to study the estimation performance of empirical ℓ 2 risk minimization in both problems when A k in NGPR, or α k in NPR, is drawn from sub-Gaussian distribution. Under different kinds of noise patterns, we establish error bounds that can imply approximate reconstruction and these results are new in the literature. In NGPR, we show the bounds are of O (cid:0) || η ||√ n (cid:1) and O (cid:0) || η || ∞ q dn (cid:1) for general noise, and of O (cid:0)q d log n n (cid:1) and O (cid:0)q d (log n ) 2 n (cid:1) for random noise with sub-Gaussian and sub-exponential tail respectively, where k η k and k η k ∞ are the 2-norm and sup-norm of the noise vector of η k . Under heavy-tailed noise, by truncating response outliers we propose a robust estimator that possesses an error bound with slower convergence rate. On the other hand, we obtain in NPR the bound is of O (cid:0)q d log n n (cid:1) and O (cid:0)q d (log n ) 2 n (cid:1) for sub-Gaussian and sub-exponential noise respectively, which is essentially tighter than the existing bound O (cid:0) || η || 2 √ n (cid:1) . Although NGPR involving measurement matrix A k is more computationally demanding than NPR involving measurement vector α k , our results reveal that NGPR exhibits stronger robustness than NPR under biased and deterministic noise. Experimental results are presented to confirm and demonstrate our theoretical findings.


Journal ArticleDOI
TL;DR: A novel model named Adversarial Auto-encoder Domain Adaptation (AADA) to handle the recommendation problem under cold-start settings is presented and promising results in terms of precision, recall, NDCG, and hit rate verify the effectiveness of the proposed method.
Abstract: This article presents a novel model named Adversarial Auto-encoder Domain Adaptation to handle the recommendation problem under cold-start settings. Specifically, we divide the hypergraph into two hypergraphs, i.e., a positive hypergraph and a negative one. Below, we adopt the cold-start user recommendation for illustration. After achieving positive and negative hypergraphs, we apply hypergraph auto-encoders to them to obtain positive and negative embeddings of warm users and items. Additionally, we employ a multi-layer perceptron to get warm and cold-start user embeddings called regular embeddings. Subsequently, for warm users, we assign positive and negative pseudo-labels to their positive and negative embeddings, respectively, and treat their positive and regular embeddings as the source and target domain data, respectively. Then, we develop a matching discriminator to jointly minimize the classification loss of the positive and negative warm user embeddings and the distribution gap between the positive and regular warm user embeddings. In this way, warm users’ positive and regular embeddings are connected. Since the positive hypergraph maintains the relations between positive warm user and item embeddings, and the regular warm and cold-start user embeddings follow a similar distribution, the regular cold-start user embedding and positive item embedding are bridged to discover their relationship. The proposed model can be easily extended to handle the cold-start item recommendation by changing inputs. We perform extensive experiments on real-world datasets for both cold-start user and cold-start item recommendations. Promising results in terms of precision, recall, normalized discounted cumulative gain, and hit rate verify the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: All complex sparse signals or low-rank matrices can be uniformly, exactly recovered from a near optimal number of complex Gaussian measurement phases by recasting PO-CS as a linear compressive sensing problem, and the exact recovery follows from restricted isometry property (RIP).
Abstract: In phase-only compressive sensing (PO-CS), our goal is to recover low-complexity signals (e.g., sparse signals, low-rank matrices) from the phase of complex linear measurements. While perfect recovery of signal direction in PO-CS was observed quite early, the exact reconstruction guarantee for a fixed, real signal was recently done by Jacques and Feuillen [IEEE Trans. Inf. Theory, 67 (2021), pp. 4150-4161]. However, two questions remain open: the uniform recovery guarantee and exact recovery of complex signal. In this paper, we almost completely address these two open questions. We prove that, all complex sparse signals or low-rank matrices can be uniformly, exactly recovered from a near optimal number of complex Gaussian measurement phases. By recasting PO-CS as a linear compressive sensing problem, the exact recovery follows from restricted isometry property (RIP). Our approach to uniform recovery guarantee is based on covering arguments that involve a delicate control of the (original linear) measurements with overly small magnitude. To work with complex signal, a different sign-product embedding property and a careful rescaling of the sensing matrix are employed. In addition, we show an extension that the uniform recovery is stable under moderate bounded noise. We also propose to add Gaussian dither before capturing the phases to achieve full reconstruction with norm information. Experimental results are reported to corroborate and demonstrate our theoretical results.

Journal ArticleDOI
TL;DR: In this paper , a Riemannian optimization method was proposed to solve the problem of low rank third-order tensor completion by using Discrete Cosine Transform-related transform tensor-tensor product.

Journal ArticleDOI
TL;DR: The proposed Time Series Attention Transformer (TSAT) represents both temporal information and inter-dependencies of multivariate time series in terms of edge-enhanced dynamic graphs, and applies the embedded dynamic graphs to times series forecasting problems, including two real-world datasets and two benchmark datasets.
Abstract: A reliable and efficient representation of multivariate time series is crucial in various downstream machine learning tasks. In multivariate time series forecasting, each variable depends on its historical values and there are inter-dependencies among variables as well. Models have to be designed to capture both intra and inter relationships among the time series. To move towards this goal, we propose the Time Series Attention Transformer (TSAT) for multivariate time series representation learning. Using TSAT, we represent both temporal information and inter-dependencies of multivariate time series in terms of edge-enhanced dynamic graphs . The intra-series correlations are repre-sented by nodes in a dynamic graph; a self-attention mechanism is modified to capture the inter-series correlations by using the super-empirical mode decomposition (SMD) module. We applied the embedded dynamic graphs to times series forecasting problems, including two real-world datasets and two benchmark datasets. Extensive experiments show that TSAT clearly outerperforms six state-of-the-art baseline methods in various forecasting horizons. We further visualize the embedded dynamic graphs to illustrate the graph representation power of TSAT. We share our code at https://github.com/RadiantResearch/TSAT.

Journal ArticleDOI
TL;DR: In this paper , a geometric inexact Newton-conjugate gradient (Newton-CG) method for solving the resulting matrix optimization problems is proposed, under mild assumptions, and the global and quadratic convergence of the proposed method for the complex case is established.
Abstract: In this paper, we first give new model formulations for computing arbitrary generalized singular value of a Grassmann matrix pair or a real matrix pair. In these new formulations, we need to solve matrix optimization problems with unitary constraints or orthogonal constraints. We propose a geometric inexact Newton--conjugate gradient (Newton-CG) method for solving the resulting matrix optimization problems. Under some mild assumptions, we establish the global and quadratic convergence of the proposed method for the complex case. Some numerical examples are given to illustrate the effectiveness and high accuracy of the proposed method.

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed HessianFR outperforms baselines in terms of convergence and image generation quality and the convergence of the stochastic algorithm and the approximation of Hessian inverse are exploited to improve algorithm efficiency.
Abstract: . Wide applications of differentiable two-player sequential games (e.g., image generation by GANs) have raised much interest and attention of researchers to study efficient and fast algorithms. Most of existing algorithms are developed based on nice properties of simultaneous games, i.e., convex-concave payoff functions, but are not applicable in solving sequential games with different settings. Some conventional gradient descent ascent algorithms theoretically and numerically fail to find the local Nash equilibrium of the simultaneous game or the local minimax (i.e., local Stackelberg equilibrium) of the sequential game. In this paper, we propose the HessianFR, an efficient Hessian-based Follow-the-Ridge algorithm with theoretical guarantees. Furthermore, the convergence of the stochastic algorithm and the approximation of Hessian inverse are exploited to improve algorithm efficiency. A series of experiments of training generative adversarial networks (GANs) have been conducted on both synthetic and real-world large-scale image datasets (e.g. MNIST, CIFAR-10 and CelebA). The experimental results demonstrate that the proposed HessianFR outperforms baselines in terms of convergence and image generation quality. stability. We evaluate the convergence of HessianFR, FR, GDN, GDA, GDA-2 and EG in terms of gradient norms for both generators and discriminators. We first use GDA-2 (with learning rates η x = 1E-04 for generators and η y = 1E-04 for discriminators) to pretrain the model for 10K iterations and then run above algorithms (with fine tuned learning rates) for 100K iterations. For more details about experimental settings and contents, please see the supplementary material.

Journal ArticleDOI
TL;DR: This paper proposes a separable low-rank quaternion linear mixing model to polarized signals that uses a block coordinate descent algorithm to compute nonnegative factor matrix H in real number space and tests its effectiveness on the applications of polarization image representation and spectro-polarimetric imaging unmixing.
Abstract: Polarization is a unique characteristic of transverse wave and is represented by Stokes parameters. Analysis of polarization states can reveal valuable information about the sources. In this paper, we propose a separable low-rank quaternion linear mixing model to polarized signals: we assume each column of source factor matrix ˘ W equals a column of polarized data matrix ˘ M and refer to the corresponding problem as separable quaternion matrix factorization (SQMF). We discuss some properties of the matrix that can be decomposed by SQMF. To determine the source factor matrix ˘ W in quaternion space, we propose a heuristic algorithm called quaternion successive projection algorithm (QSPA) inspired by the successive projection algorithm. To guarantee the effectiveness of QSPA, a new normalization operator is proposed for the quaternion matrix. We use a block coordinate descent algorithm to compute nonnegative factor matrix H in real number space. We test our method on the applications of polarization image representation and spectro-polarimetric imaging unmixing to verify its effectiveness.

Journal ArticleDOI
TL;DR: Several numerical examples arising from partial differential equations, queueing problems and probabilistic Boolean networks are presented to demonstrate that solutions of linear systems with sizes ranging from septillion to nonillion can be learned quite accurately.
Abstract: . In this paper, we study deep neural networks for solving extremely large linear systems arising from physically relevant problems. Because of the curse of dimensionality, it is expensive to store both solution and right-hand side vectors in such extremely large linear systems. Our idea is to employ a neural network to characterize the solution with parameters being much fewer than the size of the solution. We present an error analysis of the proposed method provided that the solution vector can be approximated by the continuous quantity, which is in the Barron space. Several numerical examples arising from partial differential equations, queueing problems and probabilistic Boolean networks are presented to demonstrate that solutions of linear systems with sizes ranging from septillion (10 24 ) to nonillion (10 30 ) can be learned quite accurately.

Proceedings ArticleDOI
27 Sep 2022
TL;DR: A novel CRS solver based on an approximate secular equation, which requires only some of the Hessian eigenvalues and is therefore much more efficient, which makes it particularly suitable for high-dimensional applications of unconstrained non-convex optimization, such as low-rank recovery and deep learning.
Abstract: The cubic regularization method (CR) is a popular algorithm for unconstrained non-convex optimization. At each iteration, CR solves a cubically regularized quadratic problem, called the cubic regularization subproblem (CRS). One way to solve the CRS relies on solving the secular equation, whose computational bottleneck lies in the computation of all eigenvalues of the Hessian matrix. In this paper, we propose and analyze a novel CRS solver based on an approximate secular equation, which requires only some of the Hessian eigenvalues and is therefore much more efficient. Two approximate secular equations (ASEs) are developed. For both ASEs, we first study the existence and uniqueness of their roots and then establish an upper bound on the gap between the root and that of the standard secular equation. Such an upper bound can in turn be used to bound the distance from the approximate CRS solution based ASEs to the true CRS solution, thus offering a theoretical guarantee for our CRS solver. A desirable feature of our CRS solver is that it requires only matrix-vector multiplication but not matrix inversion, which makes it particularly suitable for high-dimensional applications of unconstrained non-convex optimization, such as low-rank recovery and deep learning. Numerical experiments with synthetic and real data-sets are conducted to investigate the practical performance of the proposed CRS solver. Experimental results show that the proposed solver outperforms two state-of-the-art methods.

Journal ArticleDOI
TL;DR: Numerical examples on both synthetic and real-world data sets demonstrate the superiority of the proposed method for nonnegative tensor data completion, better than existing tensor-based or matrix-based methods.
Abstract: Tensor decomposition is a powerful tool for extracting physically meaningful latent factors from multi-dimensional nonnegative data, and has been an increasing interest in a variety of fields such as image processing, machine learning, and computer vision. In this paper, we propose a sparse nonnegative Tucker decomposition and completion method for the recovery of underlying nonnegative data under noisy observations. Here the underlying nonnegative data tensor is decomposed into a core tensor and several factor matrices with all entries being nonnegative and the factor matrices being sparse. The loss function is derived by the maximum likelihood estimation of the noisy observations, and the l0 norm is employed to enhance the sparsity of the factor matrices. We establish the error bound of the estimator of the proposed model under generic noise scenarios, which is then specified to the observations with additive Gaussian noise, additive Laplace noise, and Poisson observations, respectively. Our theoretical results are better than those by existing tensor-based or matrix-based methods. Moreover, the minimax lower bounds are shown to be matched with the derived upper bounds up to logarithmic factors. Numerical examples on both synthetic and real-world data sets demonstrate the superiority of the proposed method for nonnegative tensor data completion.


Journal ArticleDOI
TL;DR: Numerical experiments are conducted and show that the proposed algorithms can greatly recover damaged spherical images and achieve the best performance over purely using deep learning denoiser and plug-and-play model.
Abstract: Spherical image processing has been widely applied in many important fields, such as omnidirectional vision for autonomous cars, global climate modelling, and medical imaging. It is non-trivial to extend an algorithm developed for flat images to the spherical ones. In this work, we focus on the challenging task of spherical image inpainting with deep learning-based regularizer. Instead of a naive application of existing models for planar images, we employ a fast directional spherical Haar framelet transform and develop a novel optimization framework based on a sparsity assumption of the framelet transform. Furthermore, by employing progressive encoder-decoder architecture, a new and better-performed deep CNN denoiser is carefully designed and works as an implicit regularizer. Finally, we use a plug-and-play method to handle the proposed optimization model, which can be implemented efficiently by training the CNN denoiser prior. Numerical experiments are conducted and show that the proposed algorithms can greatly recover damaged spherical images and achieve the best performance over purely using deep learning denoiser and plug-and-play model.

Journal ArticleDOI
TL;DR: Experimental results on two datasets of human cancer diseases are presented, which demonstrate that the proposed multiple graph spectral clustering method can not only identify gene functional modules, but also calculate the degree of association among different cancer diseases accurately.

Journal ArticleDOI
TL;DR: In this article , a momentum accelerated adaptive cubic regularization method (ARCm) is proposed to improve the convergence performance of non-convex logistic regression and robust linear regression models.
Abstract: The cubic regularization method (CR) and its adaptive ver-sion (ARC) are popular Newton-type methods in solving unconstrained non-convex optimization problems, due to its global convergence to local minima under mild conditions. The main aim of this paper is to develop a momentum accelerated adaptive cubic regularization method (ARCm) to improve the convergent performance. With the proper choice of momentum step size, we show the global convergence of ARCm and the local convergence can also be guaranteed under the K(cid:32)L property. Such global and local convergence can also be established when inexact solvers with low computational costs are employed in the iteration procedure. Numerical results for non-convex logistic regression and robust linear regression models are reported to demonstrate that the proposed ARCm significantly outperforms state-of-the-art cubic regularization methods (e.g., CR, momentum-based CR, ARC) and the trust region method. In particular, the number of iterations required by ARCm is less than 10% to 50% required by the most competitive method (ARC) in the experiments.