scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Expectation-Maximization Gaussian-Mixture Approximate Message Passing

01 Oct 2013-IEEE Transactions on Signal Processing (IEEE)-Vol. 61, Iss: 19, pp 4658-4672
TL;DR: An empirical-Bayesian technique is proposed that simultaneously learns the signal distribution while MMSE-recovering the signal-according to the learned distribution-using AMP, and model the non-zero distribution as a Gaussian mixture, and learn its parameters through expectation maximization, using AMP to implement the expectation step.
Abstract: When recovering a sparse signal from noisy compressive linear measurements, the distribution of the signal's non-zero coefficients can have a profound effect on recovery mean-squared error (MSE). If this distribution was a priori known, then one could use computationally efficient approximate message passing (AMP) techniques for nearly minimum MSE (MMSE) recovery. In practice, however, the distribution is unknown, motivating the use of robust algorithms like LASSO-which is nearly minimax optimal-at the cost of significantly larger MSE for non-least-favorable distributions. As an alternative, we propose an empirical-Bayesian technique that simultaneously learns the signal distribution while MMSE-recovering the signal-according to the learned distribution-using AMP. In particular, we model the non-zero distribution as a Gaussian mixture and learn its parameters through expectation maximization, using AMP to implement the expectation step. Numerical experiments on a wide range of signal classes confirm the state-of-the-art performance of our approach, in both reconstruction error and runtime, in the high-dimensional regime, for most (but not all) sensing operators.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper proposes two novel neural-network architectures that decouple prediction errors across layers in the same way that the approximate message passing (AMP) algorithms decouple them across iterations: through Onsager correction.
Abstract: Deep learning has gained great popularity due to its widespread success on many inference problems. We consider the application of deep learning to the sparse linear inverse problem, where one seeks to recover a sparse signal from a few noisy linear measurements. In this paper, we propose two novel neural-network architectures that decouple prediction errors across layers in the same way that the approximate message passing (AMP) algorithms decouple them across iterations: through Onsager correction. First, we propose a “learned AMP” network that significantly improves upon Gregor and LeCun's “learned ISTA.” Second, inspired by the recently proposed “vector AMP” (VAMP) algorithm, we propose a “learned VAMP” network that offers increased robustness to deviations in the measurement matrix from i.i.d. Gaussian. In both cases, we jointly learn the linear transforms and scalar nonlinearities of the network. Interestingly, with i.i.d. signals, the linear transforms and scalar nonlinearities prescribed by the VAMP algorithm coincide with the values learned through back-propagation, leading to an intuitive interpretation of learned VAMP. Finally, we apply our methods to two problems from 5G wireless communications: compressive random access and massive-MIMO channel estimation.

395 citations

Journal ArticleDOI
TL;DR: Simulation results show that the proposed method can provide an accurate channel estimate and achieve a substantial training overhead reduction and the inherent sparsity in mmWave channels is exploited.
Abstract: In this letter, we consider channel estimation for intelligent reflecting surface (IRS)-assisted millimeter wave (mmWave) systems, where an IRS is deployed to assist the data transmission from the base station (BS) to a user. It is shown that for the purpose of joint active and passive beamforming, the knowledge of a large-size cascade channel matrix needs to be acquired. To reduce the training overhead, the inherent sparsity in mmWave channels is exploited. By utilizing properties of Katri-Rao and Kronecker products, we find a sparse representation of the cascade channel and convert cascade channel estimation into a sparse signal recovery problem. Simulation results show that our proposed method can provide an accurate channel estimate and achieve a substantial training overhead reduction.

327 citations

Journal ArticleDOI
TL;DR: Experimental results show that the block sparse Bayesian learning framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.
Abstract: Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.

320 citations

Journal ArticleDOI
TL;DR: In this article, a broadband channel estimation algorithm for mmWave multiple input multiple output (MIMO) systems with few-bit analog-to-digital converters (ADCs) is proposed.
Abstract: We develop a broadband channel estimation algorithm for millimeter wave (mmWave) multiple input multiple output (MIMO) systems with few-bit analog-to-digital converters (ADCs). Our methodology exploits the joint sparsity of the mmWave MIMO channel in the angle and delay domains. We formulate the estimation problem as a noisy quantized compressed-sensing problem and solve it using efficient approximate message passing (AMP) algorithms. In particular, we model the angle-delay coefficients using a Bernoulli–Gaussian-mixture distribution with unknown parameters and use the expectation-maximization forms of the generalized AMP and vector AMP algorithms to simultaneously learn the distributional parameters and compute approximately minimum mean-squared error (MSE) estimates of the channel coefficients. We design a training sequence that allows fast, fast Fourier transform based implementation of these algorithms while minimizing peak-to-average power ratio at the transmitter, making our methods scale efficiently to large numbers of antenna elements and delays. We present the results of a detailed simulation study that compares our algorithms to several benchmarks. Our study investigates the effect of SNR, training length, training type, ADC resolution, and runtime on channel estimation MSE, mutual information, and achievable rate. It shows that, in a mmWave MIMO system, the methods we propose to exploit joint angle-delay sparsity allow 1-bit ADCs to perform comparably to infinite-bit ADCs at low SNR, and 4-bit ADCs to perform comparably to infinite-bit ADCs at medium SNR.

319 citations

Proceedings ArticleDOI
01 Nov 2014
TL;DR: A modified EM algorithm is proposed that exploits sparsity and has better performance than the conventional EM algorithm and is presented as a solution to the channel estimation problem for millimeter wave MIMO systems with one-bit analog-to-digital converters.
Abstract: We develop channel estimation agorithms for millimeter wave (mmWave) multiple input multiple output (MIMO) systems with one-bit analog-to-digital converters (ADCs). Since the mmWave MIMO channel is sparse due to the propagation characteristics, the estimation problem is formulated as a one-bit compressed sensing problem. We propose a modified EM algorithm that exploits sparsity and has better performance than the conventional EM algorithm. We also present a second solution using the generalized approximate message passing (GAMP) algorithm to solve this optimization problem. The simulation results show that GAMP can reduce mean squared error in the important low and medium SNR regions.

303 citations


Cites methods from "Expectation-Maximization Gaussian-M..."

  • ...The parameters η and σ(2) L can be learned by EM-GAMP algorithm [22] if they are unknown....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations


"Expectation-Maximization Gaussian-M..." refers background in this paper

  • ...olynomial-complexity algorithms when x is sufficiently sparse and when A satisfies certain restricted isometry properties [4], or when A is large with i.i.d random entries [5] as discussed below. Lasso [6] (or, equivalently, Basis Pursuit Denoising [7]), is a well-known approach to the sparse-signal recovery problem that solves the convex problem xˆlasso = argmin xˆ ky −Axˆk2 2 +λlassokxˆk1, (1) with λ...

    [...]

Journal ArticleDOI
TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.
Abstract: We consider the class of iterative shrinkage-thresholding algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods, which can be viewed as an extension of the classical gradient algorithm, is attractive due to its simplicity and thus is adequate for solving large-scale problems even with dense matrix data. However, such methods are also known to converge quite slowly. In this paper we present a new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically. Initial promising numerical results for wavelet-based image deblurring demonstrate the capabilities of FISTA which is shown to be faster than ISTA by several orders of magnitude.

11,413 citations


"Expectation-Maximization Gaussian-M..." refers methods or result in this paper

  • ... tried SPGL1 but found performance degradations at small M. 11For FISTA, we used the regularization parameter λ FISTA =10 −5, which is consistent with the values used for the noiseless experiments in [26]. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 M/N K/M EM-GM-AMP-MOS EM-GM-AMP EM-BG-AMP genie GM-AMP DMM-AMP theoretical LASSO Fig. 4. Empirical PTCs and LASSO theoretical ...

    [...]

  • ...tions of a K-sparse BG signal and an i.i.dN(0,M−1) matrix A. We then recovered x from the noiseless measurements using EM-GM-AMP-MOS, EM-GM-AMP, EM-BGAMP, genie-GM-AMP, and the Lasso-solver10 FISTA11 [26]. Figure 6 shows that the PTCs of EM-GM-AMP-MOS and EMGM-AMP are nearly identical, slightly better than those of EM-BG-AMP and genie-GM-AMP (especially at very small M), and much better than FISTA’s. ...

    [...]

Journal ArticleDOI
TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
Abstract: The time-frequency and time-scale communities have recently developed a large number of overcomplete waveform dictionaries --- stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for decomposition have been proposed, including the method of frames (MOF), Matching pursuit (MP), and, for special dictionaries, the best orthogonal basis (BOB). Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions. We give examples exhibiting several advantages over MOF, MP, and BOB, including better sparsity and superresolution. BP has interesting relations to ideas in areas as diverse as ill-posed problems, in abstract harmonic analysis, total variation denoising, and multiscale edge denoising. BP in highly overcomplete dictionaries leads to large-scale optimization problems. With signals of length 8192 and a wavelet packet dictionary, one gets an equivalent linear program of size 8192 by 212,992. Such problems can be attacked successfully only because of recent advances in linear programming by interior-point methods. We obtain reasonable success with a primal-dual logarithmic barrier method and conjugate-gradient solver.

9,950 citations


"Expectation-Maximization Gaussian-M..." refers background in this paper

  • ...nd when A satisfies certain restricted isometry properties [4], or when A is large with i.i.d zeromean sub-Gaussian entries [5] as discussed below. LASSO [6] (or, equivalently, Basis Pursuit Denoising [7]), is a well-known approach to the sparse-signal recovery problem that solves the convex problem xˆlasso = argmin xˆ ky −Axˆk2 2 +λlassokxˆk1, (1) with λlasso a tuning parameter that trades between th...

    [...]

Journal ArticleDOI
Michael E. Tipping1
TL;DR: It is demonstrated that by exploiting a probabilistic Bayesian learning framework, the 'relevance vector machine' (RVM) can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages.
Abstract: This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classification tasks utilising models linear in the parameters Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the 'relevance vector machine' (RVM), a model of identical functional form to the popular and state-of-the-art 'support vector machine' (SVM) We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages These include the benefits of probabilistic predictions, automatic estimation of 'nuisance' parameters, and the facility to utilise arbitrary basis functions (eg non-'Mercer' kernels) We detail the Bayesian framework and associated learning algorithm for the RVM, and give some illustrative examples of its application along with some comparative benchmarks We offer some explanation for the exceptional degree of sparsity obtained, and discuss and demonstrate some of the advantageous features, and potential extensions, of Bayesian relevance learning

5,116 citations


"Expectation-Maximization Gaussian-M..." refers methods in this paper

  • ...tic unknowns, our proposed EM-GM-AMP algorithm can be classified as an “empirical-Bayesian” approach [16]. Compared with previously proposed empirical-Bayesian approaches to compressive sensing (e.g., [17]–[19]), ours has a more flexible signal model, and thus is able to better match a wide range of signal pdfs pX(·), as we demonstrate through a detailed numerical study. In addition, the complexity scal...

    [...]