Expectation-Maximization Gaussian-Mixture Approximate Message Passing

doi:10.1109/TSP.2013.2272287

Home
/
Papers
/
Expectation-Maximization Gaussian-Mixture Approximate Message Passing

Journal Article•DOI•

Expectation-Maximization Gaussian-Mixture Approximate Message Passing

Jeremy Vila¹, Philip Schniter¹•Institutions (1)

Ohio State University¹

01 Oct 2013-IEEE Transactions on Signal Processing (IEEE)-Vol. 61, Iss: 19, pp 4658-4672

TL;DR: An empirical-Bayesian technique is proposed that simultaneously learns the signal distribution while MMSE-recovering the signal-according to the learned distribution-using AMP, and model the non-zero distribution as a Gaussian mixture, and learn its parameters through expectation maximization, using AMP to implement the expectation step.

read less

Abstract: When recovering a sparse signal from noisy compressive linear measurements, the distribution of the signal's non-zero coefficients can have a profound effect on recovery mean-squared error (MSE). If this distribution was a priori known, then one could use computationally efficient approximate message passing (AMP) techniques for nearly minimum MSE (MMSE) recovery. In practice, however, the distribution is unknown, motivating the use of robust algorithms like LASSO-which is nearly minimax optimal-at the cost of significantly larger MSE for non-least-favorable distributions. As an alternative, we propose an empirical-Bayesian technique that simultaneously learns the signal distribution while MMSE-recovering the signal-according to the learned distribution-using AMP. In particular, we model the non-zero distribution as a Gaussian mixture and learn its parameters through expectation maximization, using AMP to implement the expectation step. Numerical experiments on a wide range of signal classes confirm the state-of-the-art performance of our approach, in both reconstruction error and runtime, in the high-dimensional regime, for most (but not all) sensing operators.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

AMP-Inspired Deep Networks for Sparse Linear Inverse Problems

[...]

Mark Borgerding¹, Philip Schniter¹, Sundeep Rangan²•Institutions (2)

Ohio State University¹, New York University²

15 Aug 2017-IEEE Transactions on Signal Processing

TL;DR: This paper proposes two novel neural-network architectures that decouple prediction errors across layers in the same way that the approximate message passing (AMP) algorithms decouple them across iterations: through Onsager correction.

...read moreread less

Abstract: Deep learning has gained great popularity due to its widespread success on many inference problems. We consider the application of deep learning to the sparse linear inverse problem, where one seeks to recover a sparse signal from a few noisy linear measurements. In this paper, we propose two novel neural-network architectures that decouple prediction errors across layers in the same way that the approximate message passing (AMP) algorithms decouple them across iterations: through Onsager correction. First, we propose a “learned AMP” network that significantly improves upon Gregor and LeCun's “learned ISTA.” Second, inspired by the recently proposed “vector AMP” (VAMP) algorithm, we propose a “learned VAMP” network that offers increased robustness to deviations in the measurement matrix from i.i.d. Gaussian. In both cases, we jointly learn the linear transforms and scalar nonlinearities of the network. Interestingly, with i.i.d. signals, the linear transforms and scalar nonlinearities prescribed by the VAMP algorithm coincide with the values learned through back-propagation, leading to an intuitive interpretation of learned VAMP. Finally, we apply our methods to two problems from 5G wireless communications: compressive random access and massive-MIMO channel estimation.

...read moreread less

395 citations

Journal Article•DOI•

Compressed Channel Estimation for Intelligent Reflecting Surface-Assisted Millimeter Wave Systems

[...]

Peilan Wang¹, Jun Fang¹, Huiping Duan¹, Hongbin Li²•Institutions (2)

University of Electronic Science and Technology of China¹, Stevens Institute of Technology²

28 May 2020-IEEE Signal Processing Letters

TL;DR: Simulation results show that the proposed method can provide an accurate channel estimate and achieve a substantial training overhead reduction and the inherent sparsity in mmWave channels is exploited.

...read moreread less

Abstract: In this letter, we consider channel estimation for intelligent reflecting surface (IRS)-assisted millimeter wave (mmWave) systems, where an IRS is deployed to assist the data transmission from the base station (BS) to a user. It is shown that for the purpose of joint active and passive beamforming, the knowledge of a large-size cascade channel matrix needs to be acquired. To reduce the training overhead, the inherent sparsity in mmWave channels is exploited. By utilizing properties of Katri-Rao and Kronecker products, we find a sparse representation of the cascade channel and convert cascade channel estimation into a sparse signal recovery problem. Simulation results show that our proposed method can provide an accurate channel estimate and achieve a substantial training overhead reduction.

...read moreread less

327 citations

Journal Article•DOI•

Compressed Sensing for Energy-Efficient Wireless Telemonitoring of Noninvasive Fetal ECG Via Block Sparse Bayesian Learning

[...]

Zhilin Zhang¹, Tzyy-Ping Jung¹, Scott Makeig¹, Bhaskar D. Rao¹•Institutions (1)

University of California, San Diego¹

01 Feb 2013-IEEE Transactions on Biomedical Engineering

TL;DR: Experimental results show that the block sparse Bayesian learning framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.

...read moreread less

Abstract: Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.

...read moreread less

320 citations

Journal Article•DOI•

Channel Estimation in Broadband Millimeter Wave MIMO Systems With Few-Bit ADCs

[...]

Jianhua Mo¹, Philip Schniter², Robert W. Heath²•Institutions (2)

Samsung¹, Ohio State University²

01 Mar 2018-IEEE Transactions on Signal Processing

TL;DR: In this article, a broadband channel estimation algorithm for mmWave multiple input multiple output (MIMO) systems with few-bit analog-to-digital converters (ADCs) is proposed.

...read moreread less

Abstract: We develop a broadband channel estimation algorithm for millimeter wave (mmWave) multiple input multiple output (MIMO) systems with few-bit analog-to-digital converters (ADCs). Our methodology exploits the joint sparsity of the mmWave MIMO channel in the angle and delay domains. We formulate the estimation problem as a noisy quantized compressed-sensing problem and solve it using efficient approximate message passing (AMP) algorithms. In particular, we model the angle-delay coefficients using a Bernoulli–Gaussian-mixture distribution with unknown parameters and use the expectation-maximization forms of the generalized AMP and vector AMP algorithms to simultaneously learn the distributional parameters and compute approximately minimum mean-squared error (MSE) estimates of the channel coefficients. We design a training sequence that allows fast, fast Fourier transform based implementation of these algorithms while minimizing peak-to-average power ratio at the transmitter, making our methods scale efficiently to large numbers of antenna elements and delays. We present the results of a detailed simulation study that compares our algorithms to several benchmarks. Our study investigates the effect of SNR, training length, training type, ADC resolution, and runtime on channel estimation MSE, mutual information, and achievable rate. It shows that, in a mmWave MIMO system, the methods we propose to exploit joint angle-delay sparsity allow 1-bit ADCs to perform comparably to infinite-bit ADCs at low SNR, and 4-bit ADCs to perform comparably to infinite-bit ADCs at medium SNR.

...read moreread less

319 citations

Proceedings Article•DOI•

Channel estimation in millimeter wave MIMO systems with one-bit quantization

[...]

Jianhua Mo¹, Philip Schniter², Nuria Gonzalez Prelcic³, Robert W. Heath¹•Institutions (3)

University of Texas at Austin¹, Ohio State University², University of Vigo³

01 Nov 2014

TL;DR: A modified EM algorithm is proposed that exploits sparsity and has better performance than the conventional EM algorithm and is presented as a solution to the channel estimation problem for millimeter wave MIMO systems with one-bit analog-to-digital converters.

...read moreread less

Abstract: We develop channel estimation agorithms for millimeter wave (mmWave) multiple input multiple output (MIMO) systems with one-bit analog-to-digital converters (ADCs). Since the mmWave MIMO channel is sparse due to the propagation characteristics, the estimation problem is formulated as a one-bit compressed sensing problem. We propose a modified EM algorithm that exploits sparsity and has better performance than the conventional EM algorithm. We also present a second solution using the generalized approximate message passing (GAMP) algorithm to solve this optimization problem. The simulation results show that GAMP can reduce mean squared error in the important low and medium SNR regions.

...read moreread less

303 citations

Cites methods from "Expectation-Maximization Gaussian-M..."

...The parameters η and σ(2) L can be learned by EM-GAMP algorithm [22] if they are unknown....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Maximum likelihood from incomplete data via the EM algorithm

[...]

Arthur P. Dempster¹, Nan M. Laird¹, Donald B. Rubin¹•Institutions (1)

Harvard University¹

01 Sep 1977-Journal of the royal statistical society series b-methodological

49,597 citations

Journal Article•DOI•

Regression Shrinkage and Selection via the Lasso

[...]

Robert Tibshirani

01 Jan 1996-Journal of the royal statistical society series b-methodological

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

...read moreread less

40,785 citations

"Expectation-Maximization Gaussian-M..." refers background in this paper

...olynomial-complexity algorithms when x is sufﬁciently sparse and when A satisﬁes certain restricted isometry properties [4], or when A is large with i.i.d random entries [5] as discussed below. Lasso [6] (or, equivalently, Basis Pursuit Denoising [7]), is a well-known approach to the sparse-signal recovery problem that solves the convex problem xˆlasso = argmin xˆ ky −Axˆk2 2 +λlassokxˆk1, (1) with λ...
[...]

Journal Article•DOI•

A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

[...]

Amir Beck¹, Marc Teboulle•Institutions (1)

Technion – Israel Institute of Technology¹

01 Jan 2009-Siam Journal on Imaging Sciences

TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.

...read moreread less

Abstract: We consider the class of iterative shrinkage-thresholding algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods, which can be viewed as an extension of the classical gradient algorithm, is attractive due to its simplicity and thus is adequate for solving large-scale problems even with dense matrix data. However, such methods are also known to converge quite slowly. In this paper we present a new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically. Initial promising numerical results for wavelet-based image deblurring demonstrate the capabilities of FISTA which is shown to be faster than ISTA by several orders of magnitude.

...read moreread less

11,413 citations

"Expectation-Maximization Gaussian-M..." refers methods or result in this paper

... tried SPGL1 but found performance degradations at small M. 11For FISTA, we used the regularization parameter λ FISTA =10 −5, which is consistent with the values used for the noiseless experiments in [26]. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 M/N K/M EM-GM-AMP-MOS EM-GM-AMP EM-BG-AMP genie GM-AMP DMM-AMP theoretical LASSO Fig. 4. Empirical PTCs and LASSO theoretical ...
[...]
...tions of a K-sparse BG signal and an i.i.dN(0,M−1) matrix A. We then recovered x from the noiseless measurements using EM-GM-AMP-MOS, EM-GM-AMP, EM-BGAMP, genie-GM-AMP, and the Lasso-solver10 FISTA11 [26]. Figure 6 shows that the PTCs of EM-GM-AMP-MOS and EMGM-AMP are nearly identical, slightly better than those of EM-BG-AMP and genie-GM-AMP (especially at very small M), and much better than FISTA’s. ...
[...]

Journal Article•DOI•

Atomic Decomposition by Basis Pursuit

[...]

Scott Chen¹, David L. Donoho², Michael A. Saunders²•Institutions (2)

Renaissance Technologies¹, Stanford University²

11 Dec 1998-SIAM Journal on Scientific Computing

TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.

...read moreread less

Abstract: The time-frequency and time-scale communities have recently developed a large number of overcomplete waveform dictionaries --- stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for decomposition have been proposed, including the method of frames (MOF), Matching pursuit (MP), and, for special dictionaries, the best orthogonal basis (BOB). Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions. We give examples exhibiting several advantages over MOF, MP, and BOB, including better sparsity and superresolution. BP has interesting relations to ideas in areas as diverse as ill-posed problems, in abstract harmonic analysis, total variation denoising, and multiscale edge denoising. BP in highly overcomplete dictionaries leads to large-scale optimization problems. With signals of length 8192 and a wavelet packet dictionary, one gets an equivalent linear program of size 8192 by 212,992. Such problems can be attacked successfully only because of recent advances in linear programming by interior-point methods. We obtain reasonable success with a primal-dual logarithmic barrier method and conjugate-gradient solver.

...read moreread less

9,950 citations

"Expectation-Maximization Gaussian-M..." refers background in this paper

...nd when A satisﬁes certain restricted isometry properties [4], or when A is large with i.i.d zeromean sub-Gaussian entries [5] as discussed below. LASSO [6] (or, equivalently, Basis Pursuit Denoising [7]), is a well-known approach to the sparse-signal recovery problem that solves the convex problem xˆlasso = argmin xˆ ky −Axˆk2 2 +λlassokxˆk1, (1) with λlasso a tuning parameter that trades between th...
[...]

Journal Article•DOI•

Sparse bayesian learning and the relevance vector machine

[...]

Michael E. Tipping¹•Institutions (1)

Microsoft¹

01 Sep 2001-Journal of Machine Learning Research

TL;DR: It is demonstrated that by exploiting a probabilistic Bayesian learning framework, the 'relevance vector machine' (RVM) can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages.

...read moreread less

Abstract: This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classification tasks utilising models linear in the parameters Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the 'relevance vector machine' (RVM), a model of identical functional form to the popular and state-of-the-art 'support vector machine' (SVM) We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages These include the benefits of probabilistic predictions, automatic estimation of 'nuisance' parameters, and the facility to utilise arbitrary basis functions (eg non-'Mercer' kernels) We detail the Bayesian framework and associated learning algorithm for the RVM, and give some illustrative examples of its application along with some comparative benchmarks We offer some explanation for the exceptional degree of sparsity obtained, and discuss and demonstrate some of the advantageous features, and potential extensions, of Bayesian relevance learning

...read moreread less

5,116 citations

"Expectation-Maximization Gaussian-M..." refers methods in this paper

...tic unknowns, our proposed EM-GM-AMP algorithm can be classiﬁed as an “empirical-Bayesian” approach [16]. Compared with previously proposed empirical-Bayesian approaches to compressive sensing (e.g., [17]–[19]), ours has a more ﬂexible signal model, and thus is able to better match a wide range of signal pdfs pX(·), as we demonstrate through a detailed numerical study. In addition, the complexity scal...
[...]