scispace - formally typeset
Search or ask a question

Showing papers on "Expectation–maximization algorithm published in 2019"


Journal ArticleDOI
TL;DR: A expectation maximization-based sparse Bayesian learning framework is developed and the Kalman filter and the Rauch–Tung–Striebel smoother are utilized to track the model parameters of the uplink spatial sparse channel in the expectation step.
Abstract: The low-rank property of the channel covariances can be adopted to reduce the overhead of the channel training in massive MIMO systems. In this paper, with the help of the virtual channel representation, we apply such property to both time-division duplex and frequency-division duplex systems, where the time-varying channel scenarios are considered. First, we formulate the dynamic massive MIMO channel as one sparse signal model. Then, an expectation maximization-based sparse Bayesian learning framework is developed to learn the model parameters of the sparse virtual channel. Specifically, the Kalman filter (KF) and the Rauch–Tung–Striebel smoother are utilized to track the model parameters of the uplink (UL) spatial sparse channel in the expectation step. During the maximization step, a fixed-point theorem-based algorithm and a low-complex searching method are constructed to recover the temporal varying characteristics and the spatial signatures, respectively. With the angle reciprocity, we recover the downlink (DL) model parameters from the UL ones. After that, the KF with the reduced dimension is adopt to fully exploit the channel temporal correlations to enhance the DL/UL virtual channel tracking accuracy. A monitoring scheme is also designed to detect the change of model parameters and trigger the relearning process. Finally, we demonstrate the efficacy of the proposed schemes through the numerical simulations.

153 citations


Journal ArticleDOI
TL;DR: An improved Wiener process model is proposed for RUL prediction, in which both drift and diffusion parameters are adaptive with the updating of monitoring data, and the quantitative relationship between degradation rate and degradation variation is considered.

95 citations


Proceedings ArticleDOI
01 Jun 2019
TL;DR: A novel probabilistic registration method that achieves state-of-the-art robustness as well as substantially faster computational performance than modern ICP implementations is contributed.
Abstract: Probabilistic point-set registration methods have been gaining more attention for their robustness to noise, outliers and occlusions. However, these methods tend to be much slower than the popular iterative closest point (ICP) algorithms, which severely limits their usability. In this paper, we contribute a novel probabilistic registration method that achieves state-of-the-art robustness as well as substantially faster computational performance than modern ICP implementations. This is achieved using a rigorous yet computationally-efficient probabilistic formulation. Point-set registration is cast as a maximum likelihood estimation and solved using the EM algorithm. We show that with a simple augmentation, the E step can be formulated as a filtering problem, allowing us to leverage advances in efficient Gaussian filtering methods. We also propose a customized permutohedral filter to improve its performance while retaining sufficient accuracy for our task. Additionally, we present a simple and efficient twist parameterization that generalizes our method to the registration of articulated and deformable objects. For articulated objects, the complexity of our method is almost independent of the Degrees Of Freedom (DOFs), which makes it highly efficient even for high DOF systems. The results demonstrate the proposed method consistently outperforms many competitive baselines on a variety of registration tasks.

93 citations


Journal ArticleDOI
TL;DR: Through introducing two hidden variables, a new expectation maximization algorithm is derived to estimate the unknown model parameters and the time-delays simultaneously.
Abstract: This paper presents the problems of state space model identification of multirate processes with unknown time delay. The aim is to identify a multirate state space model to approximate the parameter-varying time-delay system. The identification problems are formulated under the framework of the expectation maximization algorithm. Through introducing two hidden variables, a new expectation maximization algorithm is derived to estimate the unknown model parameters and the time-delays simultaneously. The effectiveness of the proposed algorithm is validated by a simulation example.

88 citations


Journal ArticleDOI
TL;DR: A review on applications of the EM algorithm to address missing variables and in ill conditioned problems is provided and future applications of EM algorithm as well as some open problems are provided.

85 citations


Journal ArticleDOI
TL;DR: This work improves the static framework that only works when measurements are from one single state, by further treating state changes in historical measurements as an unobserved latent variable, and incorporates the expectation-maximization (EM) algorithm to recover different hidden states in measurements.
Abstract: Grid topology and line parameters are essential for grid operation and planning, which may be missing or inaccurate in distribution grids. Existing data-driven approaches for recovering such information usually suffer from ignoring 1) input measurement errors and 2) possible state changes among historical measurements. While using the errors-in-variables model and letting the parameter and topology estimation interact with each other (PaToPa) can address input and output measurement error modeling, it only works when all measurements are from a single system state. To solve the two challenges simultaneously, we propose the “PaToPaEM” framework for joint line parameter and topology estimation with historical measurements from different unknown states. We improve the static framework that only works when measurements are from one single state, by further treating state changes in historical measurements as an unobserved latent variable. We then systematically analyze the new mathematical modeling, decouple the optimization problem, and incorporate the expectation-maximization (EM) algorithm to recover different hidden states in measurements. Combining these, the “PaToPaEM” framework enables joint topology and line parameter estimation using noisy measurements from multiple system states. It lays a solid foundation for data-driven system identification in distribution grids. Superior numerical results validate the practicability of the PaToPaEM framework.

68 citations


Journal ArticleDOI
TL;DR: This paper studies clustering of high-dimensional Gaussian mixtures and proposes a procedure, called CHIME, that is based on the EM algorithm and a direct estimation method for the sparse discriminant vector that outperforms the existing methods under a variety of settings.
Abstract: Unsupervised learning is an important problem in statistics and machine learning with a wide range of applications. In this paper, we study clustering of high-dimensional Gaussian mixtures and propose a procedure, called CHIME, that is based on the EM algorithm and a direct estimation method for the sparse discriminant vector. Both theoretical and numerical properties of CHIME are investigated. We establish the optimal rate of convergence for the excess misclustering error and show that CHIME is minimax rate optimal. In addition, the optimality of the proposed estimator of the discriminant vector is also established. Simulation studies show that CHIME outperforms the existing methods under a variety of settings. The proposed CHIME procedure is also illustrated in an analysis of a glioblastoma gene expression data set and shown to have superior performance. Clustering of Gaussian mixtures in the conventional low-dimensional setting is also considered. The technical tools developed for the high-dimensional setting are used to establish the optimality of the clustering procedure that is based on the classical EM algorithm.

66 citations


Posted Content
TL;DR: This work analyzes a general SA scheme to minimize a non-convex, smooth objective function, and illustrates these settings with the online EM algorithm and the policy-gradient method for average reward maximization in reinforcement learning.
Abstract: Stochastic approximation (SA) is a key method used in statistical learning. Recently, its non-asymptotic convergence analysis has been considered in many papers. However, most of the prior analyses are made under restrictive assumptions such as unbiased gradient estimates and convex objective function, which significantly limit their applications to sophisticated tasks such as online and reinforcement learning. These restrictions are all essentially relaxed in this work. In particular, we analyze a general SA scheme to minimize a non-convex, smooth objective function. We consider update procedure whose drift term depends on a state-dependent Markov chain and the mean field is not necessarily of gradient type, covering approximate second-order method and allowing asymptotic bias for the one-step updates. We illustrate these settings with the online EM algorithm and the policy-gradient method for average reward maximization in reinforcement learning.

52 citations


Journal ArticleDOI
TL;DR: Numerical simulations show that five leaks can be accurately identified with super-resolution and the leak number can be correctly estimated using the methods proposed in this paper.

47 citations


Journal ArticleDOI
TL;DR: This work proposes to employ a mixture of two Gaussian distributions as the noise model to capture both regular noise and irregular noise, thereby enhancing the robustness of the regression model.

44 citations


Journal ArticleDOI
TL;DR: A probabilistic model based approach for machinery condition prognosis based on particle filter is presented by integrating physical knowledge with in-process measurements into a state space framework to account for uncertainty and nonlinearity in machinery degradation process.
Abstract: This paper presents a probabilistic model based approach for machinery condition prognosis based on particle filter by integrating physical knowledge with in-process measurements into a state space framework to account for uncertainty and nonlinearity in machinery degradation process. One limitation of conventional particle filter is that condition prognosis is performed based on the model with predetermined parameters obtained from simulation studies or lab-controlled tests. Due to the stochastic nature of machinery defect propagation under varying operating conditions, model parameters may vary in practice which causes prediction errors. To address it, an integrated state prediction and parameter estimation framework based on particle filter and expectation-maximization algorithm is formulated and investigated. The model parameters are adaptively estimated based on expectation-maximization algorithm utilizing hidden degradation state and available in-process measurements. Particle filter is then performed on the identified model with estimated parameters following Bayesian inference scheme to improve the robustness and accuracy of machinery condition prognosis. The effectiveness of the developed method is demonstrated through a simulation study and an experimental run-to-failure bearing test in a wind turbine.

Proceedings ArticleDOI
20 May 2019
TL;DR: In this article, a multi-modal Gaussian mixture model is adapted to the error distribution of a sensor fusion problem to provide a computationally efficient solution with well-behaved convergence properties.
Abstract: Non-Gaussian and multimodal distributions are an important part of many recent robust sensor fusion algorithms. In difference to robust cost functions, they are probabilistically founded and have good convergence properties. Since their robustness depends on a close approximation of the real error distribution, their parametrization is crucial.We propose a novel approach that allows to adapt a multi-modal Gaussian mixture model to the error distribution of a sensor fusion problem. By combining expectation-maximization and non-linear least squares optimization, we are able to provide a computationally efficient solution with well-behaved convergence properties.We demonstrate the performance of these algorithms on several real-world GNSS and indoor localization datasets. The proposed adaptive mixture algorithm outperforms state-of-the-art approaches with static parametrization. Source code and datasets are available under https://mytuc.org/libRSF.

Journal ArticleDOI
TL;DR: Simulations demonstrate that the GMM REMLP prior yields better performance than the EM algorithm for small data sets, and is applied to phenotype classification when the prior knowledge consists of colon cancer pathways.
Abstract: Gene-expression-based classification and regression are major concerns in translational genomics. If the feature-label distribution is known, then an optimal classifier can be derived. If the predictor-target distribution is known, then an optimal regression function can be derived. In practice, neither is known, data must be employed, and, for small samples, prior knowledge concerning the feature-label or predictor-target distribution can be used in the learning process. Optimal Bayesian classification and optimal Bayesian regression provide optimality under uncertainty. With optimal Bayesian classification (or regression), uncertainty is treated directly on the feature-label (or predictor-target) distribution. The fundamental engineering problem is prior construction. The Regularized Expected Mean Log-Likelihood Prior (REMLP) utilizes pathway information and provides viable priors for the feature-label distribution, assuming that the training data contain labels. In practice, the labels may not be observed. This paper extends the REMLP methodology to a Gaussian mixture model (GMM) when the labels are unknown. Prior construction bundled with prior update via Bayesian sampling results in Monte Carlo approximations to the optimal Bayesian regression function and optimal Bayesian classifier. Simulations demonstrate that the GMM REMLP prior yields better performance than the EM algorithm for small data sets. We apply it to phenotype classification when the prior knowledge consists of colon cancer pathways.

Posted Content
TL;DR: In this article, the autocorrelation of the factors is explicitly accounted for and therefore the model has a state-space form, which is implemented by means of the Expectation Maximization (EM) algorithm, jointly with the Kalman smoother.
Abstract: This paper studies Quasi Maximum Likelihood estimation of dynamic factor models for large panels of time series. Specifically, we consider the case in which the autocorrelation of the factors is explicitly accounted for and therefore the model has a state-space form. Estimation of the factors and of their loadings is implemented by means of the Expectation Maximization (EM) algorithm, jointly with the Kalman smoother.~We prove that, as both the dimension of the panel $n$ and the sample size $T$ diverge to infinity: (i) the estimated loadings are $\sqrt T$-consistent and asymptotically normal if $\sqrt T/n\to 0$; (ii) the estimated factors are $\sqrt n$-consistent and asymptotically normal if $\sqrt n/T\to 0$; (iii) the estimated common component is $\min(\sqrt T,\sqrt n)$-consistent and asymptotically normal regardless of the relative rate of divergence of $n$ and $T$. Although the model is estimated as if the idiosyncratic terms were cross-sectionally and serially uncorrelated, we show that these mis-specifications do not affect consistency. Moreover, the estimated loadings are asymptotically as efficient as those obtained with the Principal Components estimator, whereas numerical results show that the loss in efficiency of the estimated factors becomes negligible as $n$ and $T$ increase.~We then propose robust estimators of the asymptotic covariances, which can be used to conduct inference on the loadings and to compute confidence intervals for the factors and common components. In a MonteCarlo simulation exercise and an analysis of US macroeconomic data, we study the performance of our estimators and we compare them with the traditional Principal Components approach.

Journal ArticleDOI
TL;DR: An Improved Density Peaks Clustering (IDPC)-based Expectation Maximization (EM) algorithm is proposed to improve the constructing algorithm of the Gaussian Mixture Model (GMM) to enhance the performance of the GMM-based damage monitoring method.

Journal ArticleDOI
TL;DR: A threshold-free filtering algorithm based on expectation–maximization (EM), developed based on the assumption that point clouds are seen as a mixture of Gaussian models, which performed the best in comparison with the classic progressive triangulated irregular network densification (PTD) methods in terms of omission error.
Abstract: Filtering of ground points is a key step for most applications of airborne LiDAR point clouds. Although many filtering algorithms have been proposed in recent years, most of them suffer from parameter setting or thresholds fine-tuning. This is most often time-consuming and reduces the degree of automation of the applied algorithm. To overcome such problems, this paper proposes a threshold-free filtering algorithm based on expectation–maximization (EM). The filter is developed based on the assumption that point clouds are seen as a mixture of Gaussian models. Thus, the separation of ground points and non-ground points from point clouds is partitioning of the point clouds by a mixed Gaussian model that is used for screening ground points. EM is applied to realize the separation, which calculates the maximum likelihood estimates of the mixture parameters. Using the estimated parameters, the likelihoods of each point belonging to ground or non-ground are computed. Noticeably, point clouds are labeled as the component with a larger likelihood. The proposed method has been tested using the standard filtering datasets provided by the ISPRS. Experimental results showed that the proposed method performed the best in comparison with the classic progressive triangulated irregular network densification (PTD) and segment-based PTD methods in terms of omission error. The average omission error of the proposed method was 52.81% and 16.78% lower than the classic PTD method and the segment-based PTD method, respectively. Moreover, the proposed method was able to reduce its average total error by 31.95% compared to the classic PTD method.

Journal ArticleDOI
TL;DR: A Bayesian inference method for the generalized Gamma mixture model (GΓMM) based on variational expectation-maximization algorithm, where the shape parameters, the inverse scale parameters, and the mixing coefficients are treated as random variables, while the power parameters are left as parameters without assigning prior distributions.


Journal ArticleDOI
TL;DR: A new grid map is proposed which considers the model selection criterion (AIC or BIC) and risk measures at the same time, by using the entire space of models under consideration, and using the EM algorithm along with the emEM initialization strategy.
Abstract: A new statistical methodology is developed for fitting left-truncated loss data by using the G-component finite mixture model with any combination of Gamma, Lognormal, and Weibull distributions. The EM algorithm, along with the emEM initialization strategy, is employed for model fitting. We propose a new grid map which considers the model selection criterion (AIC or BIC) and risk measures at the same time, by using the entire space of models under consideration. A simulation study validates our proposed approach. The application of the proposed methodology and use of new grid maps are illustrated through analyzing a real data set that includes left-truncated insurance losses.

Journal ArticleDOI
TL;DR: An expectation-maximization algorithm is developed by viewing the modal response as a latent variable, allowing mode shape norm constraints to be incorporated and accelerating convergence, respectively, and it opens a way to compute the MPV in the Bayesian FFT method for other unexplored cases, e.g., multi-mode multi-setup problem.

Proceedings ArticleDOI
01 Jun 2019
TL;DR: This paper presented an interpretation of edge selection/reweighting in terms of variational Bayes inference, and developed a novel variational expectation maximization (VEM) algorithm with built-in adaptive edge selection for blind deblurring.
Abstract: Blind motion deblurring is an important problem that receives enduring attention in last decade. Based on the observation that a good intermediate estimate of latent image for estimating motion-blur kernel is not necessarily the one closest to latent image, edge selection has proven itself a very powerful technique for achieving state-of-the-art performance in blind deblurring. This paper presented an interpretation of edge selection/reweighting in terms of variational Bayes inference, and therefore developed a novel variational expectation maximization (VEM) algorithm with built-in adaptive edge selection for blind deblurring. Together with a restart strategy for avoiding undesired local convergence, the proposed VEM method not only has a solid mathematical foundation but also noticeably outperformed the state-of-the-art methods on benchmark datasets.

Journal ArticleDOI
TL;DR: An algorithm for simplifying a finite mixture model into a reduced mixture model with fewer mixture components that can be widely used for probabilistic data analysis, and is more accurate than other mixture simplification methods.
Abstract: We propose an algorithm for simplifying a finite mixture model into a reduced mixture model with fewer mixture components. The reduced model is obtained by maximizing a variational lower bound of the expected log-likelihood of a set of virtual samples. We develop three applications for our mixture simplification algorithm: recursive Bayesian filtering using Gaussian mixture model posteriors, KDE mixture reduction, and belief propagation without sampling. For recursive Bayesian filtering, we propose an efficient algorithm for approximating an arbitrary likelihood function as a sum of scaled Gaussian. Experiments on synthetic data, human location modeling, visual tracking, and vehicle self-localization show that our algorithm can be widely used for probabilistic data analysis, and is more accurate than other mixture simplification methods.

Journal ArticleDOI
TL;DR: This work forms the network traffic estimation problem as a noise-immune temporal matrix completion (NiTMC) model, where the complex noise is fitted by mixture of Gaussian (MoG), and the network anomaly is smoothed by the $L_{2,1}$ -norm regularization.
Abstract: Accurately estimating origin-destination (OD) network traffic is crucial for network management and capacity planning. However, the potential network anomaly and complex noise make this goal difficult to achieve. Existing network traffic estimation methods usually impute network traffic independent of anomaly detection, which ignores the potential relationship between the two tasks to help each other in achieving better performance. Moreover, these approaches can only be suitable for simple Gaussian or outlier noise assumptions, which cannot be applied to more complex noise distributions in practical applications. To address these issues, we propose a novel anomaly-tolerant network traffic estimation approach for simultaneously estimating network traffic and detecting network anomaly. Specifically, by utilizing the inherent low-rank property and temporal characteristic of traffic matrix, we formulate the network traffic estimation problem as a noise-immune temporal matrix completion (NiTMC) model, where the complex noise is fitted by mixture of Gaussian (MoG), and the network anomaly is smoothed by the $L_{2,1}$ -norm regularization. In addition, we also design a convergence-guaranteed optimization algorithm based on the expectation maximization (EM) and block coordinate update (BCU) methods to solve the proposed model. Furthermore, to deal with large-scale network problems, we develop a scalable and memory-efficient algorithm by employing stochastic proximal gradient descent (SPGD) method. Finally, the extensive experiments performed on real datasets demonstrate that our proposed NiTMC model outperforms the previously widely used network traffic estimation methods.


Journal ArticleDOI
TL;DR: Noise boost further applies to regular and Wasserstein bidirectionally trained adversarial networks and speeded convergence and improved the accuracy of bidirectional backpropagation on both the MNIST test set of hand-written digits and the CIFAR-10 testSet of images.

Proceedings ArticleDOI
01 Aug 2019
TL;DR: An efficient nonparametric Bayesian estimation of the kernel function of Hawkes processes is developed and it is shown that on diffusions related to online videos, the learned kernels reflect the perceived longevity for different content types such as music or pets videos.
Abstract: In this paper, we develop an efficient nonparametric Bayesian estimation of the kernel function of Hawkes processes. The non-parametric Bayesian approach is important because it provides flexible Hawkes kernels and quantifies their uncertainty. Our method is based on the cluster representation of Hawkes processes. Utilizing the stationarity of the Hawkes process, we efficiently sample random branching structures and thus, we split the Hawkes process into clusters of Poisson processes. We derive two algorithms -- a block Gibbs sampler and a maximum a posteriori estimator based on expectation maximization -- and we show that our methods have a linear time complexity, both theoretically and empirically. On synthetic data, we show our methods to be able to infer flexible Hawkes triggering kernels. On two large-scale Twitter diffusion datasets, we show that our methods outperform the current state-of-the-art in goodness-of-fit and that the time complexity is linear in the size of the dataset. We also observe that on diffusions related to online videos, the learned kernels reflect the perceived longevity for different content types such as music or pets videos.

Journal ArticleDOI
TL;DR: A model with five parameters to model the dispersive wave packet was developed and obtained the parameter vector of each wave packet by the expectation-maximization (EM) algorithm, which can be further applied to locate and evaluate the structure’s damage.

Journal ArticleDOI
TL;DR: A robust 3-D segmentation technique incorporated with the level sets concept and based on both shape and intensity constraints is introduced, which does not contain weighting parameters that need to be tuned and overcomes the drawbacks of other PDE approaches.

Journal ArticleDOI
TL;DR: This work recast the well known Expectation Maximization method in a distributed setting, exploiting a recently proposed algorithmic framework for in-network non-convex optimization to solve the problem of distributed unsupervised clustering.

Journal ArticleDOI
TL;DR: The JK, WLBS and PB approaches to variance estimation are shown to be robust and provide good coverage across a range of real and simulated data sets when performing model-based clustering; but care is advised when using the BS in such settings.
Abstract: Mixture models with (multivariate) Gaussian components are a popular tool in model-based clustering. Such models are often fitted by a procedure that maximizes the likelihood, such as the EM algorithm. At convergence, the maximum likelihood parameter estimates are typically reported, but in most cases little emphasis is placed on the variability associated with these estimates. In part this may be due to the fact that standard errors are not directly calculated in the model-fitting algorithm, either because they are not required to fit the model, or because they are difficult to compute. The examination of standard errors in model-based clustering is therefore typically neglected. Sampling based methods, such as the jackknife (JK), bootstrap (BS) and parametric bootstrap (PB), are intuitive, generalizable approaches to assessing parameter uncertainty in model-based clustering using a Gaussian mixture model. This paper provides a review and empirical comparison of the jackknife, bootstrap and parametric bootstrap methods for producing standard errors and confidence intervals for mixture parameters. The performance of such sampling methods in the presence of small and/or overlapping clusters requires consideration however; here the weighted likelihood bootstrap (WLBS) approach is demonstrated to be effective in addressing this concern in a model-based clustering framework. The JK, BS, PB and WLBS methods are illustrated and contrasted through simulation studies and through the traditional Old Faithful data set and also the Thyroid data set. The MclustBootstrap function, available in the most recent release of the popular R package mclust, facilitates the implementation of the JK, BS, PB and WLBS approaches to estimating parameter uncertainty in the context of model-based clustering. The JK, WLBS and PB approaches to variance estimation are shown to be robust and provide good coverage across a range of real and simulated data sets when performing model-based clustering; but care is advised when using the BS in such settings. In the case of poor model fit (for example for data with small and/or overlapping clusters), JK and BS are found to suffer from not being able to fit the specified model in many of the sub-samples formed. The PB also suffers when model fit is poor since it is reliant on data sets simulated from the model upon which to base the variance estimation calculations. However the WLBS will generally provide a robust solution, driven by the fact that all observations are represented with some weight in each of the sub-samples formed under this approach.