scispace - formally typeset
Search or ask a question

Showing papers on "Maximum a posteriori estimation published in 2020"


Proceedings ArticleDOI
14 Jun 2020
TL;DR: Wang et al. as mentioned in this paper proposed an end-to-end trainable unfolding network which leverages both learning-based and model-based methods to handle the SISR problem with different scale factors.
Abstract: Learning-based single image super-resolution (SISR) methods are continuously showing superior effectiveness and efficiency over traditional model-based methods, largely due to the end-to-end training. However, different from model-based methods that can handle the SISR problem with different scale factors, blur kernels and noise levels under a unified MAP (maximum a posteriori) framework, learning-based methods generally lack such flexibility. To address this issue, this paper proposes an end-to-end trainable unfolding network which leverages both learningbased methods and model-based methods. Specifically, by unfolding the MAP inference via a half-quadratic splitting algorithm, a fixed number of iterations consisting of alternately solving a data subproblem and a prior subproblem can be obtained. The two subproblems then can be solved with neural modules, resulting in an end-to-end trainable, iterative network. As a result, the proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model, while maintaining the advantages of learning-based methods. Extensive experiments demonstrate the superiority of the proposed deep unfolding network in terms of flexibility, effectiveness and also generalizability.

259 citations


Posted Content
TL;DR: This paper proposes an end-to-end trainable unfolding network which leverages both learningbased methods and model-based methods to super-resolve blurry, noisy images for different scale factors via a single model, while maintaining the advantages of learning- based methods.
Abstract: Learning-based single image super-resolution (SISR) methods are continuously showing superior effectiveness and efficiency over traditional model-based methods, largely due to the end-to-end training. However, different from model-based methods that can handle the SISR problem with different scale factors, blur kernels and noise levels under a unified MAP (maximum a posteriori) framework, learning-based methods generally lack such flexibility. To address this issue, this paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. Specifically, by unfolding the MAP inference via a half-quadratic splitting algorithm, a fixed number of iterations consisting of alternately solving a data subproblem and a prior subproblem can be obtained. The two subproblems then can be solved with neural modules, resulting in an end-to-end trainable, iterative network. As a result, the proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model, while maintaining the advantages of learning-based methods. Extensive experiments demonstrate the superiority of the proposed deep unfolding network in terms of flexibility, effectiveness and also generalizability.

188 citations


Journal ArticleDOI
TL;DR: This paper proposes a variational Bayes (VB) approach as an approximation of the optimal MAP detection and proves that the proposed iterative algorithm is guaranteed to converge to the global optimum of the approximated MAP detector regardless the resulting factor graph is loopy or not.
Abstract: The emerging orthogonal time frequency space (OTFS) modulation technique has shown its superiority to the current orthogonal frequency division multiplexing (OFDM) scheme, in terms of its capabilities of exploiting full time-frequency diversity and coping with channel dynamics. The optimal maximum a posteriori (MAP) detection is capable of eliminating the negative impacts of the inter-symbol interference in the delay-Doppler (DD) domain at the expense of a prohibitively high complexity. To reduce the receiver complexity for OTFS scheme, this paper proposes a variational Bayes (VB) approach as an approximation of the optimal MAP detection. Compared to the widely used message passing algorithm, we prove that the proposed iterative algorithm is guaranteed to converge to the global optimum of the approximated MAP detector regardless the resulting factor graph is loopy or not. Simulation results validate the fast convergence of the proposed VB receiver and also show its promising performance gain compared to the conventional message passing algorithm.

150 citations


Proceedings Article
12 Jul 2020
TL;DR: It is shown that a sufficient condition for a calibrated uncertainty on a ReLU network is "to be a bit Bayesian", which validate the usage of last-layer Bayesian approximation and motivate a range of a fidelity-cost trade-off.
Abstract: The point estimates of ReLU classification networks---arguably the most widely used neural network architecture---have been shown to yield arbitrarily high confidence far away from the training data. This architecture, in conjunction with a maximum a posteriori estimation scheme, is thus not calibrated nor robust. Approximate Bayesian inference has been empirically demonstrated to improve predictive uncertainty in neural networks, although the theoretical analysis of such Bayesian approximations is limited. We theoretically analyze approximate Gaussian distributions on the weights of ReLU networks and show that they fix the overconfidence problem. Furthermore, we show that even a simplistic, thus cheap, Bayesian approximation, also fixes these issues. This indicates that a sufficient condition for a calibrated uncertainty on a ReLU network is "to be a bit Bayesian". These theoretical results validate the usage of last-layer Bayesian approximation and motivate a range of a fidelity-cost trade-off. We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.

126 citations


Journal ArticleDOI
TL;DR: A Bayesian statistical approach to the multimodal change detection (CD) problem in remote sensing imagery in the unsupervised Markovian framework, which relies on the use of an observation field built up from a pixel pairwise modeling and on the bitemporal heterogeneous satellite image pair.
Abstract: This work presents a Bayesian statistical approach to the multimodal change detection (CD) problem in remote sensing imagery. More precisely, we formulate the multimodal CD problem in the unsupervised Markovian framework. The main novelty of the proposed Markovian model lies in the use of an observation field built up from a pixel pairwise modeling and on the bitemporal heterogeneous satellite image pair. Such modeling allows us to rely instead on a robust visual cue, with the appealing property of being quasi-invariant to the imaging (multi-) modality. To use this observation cue as part of a stochastic likelihood model, we first rely on a preliminary iterative estimation technique that takes into account the variety of the laws in the distribution mixture and estimates the parameters of the Markovian mixture model. Once this estimation step is completed, the Maximum a posteriori (MAP) solution of the change detection map, based on the previously estimated parameters, is then computed with a stochastic optimization process. Experimental results and comparisons involving a mixture of different types of imaging modalities confirm the robustness of the proposed approach.

80 citations


Journal ArticleDOI
TL;DR: A general model of deep neural network (DNN)-based modulation classifiers for single-input single-output (SISO) systems is introduced and its feasibility is analyzed, and its robustness to uncertain noise conditions is compared to that of the conventional maximum likelihood (ML)-based classifiers.
Abstract: Recently, classifying the modulation schemes of signals using deep neural network has received much attention. In this paper, we introduce a general model of deep neural network (DNN)-based modulation classifiers for single-input single-output (SISO) systems. Its feasibility is analyzed using maximum a posteriori probability (MAP) criterion and its robustness to uncertain noise conditions is compared to that of the conventional maximum likelihood (ML)-based classifiers. To reduce the design and training cost of DNN classifiers, a simple but effective pre-processing method is introduced and adopted. Furthermore, featuring multiple recurrent neural network (RNN) layers, the DNN modulation classifier is realized. Simulation results show that the proposed RNN-based classifier is robust to the uncertain noise conditions, and the performance of it approaches to that of the ideal ML classifier with perfect channel and noise information. Moreover, with a much lower complexity, it outperforms the existing ML-based classifiers, specifically, expectation maximization (EM) and expectation conditional maximization (ECM) classifiers which iteratively estimate channel and noise parameters. In addition, the proposed classifier is shown to be invariant to the signal distortion such as frequency offset. Furthermore, the adopted pre-processing method is shown to accelerate the training process of our proposed classifier, thus reducing the training cost. Lastly, the computational complexity of our proposed classifier is analyzed and compared to other traditional ones, which further demonstrates its overall advantage.

78 citations


Proceedings Article
30 Apr 2020
TL;DR: In this paper, an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) that performs policy iteration based on a learned state-value function is proposed. But this method suffers from large variance that may limit performance and in practice requires carefully tuned entropy regularization to prevent policy collapse.
Abstract: Some of the most successful applications of deep reinforcement learning to challenging domains in discrete and continuous control have used policy gradient methods in the on-policy setting. However, policy gradients can suffer from large variance that may limit performance, and in practice require carefully tuned entropy regularization to prevent policy collapse. As an alternative to policy gradient algorithms, we introduce V-MPO, an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) that performs policy iteration based on a learned state-value function. We show that V-MPO surpasses previously reported scores for both the Atari-57 and DMLab-30 benchmark suites in the multi-task setting, and does so reliably without importance weighting, entropy regularization, or population-based tuning of hyperparameters. On individual DMLab and Atari levels, the proposed algorithm can achieve scores that are substantially higher than has previously been reported. V-MPO is also applicable to problems with high-dimensional, continuous action spaces, which we demonstrate in the context of learning to control simulated humanoids with 22 degrees of freedom from full state observations and 56 degrees of freedom from pixel observations, as well as example OpenAI Gym tasks where V-MPO achieves substantially higher asymptotic scores than previously reported.

72 citations


Proceedings ArticleDOI
20 May 2020
TL;DR: It is shown that the most likely translations under the model accumulate so little probability mass that the mode can be considered essentially arbitrary, and advocate for the use of decision rules that take into account the translation distribution holistically.
Abstract: Recent studies have revealed a number of pathologies of neural machine translation (NMT) systems. Hypotheses explaining these mostly suggest there is something fundamentally wrong with NMT as a model or its training algorithm, maximum likelihood estimation (MLE). Most of this evidence was gathered using maximum a posteriori (MAP) decoding, a decision rule aimed at identifying the highest-scoring translation, i.e. the mode. We argue that the evidence corroborates the inadequacy of MAP decoding more than casts doubt on the model and its training algorithm. In this work, we show that translation distributions do reproduce various statistics of the data well, but that beam search strays from such statistics. We show that some of the known pathologies and biases of NMT are due to MAP decoding and not to NMT’s statistical assumptions nor MLE. In particular, we show that the most likely translations under the model accumulate so little probability mass that the mode can be considered essentially arbitrary. We therefore advocate for the use of decision rules that take into account the translation distribution holistically. We show that an approximation to minimum Bayes risk decoding gives competitive results confirming that NMT models do capture important aspects of translation well in expectation.

70 citations


Journal ArticleDOI
TL;DR: A thorough review of MC methods for the estimation of static parameters in signal processing applications is performed, describing many of the most relevant MCMC and IS algorithms, and their combined use.
Abstract: Statistical signal processing applications usually require the estimation of some parameters of interest given a set of observed data. These estimates are typically obtained either by solving a multi-variate optimization problem, as in the maximum likelihood (ML) or maximum a posteriori (MAP) estimators, or by performing a multi-dimensional integration, as in the minimum mean squared error (MMSE) estimators. Unfortunately, analytical expressions for these estimators cannot be found in most real-world applications, and the Monte Carlo (MC) methodology is one feasible approach. MC methods proceed by drawing random samples, either from the desired distribution or from a simpler one, and using them to compute consistent estimators. The most important families of MC algorithms are the Markov chain MC (MCMC) and importance sampling (IS). On the one hand, MCMC methods draw samples from a proposal density, building then an ergodic Markov chain whose stationary distribution is the desired distribution by accepting or rejecting those candidate samples as the new state of the chain. On the other hand, IS techniques draw samples from a simple proposal density and then assign them suitable weights that measure their quality in some appropriate way. In this paper, we perform a thorough review of MC methods for the estimation of static parameters in signal processing applications. A historical note on the development of MC schemes is also provided, followed by the basic MC method and a brief description of the rejection sampling (RS) algorithm, as well as three sections describing many of the most relevant MCMC and IS algorithms, and their combined use. Finally, five numerical examples (including the estimation of the parameters of a chaotic system, a localization problem in wireless sensor networks and a spectral analysis application) are provided in order to demonstrate the performance of the described approaches.

69 citations


Posted Content
TL;DR: Simulation results show that the proposed DFRC based beamforming scheme is superior to the feedback-based approach in terms of both estimation and communication performance, and the proposed message passing algorithm achieves a similar performance of the high-complexity particle filtering-based methods.
Abstract: The development of dual-functional radar-communication (DFRC) systems, where vehicle localization and tracking can be combined with vehicular communication, will lead to more efficient future vehicular networks. In this paper, we develop a predictive beamforming scheme in the context of DFRC systems. We consider a system model where the road-side units estimates and predicts the motion parameters of vehicles based on the echoes of the DFRC signal. Compared to the conventional feedback-based beam tracking approaches, the proposed method can reduce the signaling overhead and improve the accuracy. To accurately estimate the motion parameters of vehicles in real-time, we propose a novel message passing algorithm based on factor graph, which yields near optimal solution to the maximum a posteriori estimation. The beamformers are then designed based on the predicted angles for establishing the communication links.}With the employment of appropriate approximations, all messages on the factor graph can be derived in a closed-form, thus reduce the complexity. Simulation results show that the proposed DFRC based beamforming scheme is superior to the feedback-based approach in terms of both estimation and communication performance. Moreover, the proposed message passing algorithm achieves a similar performance of the high-complexity particle-based methods.

61 citations


Journal ArticleDOI
TL;DR: In this article, a Bayesian approach was proposed to solve the problem of ship wake detection using a generalized minimax concave (GMC) function, where the GMC penalty enforces the overall cost function to be convex.
Abstract: In order to analyze synthetic aperture radar (SAR) images of the sea surface, ship wake detection is essential for extracting information on the wake generating vessels. One possibility is to assume a linear model for wakes, in which case detection approaches are based on transforms such as Radon and Hough. These express the bright (dark) lines as peak (trough) points in the transform domain. In this article, ship wake detection is posed as an inverse problem, which the associated cost function including a sparsity enforcing penalty, i.e., the generalized minimax concave (GMC) function. Despite being a nonconvex regularizer, the GMC penalty enforces the overall cost function to be convex. The proposed solution is based on a Bayesian formulation, whereby the point estimates are recovered using a maximum a posteriori (MAP) estimation. To quantify the performance of the proposed method, various types of SAR images are used, corresponding to TerraSAR-X, COSMO-SkyMed, Sentinel-1, and Advanced Land Observing Satellite 2 (ALOS2). The performance of various priors in solving the proposed inverse problem is first studied by investigating the GMC along with the $L_{1}$ , $L_{p}$ , nuclear, and total variation (TV) norms. We show that the GMC achieves the best results and we subsequently study the merits of the corresponding method in comparison to two state-of-the-art approaches for ship wake detection. The results show that our proposed technique offers the best performance by achieving 80% success rate.

Journal ArticleDOI
TL;DR: This novel framework integrates the state-of-the-art Bayesian formulations into a hierarchical setting aiming to capture both the identification precision and the variability prompted due to modeling errors to address relevant concerns related to the outcome of the mainstream Bayesian methods in capturing the stochastic variability from dissimilar data sets.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: An effective method named Probability Weighted Compact Feature Learning (PWCF), which provides inter-domain correlation guidance to promote cross-domain retrieval accuracy and learns a series of compact binary codes to improve the retrieval speed, is proposed.
Abstract: Domain adaptive image retrieval includes single-domain retrieval and cross-domain retrieval. Most of the existing image retrieval methods only focus on single-domain retrieval, which assumes that the distributions of retrieval databases and queries are similar. However, in practical application, the discrepancies between retrieval databases often taken in ideal illumination/pose/background/camera conditions and queries usually obtained in uncontrolled conditions are very large. In this paper, considering the practical application, we focus on challenging cross-domain retrieval. To address the problem, we propose an effective method named Probability Weighted Compact Feature Learning (PWCF), which provides inter-domain correlation guidance to promote cross-domain retrieval accuracy and learns a series of compact binary codes to improve the retrieval speed. First, we derive our loss function through the Maximum A Posteriori Estimation (MAP): Bayesian Perspective (BP) induced focal-triplet loss, BP induced quantization loss and BP induced classification loss. Second, we propose a common manifold structure between domains to explore the potential correlation across domains. Considering the original feature representation is biased due to the inter-domain discrepancy, the manifold structure is difficult to be constructed. Therefore, we propose a new feature named Histogram Feature of Neighbors (HFON) from the sample statistics perspective. Extensive experiments on various benchmark databases validate that our method outperforms many state-of-the-art image retrieval methods for domain adaptive image retrieval. The source code is available at {https://github.com/fuxianghuang1/PWCF}.

Journal ArticleDOI
TL;DR: The PCVM is extended and a multiclass PCVM (mPCVM) is proposed, which combines the advantages of both the support vector machine and the relevant vector machine, delivering a sparse Bayesian solution to classification problems.
Abstract: The probabilistic classification vector machine (PCVM) synthesizes the advantages of both the support vector machine and the relevant vector machine, delivering a sparse Bayesian solution to classification problems. However, the PCVM is currently only applicable to binary cases. Extending the PCVM to multiclass cases via heuristic voting strategies such as one-vs-rest or one-vs-one often results in a dilemma where classifiers make contradictory predictions, and those strategies might lose the benefits of probabilistic outputs. To overcome this problem, we extend the PCVM and propose a multiclass PCVM (mPCVM). Two learning algorithms, i.e., one top-down algorithm and one bottom-up algorithm, have been implemented in the mPCVM. The top-down algorithm obtains the maximum a posteriori (MAP) point estimates of the parameters based on an expectation–maximization algorithm, and the bottom-up algorithm is an incremental paradigm by maximizing the marginal likelihood. The superior performance of the mPCVMs, especially when the investigated problem has a large number of classes, is extensively evaluated on the synthetic and benchmark data sets.

Journal ArticleDOI
TL;DR: This paper reviews the classical problem of free-form curve registration and applies it to an efficient RGB-D visual odometry system called Canny-VO, as it efficiently tracks all Canny edge features extracted from the images.
Abstract: The present paper reviews the classical problem of free-form curve registration and applies it to an efficient RGBD visual odometry system called Canny-VO, as it efficiently tracks all Canny edge features extracted from the images. Two replacements for the distance transformation commonly used in edge registration are proposed: Approximate Nearest Neighbour Fields and Oriented Nearest Neighbour Fields. 3D2D edge alignment benefits from these alternative formulations in terms of both efficiency and accuracy. It removes the need for the more computationally demanding paradigms of datato-model registration, bilinear interpolation, and sub-gradient computation. To ensure robustness of the system in the presence of outliers and sensor noise, the registration is formulated as a maximum a posteriori problem, and the resulting weighted least squares objective is solved by the iteratively re-weighted least squares method. A variety of robust weight functions are investigated and the optimal choice is made based on the statistics of the residual errors. Efficiency is furthermore boosted by an adaptively sampled definition of the nearest neighbour fields. Extensive evaluations on public SLAM benchmark sequences demonstrate state-of-the-art performance and an advantage over classical Euclidean distance fields.

Journal ArticleDOI
TL;DR: An effective satellite video SR framework based on locally spatiotemporal neighbors and nonlocal similarity modeling is proposed and demonstrates better SR performance in preserving edges and texture details compared with the-state-of-art video SR methods.
Abstract: Recently, super-resolution (SR) of satellite videos has received increasing attention as it can overcome the limitation of spatial resolution in applications of satellite videos to dynamic analysis. The low quality of satellite videos presents big challenges to the development of the spatial SR techniques, e.g., accurate motion estimation and motion compensation for multiframe SR. Therefore, reasonable image priors in maximum a posteriori (MAP) framework, where motion information among adjacent frames is involved, are needed to regularize the solution space and generate the corresponding high-resolution frames. In this article, an effective satellite video SR framework based on locally spatiotemporal neighbors and nonlocal similarity modeling is proposed. Firstly, local prior knowledge is represented by means of adaptively exploiting spatiotemporal neighbors. In this way, implicitly local motion information can be captured without explicit motion estimation. Secondly, the nonlocal spatial similarity is integrated into the proposed SR framework to enhance texture details. Finally, the locally spatiotemporal regularization and nonlocal similarity modeling bring out a complex optimization problem, which is solved via the iterated reweighted least squares in the proposed SR framework. The videos from the Jilin-1 satellite and the OVS-1A satellite are used for evaluating the proposed method. Experimental results show that the proposed method demonstrates better SR performance in preserving edges and texture details compared with the-state-of-art video SR methods.

Journal ArticleDOI
TL;DR: The SSGL is extended to sparse generalized additive models (GAMs), thereby introducing the first nonparametric variant of the spike-and-slab lasso methodology and developing theory to uniquely characterize the global posterior mode under the SSGL and introducing a highly efficient block coordinate ascent algorithm for maximum a posteriori estimation.
Abstract: –We introduce the spike-and-slab group lasso (SSGL) for Bayesian estimation and variable selection in linear regression with grouped variables. We further extend the SSGL to sparse generali...

Proceedings ArticleDOI
06 Jul 2020
TL;DR: Graph Learning Neural Networks (GLNNs) as mentioned in this paper exploit the optimization of graphs (the adjacency matrix in particular) from both data and tasks by leveraging on spectral graph theory, and propose the objective of graph learning from a sparsity constraint, properties of a valid adjACency matrix as well as a graph Laplacian regularizer via maximum a posteriori estimation.
Abstract: Graph Convolutional Neural Networks (GCNNs) are generalizations of CNNs to graph-structured data, in which convolution is guided by the graph topology. In many cases where graphs are unavailable, existing methods manually construct graphs or learn task-driven adaptive graphs. In this paper, we propose Graph Learning Neural Networks (GLNNs), which exploit the optimization of graphs (the adjacency matrix in particular) from both data and tasks. Leveraging on spectral graph theory, we propose the objective of graph learning from a sparsity constraint, properties of a valid adjacency matrix as well as a graph Laplacian regularizer via maximum a posteriori estimation. The optimization objective is then integrated into the loss function of the GCNN, which adapts the graph topology to not only labels of a specific task but also the input data. Experimental results show that our proposed GLNN significantly outperforms state-of-the-art approaches over widely adopted social network datasets and citation network datasets for semi-supervised classification.

Book ChapterDOI
23 Aug 2020
TL;DR: Chen et al. as discussed by the authors proposed a 3D-Crowd-Pose-Estimation-Based-on-MVG (3DCrowdPose) method, which consists of two key components: a graph model for fast cross-view matching, and a maximum a posteriori (MAP) estimator for the reconstruction of the 3D human poses.
Abstract: Epipolar constraints are at the core of feature matching and depth estimation in current multi-person multi-camera 3D human pose estimation methods. Despite the satisfactory performance of this formulation in sparser crowd scenes, its effectiveness is frequently challenged under denser crowd circumstances mainly due to two sources of ambiguity. The first is the mismatch of human joints resulting from the simple cues provided by the Euclidean distances between joints and epipolar lines. The second is the lack of robustness from the naive formulation of the problem as a least squares minimization. In this paper, we depart from the multi-person 3D pose estimation formulation, and instead reformulate it as crowd pose estimation. Our method consists of two key components: a graph model for fast cross-view matching, and a maximum a posteriori (MAP) estimator for the reconstruction of the 3D human poses. We demonstrate the effectiveness and superiority of our proposed method on four benchmark datasets. Our code is available at: https://github.com/HeCraneChen/3D-Crowd-Pose-Estimation-Based-on-MVG.

Journal ArticleDOI
TL;DR: Under the scenario of time-series observation by remote sensing imagery, the joint spectral–spatial–temporal MAP-based (SST_MAP) model for SPM is proposed, with a newly developed temporal regularization model added to the MAP model, based on the prerequisite for a temporally close fine image covering the same study region.
Abstract: The maximum a posteriori (MAP) estimation model-based sub-pixel mapping (SPM) method is an alternative way to solve the ill-posed SPM problem. The MAP estimation model has been proven to be an effective SPM approach and has been extensively developed over the past few years, as a result of its effective regularization capability that comes from the spatial regularization model. However, various spatial regularization models do not always truly reflect the detailed spatial distribution in a real situation, and the over-smoothing effect of the spatial regularization model always tends to efface the detailed structural information. In this article, under the scenario of time-series observation by remote sensing imagery, the joint spectral–spatial–temporal MAP-based (SST_MAP) model for SPM is proposed. In SST_MAP, a newly developed temporal regularization model is added to the MAP model, based on the prerequisite for a temporally close fine image covering the same study region. This available fine image can provide the specific spatial structures most closely conforming to the ground truth for a more precise constraint, thereby reducing the over-smoothing effect. Furthermore, the three dimensions are mutually balanced and mutually constrained, to reach an equilibrium point and achieve restoration of both smooth areas for the homogeneous land-cover classes and a detailed structure for the heterogeneous land-cover classes. Four experiments were designed to validate the proposed SST_MAP: three synthetic-image experiments and one real-image experiment. The restoration results confirm the superiority of the proposed SST_MAP model. Notably, under the background of time-series observation, SST_MAP provides an alternative way of land-cover change detection (LCCD), achieving both detailed spatial-scale and high-frequency temporal LCCD observation for the study case of urbanization analysis within the city of Wuhan in China.

Journal ArticleDOI
TL;DR: This work proposes a maximum a posteriori (MAP) estimator to detect the information source with other methods as the prior, and designs two approximate MAP estimators, namely brute force search approximation (BFSA) and greedy search bound approximation (GSBA), from the perspective of likelihood approximation.
Abstract: Information source detection is to identify nodes initiating the diffusion process in a network, which has a wide range of applications including epidemic outbreak prevention, Internet virus source identification, and rumor source tracing in social networks. Although it has attracted ever-increasing attention from research community in recent years, existing solutions still suffer from high time complexity and inadequate effectiveness, due to high dynamics of information diffusion and observing just a snapshot of the whole process. To this end, we present a comprehensive study for single information source detection in weighted graphs . Specifically, we first propose a maximum a posteriori (MAP) estimator to detect the information source with other methods as the prior, which ensures our method can be integrated with others naturally. Different from many related works, we exploit both infected nodes and their uninfected neighbors to calculate the effective propagation probability , and then derive the exact formation of likelihood for general weighted graphs. To further improve the efficiency, we design two approximate MAP estimators, namely brute force search approximation (BFSA) and greedy search bound approximation (GSBA), from the perspective of likelihood approximation. BFSA tries to traverse the permitted permutations to directly compute the likelihood, but GSBA exploits a strategy of greedy search to find a surrogate upper bound of the likelihood, and thus avoids the enumeration of permitted permutations. Therefore, detecting with partial nodes and likelihood approximation reduces the computational complexity drastically for large graphs. Extensive experiments on several data sets also clearly demonstrate the effectiveness of our methods on detecting the single information source with different settings in weighted graphs.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian hierarchical setting is introduced, which breaks time-history vibration responses into several segments so as to capture and identify the variability of inferred parameters over the segments, and the proposed method delivers robust parametric uncertainties with respect to unknown phenomena such as ambient conditions, input characteristics, and environmental factors.

Posted Content
TL;DR: In this paper, the authors theoretically analyze approximate Gaussian distributions on the weights of ReLU networks and show that they fix the overconfidence problem and validate the usage of last-layer Bayesian approximation and motivate a range of a fidelity-cost trade-off.
Abstract: The point estimates of ReLU classification networks---arguably the most widely used neural network architecture---have been shown to yield arbitrarily high confidence far away from the training data. This architecture, in conjunction with a maximum a posteriori estimation scheme, is thus not calibrated nor robust. Approximate Bayesian inference has been empirically demonstrated to improve predictive uncertainty in neural networks, although the theoretical analysis of such Bayesian approximations is limited. We theoretically analyze approximate Gaussian distributions on the weights of ReLU networks and show that they fix the overconfidence problem. Furthermore, we show that even a simplistic, thus cheap, Bayesian approximation, also fixes these issues. This indicates that a sufficient condition for a calibrated uncertainty on a ReLU network is "to be a bit Bayesian". These theoretical results validate the usage of last-layer Bayesian approximation and motivate a range of a fidelity-cost trade-off. We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.

Posted Content
TL;DR: Non-asymptotic computational guarantees for Langevin-type MCMC algorithms which scale polynomially in key quantities such as the dimension of the model, the desired precision level, and the number of available statistical measurements are presented.
Abstract: The problem of generating random samples of high-dimensional posterior distributions is considered. The main results consist of non-asymptotic computational guarantees for Langevin-type MCMC algorithms which scale polynomially in key quantities such as the dimension of the model, the desired precision level, and the number of available statistical measurements. As a direct consequence, it is shown that posterior mean vectors as well as optimisation based maximum a posteriori (MAP) estimates are computable in polynomial time, with high probability under the distribution of the data. These results are complemented by statistical guarantees for recovery of the ground truth parameter generating the data. Our results are derived in a general high-dimensional non-linear regression setting (with Gaussian process priors) where posterior measures are not necessarily log-concave, employing a set of local `geometric' assumptions on the parameter space, and assuming that a good initialiser of the algorithm is available. The theory is applied to a representative non-linear example from PDEs involving a steady-state Schrodinger equation.

Journal ArticleDOI
TL;DR: A Computationally efficient model updating approach based on maximum a posteriori (MAP) is formulated where all the mass, stiffness and damping parameters are updated in a sequential iterative manner and is observed to be successful for updating of FE model as well as for detection of changes/damages in probabilistic manner, while being computationally efficient as well.

Proceedings Article
17 Jul 2020
TL;DR: The proposed framework, termed Convolutional Neural Networks with Feedback (CNN-F), introduces a generative feedback with latent variables into existing CNN architectures, making consistent predictions via alternating MAP inference under a Bayesian framework.
Abstract: Neural networks are vulnerable to input perturbations such as additive noise and adversarial attacks. In contrast, human perception is much more robust to such perturbations. The Bayesian brain hypothesis states that human brains use an internal generative model to update the posterior beliefs of the sensory input. This mechanism can be interpreted as a form of self-consistency between the maximum a posteriori (MAP) estimation of an internal generative model and the external environment. Inspired by such hypothesis, we enforce self-consistency in neural networks by incorporating generative recurrent feedback. We instantiate this design on convolutional neural networks (CNNs). The proposed framework, termed Convolutional Neural Networks with Feedback (CNN-F), introduces a generative feedback with latent variables to existing CNN architectures, where consistent predictions are made through alternating MAP inference under a Bayesian framework. In the experiments, CNN-F shows considerably improved adversarial robustness over conventional feedforward CNNs on standard benchmarks.

Journal ArticleDOI
TL;DR: This paper develops a novel high-dimensional non-coherent detection scheme for molecular signals using the Parzen window technique based probabilistic neural network (Parzen-PNN), given its ability to approximate the multivariate posterior densities by taking the previous detection results into a channel-independent Gaussian Parzenwindow, thereby avoiding the complex channel estimations.
Abstract: In emerging Internet-of-Nano-Thing (IoNT), information will be embedded and conveyed in the form of molecules through complex and diffusive medias. One main challenge lies in the long-tail nature of the channel response causing inter-symbol-interference (ISI), which deteriorates the detection performance. If the channel is unknown, existing coherent schemes (e.g., the state-of-the-art maximum a posteriori , MAP) have to pursue complex channel estimation and ISI mitigation techniques, which will result in either high computational complexity, or poor estimation accuracy that will hinder the detection performance. In this paper, we develop a novel high-dimensional non-coherent detection scheme for molecular signals. We achieve this in a higher-dimensional metric space by combining different non-coherent metrics that exploit the transient features of the signals. By deducing the theoretical bit error rate (BER) for any constructed high-dimensional non-coherent metric, we prove that, higher dimensionality always achieves a lower BER in the same sample space, at the expense of higher complexity on computing the multivariate posterior densities. The realization of this high-dimensional non-coherent scheme is resorting to the Parzen window technique based probabilistic neural network (Parzen-PNN), given its ability to approximate the multivariate posterior densities by taking the previous detection results into a channel-independent Gaussian Parzen window, thereby avoiding the complex channel estimations. The complexity of the posterior computation is shared by the parallel implementation of the Parzen-PNN. Numerical simulations demonstrate that our proposed scheme can gain 10dB in SNR given a fixed BER as $10^{-4}$ , in comparison with other state-of-the-art methods.

Journal ArticleDOI
TL;DR: This proposal seeks to estimate how gases released in the environment are distributed from a set of sparse and uncertain gas-concentration and wind-flow measurements; such that by exploiting the high correlation between these two magnitudes the authors may extrapolate their value for unexplored areas.

Journal ArticleDOI
TL;DR: A variational approximation of the likelihood function is developed, which allows us to find variationally optimal approximations of the maximum-likelihood and maximum a posteriori estimates.
Abstract: In this article, we consider the identification of linear models from quantized output data. We develop a variational approximation of the likelihood function, which allows us to find variationally optimal approximations of the maximum-likelihood and maximum a posteriori estimates. We show that these estimates are obtained by projecting the midpoint in the quantization interval of each output measurement onto the column space of the input regression matrix. Interpreting the quantized output as a random variable, we derive its moments for generic noise distributions. For the case of Gaussian noise and Gaussian independent identically distributed input, we give an analytical characterization of the bias, which we use to build a bias-compensation scheme that leads to consistent estimates.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian framework is adopted to infer the strength of the heat sources from thermochromic liquid crystal (TLC) temperature measurements, and the estimated heat source values are input to the forward model to determine the hot spot temperatures on the flat plate.