scispace - formally typeset
Search or ask a question

Showing papers on "Monte Carlo method published in 2004"


Journal ArticleDOI
TL;DR: GENECLASS2 is a software that computes various genetic assignment criteria to assign or exclude reference populations as the origin of diploid or haploid individuals, as well as of groups of individuals, on the basis of multilocus genotype data, for the specific task of first-generation migrant detection.
Abstract: GENECLASS2 is a software that computes various genetic assignment criteria to assign or exclude reference populations as the origin of diploid or haploid individuals, as well as of groups of individuals, on the basis of multilocus genotype data. In addition to traditional assignment aims, the program allows the specific task of first-generation migrant detection. It includes several Monte Carlo resampling algorithms that compute for each individual its probability of belonging to each reference population or to be a resident (i.e., not a first-generation migrant) in the population where it was sampled. A user-friendly interface facilitates the treatment of large datasets.

2,406 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that with simple extensions of the shower algorithm in Monte Carlo programs, one can implement NLO corrections to the hardest emission that overcome the problems of negative weighted events found in previous implementations.
Abstract: I show that with simple extensions of the shower algorithms in Monte Carlo programs, one can implement NLO corrections to the hardest emission that overcome the problems of negative weighted events found in previous implementations. Simple variants of the same method can be used for an improved treatment of matrix element corrections in Shower Monte Carlo programs.

1,766 citations


Journal ArticleDOI
TL;DR: It is shown that with simple extensions of the shower algorithms in Monte Carlo programs, one can implement NLO corrections to the hardest emission that overcome the problems of negative weighted events found in previous implementations.
Abstract: I show that with simple extensions of the shower algorithms in Monte Carlo programs, one can implement NLO corrections to the hardest emission that overcome the problems of negative weighted events found in previous implementations. Simple variants of the same method can be used for an improved treatment of matrix element corrections in Shower Monte Carlo programs.

1,749 citations


01 Jan 2004
TL;DR: In this paper, the Monte Carlo method is not compelling for one dimensional integration, but it is more compelling for a d-dimensional integral evaluated withM points, so that the error in I goes down as 1/ √ M and is smaller if the variance σ 2 f of f is smaller.
Abstract: so that the error in I goes down as 1/ √ M and is smaller if the variance σ 2 f of f is smaller For a one dimensional integration the Monte Carlo method is not compelling However consider a d dimensional integral evaluated withM points For a uniform mesh each dimension of the integral getsM1/d points, so that the separation is h = M−1/d The error in the integration over one h cube is of order hd+2, since we are approximating the surface by a linear interpolation (a plane) with an O(h2) error The total error in the integral is Mhd+2 = M−2/d The error in the Monte Carlo method remains M−1/2, so that this method wins for d > 4 We can reduce the error in I by reducing the effective σf This is done by concentrating the sampling where f (x) is large, using a weight function w(x) (ie w(x) > 0, ∫ 1 0 w(x) = 1) I = ∫ 1

1,642 citations


Journal ArticleDOI
TL;DR: Simulated data sets were used to test the power and accuracy of Monte Carlo resampling methods in generating statistical thresholds for identifying F0 immigrants in populations with ongoing gene flow, and hence for providing direct, real‐time estimates of migration rates.
Abstract: Genetic assignment methods use genotype likelihoods to draw inference about where individuals were or were not born, potentially allowing direct, real-time estimates of dispersal. We used simulated data sets to test the power and accuracy of Monte Carlo resampling methods in generating statistical thresholds for identifying F0 immigrants in populations with ongoing gene flow, and hence for providing direct, real-time estimates of migration rates. The identification of accurate critical values required that resampling methods preserved the linkage disequilibrium deriving from recent generations of immigrants and reflected the sampling variance present in the data set being analysed. A novel Monte Carlo resampling method taking into account these aspects was proposed and its efficiency was evaluated. Power and error were relatively insensitive to the frequency assumed for missing alleles. Power to identify F0 immigrants was improved by using large sample size (up to about 50 individuals) and by sampling all populations from which migrants may have originated. A combination of plotting genotype likelihoods and calculating mean genotype likelihood ratios (DLR) appeared to be an effective way to predict whether F0 immigrants could be identified for a particular pair of populations using a given set of markers.

1,481 citations



Journal ArticleDOI
TL;DR: A priori error estimates for the computation of the expected value of the solution are given and a comparison of the computational work required by each numerical approximation is included to suggest intuitive conditions for an optimal selection of the numerical approximation.
Abstract: We describe and analyze two numerical methods for a linear elliptic problem with stochastic coefficients and homogeneous Dirichlet boundary conditions. Here the aim of the com- putations is to approximate statistical moments of the solution, and, in particular, we give a priori error estimates for the computation of the expected value of the solution. The first method gener- ates independent identically distributed approximations of the solution by sampling the coefficients of the equation and using a standard Galerkin finite element variational formulation. The Monte Carlo method then uses these approximations to compute corresponding sample averages. The sec- ond method is based on a finite dimensional approximation of the stochastic coefficients, turning the original stochastic problem into a deterministic parametric elliptic problem. A Galerkin finite element method, of either the h -o rp-version, then approximates the corresponding deterministic solution, yielding approximations of the desired statistics. We present a priori error estimates and include a comparison of the computational work required by each numerical approximation to achieve a given accuracy. This comparison suggests intuitive conditions for an optimal selection of the numerical approximation.

899 citations


Journal ArticleDOI
TL;DR: In this paper, effective sample size (ESS) has been proposed to modify the nonparametric Mann-Kendall (MK) statistical test to assess the significance of trend in hydrological time series.
Abstract: The non-parametric Mann-Kendall (MK) statistical test has been popularly used to assess the significance of trend in hydrological time series The test requires sample data to be serially independent When sample data are serially correlated, the presence of serial correlation in time series will affect the ability of the test to correctly assess the significance of trend To eliminate the effect of serial correlation on the MK test, effective sample size (ESS) has been proposed to modify the MK statistic This study investigates the ability of ESS to eliminate the influence of serial correlation on the MK test by Monte Carlo simulation Simulation demonstrates that when no trend exists within time series, ESS can effectively limit the effect of serial correlation on the MK test When trend exists within time series, the existence of trend will contaminate the estimate of the magnitude of sample serial correlation, and ESS computed from the contaminated serial correlation cannot properly eliminate the effect of serial correlation on the MK test However, if ESS is computed from the sample serial correlation that is estimated from the detrended series, ESS can still effectively reduce the influence of serial correlation on the MK test

878 citations


Journal ArticleDOI
TL;DR: The results challenge the recently proposed notion that a set of six icosahedrally‐arranged orientations is optimal for DT‐MRI and show that at least 20 unique samplingorientations are necessary for a robust estimation of anisotropy, whereas at least 30 unique sampling orientations are required for a strong estimation of tensor‐orientation and mean diffusivity.
Abstract: There are conflicting opinions in the literature as to whether it is more beneficial to use a large number of gradient sampling orientations in diffusion tensor MRI (DT-MRI) experiments than to use a smaller number of carefully chosen orientations. In this study, Monte Carlo simulations were used to study the effect of using different gradient sampling schemes on estimates of tensor-derived quantities assuming a b-value of 1000 smm –2 . The study focused in particular on the effect that the number of unique gradient orientations has on uncertainty in estimates of tensor-orientation, and on estimates of the trace and anisotropy of the diffusion tensor. The results challenge the recently proposed notion that a set of six icosahedrally-arranged orientations is optimal for DT-MRI. It is shown that at least 20 unique sampling orientations are necessary for a robust estimation of anisotropy, whereas at least 30 unique sampling orientations are required for a robust estimation of tensor-orientation and mean diffusivity. Finally, the performance of sampling schemes that use low numbers of sampling orientations, but make efficient use of available gradient power, are compared to less efficient schemes with larger numbers of sampling orientations, and the relevant scenarios in which each type of scheme should be used are discussed. Magn Reson Med 51:807– 815, 2004. Published 2004 Wiley-Liss, Inc.†

824 citations


Journal ArticleDOI
TL;DR: In this paper, a discrete-time approximation for decoupled forward-backward stochastic dierential equations is proposed, and the L p norm of the error is shown to be of the order of the time step.

615 citations


Journal ArticleDOI
TL;DR: In this article, a particle representation of the filtering distributions, and their evolution through time using sequential importance sampling and resampling ideas are developed for performing smoothing computations in general state-space models.
Abstract: We develop methods for performing smoothing computations in general state-space models. The methods rely on a particle representation of the filtering distributions, and their evolution through time using sequential importance sampling and resampling ideas. In particular, novel techniques are presented for generation of sample realizations of historical state sequences. This is carried out in a forward-filtering backward-smoothing procedure that can be viewed as the nonlinear, non-Gaussian counterpart of standard Kalman filter-based simulation smoothers in the linear Gaussian case. Convergence in the mean squared error sense of the smoothed trajectories is proved, showing the validity of our proposed method. The methods are tested in a substantial application for the processing of speech signals represented by a time-varying autoregression and parameterized in terms of time-varying partial correlation coefficients, comparing the results of our algorithm with those from a simple smoother based on the filte...

Journal ArticleDOI
TL;DR: Using Monte Carlo simulations it is shown that estimation algorithms can come close to attaining the limit given in the expression and explicit quantitative results are provided to show how the limit of the localization accuracy is reduced by factors such as pixelation of the detector and noise sources in the detection system.

Journal ArticleDOI
TL;DR: A central limit theorem for the Monte Carlo estimates produced by these computational methods is established in this paper, and applies in a general framework which encompasses most of the sequential Monte Carlo methods that have been considered in the literature, including the resample-move algorithm of Gilks and Berzuini [J. R. Stat. Ser. B Statol. 63 (2001) 127,146] and the residual resampling scheme.
Abstract: The term “sequential Monte Carlo methods” or, equivalently, “particle filters,” refers to a general class of iterative algorithms that performs Monte Carlo approximations of a given sequence of distributions of interest (πt). We establish in this paper a central limit theorem for the Monte Carlo estimates produced by these computational methods. This result holds under minimal assumptions on the distributions πt, and applies in a general framework which encompasses most of the sequential Monte Carlo methods that have been considered in the literature, including the resample-move algorithm of Gilks and Berzuini [J. R. Stat. Soc. Ser. B Stat. Methodol. 63 (2001) 127–146] and the residual resampling scheme. The corresponding asymptotic variances provide a convenient measurement of the precision of a given particle filter. We study, in particular, in some typical examples of Bayesian applications, whether and at which rate these asymptotic variances diverge in time, in order to assess the long term reliability of the considered algorithm.

Journal ArticleDOI
TL;DR: A Markov chain Monte Carlo algorithm for characterizing genetically divergent groups based on molecular markers and geographical sampling design of the dataset is modified to support multiple parallel MCMC chains, with enhanced features that enable considerably faster and more reliable estimation compared to the earlier version of the algorithm.
Abstract: Summary: Bayesian statistical methods based on simulation techniques have recently been shown to provide powerful tools for the analysis of genetic population structure. We have previously developed a Markov chain Monte Carlo (MCMC) algorithm for characterizing genetically divergent groups based on molecular markers and geographical sampling design of the dataset. However, for large-scale datasets such algorithms may get stuck to local maxima in the parameter space. Therefore, we have modified our earlier algorithm to support multiple parallel MCMC chains, with enhanced features that enable considerably faster and more reliable estimation compared to the earlier version of the algorithm. We consider also a hierarchical tree representation, from which a Bayesian model-averaged structure estimate can be extracted. The algorithm is implemented in a computer program that features a user-friendly interface and built-in graphics. The enhanced features are illustrated by analyses of simulated data and an extensive human molecular dataset. Availability: Freely available at http://www.rni.helsinki.fi/~jic/bapspage.html

Journal ArticleDOI
TL;DR: In this article, the authors examined the universality of interstellar turbulence from observed structure functions of 27 giant molecular clouds and Monte Carlo modeling, and quantified the degree of turbulence universality by Monte Carlo simulations that reproduce the mean squared velocity residuals of the observed cloud-to-cloud relationship.
Abstract: The universality of interstellar turbulence is examined from observed structure functions of 27 giant molecular clouds and Monte Carlo modeling. We show that the structure functions, ?v = vol?, derived from wide-field imaging of 12CO J=1-0 emission from individual clouds are described by a narrow range in the scaling exponent, ?, and the scaling coefficient, vo. The similarity of turbulent structure functions emphasizes the universality of turbulence in the molecular interstellar medium and accounts for the cloud-to-cloud size/line width relationship initially identified by Larson. The degree of turbulence universality is quantified by Monte Carlo simulations that reproduce the mean squared velocity residuals of the observed cloud-to-cloud relationship. Upper limits to the variation of the scaling amplitudes and exponents for molecular clouds are ~10%-20%. The measured invariance of turbulence for molecular clouds with vastly different sizes, environments, and star formation activity suggests a common formation mechanism such as converging turbulent flows within the diffuse interstellar medium and a limited contribution of energy from sources within the cloud with respect to large-scale driving mechanisms.

Journal Article
TL;DR: This work presents an algorithm that computes the exact posterior probability of a subnetwork, e.g., a directed edge, and shows that also in domains with a large number of variables, exact computation is feasible, given suitable a priori restrictions on the structures.
Abstract: Learning a Bayesian network structure from data is a well-motivated but computationally hard task. We present an algorithm that computes the exact posterior probability of a subnetwork, e.g., a directed edge; a modified version of the algorithm finds one of the most probable network structures. This algorithm runs in time O(n 2n + nk+1C(m)), where n is the number of network variables, k is a constant maximum in-degree, and C(m) is the cost of computing a single local marginal conditional likelihood for m data instances. This is the first algorithm with less than super-exponential complexity with respect to n. Exact computation allows us to tackle complex cases where existing Monte Carlo methods and local search procedures potentially fail. We show that also in domains with a large number of variables, exact computation is feasible, given suitable a priori restrictions on the structures; combining exact and inexact methods is also possible. We demonstrate the applicability of the presented algorithm on four synthetic data sets with 17, 22, 37, and 100 variables.

Journal ArticleDOI
TL;DR: The proposed algorithm can handle virtually any type of process dynamics, factor structure, and payout specification, and gives valid confidence intervals for the true value of the Bermudan option price.
Abstract: This paper describes a practical algorithm based on Monte Carlo simulation for the pricing of multidimensional American (i.e., continuously exercisable) and Bermudan (i.e., discretely exercisable) ...

Journal ArticleDOI
TL;DR: A critical appraisal of reliability procedures for high dimensions is presented and it is observed that some types of Monte Carlo based simulation procedures in fact are capable of treating high dimensional problems.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated consumers' use of screening rules as part of a discrete-choice model, which accommodates conjunctive, disjunctive, and compensatory screening rules.
Abstract: Many theories of consumer behavior involve thresholds and discontinuities. In this paper, we investigate consumers' use of screening rules as part of a discrete-choice model. Alternatives that pass the screen are evaluated in a manner consistent with random utility theory; alternatives that do not pass the screen have a zero probability of being chosen. The proposed model accommodates conjunctive, disjunctive, and compensatory screening rules. We estimate a model that reflects a discontinuous decision process by employing the Bayesian technique of data augmentation and using Markov-chain Monte Carlo methods to integrate over the parameter space. The approach has minimal information requirements and can handle a large number of choice alternatives. The method is illustrated using a conjoint study of cameras. The results indicate that 92% of respondents screen alternatives on one or more attributes.

Journal ArticleDOI
TL;DR: In this article, the authors consider forecasting using a combination, when no model coincides with a non-constant data generation process (DGP), and show that combining forecasts adds value, and can even dominate the best individual device.
Abstract: Summary We consider forecasting using a combination, when no model coincides with a non-constant data generation process (DGP). Practical experience suggests that combining forecasts adds value, and can even dominate the best individual device. We show why this can occur when forecasting models are differentially mis-specified, and is likely to occur when the DGP is subject to location shifts. Moreover, averaging may then dominate over estimated weights in the combination. Finally, it cannot be proved that only non-encompassed devices should be retained in the combination. Empirical and Monte Carlo illustrations confirm the analysis.

Journal ArticleDOI
TL;DR: The equation of state of a two-component Fermi gas with attractive short-range interspecies interactions using the fixed-node diffusion Monte Carlo method and results show a molecular regime with repulsive interactions well described by the dimer-dimer scattering length.
Abstract: We calculate the equation of state of a two-component Fermi gas with attractive short-range interspecies interactions using the fixed-node diffusion Monte Carlo method. The interaction strength is varied over a wide range by tuning the value $a$ of the $s$-wave scattering length of the two-body potential. For $ag0$ and $a$ smaller than the inverse Fermi wave vector our results show a molecular regime with repulsive interactions well described by the dimer-dimer scattering length ${a}_{m}=0.6a$. The pair correlation functions of parallel and opposite spins are also discussed as a function of the interaction strength.

Journal ArticleDOI
TL;DR: In this article, the authors propose a method to estimate and sum the relevant autocorrelation functions, which is argued to produce more certain error estimates than binning techniques and hence to help toward a better exploitation of expensive simulations.


Journal ArticleDOI
TL;DR: In this article, a higher-order solution of the means and variance of hydraulic head for saturated flow in randomly heterogeneous porous media was obtained by the combination of Karhunen-Loeve decomposition, polynomial expansion, and perturbation methods.

Journal ArticleDOI
TL;DR: In this article, a series of first principles molecular dynamics and Monte Carlo simulations were carried out for liquid water to investigate the reproducibility of different sampling approaches, including Car−Parrinello and Born−Oppenheimer simulations.
Abstract: A series of first principles molecular dynamics and Monte Carlo simulations were carried out for liquid water to investigate the reproducibility of different sampling approaches. These simulations include Car−Parrinello molecular dynamics simulations using the program cpmd with different values of the fictitious electron mass in the microcanonical and canonical ensembles, Born−Oppenheimer molecular dynamics using the programs cpmd and cp2k in the microcanonical ensemble, and Metropolis Monte Carlo using cp2k in the canonical ensemble. With the exception of one simulation for 128 water molecules, all other simulations were carried out for systems consisting of 64 molecules. Although the simulations yield somewhat fortuitous agreement in structural properties, analysis of other properties demonstrate that one should exercise caution when assuming the reproducibility of Car−Parrinello and Born−Oppenheimer molecular dynamics simulations for small system sizes in the microcanonical ensemble. In contrast, the m...

Journal ArticleDOI
TL;DR: It is demonstrated that random-coil statistics are not a unique signature of featureless polymers, and a contrived counterexample in which largely native protein ensembles nevertheless exhibit random- coil characteristics is introduced.
Abstract: The Gaussian-distributed random coil has been the dominant model for denatured proteins since the 1950s, and it has long been interpreted to mean that proteins are featureless, statistical coils in 6 M guanidinium chloride. Here, we demonstrate that random-coil statistics are not a unique signature of featureless polymers. The random-coil model does predict the experimentally determined coil dimensions of denatured proteins successfully. Yet, other equally convincing experiments have shown that denatured proteins are biased toward specific conformations, in apparent conflict with the random-coil model. We seek to resolve this paradox by introducing a contrived counterexample in which largely native protein ensembles nevertheless exhibit random-coil characteristics. Specifically, proteins of known structure were used to generate disordered conformers by varying backbone torsion angles at random for ≈8% of the residues; the remaining ≈92% of the residues remained fixed in their native conformation. Ensembles of these disordered structures were generated for 33 proteins by using a torsion-angle Monte Carlo algorithm with hard-sphere sterics; bulk statistics were then calculated for each ensemble. Despite this extreme degree of imposed internal structure, these ensembles have end-to-end distances and mean radii of gyration that agree well with random-coil expectations in all but two cases.

Journal ArticleDOI
TL;DR: In this article, a review of generalized ensemble algorithms for complex systems with many degrees of freedom such as spin glass and biomolecular systems is presented. And five new generalized-ensemble algorithms which are extensions of the above methods are presented.
Abstract: In complex systems with many degrees of freedom such as spin glass and biomolecular systems, conventional simulations in canonical ensemble suffer from the quasi-ergodicity problem. A simulation in generalized ensemble performs a random walk in potential energy space and overcomes this difficulty. From only one simulation run, one can obtain canonical ensemble averages of physical quantities as functions of temperature by the single-histogram and/or multiple-histogram reweighting techniques. In this article we review the generalized ensemble algorithms. Three well-known methods, namely, multicanonical algorithm (MUCA), simulated tempering (ST), and replica-exchange method (REM), are described first. Both Monte Carlo (MC) and molecular dynamics (MD) versions of the algorithms are given. We then present five new generalized-ensemble algorithms which are extensions of the above methods.

Journal ArticleDOI
TL;DR: A new high resolution reduced model, its force field and applications in the structural proteomics is described and it is shown that the new approach goes beyond the range of applicability of the traditional methods of the protein comparative modeling.
Abstract: Protein modeling could be done on various levels of structural details, from simplified lattice or continuous representations, through high resolution reduced models, employing the united atom representation, to all-atom models of the molecular mechanics. Here I describe a new high resolution reduced model, its force field and applications in the structural proteomics. The model uses a lattice representation with 800 possible orientations of the virtual alpha carbon-alpha carbon bonds. The sampling scheme of the conformational space employs the Replica Exchange Monte Carlo method. Knowledge-based potentials of the force field include: generic protein-like conformational biases, statistical potentials for the short-range conformational propensities, a model of the main chain hydrogen bonds and context-dependent statistical potentials describing the side group interactions. The model is more accurate than the previously designed lattice models and in many applications it is complementary and competitive in respect to the all-atom techniques. The test applications include: the ab initio structure prediction, multitemplate comparative modeling and structure prediction based on sparse experimental data. Especially, the new approach to comparative modeling could be a valuable tool of the structural proteomics. It is shown that the new approach goes beyond the range of applicability of the traditional methods of the protein comparative modeling.

Journal ArticleDOI
TL;DR: In this article, the authors examined the universality of interstellar turbulence from observed structure functions of 27 giant molecular clouds and Monte Carlo modeling, and showed that the structure functions, dv=v0 l^gamma, derived from wide field imaging of CO J=1-0 emission from individual clouds are described by a narrow range in the scaling exponent, gamma, and the scaling coefficient, v0.
Abstract: The universality of interstellar turbulence is examined from observed structure functions of 27 giant molecular clouds and Monte Carlo modeling. We show that the structure functions, dv=v0 l^gamma, derived from wide field imaging of CO J=1-0 emission from individual clouds are described by a narrow range in the scaling exponent, gamma, and the scaling coefficient, v0. The similarity of turbulent structure functions emphasizes the universality of turbulence in the molecular interstellar medium and accounts for the cloud-to-cloud size-line width relationship initially identified by Larson (1981). The degree of turbulence universality is quantified by Monte Carlo simulations that reproduce the mean squared velocity residuals of the observed cloud-to-cloud relationship. Upper limits to the variation of the scaling amplitudes and exponents for molecular clouds are ~10-20%. The measured invariance of turbulence for molecular clouds with vastly different sizes, environments, and star formation activity suggests a common formation mechanism such as converging turbulent flows within the diffuse ISM and a limited contribution of energy from sources within the cloud with respect to large scale driving mechanisms.

Proceedings ArticleDOI
01 Jan 2004
TL;DR: An efficient real-time algorithm that solves the data association problem and is capable of initiating and terminating a varying number of tracks, which shows remarkable performance compared to the greedy algorithm and the multiple hypothesis tracker under extreme conditions.
Abstract: In this paper, we consider the general multiple-target tracking problem in which an unknown number of targets appears and disappears at random times and the goal is to find the tracks of targets from noisy observations. We propose an efficient real-time algorithm that solves the data association problem and is capable of initiating and terminating a varying number of tracks. We take the data-oriented, combinatorial optimization approach to the data association problem but avoid the enumeration of tracks by applying a sampling method called Markov chain Monte Carlo (MCMC). The MCMC data association algorithm can be viewed as a "deferred logic" method since its decision about forming a track is based on both current and past observations. At the same time, it can be viewed as an approximation to the optimal Bayesian filter. The algorithm shows remarkable performance compared to the greedy algorithm and the multiple hypothesis tracker (MHT) under extreme conditions, such as a large number of targets in a dense environment, low detection probabilities, and high false alarm rates.