scispace - formally typeset
Search or ask a question

Showing papers on "Markov chain published in 1992"


Journal ArticleDOI
TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.
Abstract: Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for Q,-learning based on that outlined in Watkins (1989). We show that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed each iteration, rather than just one.

3,294 citations


Journal ArticleDOI
TL;DR: The case is made for basing all inference on one long run of the Markov chain and estimating the Monte Carlo error by standard nonparametric methods well-known in the time-series and operations research literature.
Abstract: Markov chain Monte Carlo using the Metropolis-Hastings algorithm is a general method for the simulation of stochastic processes having probability densities known up to a constant of proportionality. Despite recent advances in its theory, the practice has remained controversial. This article makes the case for basing all inference on one long run of the Markov chain and estimating the Monte Carlo error by standard nonparametric methods well-known in the time-series and operations research literature. In passing it touches on the Kipnis-Varadhan central limit theorem for reversible Markov chains, on some new variance estimators, on judging the relative efficiency of competing Monte Carlo schemes, on methods for constructing more rapidly mixing Markov chains and on diagnostics for Markov chain Monte Carlo.

1,912 citations


Journal ArticleDOI
TL;DR: In this article, a stochastic differential formulation of recursive utility is given sufficient conditions for existence, uniqueness, time consistency, monotonicity, continuity, risk aversion, concavity, and other properties.
Abstract: A stochastic differential formulation of recursive utility is given sufficient conditions for existence, uniqueness, time consistency, monotonicity, continuity, risk aversion, concavity, and other properties. In the setting of Brownian information, recursive and intertemporal expected utility functions are observationally distinguishable. However, one cannot distinguish between a number of non-expected-utility theories of one-shot choice under uncertainty after they are suitably integrated into an intertemporal framework. In a "smooth" Markov setting, the stochastic differential utility model produces a generalization of the Hamilton-Bellman-Jacobi characterization of optimality. A companion paper explores the implications for asset prices. Copyright 1992 by The Econometric Society.

1,040 citations


Journal ArticleDOI
TL;DR: In this paper, a Markov chain Monte Carlo method is used to approximate the whole likelihood function in autologistic models and other exponential family models for dependent data, and the parameter value (if any) maximizing this function approximates the MLE.
Abstract: Maximum likelihood estimates (MLEs) in autologistic models and other exponential family models for dependent data can be calculated with Markov chain Monte Carlo methods (the Metropolis algorithm or the Gibbs sampler), which simulate ergodic Markov chains having equilibrium distributions in the model. From one realization of such a Markov chain, a Monte Carlo approximant to the whole likelihood function can be constructed. The parameter value (if any) maximizing this function approximates the MLE

869 citations


Journal ArticleDOI
TL;DR: A new upper bound on the mixing rate is presented, based on the solution to a multicommodity flow problem in the Markov chain viewed as a graph, and improved bounds are obtained for the runtimes of randomised approximation algorithms for various problems, including computing the permanent of a 0–1 matrix, counting matchings in graphs, and computing the partition function of a ferromagnetic Ising system.
Abstract: The paper is concerned with tools for the quantitative analysis of finite Markov chains whose states are combinatorial structures. Chains of this kind have algorithmic applications in many areas, including random sampling, approximate counting, statistical physics and combinatorial optimisation. The efficiency of the resulting algorithms depends crucially on the mixing rate of the chain, i.e., the time taken for it to reach its stationary or equilibrium distribution.The paper presents a new upper bound on the mixing rate, based on the solution to a multicommodity flow problem in the Markov chain viewed as a graph. The bound gives sharper estimates for the mixing rate of several important complex Markov chains. As a result, improved bounds are obtained for the runtimes of randomised approximation algorithms for various problems, including computing the permanent of a 0–1 matrix, counting matchings in graphs, and computing the partition function of a ferromagnetic Ising system. Moreover, solutions to the multicommodity flow problem are shown to capture the mixing rate quite closely: thus, under fairly general conditions, a Markov chain is rapidly mixing if and only if it supports a flow of low cost.

534 citations


Book
01 Mar 1992
TL;DR: In this paper, a representative work of Chinese probabilists on probability theory and its applications in physics is presented, including the results of jump Markov processes, as well as Markov interacting processes with noncompact states.
Abstract: This volume presents a representative work of Chinese probabilists on probability theory and its applications in physics. Interesting results of jump Markov processes are discussed, as well as Markov interacting processes with noncompact states, including the Schlogal model taken from statistical physics. The main body of this book is self-contained and can be used in a course in stochastic processes for graduate students. The book consists of four parts. In Parts 1 and 2, the author introduces the general theory for jump processes. New contributions to the classical problems: uniqueness, recurrence and positive recurrence are studied. Then, probability metrics and coupling methods, stochastically monotonicity, reversibility, large deviations and the estimates of L squared-spectral gap are discussed. Part 3 begins with the study of equilibrium particle systems. This contains the criteria of the reversibility, the construction of Gibbs states and the particle systems on lattice fractals. The final part emphasizes the reaction-diffusion processes which come from non-equilibrium statistical physics. Topics include constructions, existence of stationary distributions, ergodicity, phase transitions and hydrodynamic limits for the processes.

526 citations


Journal ArticleDOI
TL;DR: It is found that traffic periodicity can cause different sources with identical statistical characteristics to experience differing cell-loss rates, and a multistate Markov chain model that can be derived from three traffic parameters is sufficiently accurate for use in traffic studies.
Abstract: Source modeling and performance issues are studied using a long (30 min) sequence of real video teleconference data. It is found that traffic periodicity can cause different sources with identical statistical characteristics to experience differing cell-loss rates. For a single-stage multiplexer model, some of this source-periodicity effect can be mitigated by appropriate buffer scheduling and one effective scheduling policy is presented. For the sequence analyzed, the number of cells per frame follows a gamma (or negative binomial) distribution. The number of cells per frame is a stationary stochastic process. For traffic studies, neither an autoregressive model of order two nor a two-state Markov chain model is good because they do not model correctly the occurrence of frames with a large number of cells, which are a primary factor in determining cell-loss rates. The order two autoregressive model, however, fits the data well in a statistical sense. A multistate Markov chain model that can be derived from three traffic parameters is sufficiently accurate for use in traffic studies. >

469 citations


Journal ArticleDOI
TL;DR: In this paper, the consistency of a sequence of maximum likelihood estimators is proved and the conclusion of the Shannon-McMillan-Breiman theorem on entropy convergence is established for hidden Markov models.

455 citations


Journal ArticleDOI
TL;DR: A simple genetic algorithm as a Markov chain is model, both complete and exact, which considers the asymptotics of the steady state distributions as population size increases.
Abstract: We model a simple genetic algorithm as a Markov chain. Our method is both complete (selection, mutation, and crossover are incorporated into an explicitly given transition matrix) and exact; no special assumptions are made which restrict populations or population trajectories. We also consider the asymptotics of the steady state distributions as population size increases.

447 citations


Journal ArticleDOI
TL;DR: A stochastic approach to the estimation of 2D motion vector fields from time-varying images is presented and the maximum a posteriori probability (MAP) estimation is incorporated into a hierarchical environment to deal efficiently with large displacements.
Abstract: A stochastic approach to the estimation of 2D motion vector fields from time-varying images is presented. The formulation involves the specification of a deterministic structural model along with stochastic observation and motion field models. Two motion models are proposed: a globally smooth model based on vector Markov random fields and a piecewise smooth model derived from coupled vector-binary Markov random fields. Two estimation criteria are studied. In the maximum a posteriori probability (MAP) estimation, the a posteriori probability of motion given data is maximized, whereas in the minimum expected cost (MEC) estimation, the expectation of a certain cost function is minimized. Both algorithms generate sample fields by means of stochastic relaxation implemented via the Gibbs sampler. Two versions are developed: one for a discrete state space and the other for a continuous state space. The MAP estimation is incorporated into a hierarchical environment to deal efficiently with large displacements. >

345 citations


Journal ArticleDOI
TL;DR: The efficacy of the mean field theory approach is demonstrated on parameter estimation for one-dimensional mixture data and two-dimensional unsupervised stochastic model-based image segmentation and on parameter estimates for both synthetic and real-world images.
Abstract: In many signal processing and pattern recognition applications, the hidden data are modeled as Markov processes, and the main difficulty of using the maximisation (EM) algorithm for these applications is the calculation of the conditional expectations of the hidden Markov processes. It is shown how the mean field theory from statistical mechanics can be used to calculate the conditional expectations for these problems efficiently. The efficacy of the mean field theory approach is demonstrated on parameter estimation for one-dimensional mixture data and two-dimensional unsupervised stochastic model-based image segmentation. Experimental results indicate that in the 1-D case, the mean field theory approach provides results comparable to those obtained by Baum's (1987) algorithm, which is known to be optimal. In the 2-D case, where Baum's algorithm can no longer be used, the mean field theory provides good parameter estimates and image segmentation for both synthetic and real-world images. >

Journal ArticleDOI
TL;DR: In this paper, the authors connect various topological and probabilistic forms of stability for discrete-time Markov chains, including tightness and Harris recurrence, and show that these concepts are largely equivalent for a major class of chains (chains with continuous components), or if the state space has a sufficiently rich class of appropriate sets.
Abstract: In this paper we connect various topological and probabilistic forms of stability for discrete-time Markov chains. These include tightness on the one hand and Harris recurrence and ergodicity on the other. We show that these concepts of stability are largely equivalent for a major class of chains (chains with continuous components), or if the state space has a sufficiently rich class of appropriate sets ('petite sets'). We use a discrete formulation of Dynkin's formula to establish unified criteria for these stability concepts, through bounding of moments of first entrance times to petite sets. This gives a generalization of Lyapunov-Foster criteria for the various stability conditions to hold. Under these criteria, ergodic theorems are shown to be valid even in the non-irreducible case. These results allow a more general test function approach for determining rates of convergence of the underlying distributions of a Markov chain, and provide strong mixing results and new versions of the central limit theorem and the law of the iterated logarithm.

Journal ArticleDOI
TL;DR: The elementary MOESP algorithm presented in the first part of this series of papers is analysed and the asymptotic properties of the estimated state-space model when only considering zero-mean white noise perturbations on the output sequence are studied.
Abstract: The elementary MOESP algorithm presented in the first part of this series of papers is analysed in this paper. This is done in three different ways. First, we study the asymptotic properties of the estimated state-space model when only considering zero-mean white noise perturbations on the output sequence. It is shown that, in this case, the MOESPl implementation yields asymptotically unbiased estimates. An important constraint to this result is that the underlying system must have a finite impulse response and subsequently the size of the Hankel matrices, constructed from the input and output data at the beginning of the computations, depends on the number of non-zero Markov parameters. This analysis, however, leads to a second implementation of the elementary MOESP scheme, namely MOESP2. The latter implementation has the same asymptotic properties without the finite impulse response constraint. Secondly, we compare the MOESP2 algorithm with a classical state space model identification scheme. The latter...

01 Jan 1992
TL;DR: In this article, the authors present a performance model for automated manufacturing systems, which is based on the performance model of the assembly process of an automaton and the assembly of a set of components.
Abstract: (1993). Performance Modeling of Automated Manufacturing Systems. Technometrics: Vol. 35, No. 4, pp. 456-456.

Journal ArticleDOI
TL;DR: This paper concerns the use and implementation of maximum-penalized-likelihood procedures for choosing the number of mixing components and estimating the parameters in independent and Markov-dependent mixture models.
Abstract: SUMMARY This paper concerns the use and implementation of maximum-penalized-likelihood procedures for choosing the number of mixing components and estimating the parameters in independent and Markov-dependent mixture models. Computation of the estimates is achieved via algorithms for the automatic generation of starting values for the EM algorithm. Computation of the information matrix is also discussed. Poisson mixture models are applied to a sequence of counts of movements by a fetal lamb in utero obtained by ultrasound. The resulting estimates are seen to provide plausible mechanisms for the physiological process. The analysis of count data that are overdispersed relative to the Poisson distribution (i.e., variance > mean) has received considerable recent attention. Such data might arise in a clinical study in which overdispersion is caused by unexplained or random subject effects. Alternatively, we might observe a time series of counts in which temporal patterns in the data suggest that a Poisson model and its implied randomness are inappropriate. This paper is motivated by analysis of a time series of overdispersed count data generated in a study of central nervous system development in fetal lambs. Our data set consists of observed movement counts in 240 consecutive 5-second intervals obtained from a single animal. In analysing these data, we focus on the use of Poisson mixture models assuming independent observations and also Markov-dependent mixture models (or hidden Markov models). These models assume that the counts follow independent Poisson distributions conditional on the rates, which are generated from a mixing distribution either independently or with Markov dependence. We believe finite mixture models are particularly attractive because they provide plausible explanations for variation in the data. This paper will emphasize the following issues concerning estimation, inference, and application of mixture models: (i) choosing the number of model components; (ii) applying the EM algorithm to obtain parameter estimates; (iii) generating sufficiently many starting values to identify a global maximum of the likelihood; (iv) avoiding numerical instability

Journal ArticleDOI
TL;DR: In this paper, a stochastic search method is proposed for finding a global solution to the discrete optimization problem in which the objective function must be estimated by Monte Carlo simulation, and it is shown under mild conditions that the Markov chain is strongly ergodic.
Abstract: In this paper a stochastic search method is proposed for finding a global solution to the stochastic discrete optimization problem in which the objective function must be estimated by Monte Carlo simulation. Although there are many practical problems of this type in the fields of manufacturing engineering, operations research, and management science, there have not been any nonheuristic methods proposed for such discrete problems with stochastic infrastructure. The proposed method is very simple, yet it finds a global optimum solution. The method exploits the randomness of Monte Carlo simulation and generates a sequence of solution estimates. This generated sequence turns out to be a nonstationary Markov chain, and it is shown under mild conditions that the Markov chain is strongly ergodic and that the probability that the current solution estimate is global optimum converges to one. Furthermore, the speed of convergence is also analyzed.

Journal ArticleDOI
TL;DR: A new class of optimization heuristics which combine local searches with stochastic sampling methods, allowing one to iterate local optimization heURistics is considered, improving 3-opt by over 1.6% and Lin-Kernighan by 1.3%.

Journal ArticleDOI
TL;DR: Special forms of the general unsupervised segmentation algorithm are developed for the segmentation of noisy and textured images, which yield good segmentations, accurate estimates for the parameters, and the correct number of regions.

Journal ArticleDOI
TL;DR: In this paper, a class of dynamic models in which both the conditional mean and the conditional variance are endogenous stepwise functions is considered, and the authors derive statistical properties of these models; pseudo-maximum likelihood estimators, conditional homoscedasticity tests, tests of weak or strong white noise, CAPM test, factors determination, ARCH-M effects.

Book
01 Oct 1992
TL;DR: Analysis of time-inhomogeneity tests for exponentiality and sequential dependency properties and examples of analyses based on continuous-time Markov chain modelling are presented.
Abstract: Introduction Preliminary inspection of the observations Analysis of time-inhomogeneity Tests for exponentiality Tests of sequential dependency properties Simultaneous tests Analysis based on a (semi)-Markov description Examples of analyses based on continuous-time Markov chain modelling Appendices References Author index Subject index.

Journal ArticleDOI
TL;DR: An analog model describing signal amplitude and phase variations on shadowed satellite mobile channels and an M-state Markov chain is applied to represent environment parameter variations show close agreement with measurements.
Abstract: An analog model describing signal amplitude and phase variations on shadowed satellite mobile channels is proposed. A linear combination of log-normal, Rayleigh, and Rice models is used to describe signal variations over an area with constant environment attributes while an M-state Markov chain is applied to represent environment parameter variations. Channel parameters are evaluated from the experimental data and utilized to verify a simulation model. Results, presented in the form of signal waveforms, probability density functions, fade durations, and average bit and block error rates, show close agreement with measurements. >

Journal ArticleDOI
TL;DR: In this paper, it is shown that a version of Maurey's extension theorem holds for Lipschitz maps between metric spaces satisfying certain geometric conditions, analogous to type and cotype.
Abstract: it is shown that a version of Maurey's extension theorem holds for Lipschitz maps between metric spaces satisfying certain geometric conditions, analogous to type and cotype. As a consequence, a classical Theorem of Kirszbraun can be generalised to include maps intoLp, 1

Journal ArticleDOI
TL;DR: An introduction to Markov reward models including solution techniques and application examples is presented and a brief discussion of how task completion time models and models of queues with breakdowns and repairs relate to MarkOV reward models is given.

Journal ArticleDOI
TL;DR: In this paper, the authors used Markovian transition matrices (MTM) for the overall bridge condition, and transition matrix is developed for the condition rating of individual bridge components (e.g., superstructures, decks, and piers).
Abstract: This paper describes methods for determining and utilizing Markov chains in the evaluation of highway bridge deterioration. Using a data base of 850 bridges in New York State, Markovian transition matrices (MTM) are first found for the overall bridge condition. Then, transition matrices are developed for the condition rating of individual bridge components (e.g., superstructures, decks, and piers). In each case, chains are determined for various types of construction. Also discussed is the modeling of correlated elements such as the primary structure and joint condition and the ability to determine the correlation for a set of data. The consequence of small data bases is discussed, and an explanation is offered for unexpected values of the transition probabilities. Finally examined is the use of Markovian analysis for predicting the evolution of the average condition rating of a set of bridges, and expected value of condition rating for a single bridge. Markov transition matrices are introduced to model t...

Journal ArticleDOI
TL;DR: The identifiability problem is completely solved by linear algebra, where a block structure of a Markov transition matrix plays a fundamental role, and from which the minimum degree of freedom for a source is revealed.
Abstract: If only a function of the state in a finite-state Markov chain is observed, then the stochastic process is no longer Markovian in general. This type of information source is found widely and the basic problem of its identifiability remains open, that is, the problem of showing when two different Markov chains generate the same stochastic process. The identifiability problem is completely solved by linear algebra, where a block structure of a Markov transition matrix plays a fundamental role, and from which the minimum degree of freedom for a source is revealed. >

Journal ArticleDOI
TL;DR: In this paper, a periodic Markov chain is used to generate the maximal segmental sum, and the explicit limit distribution of the maximal sum is determined by a realization of states with so = au and the real-valued i.i.d. bounded variables associated with the transitions.
Abstract: Let s1, " , sn be generated governed by an r-state irreducible aperiodic Markov chain. The partial sum process S.,m = 1E' Xss,si+,, m = 1, 2, - - - is determined by a realization {s})=o of states with so = au and the real-valued i.i.d. bounded variables XP associated with the transitions si = a, si+l = fP. Assume Xop has negative stationary mean. The explicit limit distribution of the maximal segmental sum

Journal ArticleDOI
TL;DR: Positive and negative results indicate that the tail of the distribution of the Distribution of the cycle length τ
Abstract: Let X = {X(t)}t ≥ 0 be a stochastic process with a stationary version X*. It is investigated when it is possible to generate by simulation a version X˜ of X with lower initial bias than X itself, in the sense that either X˜ is strictly stationary (has the same distribution as X*) or the distribution of X˜ is close to the distribution of X*. Particular attention is given to regenerative processes and Markov processes with a finite, countable, or general state space. The results are both positive and negative, and indicate that the tail of the distribution of the cycle length t plays a critical role. The negative results essentially state that without some information on this tail, no a priori computable bias reduction is possible; in particular, this is the case for the class of all Markov processes with a countably infinite state space. On the contrary, the positive results give algorithms for simulating X˜ for various classes of processes with some special structure on t. In particular, one can generate X˜ as strictly stationary for finite state Markov chains, Markov chains satisfying a Doeblin-type minorization, and regenerative processes with the cycle length t bounded or having a stationary age distribution that can be generated by simulation.

Journal ArticleDOI
TL;DR: Analysis of the structure of some small complete genomes and a human genome segment using a hidden Markov chain model finds a variety of discrete compositional domains and their correlations with genome function are explored.

Journal ArticleDOI
TL;DR: In this article, the authors present a course on random walk and Brownian motion, Discrete-parameter Markov chains, Continuous-parameters Markov Chains, Brownian Motion and diffusions, and Dynamic Programming and stochastic optimization.
Abstract: Preface to the Classics Edition Preface Sample course outline 1. Random walk and Brownian motion 2, Discrete-parameter Markov chains 3. Birth-death Markov chains 4. Continuous-parameter Markov chains 5. Brownian motion and diffusions 6. Dynamic programming and stochastic optimization 7. An introduction to stochastic differential equations 8. A probability and measure theory overview Author index Subject index Errata.

Journal ArticleDOI
01 May 1992
TL;DR: Interrelationships between the structure and size of the generating Markov matrices and the string editing distance shed light on the relative roles of deterministic and probabilistic processes in producing human visual scanpaths.
Abstract: Sequences of visual fixations, while looking at an object, are modeled as Markov processes, and statistical properties of such processes are derived by means of simulations. The sequences are also abstracted as character strings, and a quantitative method of measuring their similarity, based on minimum string editing cost (actually dissimilarity distance), is introduced. Interrelationships between the structure and size of the generating Markov matrices and the string editing distance shed light on the relative roles of deterministic and probabilistic processes in producing human visual scanpaths. >