scispace - formally typeset
Search or ask a question
Topic

Markov chain

About: Markov chain is a research topic. Over the lifetime, 51900 publications have been published within this topic receiving 1375044 citations. The topic is also known as: Markov process & Markov chains.


Papers
More filters
Journal ArticleDOI
TL;DR: Several Markov chain methods are available for sampling from a posterior distribution as discussed by the authors, including Gibbs sampler and Metropolis algorithm, and several strategies for constructing hybrid algorithms, which can be used to guide the construction of more efficient algorithms.
Abstract: Several Markov chain methods are available for sampling from a posterior distribution. Two important examples are the Gibbs sampler and the Metropolis algorithm. In addition, several strategies are available for constructing hybrid algorithms. This paper outlines some of the basic methods and strategies and discusses some related theoretical and practical issues. On the theoretical side, results from the theory of general state space Markov chains can be used to obtain convergence rates, laws of large numbers and central limit theorems for estimates obtained from Markov chain methods. These theoretical results can be used to guide the construction of more efficient algorithms. For the practical use of Markov chain methods, standard simulation methodology provides several variance reduction techniques and also give guidance on the choice of sample size and allocation.

3,780 citations

Journal ArticleDOI
TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.
Abstract: Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for Q,-learning based on that outlined in Watkins (1989). We show that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed each iteration, rather than just one.

3,294 citations

Book
01 Jul 1976
TL;DR: This lecture reviews the theory of Markov chains and introduces some of the high quality routines for working with Markov Chains available in QuantEcon.jl.
Abstract: Markov chains are one of the most useful classes of stochastic processes, being • simple, flexible and supported by many elegant theoretical results • valuable for building intuition about random dynamic models • central to quantitative modeling in their own right You will find them in many of the workhorse models of economics and finance. In this lecture we review some of the theory of Markov chains. We will also introduce some of the high quality routines for working with Markov chains available in QuantEcon.jl. Prerequisite knowledge is basic probability and linear algebra.

3,255 citations

01 Mar 2006
TL;DR: Bayesian inference with Markov Chain Monte Carlo with coda package for R contains a set of functions designed to help the user answer questions about how many samples are required to accurately estimate posterior quantities of interest.
Abstract: [1st paragraph] At first sight, Bayesian inference with Markov Chain Monte Carlo (MCMC) appears to be straightforward. The user defines a full probability model, perhaps using one of the programs discussed in this issue; an underlying sampling engine takes the model definition and returns a sequence of dependent samples from the posterior distribution of the model parameters, given the supplied data. The user can derive any summary of the posterior distribution from this sample. For example, to calculate a 95% credible interval for a parameter α, it suffices to take 1000 MCMC iterations of α and sort them so that α1<α2<...<α1000. The credible interval estimate is then (α25, α975). However, there is a price to be paid for this simplicity. Unlike most numerical methods used in statistical inference, MCMC does not give a clear indication of whether it has converged. The underlying Markov chain theory only guarantees that the distribution of the output will converge to the posterior in the limit as the number of iterations increases to infinity. The user is generally ignorant about how quickly convergence occurs, and therefore has to fall back on post hoc testing of the sampled output. By convention, the sample is divided into two parts: a “burn in” period during which all samples are discarded, and the remainder of the run in which the chain is considered to have converged sufficiently close to the limiting distribution to be used. Two questions then arise: 1. How long should the burn in period be? 2. How many samples are required to accurately estimate posterior quantities of interest? The coda package for R contains a set of functions designed to help the user answer these questions. Some of these convergence diagnostics are simple graphical ways of summarizing the data. Others are formal statistical tests.

3,098 citations

Book
15 Jun 1960

3,046 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
88% related
Probabilistic logic
56K papers, 1.3M citations
87% related
Bounded function
77.2K papers, 1.3M citations
87% related
Optimization problem
96.4K papers, 2.1M citations
86% related
Robustness (computer science)
94.7K papers, 1.6M citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20243
20231,336
20223,183
20212,007
20202,222
20192,294