scispace - formally typeset
Search or ask a question
Book

Measures, Integrals and Martingales

TL;DR: In this paper, the Riemann integral is used to describe the convergence of a point-set topology with respect to a set of measures, which is a special case of the concept of non-measurable sets.
Abstract: Prelude Dependence chart Prologue 1. The pleasures of counting 2. sigma-algebras 3. Measures 4. Uniqueness of measures 5. Existance of measures 6. Measurable mappings 7. Measurable functions 8. Integration of positive functions 9. Integrals of measurable functions and null sets 10. Convergence theroems and their applications 11. The function spaces 12. Product measures and Fubini's theorem 13. Integrals with respect to image measures 14. Integrals of images and Jacobi's transformation rule 15. Uniform integrability and Vitali's convergence theorem 16. Martingales 17. Martingale convergence theorems 18. The Radon-Nikodym theorem and other applications of martingales 19. Inner product spaces 20. Hilbert space 21. Conditional expectations in L2 22. Conditional expectations in Lp 23. Orthonormal systems and their convergence behaviour Appendix A. Lim inf and lim supp Appendix B. Some facts from point-set topology Appendix C. The volume of a parallelepiped Appendix D. Non-measurable sets Appendix E. A summary of the Riemann integral Further reading Bibliography Notation index Name and subject index.
Citations
More filters
Journal Article
TL;DR: A canonical way to turn any smooth parametric family of probability distributions on an arbitrary search space X into a continuous-time black-box optimization method on X, the information-geometric optimization (IGO) method, which achieves maximal invariance properties.
Abstract: We present a canonical way to turn any smooth parametric family of probability distributions on an arbitrary search space X into a continuous-time black-box optimization method on X, the information-geometric optimization (IGO) method. Invariance as a major design principle keeps the number of arbitrary choices to a minimum. The resulting IGO flow is the flow of an ordinary differential equation conducting the natural gradient ascent of an adaptive, time-dependent transformation of the objective function. It makes no particular assumptions on the objective function to be optimized. The IGO method produces explicit IGO algorithms through time discretization. It naturally recovers versions of known algorithms and offers a systematic way to derive new ones. In continuous search spaces, IGO algorithms take a form related to natural evolution strategies (NES). The cross-entropy method is recovered in a particular case with a large time step, and can be extended into a smoothed, parametrization-independent maximum likelihood update (IGO-ML). When applied to the family of Gaussian distributions on Rd, the IGO framework recovers a version of the well-known CMA-ES algorithm and of xNES. For the family of Bernoulli distributions on {0, 1}d, we recover the seminal PBIL algorithm and cGA. For the distributions of restricted Boltzmann machines, we naturally obtain a novel algorithm for discrete optimization on {0, 1}d. All these algorithms are natural instances of, and unified under, the single information-geometric optimization framework. The IGO method achieves, thanks to its intrinsic formulation, maximal invariance properties: invariance under reparametrization of the search space X, under a change of parameters of the probability distribution, and under increasing transformation of the function to be optimized. The latter is achieved through an adaptive, quantile-based formulation of the objective. Theoretical considerations strongly suggest that IGO algorithms are essentially characterized by a minimal change of the distribution over time. Therefore they have minimal loss in diversity through the course of optimization, provided the initial diversity is high. First experiments using restricted Boltzmann machines confirm this insight. As a simple consequence, IGO seems to provide, from information theory, an elegant way to simultaneously explore several valleys of a fitness landscape in a single run.

175 citations

Book
01 Jan 2013
TL;DR: A Primer on Feller Semigroups and Feller Processes as discussed by the authors, including Feller Generators and Symbols, construction of Feller processes, Transformations of Fell Processes, Sample Path Properties, Global Properties, Approximation, and Open Problems.
Abstract: A Primer on Feller Semigroups and Feller Processes- Feller Generators and Symbols- Construction of Feller Processes- Transformations of Feller Processes- Sample Path Properties- Global Properties- Approximation- Open Problems- References- Index

149 citations

Posted Content
TL;DR: In this paper, the authors present a continuous-time information-geometric optimization (IGO) method, which can be used to generate a continuous time black-box optimization algorithm for any parametric family of probability distributions.
Abstract: We present a canonical way to turn any smooth parametric family of probability distributions on an arbitrary search space $X$ into a continuous-time black-box optimization method on $X$, the \emph{information-geometric optimization} (IGO) method. Invariance as a design principle minimizes the number of arbitrary choices. The resulting \emph{IGO flow} conducts the natural gradient ascent of an adaptive, time-dependent, quantile-based transformation of the objective function. It makes no assumptions on the objective function to be optimized. The IGO method produces explicit IGO algorithms through time discretization. It naturally recovers versions of known algorithms and offers a systematic way to derive new ones. The cross-entropy method is recovered in a particular case, and can be extended into a smoothed, parametrization-independent maximum likelihood update (IGO-ML). For Gaussian distributions on $\mathbb{R}^d$, IGO is related to natural evolution strategies (NES) and recovers a version of the CMA-ES algorithm. For Bernoulli distributions on $\{0,1\}^d$, we recover the PBIL algorithm. From restricted Boltzmann machines, we obtain a novel algorithm for optimization on $\{0,1\}^d$. All these algorithms are unified under a single information-geometric optimization framework. Thanks to its intrinsic formulation, the IGO method achieves invariance under reparametrization of the search space $X$, under a change of parameters of the probability distributions, and under increasing transformations of the objective function. Theory strongly suggests that IGO algorithms have minimal loss in diversity during optimization, provided the initial diversity is high. First experiments using restricted Boltzmann machines confirm this insight. Thus IGO seems to provide, from information theory, an elegant way to spontaneously explore several valleys of a fitness landscape in a single run.

143 citations

Journal ArticleDOI
TL;DR: To analyze and compare solar forecasts, the well-established Murphy–Winkler framework for distribution-oriented forecast verification is recommended as a standard practice and the use of the root mean square error (RMSE) skill score based on the optimal convex combination of climatology and persistence methods is highly recommended.

129 citations


Cites background from "Measures, Integrals and Martingales..."

  • ...non-negativity, null empty set, and -additivity, that is A = μ A A μ ( ) 0 , ( ) 0, and = μ A μ A ( ) ( ) j j j j , where symbol denotes disjoint union (Schilling, 2017)....

    [...]

Journal ArticleDOI
TL;DR: This work shows how this can be achieved by augmenting the likelihood with continuous latent variables, and computing inference using the resulting augmented posterior, and establishes the effectiveness of the estimation method by modeling consumer behavior in online retail using Archimedean and Gaussian copulas.
Abstract: Estimation of copula models with discrete margins can be difficult beyond the bivariate case. We show how this can be achieved by augmenting the likelihood with continuous latent variables, and computing inference using the resulting augmented posterior. To evaluate this, we propose two efficient Markov chain Monte Carlo sampling schemes. One generates the latent variables as a block using a Metropolis–Hastings step with a proposal that is close to its target distribution, the other generates them one at a time. Our method applies to all parametric copulas where the conditional copula functions can be evaluated, not just elliptical copulas as in much previous work. Moreover, the copula parameters can be estimated joint with any marginal parameters, and Bayesian selection ideas can be employed. We establish the effectiveness of the estimation method by modeling consumer behavior in online retail using Archimedean and Gaussian copulas. The example shows that elliptical copulas can be poor at modeling depend...

121 citations


Cites methods from "Measures, Integrals and Martingales..."

  • ...To prove Propositions 3 and 4, we use the following identity that can be derived using standard measure theory; see Stein & Shakarchi (2005) or Schilling (2005). Let H1, ....

    [...]