scispace - formally typeset
Search or ask a question

Showing papers on "Variable-order Bayesian network published in 2007"


Reference EntryDOI
TL;DR: In this paper, the concept of hidden Markov models in computational biology is introduced and described using simple biological examples, requiring as little mathematical knowledge as possible, and an overview of their current applications are presented.
Abstract: This unit introduces the concept of hidden Markov models in computational biology. It describes them using simple biological examples, requiring as little mathematical knowledge as possible. The unit also presents a brief history of hidden Markov models and an overview of their current applications before concluding with a discussion of their limitations.

1,305 citations


Book
30 Jul 2007
TL;DR: The introduction to Bayesian statistics as mentioned in this paper presents Bayes theorem, the estimation of unknown parameters, the determination of confidence regions and the derivation of tests of hypotheses for the unknown parameters in a manner that is simple, intuitive and easy to comprehend.
Abstract: The Introduction to Bayesian Statistics (2nd Edition) presents Bayes theorem, the estimation of unknown parameters, the determination of confidence regions and the derivation of tests of hypotheses for the unknown parameters, in a manner that is simple, intuitive and easy to comprehend. The methods are applied to linear models, in models for a robust estimation, for prediction and filtering and in models for estimating variance components and covariance components. Regularization of inverse problems and pattern recognition are also covered while Bayesian networks serve for reaching decisions in systems with uncertainties. If analytical solutions cannot be derived, numerical algorithms are presented such as the Monte Carlo integration and Markov Chain Monte Carlo methods.

488 citations


Journal ArticleDOI
TL;DR: In this article, the authors review linkages to optimal interpolation, kriging, Kalman filtering, smoothing, and variational analysis for data assimilation in Bayesian statistics.

375 citations


Book
01 Jan 2007
TL;DR: In this article, the authors present a self-contained entry to computational Bayesian statistics, focusing on standard statistical models and backed up by discussed real datasets available from the book website.
Abstract: This Bayesian modeling book is intended for practitioners and applied statisticians looking for a self-contained entry to computational Bayesian statistics. Focusing on standard statistical models and backed up by discussed real datasets available from the book website, it provides an operational methodology for conducting Bayesian inference, rather than focusing on its theoretical justifications. Special attention is paid to the derivation of prior distributions in each case and specific reference solutions are given for each of the models. Similarly, computational details are worked out to lead the reader towards an effective programming of the methods given in the book.

348 citations


Journal ArticleDOI
TL;DR: A hybrid model called Finite State Linear Model is described and some simple network dynamics can be simulated in this model, and the topology of gene regulatory networks in yeast is studied in more detail.
Abstract: Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these categories. We will study the topology of gene regulatory networks in yeast in more detail, comparing a direct network derived from transcription factor binding data and an indirect network derived from genome-wide expression data in mutants. Regarding the network dynamics we briefly describe discrete and continuous approaches to network modelling, then describe a hybrid model called Finite State Linear Model and demonstrate that some simple network dynamics can be simulated in this model.

334 citations


Proceedings ArticleDOI
20 Jun 2007
TL;DR: This work considers the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes chosen randomly from a fixed but unknown distribution, using a hierarchical Bayesian infinite mixture model.
Abstract: We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknown distribution. We model the distribution over MDPs using a hierarchical Bayesian infinite mixture model. For each novel MDP, we use the previously learned distribution as an informed prior for modelbased Bayesian reinforcement learning. The hierarchical Bayesian framework provides a strong prior that allows us to rapidly infer the characteristics of new environments based on previous environments, while the use of a nonparametric model allows us to quickly adapt to environments we have not encountered before. In addition, the use of infinite mixtures allows for the model to automatically learn the number of underlying MDP components. We evaluate our approach and show that it leads to significant speedups in convergence to an optimal policy after observing only a small number of tasks.

311 citations


Journal ArticleDOI
TL;DR: This work introduces a new statistical computing method, called data cloning, to calculate maximum likelihood estimates and their standard errors for complex ecological models, which is particularly useful for analysing ecological situations in which hierarchical statistical models are appropriate.
Abstract: We introduce a new statistical computing method, called data cloning, to calculate maximum likelihood estimates and their standard errors for complex ecological models. Although the method uses the Bayesian framework and exploits the computational simplicity of the Markov chain Monte Carlo (MCMC) algorithms, it provides valid frequentist inferences such as the maximum likelihood estimates and their standard errors. The inferences are completely invariant to the choice of the prior distributions and therefore avoid the inherent subjectivity of the Bayesian approach. The data cloning method is easily implemented using standard MCMC software. Data cloning is particularly useful for analysing ecological situations in which hierarchical statistical models, such as state-space models and mixed effects models, are appropriate. We illustrate the method by fitting two nonlinear population dynamics models to data in the presence of process and observation noise.

255 citations


Proceedings ArticleDOI
17 Jun 2007
TL;DR: Combining spatial and aspect models significantly improves the region-level classification accuracy, and models trained with image-level labels outperform PLSA trained with pixel-level ones.
Abstract: Considerable advances have been made in learning to recognize and localize visual object classes. Simple bag-of-feature approaches label each pixel or patch independently. More advanced models attempt to improve the coherence of the labellings by introducing some form of inter-patch coupling: traditional spatial models such as MRF's provide crisper local labellings by exploiting neighbourhood-level couplings, while aspect models such as PLSA and LDA use global relevance estimates (global mixing proportions for the classes appearing in the image) to shape the local choices. We point out that the two approaches are complementary, combining them to produce aspect-based spatial field models that outperform both approaches. We study two spatial models: one based on averaging over forests of minimal spanning trees linking neighboring image regions, the other on an efficient chain-based Expectation Propagation method for regular 8-neighbor Markov random fields. The models can be trained using either patch-level labels or image-level keywords. As input features they use factored observation models combining texture, color and position cues. Experimental results on the MSR Cambridge data sets show that combining spatial and aspect models significantly improves the region-level classification accuracy. In fact our models trained with image-level labels outperform PLSA trained with pixel-level ones.

237 citations


Book ChapterDOI
01 Jan 2007
TL;DR: A simple and concise description of an alternative Bayesian approach to structural equation models with latent variables is developed, and an industrialization and democratization case study from the literature is illustrated.
Abstract: Structural equation models (SEMs) with latent variables are routinely used in social science research, and are of increasing importance in biomedical applications. Standard practice in implementing SEMs relies on frequentist methods. A simple and concise description of an alternative Bayesian approach is developed. Furthermore, a brief overview of the literature, a description of Bayesian specification of SEMs, and an outline of a Gibbs sampling strategy for model fitting is provided. Bayesian inferences are illustrated through an industrialization and democratization case study from the literature. The Bayesian approach has some distinct advantages, due to the availability of samples from the joint posterior distribution of the model parameters and latent variables, that are highlighted. These posterior samples provide important information not contained in the measurement and structural parameters. As is illustrated using the case study, this information can often provide valuable insight into structural relationships.

230 citations


Journal ArticleDOI
TL;DR: This work describes a more general approach that allows causal models to be applied to any lifecycle and enables decision-makers to reason in a way that is not possible with regression-based models.
Abstract: An important decision in software projects is when to stop testing. Decision support tools for this have been built using causal models represented by Bayesian Networks (BNs), incorporating empirical data and expert judgement. Previously, this required a custom BN for each development lifecycle. We describe a more general approach that allows causal models to be applied to any lifecycle. The approach evolved through collaborative projects and captures significant commercial input. For projects within the range of the models, defect predictions are very accurate. This approach enables decision-makers to reason in a way that is not possible with regression-based models.

211 citations


Book
01 Jan 2007
TL;DR: Bayesian Networks and influence diagram, Bayesian networks and influence diagrams, and Bayesian network diagrams and diagrams are presented.
Abstract: Bayesian networks and influence diagram , Bayesian networks and influence diagram , کتابخانه دیجیتال جندی شاپور اهواز

Book
02 Oct 2007
TL;DR: This thoroughly revised and expanded new edition now includes a more detailed treatment of the EM algorithm, a description of an efficient approximate Viterbi-training procedure, a theoretical derivation of the perplexity measure and coverage of multi-pass decoding based on n-best search.
Abstract: This thoroughly revised and expanded new edition now includes a more detailed treatment of the EM algorithm, a description of an efficient approximate Viterbi-training procedure, a theoretical derivation of the perplexity measure and coverage of multi-pass decoding based on n-best search. Supporting the discussion of the theoretical foundations of Markov modeling, special emphasis is also placed on practical algorithmic solutions. Features: introduces the formal framework for Markov models; covers the robust handling of probability quantities; presents methods for the configuration of hidden Markov models for specific application areas; describes important methods for efficient processing of Markov models, and the adaptation of the models to different tasks; examines algorithms for searching within the complex solution spaces that result from the joint application of Markov chain and hidden Markov models; reviews key applications of Markov models.

Journal ArticleDOI
TL;DR: A novel class of Bayesian models for multivariate time series analysis based on a synthesis of dynamic linear models and graphical models is introduced as a key step towards scaling multivariate dynamic Bayesian modelling methodology to time series of increasing dimension and complexity.
Abstract: This paper introduces a novel class of Bayesian models for multivariate time series analysis based on a synthesis of dynamic linear models and graphical models. The synthesis uses sparse graphical modelling ideas to introduce struc- tured, conditional independence relationships in the time-varying, cross-sectional covariance matrices of multiple time series. We dene this new class of models and their theoretical structure involving novel matrix-normal/hyper-inverse Wishart distributions. We then describe the resulting Bayesian methodology and compu- tational strategies for model tting and prediction. This includes novel stochastic evolution theory for time-varying, structured variance matrices, and the full se- quential and conjugate updating, ltering and forecasting analysis. The models are then applied in the context of nancial time series for predictive portfolio analysis. The improvements dened in optimal Bayesian decision analysis in this example context vividly illustrate the practical benets of the parsimony induced via appro- priate graphical model structuring in multivariate dynamic modelling. We discuss theoretical and empirical aspects of the conditional independence structures in such models, issues of model uncertainty and search, and the relevance of this new framework as a key step towards scaling multivariate dynamic Bayesian modelling methodology to time series of increasing dimension and complexity.

Journal ArticleDOI
TL;DR: Although WinBUGS may not be that efficient for more complicated models, it does make Bayesian inference with stochastic frontier models easily accessible for applied researchers and its generic structure allows for a lot of flexibility in model specification.
Abstract: Markov chain Monte Carlo (MCMC) methods have become a ubiquitous tool in Bayesian analysis. This paper implements MCMC methods for Bayesian analysis of stochastic frontier models using the WinBUGS package, a freely available software. General code for cross-sectional and panel data are presented and various ways of summarizing posterior inference are discussed. Several examples illustrate that analyses with models of genuine practical interest can be performed straightforwardly and model changes are easily implemented. Although WinBUGS may not be that efficient for more complicated models, it does make Bayesian inference with stochastic frontier models easily accessible for applied researchers and its generic structure allows for a lot of flexibility in model specification.

Book ChapterDOI
06 Aug 2007
TL;DR: Four main improved approaches to naive Bayes are reviewed and some main directions for future research on Bayesian network classifiers are discussed, including feature selection, structure extension, local learning, and data expansion.
Abstract: The attribute conditional independence assumption of naive Bayes essentially ignores attribute dependencies and is often violated. On the other hand, although a Bayesian network can represent arbitrary attribute dependencies, learning an optimal Bayesian network classifier from data is intractable. Thus, learning improved naive Bayes has attracted much attention from researchers and presented many effective and efficient improved algorithms. In this paper, we review some of these improved algorithms and single out four main improved approaches: 1) Feature selection; 2) Structure extension; 3) Local learning; 4) Data expansion. We experimentally tested these approaches using the whole 36 UCI data sets selected by Weka, and compared them to naive Bayes. The experimental results show that all these approaches are effective. In the end, we discuss some main directions for future research on Bayesian network classifiers.

Book
30 Mar 2007
TL;DR: This book describes the underlying concepts of Bayesian Networks in an interesting manner with the help of diverse applications, and theories that prove Bayesian networks valid.
Abstract: Bayesian networks are now being used in a variety of artificial intelligence applications. These networks are high-level representations of probability distributions over a set of variables that are used for building a model of the problem domain. Bayesian Network Technologies: Applications and Graphical Models provides an excellent and well-balanced collection of areas where Bayesian networks have been successfully applied. This book describes the underlying concepts of Bayesian Networks in an interesting manner with the help of diverse applications, and theories that prove Bayesian networks valid. Bayesian Network Technologies: Applications and Graphical Models provides specific examples of how Bayesian networks are powerful machine learning tools critical in solving real-life problems.

Journal ArticleDOI
TL;DR: Bayesian methods for analyzing longitudinal data in social and behavioral research are recommended for their ability to incorporate prior information in estimating simple and complex models as discussed by the authors. But, they are not suitable for large-scale data sets.
Abstract: Bayesian methods for analyzing longitudinal data in social and behavioral research are recommended for their ability to incorporate prior information in estimating simple and complex models. We first summarize the basics of Bayesian methods before presenting an empirical example in which we fit a latent basis growth curve model to achievement data from the National Longitudinal Survey of Youth. This step-by-step example illustrates how to analyze data using both noninformative and informative priors. The results show that in addition to being an alternative to the maximum likelihood estimation (MLE) method, Bayesian methods also have unique strengths, such as the systematic incorporation of prior information from previous studies. These methods are more plausible ways to analyze small sample data compared with the MLE method.

Journal ArticleDOI
TL;DR: There appears to be a vast gap between the kinds of knowledge that children learn and the mechanisms that could allow them to learn that knowledge and is there a more precise computational way to bridge this gap?
Abstract: Over the past 30 years we have discovered an enormous amount about what children know and when they know it. But the real question for developmental cognitive science is not so much what children know, when they know it or even whether they learn it. The real question is how they learn it and why they get it right. Developmental ‘theory theorists’ (e.g. Carey, 1985; Gopnik & Meltzoff, 1997; Wellman & Gelman, 1998) have suggested that children’s learning mechanisms are analogous to scientific theory-formation. However, what we really need is a more precise computational specification of the mechanisms that underlie both types of learning, in cognitive development and scientific discovery. The most familiar candidates for learning mechanisms in developmental psychology have been variants of associationism, either the mechanisms of classical and operant conditioning in behaviorist theories (e.g. Rescorla & Wagner, 1972) or more recently, connectionist models (e.g. Rumelhart & McClelland, 1986; Elman, Bates, Johnson & Karmiloff-Smith, 1996; Shultz, 2003; Rogers & McClelland, 2004). Such theories have had difficulty explaining how apparently rich, complex, abstract, rulegoverned representations, such as we see in everyday theories, could be derived from evidence. Typically, associationists have argued that such abstract representations do not really exist, and that children’s behavior can be just as well explained in terms of more specific learned associations between task inputs and outputs. Connectionists often qualify this denial by appealing to the notion of distributed representations in hidden layers of units that relate inputs to outputs (Rogers & McClelland, 2004; Colunga & Smith, 2005). On this view, however, the representations are not explicit, task-independent models of the world’s structure that are responsible for the input–output relations. Instead, they are implicit summaries of the input–output relations for a specific set of tasks that the connectionist network has been trained to perform. Conversely, more nativist accounts of cognitive development endorse the existence of abstract rule-governed representations but deny that their basic structure is learned. Modularity or ‘core knowledge’ theorists, for example, suggest that there are a small number of innate causal schemas designed to fit particular domains of knowledge, such as a belief-desire schema for intuitive psychology or a generic object schema for intuitive physics. Development is either a matter of enriching those innate schemas, or else involves quite sophisticated and culturespecific kinds of learning like those of the social institutions of science (e.g. Spelke, Breinlinger, Macomber & Jacobson, 1992). This has left empirically minded developmentalists, who seem to see both abstract representation and learning in even the youngest children, in an unfortunate theoretical bind. There appears to be a vast gap between the kinds of knowledge that children learn and the mechanisms that could allow them to learn that knowledge. The attempt to bridge this gap dates back to Piagetian ideas about constructivism, of course, but simply saying that there are constructivist learning mechanisms is a way of restating the problem rather than providing a solution. Is there a more precise computational way to bridge this gap? Recent developments in machine learning and artificial intelligence suggest that the answer may be yes. These new approaches to inductive learning are based on sophisticated and rational mechanisms of statistical

Proceedings Article
11 Mar 2007
TL;DR: In this article, the authors study the multi-task Bayesian network structure learning problem, where given data for multiple related problems, learn a Bayesian Network structure for each of them, sharing information among the problems to boost performance.
Abstract: We study the multi-task Bayesian Network structure learning problem: given data for multiple related problems, learn a Bayesian Network structure for each of them, sharing information among the problems to boost performance. We learn the structures for all the problems simultaneously using a score and search approach that encourages the learned Bayes Net structures to be similar. Encouraging similarity promotes information sharing and prioritizes learning structural features that explain the data from all problems over features that only seem relevant to a single one. This leads to a significant increase in the accuracy of the learned structures, especially when training data is scarce.

Journal ArticleDOI
TL;DR: A diagnostic test based on measuring the conflict between two independent sources of evidence regarding a parameter is investigated, giving rise to a p-value that exactly matches or closely approximates a cross-validatory predictive comparison, and yet is more widely applicable.
Abstract: A variety of simulation-based techniques have been proposed for detection of divergent behaviour at each level of a hierarchical model. We investigate a diagnostic test based on measuring the conflict between two independent sources of evidence regarding a parameter: that arising from its predictive prior given the remainder of the data, and that arising from its likelihood. This test gives rise to a $p$-value that exactly matches or closely approximates a cross-validatory predictive comparison, and yet is more widely applicable. Its properties are explored for normal hierarchical models and in an application in which divergent surgical mortality was suspected. Since full cross-validation is so computationally demanding, we examine full-data approximations which are shown to have only moderate conservatism in normal models. A second example concerns criticism of a complex growth curve model at both observation and parameter levels, and illustrates the issue of dealing with multiple $p$-values within a Bayesian framework. We conclude with the proposal of an overall strategy to detecting divergent behaviour in hierarchical models.

Journal ArticleDOI
01 Jul 2007-Genetics
TL;DR: The Bayesian model selection framework for mapping epistatic QTL in experimental crosses is extended to include environmental effects and gene–environment interactions and a new, fast Markov chain Monte Carlo algorithm is proposed to explore the posterior distribution of unknowns.
Abstract: We extend our Bayesian model selection framework for mapping epistatic QTL in experimental crosses to include environmental effects and gene–environment interactions. We propose a new, fast Markov chain Monte Carlo algorithm to explore the posterior distribution of unknowns. In addition, we take advantage of any prior knowledge about genetic architecture to increase posterior probability on more probable models. These enhancements have significant computational advantages in models with many effects. We illustrate the proposed method by detecting new epistatic and gene–sex interactions for obesity-related traits in two real data sets of mice. Our method has been implemented in the freely available package R/qtlbim (http://www.qtlbim.org) to facilitate the general usage of the Bayesian methodology for genomewide interacting QTL analysis.

Journal ArticleDOI
TL;DR: This paper discusses how Bayesian network models are set up with expert information, improved and calibrated from data, and deployed as evidence-based inference engines, and illustrates the flexibility and capabilities of Bayesian networks through a series of concrete examples, without extensive technical detail.
Abstract: This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models are reviewed, as they affect applications to diagnostic assessment. The paper discusses how Bayesian network models are set up with expert information, improved and calibrated from data, and deployed as evidence-based inference engines. Aimed at a general educational measurement audience, the paper illustrates the flexibility and capabilities of Bayesian networks through a series of concrete examples, and without extensive technical detail. Examples are provided of proficiency spaces with direct dependencies among proficiency nodes, and of customized evidence models for complex tasks. This paper is intended to motivate educational measurement practitioners to learn more about Bayesian networks from the research literature, to acquire readily available Bayesian network software, to perform studies with real and simulated data sets, and to look for opportunities in educational settings that may benefit from diagnostic assessment fueled by Bayesian network modeling.

Journal ArticleDOI
TL;DR: The Bayesian grey model was the most accurate one among these models and can obviously conquer the difficulties with a small sample set and ambiguity of available information.

Journal ArticleDOI
TL;DR: A Bayesian approach to analyze a general structural equation model that accommodates the general nonlinear terms of latent variables and covariates is introduced and produces a Bayesian estimate that has the same statistical optimal properties as a maximum likelihood estimate.
Abstract: The analysis of interaction among latent variables has received much attention. This article introduces a Bayesian approach to analyze a general structural equation model that accommodates the general nonlinear terms of latent variables and covariates. This approach produces a Bayesian estimate that has the same statistical optimal properties as a maximum likelihood estimate. Other advantages over the traditional approaches are discussed. More important, we demonstrate through examples how to use the freely available software WinBUGS to obtain Bayesian results for estimation and model comparison. Simulation studies are conducted to assess the empirical performances of the approach for situations with various sample sizes and prior inputs.

Proceedings ArticleDOI
12 Aug 2007
TL;DR: A technique for reducing the wireless cost of tracking mobile users with uncertain parameters is developed in this paper and a novel hybrid Bayesian neural network model for predicting locations on Cellular Networks is presented.
Abstract: Nowadays, path prediction is being extensively examined for use in the context of mobile and wireless computing towards more efficient network resource management schemes. Path prediction allows the network and services to further enhance the quality of service levels that the user enjoys. In this paper we present a path prediction algorithm that exploits human creatures habits. In this paper, we present a novel hybrid Bayesian neural network model for predicting locations on Cellular Networks (can also be extended to other wireless networks such as WI-FI and WiMAX). We investigate different parallel implementation techniques on mobile devices of the proposed approach and compare it to many standard neural network techniques such as: Back-propagation, Elman, Resilient, Levenberg-Marqudat, and One-Step Secant models. In our experiments, we compare results of the proposed Bayesian Neural Network with 5 standard neural network techniques in predicting both next location and next service to request. Bayesian learning for Neural Networks predicts both location and service better than standard neural network techniques since it uses well founded probability model to represent uncertainty about the relationships being learned. The result of Bayesian training is a posterior distribution over network weights. We use Markov chain Monte Carlo methods (MCMC) to sample N values from the posterior weights distribution. These N samples vote for the best prediction. Simulations of the algorithm, performed using a Realistic Mobility Patterns, show increased prediction accuracy.

Journal ArticleDOI
TL;DR: This paper considers the Bayesian analysis of dyadic data with particular emphasis on applications in social psychology, and proposes a class of models where a single value is elicited to complete the prior specification.
Abstract: SYNOPTIC ABSTRACTThis paper considers the Bayesian analysis of dyadic data with particular emphasis on applications in social psychology. Various existing models are extended and unified under a class of models where a single value is elicited to complete the prior specification. Certain situations which have sometimes been problematic (e.g. incomplete data, non-standard covariates, missing data, unbalanced data) are easily handled under the proposed class of Bayesian models. Inference is straightforward using software that is based on Markov chain Monte Carlo methods. Examples are provided which highlight the variety of data sets that can be entertained and the ease in which they can now be analyzed.

Book ChapterDOI
11 Jul 2007
TL;DR: Dynamic Bayesian Networks are used to build a stochastic reliability model that relies on standard models of software architecture, and does not require implementation-level artifacts.
Abstract: Modern society relies heavily on complex software systems for everyday activities. Dependability of these systems thus has become a critical feature that determines which products are going to be successfully and widely adopted. In this paper, we present an approach to modeling reliability of software systems at the architectural level. Dynamic Bayesian Networks are used to build a stochastic reliability model that relies on standard models of software architecture, and does not require implementation-level artifacts. Reliability values obtained via this approach can aid the architect in evaluating design alternatives. The approach is evaluated using sensitivity and uncertainty analysis.

01 Jan 2007
TL;DR: In this article, Bayesian networks and logic programs, extensions of the Basic Framework, learning Bayesian Logic Programs, Balios - The Engine for Basic Logic Programs are discussed. But they do not cover the use of Bayesian Networks and Logic Programs.
Abstract: This chapter contains sections titled: Introduction, On Bayesian Networks and Logic Programs, Bayesian Logic Programs, Extensions of the Basic Framework, Learning Bayesian Logic Programs, Balios - The Engine for Basic Logic Programs, Related Work, Conclusions, Acknowledgments, References

Journal ArticleDOI
TL;DR: This work shows how to infer kth order Markov chains, for arbitrary k, from finite data by applying Bayesian methods to both parameter estimation and model-order selection and establishes a direct relationship between Bayesian evidence and the partition function.
Abstract: Markov chains are a natural and well understood tool for describing one-dimensional patterns in time or space. We show how to infer kth order Markov chains, for arbitrary k, from finite data by applying Bayesian methods to both parameter estimation and model-order selection. Extending existing results for multinomial models of discrete data, we connect inference to statistical mechanics through information-theoretic (type theory) techniques. We establish a direct relationship between Bayesian evidence and the partition function which allows for straightforward calculation of the expectation and variance of the conditional relative entropy and the source entropy rate. Finally, we introduce a method that uses finite data-size scaling with model-order comparison to infer the structure of out-of-class processes.

Journal ArticleDOI
TL;DR: In this paper, the authors consider models based on multivariate counting processes, including multi-state models, and show that the likelihood-based cross-validation (LCV) is a nearly unbiased estimator of it.
Abstract: We consider models based on multivariate counting processes, including multi-state models. These models are specified semi-parametrically by a set of functions and real parameters. We consider inference for these models based on coarsened observations, focusing on families of smooth estimators such as produced by penalized likelihood. An important issue is the choice of model structure, for instance, the choice between a Markov and some non-Markov models. We define in a general context the expected Kullback–Leibler criterion and we show that the likelihood-based cross-validation (LCV) is a nearly unbiased estimator of it. We give a general form of an approximate of the leave-one-out LCV. The approach is studied by simulations, and it is illustrated by estimating a Markov and two semi-Markov illness–death models with application on dementia using data of a large cohort study.