scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the American Statistical Association in 2008"


Journal ArticleDOI
TL;DR: The Lasso estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters have independent Laplace (i.e., double-exponential) priors.
Abstract: The Lasso estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters have independent Laplace (i.e., double-exponential) priors. Gibbs sampling from this posterior is possible using an expanded hierarchy with conjugate normal priors for the regression parameters and independent exponential priors on their variances. A connection with the inverse-Gaussian distribution provides tractable full conditional distributions. The Bayesian Lasso provides interval estimates (Bayesian credible intervals) that can guide variable selection. Moreover, the structure of the hierarchical model provides both Bayesian and likelihood methods for selecting the Lasso parameter. Slight modifications lead to Bayesian versions of other Lasso-related estimation methods, including bridge regression and a robust variant.

2,897 citations


Journal ArticleDOI
TL;DR: This article presented a de novo analysis of these experiments, focusing on two core issues that have received limited attention in previous analyses: treatment effect heterogeneity by gender and overrejection of the null hypothesis due to multiple inference.
Abstract: The view that the returns to educational investments are highest for early childhood interventions is widely held and stems primarily from several influential randomized trials—Abecedarian, Perry, and the Early Training Project—that point to super-normal returns to early interventions. This article presents a de novo analysis of these experiments, focusing on two core issues that have received limited attention in previous analyses: treatment effect heterogeneity by gender and overrejection of the null hypothesis due to multiple inference. To address the latter issue, a statistical framework that combines summary index tests with familywise error rate and false discovery rate corrections is implemented. The first technique reduces the number of tests conducted; the latter two techniques adjust the p values for multiple inference. The primary finding of the reanalysis is that girls garnered substantial short- and long-term benefits from the interventions, but there were no significant long-term benefits fo...

1,450 citations


Journal ArticleDOI
TL;DR: In this paper, a mixture of g priors is proposed as an alternative to the default g prior, which resolves many of the problems with the original formulation while maintaining the computational tractability of the g prior.
Abstract: Zellner's g prior remains a popular conventional prior for use in Bayesian variable selection, despite several undesirable consistency issues. In this article we study mixtures of g priors as an alternative to default g priors that resolve many of the problems with the original formulation while maintaining the computational tractability that has made the g prior so popular. We present theoretical properties of the mixture g priors and provide real and simulated examples to compare the mixture formulation with fixed g priors, empirical Bayes approaches, and other default procedures. Please see Arnold Zellner's letter and the author's response.

1,115 citations


Journal ArticleDOI
TL;DR: This work focuses on combining observations from field experiments with detailed computer simulations of a physical process to carry out statistical inference, and makes use of basis representations to reduce the dimensionality of the problem and speed up the computations required for exploring the posterior distribution.
Abstract: This work focuses on combining observations from field experiments with detailed computer simulations of a physical process to carry out statistical inference. Of particular interest here is determining uncertainty in resulting predictions. This typically involves calibration of parameters in the computer simulator as well as accounting for inadequate physics in the simulator. The problem is complicated by the fact that simulation code is sufficiently demanding that only a limited number of simulations can be carried out. We consider applications in characterizing material properties for which the field data and the simulator output are highly multivariate. For example, the experimental data and simulation output may be an image or may describe the shape of a physical object. We make use of the basic framework of Kennedy and O'Hagan. However, the size and multivariate nature of the data lead to computational challenges in implementing the framework. To overcome these challenges, we make use of basis repre...

838 citations


Journal ArticleDOI
TL;DR: This article considers a population of groups of individuals where interference is possible between individuals within the same group, and proposes estimands for direct, indirect, total, and overall causal effects of treatment strategies in this setting.
Abstract: A fundamental assumption usually made in causal inference is that of no interference between individuals (or units); that is, the potential outcomes of one individual are assumed to be unaffected by the treatment assignment of other individuals. However, in many settings, this assumption obviously does not hold. For example, in the dependent happenings of infectious diseases, whether one person becomes infected depends on who else in the population is vaccinated. In this article, we consider a population of groups of individuals where interference is possible between individuals within the same group. We propose estimands for direct, indirect, total, and overall causal effects of treatment strategies in this setting. Relations among the estimands are established; for example, the total causal effect is shown to equal the sum of direct and indirect causal effects. Using an experimental design with a two-stage randomization procedure (first at the group level, then at the individual level within groups), unbiased estimators of the proposed estimands are presented. Variances of the estimators are also developed. The methodology is illustrated in two different settings where interference is likely: assessing causal effects of housing vouchers and of vaccines.

684 citations


Journal ArticleDOI
TL;DR: A systematic examination of a real network data set using maximum likelihood estimation for exponential random graph models as well as new procedures to evaluate how well the models fit the observed networks concludes that these models capture aspects of the social structure of adolescent friendship relations not represented by previous models.
Abstract: We present a systematic examination of a real network data set using maximum likelihood estimation for exponential random graph models as well as new procedures to evaluate how well the models fit the observed networks. These procedures compare structural statistics of the observed network with the corresponding statistics on networks simulated from the fitted model. We apply this approach to the study of friendship relations among high school students from the National Longitudinal Study of Adolescent Health (AddHealth). We focus primarily on one particular network of 205 nodes, although we also demonstrate that this method may be applied to the largest network in the AddHealth study, with 2,209 nodes. We argue that several well-studied models in the networks literature do not fit these data well and demonstrate that the fit improves dramatically when the models include the recently developed geometrically weighted edgewise shared partner, geometrically weighted dyadic shared partner, and geometrically w...

657 citations



Journal ArticleDOI
TL;DR: In this article, the authors introduce the concept of variance estimation and introduce the Variance Estimation Estimation Method (VEM) as an alternative to variance estimation for variance estimation.
Abstract: (2008). Introduction to Variance Estimation. Journal of the American Statistical Association: Vol. 103, No. 483, pp. 1324-1325.

562 citations


Journal ArticleDOI
TL;DR: These case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers.
Abstract: We describe studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data We discuss breast cancer applications and key aspects of the modeling and computational methodology Our case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers Based on the metaphor of statistically derived “factors” as representing biological “subpathway” structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components Hierarchical sparsity priors address questions of dimension reduction and multiple co

559 citations


Journal ArticleDOI
TL;DR: In this paper, a non-stationary modeling methodologies that couple stationary Gaussian processes with treed partitioning is presented. But this method is not applicable to the design of a rocket booster.
Abstract: Motivated by a computer experiment for the design of a rocket booster, this article explores nonstationary modeling methodologies that couple stationary Gaussian processes with treed partitioning. Partitioning is a simple but effective method for dealing with nonstationarity. The methodological developments and statistical computing details that make this approach efficient are described in detail. In addition to providing an analysis of the rocket booster simulator, we show that our approach is effective in other arenas as well.

540 citations


Journal ArticleDOI
TL;DR: Bilder and Tebbs review An Introduction to Categorical Data Analysis (2nd ed.) by Alan Agresti as discussed by the authors, which is a good starting point for this paper.
Abstract: Bilder and Tebbs review An Introduction to Categorical Data Analysis (2nd ed.) by Alan Agresti.

Journal ArticleDOI
TL;DR: Focusing on the particular case of the Matérn class of covariance functions, this article gives conditions under which estimators maximizing the tapering approximations are, like the maximum likelihood estimator, strongly consistent.
Abstract: Maximum likelihood is an attractive method of estimating covariance parameters in spatial models based on Gaussian processes. But calculating the likelihood can be computationally infeasible for large data sets, requiring O(n3) calculations for a data set with n observations. This article proposes the method of covariance tapering to approximate the likelihood in this setting. In this approach, covariance matrixes are “tapered,” or multiplied element wise by a sparse correlation matrix. The resulting matrixes can then be manipulated using efficient sparse matrix algorithms. We propose two approximations to the Gaussian likelihood using tapering. One of these approximations simply replaces the model covariance with a tapered version, whereas the other is motivated by the theory of unbiased estimating equations. Focusing on the particular case of the Matern class of covariance functions, we give conditions under which estimators maximizing the tapering approximations are, like the maximum likelihood estimat...

Journal ArticleDOI
TL;DR: In this article, a randomized and a non-randomized experiment was used to investigate the effect of covariate-adjusted randomized assignment features on the performance of nonrandomized experiments.
Abstract: A key justification for using nonrandomized experiments is that, with proper adjustment, their results can well approximate results from randomized experiments. This hypothesis has not been consistently supported by empirical studies; however, previous methods used to study this hypothesis have confounded assignment method with other study features. To avoid these confounding factors, this study randomly assigned participants to be in a randomized experiment or a nonrandomized experiment. In the randomized experiment, participants were randomly assigned to mathematics or vocabulary training; in the nonrandomized experiment, participants chose their training. The study held all other features of the experiment constant; it carefully measured pretest variables that might predict the condition that participants chose, and all participants were measured on vocabulary and mathematics outcomes. Ordinary linear regression reduced bias in the nonrandomized experiment by 84–94% using covariate-adjusted randomized ...

Journal ArticleDOI
TL;DR: In this article, the problem of nonparametric modeling of these distributions, borrowing information across centers while also allowing centers to be clustered is addressed, and an efficient Markov chain Monte Carlo algorithm is developed for computation.
Abstract: In multicenter studies, subjects in different centers may have different outcome distributions. This article is motivated by the problem of nonparametric modeling of these distributions, borrowing information across centers while also allowing centers to be clustered. Starting with a stick-breaking representation of the Dirichlet process (DP), we replace the random atoms with random probability measures drawn from a DP. This results in a nested DP prior, which can be placed on the collection of distributions for the different centers, with centers drawn from the same DP component automatically clustered together. Theoretical properties are discussed, and an efficient Markov chain Monte Carlo algorithm is developed for computation. The methods are illustrated using a simulation study and an application to quality of care in U.S. hospitals.

Journal ArticleDOI
TL;DR: In this article, a new quantile regression approach for survival data subject to conditionally independent censoring is proposed, which leads to a simple algorithm that involves minimizations only of L1-type convex functions.
Abstract: Quantile regression offers great flexibility in assessing covariate effects on event times, thereby attracting considerable interests in its applications in survival analysis. But currently available methods often require stringent assumptions or complex algorithms. In this article we develop a new quantile regression approach for survival data subject to conditionally independent censoring. The proposed martingale-based estimating equations naturally lead to a simple algorithm that involves minimizations only of L1-type convex functions. We establish uniform consistency and weak convergence of the resultant estimators. We develop inferences accordingly, including hypothesis testing, second-stage inference, and model diagnostics. We evaluate the finite-sample performance of the proposed methods through extensive simulation studies. An analysis of a recent dialysis study illustrates the practical utility of our proposals.

Journal ArticleDOI
TL;DR: A regularized estimation procedure for variable selection that combines basis function approximations and the smoothly clipped absolute deviation penalty and establishes the theoretical properties of the procedure, including consistency in variable selection and the oracle property in estimation.
Abstract: Nonparametric varying-coefficient models are commonly used for analyzing data measured repeatedly over time, including longitudinal and functional response data. Although many procedures have been developed for estimating varying coefficients, the problem of variable selection for such models has not been addressed to date. In this article we present a regularized estimation procedure for variable selection that combines basis function approximations and the smoothly clipped absolute deviation penalty. The proposed procedure simultaneously selects significant variables with time-varying effects and estimates the nonzero smooth coefficient functions. Under suitable conditions, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. Here the oracle property means that the asymptotic distribution of an estimated coefficient function is the same as that when it is known a priori which variables are in the model. The method is...

Journal ArticleDOI
TL;DR: Three basic tools (marginalization, permutation, and trimming) are introduced that allow us to transform a Gibbs sampler into a partially collapsed GibbsSampler with known stationary distribution and faster convergence.
Abstract: Ever-increasing computational power, along with ever–more sophisticated statistical computing techniques, is making it possible to fit ever–more complex statistical models. Among the more computationally intensive methods, the Gibbs sampler is popular because of its simplicity and power to effectively generate samples from a high-dimensional probability distribution. Despite its simple implementation and description, however, the Gibbs sampler is criticized for its sometimes slow convergence, especially when it is used to fit highly structured complex models. Here we present partially collapsed Gibbs sampling strategies that improve the convergence by capitalizing on a set of functionally incompatible conditional distributions. Such incompatibility generally is avoided in the construction of a Gibbs sampler, because the resulting convergence properties are not well understood. We introduce three basic tools (marginalization, permutation, and trimming) that allow us to transform a Gibbs sampler into a part...

Journal ArticleDOI
TL;DR: Simulated examples, as well as an application to a real cancer microarray data set, show that the proposed SigClust method works remarkably well for assessing significance of clustering.
Abstract: Clustering methods provide a powerful tool for the exploratory analysis of high-dimension, low–sample size (HDLSS) data sets, such as gene expression microarray data. A fundamental statistical issue in clustering is which clusters are “really there,” as opposed to being artifacts of the natural sampling variation. We propose SigClust as a simple and natural approach to this fundamental statistical problem. In particular, we define a cluster as data coming from a single Gaussian distribution and formulate the problem of assessing statistical significance of clustering as a testing procedure. This Gaussian null assumption allows direct formulation of p values that effectively quantify the significance of a given clustering. HDLSS covariance estimation for SigClust is achieved by a combination of invariance principles, together with a factor analysis model. The properties of SigClust are studied. Simulated examples, as well as an application to a real cancer microarray data set, show that the proposed method...

Journal ArticleDOI
TL;DR: An efficient optimization algorithm is developed that is fast and always converges to a local minimum and it is proved that the SCAD estimator still has the oracle property on high-dimensional problems.
Abstract: The smoothly clipped absolute deviation (SCAD) estimator, proposed by Fan and Li, has many desirable properties, including continuity, sparsity, and unbiasedness. The SCAD estimator also has the (asymptotically) oracle property when the dimension of covariates is fixed or diverges more slowly than the sample size. In this article we study the SCAD estimator in high-dimensional settings where the dimension of covariates can be much larger than the sample size. First, we develop an efficient optimization algorithm that is fast and always converges to a local minimum. Second, we prove that the SCAD estimator still has the oracle property on high-dimensional problems. We perform numerical studies to compare the SCAD estimator with the LASSO and SIS–SCAD estimators in terms of prediction accuracy and variable selectivity when the true model is sparse. Through the simulation, we show that the variance estimator of Fan and Li still works well for some limited high-dimensional cases where the true nonzero coeffic...

Journal ArticleDOI
TL;DR: In this paper, an alternative estimation method based on the expectation-maximization algorithm was proposed to estimate declustered background seismicity rates of geologically distinct regions in Southern California.
Abstract: Maximum likelihood estimation of branching point process models via numerical optimization procedures can be unstable and computationally intensive. We explore an alternative estimation method based on the expectation-maximization algorithm. The method involves viewing the estimation of such branching processes as analogous to incomplete data problems. Using an application from seismology, we show how the epidemic-type aftershock sequence (ETAS) model can, in fact, be estimated this way, and we propose a computationally efficient procedure to maximize the expected complete data log-likelihood function. Using a space–time ETAS model, we demonstrate that this method is extremely robust and accurate and use it to estimate declustered background seismicity rates of geologically distinct regions in Southern California. All regions show similar declustered background intensity estimates except for the one covering the southern section of the San Andreas fault system to the east of San Diego in which a substanti...

Journal ArticleDOI
TL;DR: In this article, the authors propose to replace the linearity assumption by an additive structure, leading to a more widely applicable and much more flexible framework for functional regression models, which is suitable for both scalar and functional responses.
Abstract: In commonly used functional regression models, the regression of a scalar or functional response on the functional predictor is assumed to be linear. This means that the response is a linear function of the functional principal component scores of the predictor process. We relax the linearity assumption and propose to replace it by an additive structure, leading to a more widely applicable and much more flexible framework for functional regression models. The proposed functional additive regression models are suitable for both scalar and functional responses. The regularization needed for effective estimation of the regression parameter function is implemented through a projection on the eigenbasis of the covariance operator of the functional components in the model. The use of functional principal components in an additive rather than linear way leads to substantial broadening of the scope of functional regression models and emerges as a natural approach, because the uncorrelatedness of the functional pr...

Journal ArticleDOI
TL;DR: In this article, the authors present an analysis and design of stochastic switching systems for the analysis and analysis of statistical models. Journal of the American Statistical Association: Vol. 103, No. 481, pp. 430-430.
Abstract: (2008). Stochastic Switching Systems: Analysis and Design. Journal of the American Statistical Association: Vol. 103, No. 481, pp. 430-430.

Posted Content
TL;DR: The Modern Experimental Design is a must-have reference for anyone who will be designing experiments or for statisticians interested in remaining on the leading edge of this important area as mentioned in this paper...
Abstract: describes some of the aspects of analysis for designs where multiple responses are collected. Because most experiments have this feature, understanding the opportunities and challenges for this situation is essential reading for practitioners. Chapter 13 is a collection of short sections on a number of other specialized designs, including screening, equileverage, optimal, space-filling, trend-free, and mixture designs. Chapter 14, “Tying It All Together,” briefly discusses the difficult task of choosing between different designs when planning an experiment. This chapter reinforces my belief that learning this aspect of design of experiments is most challenging, both because of how we teach design (cleanly compartmentalized into tidy chunks with questions posed to fit precisely into a category) and because of the ever-increasing breadth of tools available. The exercises at the conclusion of this chapter help the reader gain experience with design selection and pose many thought-provoking questions that will challenge even the most seasoned statistician. Already an extensive volume on the subject, this book contains a wealth of information. However, on my wish list for additional topics would be more discussion about the different roles of experimentation, from exploration, to screening for important factors, to response surface methods for characterizing and optimizing the relationship between the key factors and the response, to confirmatory experiments near the chosen optimum. Matching types of designs to their intended purposes is another area that is difficult for those studying design of experiment, and direct discussion can greatly help accelerate this understanding. In addition, it would be beneficial to present some of the criteria and graphical tools that are available to compare different potential designs. This could help formalize the numerous trade-offs between the many aspects of a good design. Overall, Modern Experimental Design is a must-have reference for anyone who will be designing experiments or for statisticians interested in remaining on the leading edge of this important area. I thoroughly enjoyed reading this well-written and comprehensive book, both for the careful and clear synthesis of the new research in this area and for the many insightful comments that help connect the details of the methods to the big picture.

Journal ArticleDOI
TL;DR: In this paper, the authors present a statistical methodology to quantify whether climate models are indeed unbiased and whether and where model biases are correlated across models, and they consider the simulated mean state and the simulated trend over the period 1970-1999 for Northern Hemisphere summer and winter temperature.
Abstract: A limited number of complex numerical models that simulate the Earth's atmosphere, ocean, and land processes are the primary tool to study how climate may change over the next century due to anthropogenic emissions of greenhouse gases. A standard assumption is that these climate models are random samples from a distribution of possible models centered around the true climate. This implies that agreement with observations and the predictive skill of climate models will improve as more models are added to an average of the models. In this article we present a statistical methodology to quantify whether climate models are indeed unbiased and whether and where model biases are correlated across models. We consider the simulated mean state and the simulated trend over the period 1970–1999 for Northern Hemisphere summer and winter temperature. The key to the statistical analysis is a spatial model for the bias of each climate model and the use of kernel smoothing to estimate the correlations of biases across di...

Journal ArticleDOI
TL;DR: Order-space Markov chain Monte Carlo, equi-energy sampling, importance weighting, and stream-based computation are combined to create a fast algorithm for learning causal Bayesian network structures.
Abstract: We propose a method for the computational inference of directed acyclic graphical structures given data from experimental interventions Order-space Markov chain Monte Carlo, equi-energy sampling, importance weighting, and stream-based computation are combined to create a fast algorithm for learning causal Bayesian network structures

Journal ArticleDOI
TL;DR: This article proposes parameter estimation methods for ordinary differential equation (ODE) models based on the local smoothing approach and a pseudo–least squares (PsLS) principle under a framework of measurement error in regression models and compares their finite-sample performances via simulation studies.
Abstract: Differential equation (DE) models are widely used in many scientific fields, including engineering, physics, and biomedical sciences. The so-called “forward problem,” the problem of simulations and predictions of state variables for given parameter values in the DE models, has been extensively studied by mathematicians, physicists, engineers, and other scientists. However, the “inverse problem,” the problem of parameter estimation based on the measurements of output variables, has not been well explored using modern statistical methods, although some least squares–based approaches have been proposed and studied. In this article we propose parameter estimation methods for ordinary differential equation (ODE) models based on the local smoothing approach and a pseudo–least squares (PsLS) principle under a framework of measurement error in regression models. The asymptotic properties of the proposed PsLS estimator are established. We also compare the PsLS method to the corresponding simulation-extrapolation (...

Journal ArticleDOI
TL;DR: This article developed a measure of relative risk tolerance using responses to hypothetical income gambles in the Health and Retirement Study and showed how to construct a cardinal proxy for the risk tolerance of each survey respondent.
Abstract: Economic theory assigns a central role to risk preferences. This article develops a measure of relative risk tolerance using responses to hypothetical income gambles in the Health and Retirement Study. In contrast to most survey measures that produce an ordinal metric, this article shows how to construct a cardinal proxy for the risk tolerance of each survey respondent. The article also shows how to account for measurement error in estimating this proxy and how to obtain consistent regression estimates despite the measurement error. The risk tolerance proxy is shown to explain differences in asset allocation across households.

Journal ArticleDOI
TL;DR: In this paper, a hierarchical model was proposed to predict the frequency, type, and severity of claims in a major insurance company in Singapore by using a negative binomial regression model and a multinomial logit model.
Abstract: This work describes statistical modeling of detailed, microlevel automobile insurance records. We consider 1993–2001 data from a major insurance company in Singapore. By detailed microlevel records, we mean experience at the individual vehicle level, including vehicle and driver characteristics, insurance coverage, and claims experience, by year. The claims experience consists of detailed information on the type of insurance claim, such as whether the claim is due to injury to a third party, property damage to a third party, or claims for damage to the insured, as well as the corresponding claim amount. We propose a hierarchical model for three components, corresponding to the frequency, type, and severity of claims. The first model is a negative binomial regression model for assessing claim frequency. The driver’s gender, age, and no claims discount, as well as vehicle age and type, turn out to be important variables for predicting the event of a claim. The second is a multinomial logit model to predict ...

Journal ArticleDOI
TL;DR: In this paper, the authors focus on the question of whether the inclusion of indicators of real economic activity lowers the prediction mean squared error of forecasting models of U.S. consumer price inflation and propose three variants of the bagging algorithm specifically designed for this type of forecasting problem.
Abstract: This article focuses on the widely studied question of whether the inclusion of indicators of real economic activity lowers the prediction mean squared error of forecasting models of U.S. consumer price inflation. We propose three variants of the bagging algorithm specifically designed for this type of forecasting problem and evaluate their empirical performance. Although bagging predictors in our application are clearly more accurate than equally weighted forecasts, median forecasts, ARM forecasts, AFTER forecasts, or Bayesian forecast averages based on one extra predictor at a time, they are generally about as accurate as the Bayesian shrinkage predictor, the ridge regression predictor, the iterated LASSO predictor, or the Bayesian model average predictor based on random subsets of extra predictors. Our results show that bagging can achieve large reductions in prediction mean-squared errors even in such challenging applications as inflation forecasting; however, bagging is not the only method capable of...

Journal ArticleDOI
TL;DR: A model for estimating flight departure delay distributions required by air traffic congestion prediction models is developed, using a global optimization version of the Expectation Maximization algorithm, borrowing ideas from Genetic Algorithms.
Abstract: In this article we develop a model for estimating flight departure delay distributions required by air traffic congestion prediction models. We identify and study major factors that influence flight departure delays, and develop a strategic departure delay prediction model. This model employs nonparametric methods for daily and seasonal trends. In addition, the model uses a mixture distribution to estimate the residual errors. To overcome problems with local optima in the mixture distribution, we develop a global optimization version of the expectation–maximization algorithm, borrowing ideas from genetic algorithms. The model demonstrates reasonable goodness of fit, robustness to the choice of the model parameters, and good predictive capabilities. We use flight data from United Airlines and Denver International Airport from the years 2000/2001 to train and validate our model.