scispace - formally typeset
Search or ask a question

Showing papers on "Linear model published in 2003"


Journal ArticleDOI
TL;DR: The basic Bayesian framework must be constrained, use of the step function in computing the probability that a team would rank best or worst in a league, and implementation of a Dirichlet process prior are presented.
Abstract: (2003). Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. Journal of the American Statistical Association: Vol. 98, No. 461, pp. 257-258.

4,086 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the break dates and the number of breaks in a linear model with multiple structural changes has been considered and an efficient algorithm based on the principle of dynamic programming has been proposed.
Abstract: In a recent paper, Bai and Perron (1998) considered theoretical issues related to the limiting distribution of estimators and test statistics in the linear model with multiple structural changes. In this companion paper, we consider practical issues for the empirical applications of the procedures. We first address the problem of estimation of the break dates and present an efficient algorithm to obtain global minimizers of the sum of squared residuals. This algorithm is based on the principle of dynamic programming and requires at most least-squares operations of order O(T2) for any number of breaks. Our method can be applied to both pure and partial structural change models. Second, we consider the problem of forming confidence intervals for the break dates under various hypotheses about the structure of the data and the errors across segments. Third, we address the issue of testing for structural changes under very general conditions on the data and the errors. Fourth, we address the issue of estimating the number of breaks. Finally, a few empirical applications are presented to illustrate the usefulness of the procedures. All methods discussed are implemented in a GAUSS program. Copyright © 2002 John Wiley & Sons, Ltd.

4,026 citations


Journal ArticleDOI
TL;DR: Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio.
Abstract: Cross-sectional studies with binary outcomes analyzed by logistic regression are frequent in the epidemiological literature. However, the odds ratio can importantly overestimate the prevalence ratio, the measure of choice in these studies. Also, controlling for confounding is not equivalent for the two measures. In this paper we explore alternatives for modeling data of such studies with techniques that directly estimate the prevalence ratio. We compared Cox regression with constant time at risk, Poisson regression and log-binomial regression against the standard Mantel-Haenszel estimators. Models with robust variance estimators in Cox and Poisson regressions and variance corrected by the scale parameter in Poisson regression were also evaluated. Three outcomes, from a cross-sectional study carried out in Pelotas, Brazil, with different levels of prevalence were explored: weight-for-age deficit (4%), asthma (31%) and mother in a paid job (52%). Unadjusted Cox/Poisson regression and Poisson regression with scale parameter adjusted by deviance performed worst in terms of interval estimates. Poisson regression with scale parameter adjusted by χ2 showed variable performance depending on the outcome prevalence. Cox/Poisson regression with robust variance, and log-binomial regression performed equally well when the model was correctly specified. Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio. However, precautions are needed to avoid estimation problems in specific situations.

3,455 citations


Journal ArticleDOI
TL;DR: Experimental results with real data sets indicate that the combined model can be an effective way to improve forecasting accuracy achieved by either of the models used separately.

3,155 citations


Journal ArticleDOI
TL;DR: Clarify is a program that uses Monte Carlo simulation to convert the raw output of statistical procedures into results that are of direct interest to researchers, without changing statistical assumptions or requiring new statistical models.
Abstract: Clarify is a program that uses Monte Carlo simulation to convert the raw output of statistical procedures into results that are of direct interest to researchers, without changing statistical assumptions or requiring new statistical models. The program, designed for use with the Stata statistics package, offers a convenient way to implement the techniques described in: Gary King, Michael Tomz, and Jason Wittenberg (2000). "Making the Most of Statistical Analyses: Improving Interpretation and Presentation." American Journal of Political Science 44, no. 2 (April 2000): 347-61. We recommend that you read this article before using the software. Clarify simulates quantities of interest for the most commonly used statistical models, including linear regression, binary logit, binary probit, ordered logit, ordered probit, multinomial logit, Poisson regression, negative binomial regression, weibull regression, seemingly unrelated regression equations, and the additive logistic normal model for compositional data. Clarify Version 2.1 is forthcoming (2003) in Journal of Statistical Software.

2,417 citations


Journal ArticleDOI
TL;DR: The production of low rank smoothers for d’≥ 1 dimensional data, which can be fitted by regression or penalized regression methods, are discussed, which allow the use of approximate thin plate spline models with large data sets, and provide a sensible way of modelling interaction terms in generalized additive models.
Abstract: discuss the production of low rank smoothers for d greater than or equal to 1 dimensional data, which can be fitted by regression or penalized regression methods. The smoothers are constructed by a simple transformation and truncation of the basis that arises from the solution of the thin plate spline smoothing problem and are optimal in the sense that the truncation is designed to result in the minimum possible perturbation of the thin plate spline smoothing problem given the dimension of the basis used to construct the smoother. By making use of Lanczos iteration the basis change and truncation are computationally efficient. The smoothers allow the use of approximate thin plate spline models with large data sets, avoid the problems that are associated with 'knot placement' that usually complicate modelling with regression splines or penalized regression splines, provide a sensible way of modelling interaction terms in generalized additive models, provide low rank approximations to generalized smoothing spline models, appropriate for use with large data sets, provide a means for incorporating smooth functions of more than one variable into non-linear models and improve the computational efficiency of penalized likelihood models incorporating thin plate splines. Given that the approach produces spline-like models with a sparse basis, it also provides a natural way of incorporating unpenalized spline-like terms in linear and generalized linear models, and these can be treated just like any other model terms from the point of view of model selection, inference and diagnostics

1,948 citations


Book
01 Jan 2003
TL;DR: In this paper, the authors proposed a sampling-based approach for estimating Elasticities in time series regression models, which can be used to estimate a single Beta Parameter for m - 1 of the m Levels of a Variable Checking Regression Assumptions Regression Outliers Regression Model GOF Measures Multicollinearity in the Regression Regression model-Building Strategies Estimating Elasticities Censored Dependent Variables-Tobit Model Box-Cox Regression Violations of Regression this paper
Abstract: FUNDAMENTALS Statistical Inference I: Descriptive Statistics Measures of Relative Standing Measures of Central Tendency Measures of Variability Skewness and Kurtosis Measures of Association Properties of Estimators Methods of Displaying Data Statistical Inference II: Interval Estimation, Hypothesis Testing, and Population Comparisons Confidence Intervals Hypothesis Testing Inferences Regarding a Single Population Comparing Two Populations Nonparametric Methods CONTINUOUS DEPENDENT VARIABLE MODELS Linear Regression Assumptions of the Linear Regression Model Regression Fundamentals Manipulating Variables in Regression Estimate a Single Beta Parameter Estimate Beta Parameter for Ranges of a Variable Estimate a Single Beta Parameter for m - 1 of the m Levels of a Variable Checking Regression Assumptions Regression Outliers Regression Model GOF Measures Multicollinearity in the Regression Regression Model-Building Strategies Estimating Elasticities Censored Dependent Variables-Tobit Model Box-Cox Regression Violations of Regression Assumptions Zero Mean of the Disturbances Assumption Normality of the Disturbances Assumption Uncorrelatedness of Regressors and Disturbances Assumption Homoscedasticity of the Disturbances Assumption No Serial Correlation in the Disturbances Assumption Model Specification Errors Simultaneous-Equation Models Overview of the Simultaneous-Equations Problem Reduced Form and the Identification Problem Simultaneous-Equation Estimation Seemingly Unrelated Equations Applications of Simultaneous Equations to Transportation Data Panel Data Analysis Issues in Panel Data Analysis One-Way Error Component Models Two-Way Error Component Models Variable-Parameter Models Additional Topics and Extensions Background and Exploration in Time Series Exploring a Time Series Basic Concepts: Stationarity and Dependence Time Series in Regression Forecasting in Time Series: Autoregressive Integrated Moving Average (ARIMA) Models and Extensions Autoregressive Integrated Moving Average Models The Box-Jenkins Approach Autoregressive Integrated Moving Average Model Extensions Multivariate Models Nonlinear Models Latent Variable Models Principal Components Analysis Factor Analysis Structural Equation Modeling Duration Models Hazard-Based Duration Models Characteristics of Duration Data Nonparametric Models Semiparametric Models Fully Parametric Models Comparisons of Nonparametric, Semiparametric, and Fully Parametric Models Heterogeneity State Dependence Time-Varying Covariates Discrete-Time Hazard Models Competing Risk Models COUNT AND DISCRETE DEPENDENT VARIABLE MODELS Count Data Models Poisson Regression Model Interpretation of Variables in the Poisson Regression Model Poisson Regression Model Goodness-of-Fit Measures Truncated Poisson Regression Model Negative Binomial Regression Model Zero-Inflated Poisson and Negative Binomial Regression Models Random-Effects Count Models Logistic Regression Principles of Logistic Regression The Logistic Regression Model Discrete Outcome Models Models of Discrete Data Binary and Multinomial Probit Models Multinomial Logit Model Discrete Data and Utility Theory Properties and Estimation of MNL Models The Nested Logit Model (Generalized Extreme Value Models) Special Properties of Logit Models Ordered Probability Models Models for Ordered Discrete Data Ordered Probability Models with Random Effects Limitations of Ordered Probability Models Discrete/Continuous Models Overview of the Discrete/Continuous Modeling Problem Econometric Corrections: Instrumental Variables and Expected Value Method Econometric Corrections: Selectivity-Bias Correction Term Discrete/Continuous Model Structures Transportation Application of Discrete/Continuous Model Structures OTHER STATISTICAL METHODS Random-Parameter Models Random-Parameters Multinomial Logit Model (Mixed Logit Model) Random-Parameter Count Models Random-Parameter Duration Models Bayesian Models Bayes' Theorem MCMC Sampling-Based Estimation Flexibility of Bayesian Statistical Models via MCMC Sampling-Based Estimation Convergence and Identifi ability Issues with MCMC Bayesian Models Goodness-of-Fit, Sensitivity Analysis, and Model Selection Criterion using MCMC Bayesian Models Appendix A: Statistical Fundamentals Appendix B: Glossary of Terms Appendix C: Statistical Tables Appendix D: Variable Transformations References Index

1,843 citations


Journal ArticleDOI
TL;DR: This paper deals with fitting piecewise terms in regression models where one or more break-points are true parameters of the model and a simple linearization technique is called for, taking advantage of the linear formulation of the problem.
Abstract: This paper deals with fitting piecewise terms in regression models where one or more break-points are true parameters of the model For estimation, a simple linearization technique is called for, taking advantage of the linear formulation of the problem As a result, the method is suitable for any regression model with linear predictor and so current software can be used; threshold modelling as function of explanatory variables is also allowed Differences between the other procedures available are shown and relative merits discussed Simulations and two examples are presented to illustrate the method

1,607 citations


Journal ArticleDOI
TL;DR: It is demonstrated that by taking into account lower-level covariances and heterogeneity a substantial increase in higher-level Z score is possible, and this result has significant implications for group studies in FMRI.

1,458 citations


Journal ArticleDOI
TL;DR: This paper describes the implementation in R of a method for tabular or graphical display of terms in a complex generalised linear model that contains terms related by marginality or hierarchy, such as polynomial terms, or main effects and interactions.
Abstract: This paper describes the implementation in R of a method for tabular or graphical display of terms in a complex generalised linear model. By complex, I mean a model that contains terms related by marginality or hierarchy, such as polynomial terms, or main effects and interactions. I call these tables or graphs effect displays. Effect displays are constructed by identifying high-order terms in a generalised linear model. Fitted values under the model are computed for each such term. The lower-order "relatives" of a high-order term (e.g., main effects marginal to an interaction) are absorbed into the term, allowing the predictors appearing in the high-order term to range over their values. The values of other predictors are fixed at typical values: for example, a covariate could be fixed at its mean or median, a factor at its proportional distribution in the data, or to equal proportions in its several levels. Variations of effect displays are also described, including representation of terms higher-order to any appearing in the model.

1,373 citations


Book
J. N. K. Rao1
23 Jan 2003
TL;DR: In this paper, the authors proposed a model-based approach for estimating small area statistics based on direct and indirect estimates of the total population of a given region in a given domain.
Abstract: List of Figures. List of Tables. Foreword. Preface. 1. Introduction. What is a Small Area? Demand for Small Area Statistics. Traditional Indirect Estimators. Small Area Models. Model-Based Estimation. Some Examples. 2. Direct Domain Estimation. Introduction. Design-based Approach. Estimation of Totals. Domain Estimation. Modified Direct Estimators. Design Issues. Proofs. 3. Traditional Demographic Methods. Introduction. Symptomatic Accounting Techniques. Regression Symptomatic Procedures. Dual-system Estimation of Total Population. Derivation of Average MSEs. 4. Indirect Domain Estimation. Introduction. Synthetic Estimation. Composite Estimation. James-Stein Method. Proofs. 5. Small Area Models. Introduction. Basic Area Level (Type A) Mode l. Basic Unit Level (Type B) Model. Extensions: Type A Models. Extensions: Type B Models. Generalized Linear Mixed Models. 6. Empirical Best Linear Unbiased Prediction: Theory. Introduction. General Linear Mixed Model. Block Diagonal Covariance Structure. Proofs. 7. EBLUP: Basic Models. Basic Area Level Model. Basic Unit Level Model. 8. EBLUP: Extensions. Multivariate Fay-Herriot Model. Correlated Sampling Errors. Time Series and Cross-sectional Models. Spatial Models. Multivariate Nested Error Regression Model. Random Error Variances Linear Model. Two-fold Nested Error Regression Model. Two-level Model. 9. Empirical Bayes (EB) Method. Introduction. Basic Area Level Model. Linear Mixed Models. Binary Data. Disease Mapping. Triple-goal Estimation. Empirical Linear Bayes. Constrained LB. Proofs. 10. Hierarchical Bayes (HB) Method. Introduction. MCMC Methods. Basic Area Level Model. Unmatched Sampling and Linking Area Level Models. Basic Unit Level Model. General ANOVA Model. Two-level Models. Time Series and Cross-sectional Models. Multivariate Models. Disease Mapping Models. Binary Data. Exponential Family Models. Constrained HB. Proofs. References. Author Index. Subject Index.

Book
01 Jan 2003

Journal ArticleDOI
TL;DR: In this paper, a new class of instrumental variable (IV) estimators for linear and nonlinear treatment response models with covariates is introduced, which allows the researcher to construct estimators that can be interpreted as the parameters of a well defined approximation to a treatment response function under functional form misspecification.

Journal Article
TL;DR: In this article, the authors explore the use of the zero-norm of the parameters of linear models in learning and derive a simple but practical method for variable or feature selection, minimizing training error and ensuring sparsity in solutions.
Abstract: We explore the use of the so-called zero-norm of the parameters of linear models in learning. Minimization of such a quantity has many uses in a machine learning context: for variable or feature selection, minimizing training error and ensuring sparsity in solutions. We derive a simple but practical method for achieving these goals and discuss its relationship to existing techniques of minimizing the zero-norm. The method boils down to implementing a simple modification of vanilla SVM, namely via an iterative multiplicative rescaling of the training data. Applications we investigate which aid our discussion include variable and feature selection on biological microarray data, and multicategory classification.

Journal ArticleDOI
TL;DR: The theory of Robust Subspace Learning (RSL) for linear models within a continuous optimization framework based on robust M-estimation is developed and applies to a variety of linear learning problems in computer vision including eigen-analysis and structure from motion.
Abstract: Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications Methods for learning linear models can be seen as a special case of subspace fitting One draw-back of previous learning methods is that they are based on least squares estimation techniques and hence fail to account for “outliers” which are common in realistic training sets We review previous approaches for making linear learning methods robust to outliers and present a new method that uses an intra-sample outlier process to account for pixel outliers We develop the theory of Robust Subspace Learning (RSL) for linear models within a continuous optimization framework based on robust M-estimation The framework applies to a variety of linear learning problems in computer vision including eigen-analysis and structure from motion Several synthetic and natural examples are used to develop and illustrate the theory and applications of robust subspace learning in computer vision

Journal ArticleDOI
TL;DR: In this article, the authors focus on the local case and show how such modeling can be formalized in the context of Gaussian responses providing attractive interpretation in terms of both random effects and explaining residuals.
Abstract: In many applications, the objective is to build regression models to explain a response variable over a region of interest under the assumption that the responses are spatially correlated. In nearly all of this work, the regression coefficients are assumed to be constant over the region. However, in some applications, coefficients are expected to vary at the local or subregional level. Here we focus on the local case. Although parametric modeling of the spatial surface for the coefficient is possible, here we argue that it is more natural and flexible to view the surface as a realization from a spatial process. We show how such modeling can be formalized in the context of Gaussian responses providing attractive interpretation in terms of both random effects and explaining residuals. We also offer extensions to generalized linear models and to spatio-temporal setting. We illustrate both static and dynamic modeling with a dataset that attempts to explain (log) selling price of single-family houses.

01 Jan 2003
TL;DR: This work considers a regression setting where the response is a scalar and the predictor is a random function defined on a compact set of R, and studies an estimator based on a B-splines expansion of the functional coefficient which generalizes ridge regression.
Abstract: We consider a regression setting where the response is a scalar and the predictor is a random function defined on a compact set of R. Many fields of appli- cations are concerned with this kind of data, for instance chemometrics when the predictor is a signal digitized in many points. Then, people have mainly considered the multivariate linear model and have adapted the least squares procedure to take care of highly correlated predictors. Another point of view is to introduce a con- tinuous version of this model, i.e., the functional linear model with scalar response. We are then faced with the estimation of a functional coefficient or, equivalently, of a linear functional. We first study an estimator based on a B-splines expansion of the functional coefficient which in some way generalizes ridge regression. We derive an upper bound for the L 2 rate of convergence of this estimator. As an alternative we also introduce a smooth version of functional principal components regression for which L 2 convergence is achieved. Finally both methods are compared by means

Journal ArticleDOI
TL;DR: Optimal fusion rules based on the best linear unbiased estimation (BLUE), the weighted least squares (WLS), and their generalized versions are presented for cases with complete, incomplete, or no prior information.
Abstract: This paper deals with data (or information) fusion for the purpose of estimation. Three estimation fusion architectures are considered: centralized, distributed, and hybrid. A unified linear model and a general framework for these three architectures are established. Optimal fusion rules based on the best linear unbiased estimation (BLUE), the weighted least squares (WLS), and their generalized versions are presented for cases with complete, incomplete, or no prior information. These rules are more general and flexible, and have wider applicability than previous results. For example, they are in a unified form that is optimal for all of the three fusion architectures with arbitrary correlation of local estimates or observation errors across sensors or across time. They are also in explicit forms convenient for implementation. The optimal fusion rules presented are not limited to linear data models. Illustrative numerical results are provided to verify the fusion rules and demonstrate how these fusion rules can be used in cases with complete, incomplete, or no prior information.

Journal Article
TL;DR: The method constructs a series of sparse linear SVMs to generate linear models that can generalize well, and uses a subset of nonzero weighted variables found by the linear models to produce a final nonlinear model.
Abstract: We describe a methodology for performing variable ranking and selection using support vector machines (SVMs). The method constructs a series of sparse linear SVMs to generate linear models that can generalize well, and uses a subset of nonzero weighted variables found by the linear models to produce a final nonlinear model. The method exploits the fact that a linear SVM (no kernels) with l1-norm regularization inherently performs variable selection as a side-effect of minimizing capacity of the SVM model. The distribution of the linear model weights provides a mechanism for ranking and interpreting the effects of variables. Starplots are used to visualize the magnitude and variance of the weights for each variable. We illustrate the effectiveness of the methodology on synthetic data, benchmark problems, and challenging regression problems in drug design. This method can dramatically reduce the number of variables and outperforms SVMs trained using all attributes and using the attributes selected according to correlation coefficients. The visualization of the resulting models is useful for understanding the role of underlying variables.

Book ChapterDOI
22 Sep 2003
TL;DR: This paper uses a stagewise fitting process to construct the logistic regression models that can select relevant attributes in the data in a natural way, and shows how this approach can be used to build the logistics regression models at the leaves by incrementally refining those constructed at higher levels in the tree.
Abstract: Tree induction methods and linear models are popular techniques for supervised learning tasks, both for the prediction of nominal classes and continuous numeric values. For predicting numeric quantities, there has been work on combining these two schemes into 'model trees', i.e. trees that contain linear regression functions at the leaves. In this paper, we present an algorithm that adapts this idea for classification problems, using logistic regression instead of linear regression. We use a stagewise fitting process to construct the logistic regression models that can select relevant attributes in the data in a natural way, and show how this approach can be used to build the logistic regression models at the leaves by incrementally refining those constructed at higher levels in the tree. We compare the performance of our algorithm against that of decision trees and logistic regression on 32 benchmark UCI datasets, and show that it achieves a higher classification accuracy on average than the other two methods.

Book
07 Apr 2003
TL;DR: This book explains the development of linear models and their applications in reinforcement learning, and some of the models’ applications in qualitative and quantitative sciences are explained.
Abstract: Preface to the Second Edition.Preface to the First Edition.PART I: REGRESSION MODELS.Introduction to Linear Models.Regression on Functions of One Variable.Transforming the Data.Regression of Functions of Several Variables.Collinearity in Multiple Linear Regression.Influential Observations in Multiple Linear Regression.Polynomial Models and Qualitative Predictors.Additional Topics.PART II: ANALYSIS OF VARIANCE MODELS.Introduction to Analysis of Variance Models.Fixed Effects Models I: One-Way Classification of Means.Fixed Effects Models II: Two-Way Classification of Means.Fixed Effects Models III: Multiple Crossed and Nested Factors.Mixed Models I: The AOV Method with Balanced Data.Mixed Models II: The AVE Method with Balanced Data.Mixed Models III: Unbalanced Data.PART III: MATHEMATICAL THEORY OF LINEAR MODELS.Distribution of Linear and Quadratic Forms.Estimation and Inference for Linear Models.Simultaneous Inference: Tests and Confidence Intervals .Appendix A. Mathematics.Appendix B. Statistics.Appendix C. Statistical Tables.Appendix D. Data Tables.References.Index.

Journal ArticleDOI
TL;DR: This work proposes the use of multivariate autoregressive models of functional magnetic resonance imaging time series to make inferences about functional integration within the human brain and extends linear MAR models to accommodate nonlinear interactions to model top-down modulatory processes with bilinear terms.

Journal ArticleDOI
TL;DR: In this article, five neural network (NN) models, a linear statistical model and a deterministic modelling system (DET) were evaluated for the prediction of urban NO2 and PM10 concentrations.

Book ChapterDOI
01 Jan 2003
TL;DR: The notion of optimal rate of aggregation is defined in an abstract context and lower bounds valid for any method of aggregation are proved, thus establishing optimal rates of linear, convex and model selection type aggregation.
Abstract: We study the problem of aggregation of M arbitrary estimators of a regression function with respect to the mean squared risk Three main types of aggregation are considered: model selection, convex and linear aggregation We define the notion of optimal rate of aggregation in an abstract context and prove lower bounds valid for any method of aggregation We then construct procedures that attain these bounds, thus establishing optimal rates of linear, convex and model selection type aggregation

Book
25 Sep 2003
TL;DR: In this paper, the authors discuss the impact of MISMEASURED CATEGORICAL VARIABLES on Odds-Ratios and the effect of Mismeasurement bias on the performance of MIS-based models.
Abstract: INTRODUCTION Examples of Mismeasurement The Mismeasurement Phenomenon What is Ahead? THE IMPACT OF MISMEASURED CONTINUOUS VARIABLES The Archetypical Scenario More General Impact Multiplicative Measurement Error Multiple Mismeasured Predictors What about Variability and Small Samples? Logistic Regression Beyond Nondifferential and Unbiased Measurement Error Summary Mathematical Details THE IMPACT OF MISMEASURED CATEGORICAL VARIABLES The Linear Model Case More General Impact Inferences on Odds-Ratios Logistic Regression Differential Misclassification Polychotomous Variables Summary Mathematical Details ADJUSTMENT FOR MISMEASURED CONTINUOUS VARIABLES Posterior Distributions A Simple Scenario Nonlinear Mixed Effects Model: Viral Dynamics Logistic Regression I: Smoking and Bladder Cancer Logistic Regression II: Framingham Heart Study Issues in Specifying the Exposure Model More Flexible Exposure Models Retrospective Analysis Comparison with Non-Bayesian Approaches Summary Mathematical Details ADJUSTMENT FOR MISMEASURED CATEGORICAL VARIABLES A Simple Scenario Partial Knowledge of Misclassification Probabilities Dual Exposure Assessment Models with Additional Explanatory Variables Summary Mathematical Details FURTHER TOPICS Dichotomization of Mismeasured Continuous Variables Mismeasurement Bias and Model Misspecification Bias Identifiability in Mismeasurement Models Further Remarks APPENDIX: BAYES-MCMC INFERENCE Bayes Theorem Point and Interval Estimates Markov Chain Monte Carlo Prior Selection MCMC and Unobserved Structure REFERENCES

Jushan BAIa1
01 Jan 2003
TL;DR: An efficient algorithm to obtain global minimizers of the sum of squared residuals is presented and is based on the principle of dynamic programming and requires at most least-squares operations of order O T2 for any number of breaks.
Abstract: In a recent paper, Bai and Perron (1998) considered theoretical issues related to the limiting distribution of estimators and test statistics in the linear model with multiple structural changes. In this companion paper, we consider practical issues for the empirical applications of the procedures. We first address the problem of estimation of the break dates and present an efficient algorithm to obtain global minimizers of the sum of squared residuals. This algorithm is based on the principle of dynamic programming and requires at most least-squares operations of order O T2 for any number of breaks. Our method can be applied to both pure and partial structural change models. Second, we consider the problem of forming confidence intervals for the break dates under various hypotheses about the structure of the data and the errors across segments. Third, we address the issue of testing for structural changes under very general conditions on the data and the errors. Fourth, we address the issue of estimating the number of breaks. Finally, a few empirical applications are presented to illustrate the usefulness of the procedures. All methods discussed are implemented in a GAUSS program. Copyright  2002 John Wiley & Sons, Ltd.

Reference EntryDOI
15 Apr 2003
TL;DR: The term "growth curve" is used to describe data where the same entities are repeatedly observed, the same procedures of measurement and scaling of observations are used, and the timing of the observations is known as discussed by the authors.
Abstract: The term “growth curve” is used to describe data where: (1) the same entities are repeatedly observed, (2) the same procedures of measurement and scaling of observations are used, and (3) the timing of the observations is known. Growth curves are now common in many areas of psychological research, and some of these are presented here. The term “growth curve analysis” denotes the processes of describing, testing hypotheses, and making scientific inferences about the growth and change patterns in a wide range of time-related phenomena. In this sense, growth curve analyses are a specific form of the larger set of developmental and longitudinal research methods, but the unique features of growth data permit unique kinds of analyses. Formal models for the analysis of growth curves which have been developed in many different substantive domains are described here in five sections: (1) An introduction to growth curves, (2) linear models of growth, (3) multiple groups in growth curve models, (4) aspects of dynamic theory for growth models, and (5) multiple variables in growth curve analyses. We conclude with a discussion of future issues raised by the current growth models. Keywords: growth curves; latent growth models; longitudinal data analysis; mixed models; multilevel models; nonlinear models

Journal ArticleDOI
TL;DR: In this article, the authors present a technique based on the pseudo-values from a jackknife statistic constructed from simple summary statistic estimates of the state probabilities, which are then used in a generalised estimating equation to obtain estimates of model parameters.
Abstract: will be in a given state at some time, is a complex nonlinear function of the intensity regression coefficients. We present a technique which models the state probabilities directly. This method is based on the pseudo-values from a jackknife statistic constructed from simple summary statistic estimates of the state probabilities. These pseudo-values are then used in a generalised estimating equation to obtain estimates of the model parameters. We illustrate how this technique works by studying examples of common regression problems. We apply the technique to model acute graft-versus-host disease in bone marrow transplants.

Journal ArticleDOI
TL;DR: In this paper, a stochastic mode reduction strategy was applied to three prototype models with nonlinear behavior mimicking several features of low-frequency variability in the extratropical atmosphere.
Abstract: A systematic strategy for stochastic mode reduction is applied here to three prototype ‘‘toy’’ models with nonlinear behavior mimicking several features of low-frequency variability in the extratropical atmosphere. Two of the models involve explicit stable periodic orbits and multiple equilibria in the projected nonlinear climate dynamics. The systematic strategy has two steps: stochastic consistency and stochastic mode elimination. Both aspects of the mode reduction strategy are tested in an a priori fashion in the paper. In all three models the stochastic mode elimination procedure applies in a quantitative fashion for moderately large values of « 0.5 or even « 1, where the parameter « roughly measures the ratio of correlation times of unresolved variables to resolved climate variables, even though the procedure is only justified mathematically for « K 1. The results developed here provide some new perspectives on both the role of stable nonlinear structures in projected nonlinear climate dynamics and the regression fitting strategies for stochastic climate modeling. In one example, a deterministic system with 102 degrees of freedom has an explicit stable periodic orbit for the projected climate dynamics in two variables; however, the complete deterministic system has instead a probability density function with two large isolated peaks on the ‘‘ghost’’ of this periodic orbit, and correlation functions that only weakly ‘‘shadow’’ this periodic orbit. Furthermore, all of these features are predicted in a quantitative fashion by the reduced stochastic model in two variables derived from the systematic theory; this reduced model has multiplicative noise and augmented nonlinearity. In a second deterministic model with 101 degrees of freedom, it is established that stable multiple equilibria in the projected climate dynamics can be either relevant or completely irrelevant in the actual dynamics for the climate variable depending on the strength of nonlinearity and the coupling to the unresolved variables. Furthermore, all this behavior is predicted in a quantitative fashion by a reduced nonlinear stochastic model for a single climate variable with additive noise, which is derived from the systematic mode reduction procedure. Finally, the systematic mode reduction strategy is applied in an idealized context to the stochastic modeling of the effect of mountain torque on the angular momentum budget. Surprisingly, the strategy yields a nonlinear stochastic equation for the large-scale fluctuations, and numerical simulations confirm significantly improved predicted correlation functions from this model compared with a standard linear model with damping and white noise forcing.

Journal ArticleDOI
TL;DR: It is demonstrated that a broad class of MLMs may be estimated as structural equation models (SEMs), and within the SEM approach it is possible to include measurement models for predictors or outcomes, and to estimate the mediational pathways among predictors explicitly, tasks which are currently difficult with the conventional approach to multilevel modeling.
Abstract: Multilevel linear models (MLMs) provide a powerful framework for analyzing data collected at nested or non-nested levels, such as students within classrooms. The current article draws on recent analytical and software advances to demonstrate that a broad class ofMLMs may be estimated as structural equation models (SEMs). Moreover, within the SEM approach it is possible to include measurement models for predictors or outcomes, and to estimate the mediational pathways among predictors explicitly, tasks which are currently difficult with the conventional approach to multilevel modeling. The equivalency of the SEM approach with conventional methods for estimating MLMs is illustrated using empirical examples, including an example involving both multiple indicator latent factorsfor the outcomes and a causal chain for the predictors. The limitations of this approach for estimating MLMs are discussed and alternative approaches are considered.