scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the American Statistical Association in 2014"


Journal ArticleDOI
TL;DR: The divisive method is shown to provide consistent estimates of both the number and the location of change points under standard regularity assumptions, and methods from cluster analysis are applied to assess performance and to allow simple comparisons of location estimates, even when the estimated number differs.
Abstract: Change point analysis has applications in a wide variety of fields. The general problem concerns the inference of a change in distribution for a set of time-ordered observations. Sequential detection is an online version in which new data are continually arriving and are analyzed adaptively. We are concerned with the related, but distinct, offline version, in which retrospective analysis of an entire sequence is performed. For a set of multivariate observations of arbitrary dimension, we consider nonparametric estimation of both the number of change points and the positions at which they occur. We do not make any assumptions regarding the nature of the change in distribution or any distribution assumptions beyond the existence of the αth absolute moment, for some α ∈ (0, 2). Estimation is based on hierarchical clustering and we propose both divisive and agglomerative algorithms. The divisive method is shown to provide consistent estimates of both the number and the location of change points under standard...

454 citations


Journal ArticleDOI
TL;DR: This work considers bootstrap methods for computing standard errors and confidence intervals that take model selection into account, also known as bootstrap smoothing, to tame the erratic discontinuities of selection-based estimators.
Abstract: Classical statistical theory ignores model selection in assessing estimation accuracy. Here we consider bootstrap methods for computing standard errors and confidence intervals that take model selection into account. The methodology involves bagging, also known as bootstrap smoothing, to tame the erratic discontinuities of selection-based estimators. A useful new formula for the accuracy of bagging then provides standard errors for the smoothed estimators. Two examples, nonparametric and parametric, are carried through in detail: a regression model where the choice of degree (linear, quadratic, cubic, …) is determined by the Cp criterion and a Lasso-based estimation problem.

329 citations


Journal ArticleDOI
TL;DR: In this article, a simple method for modeling interactions between the treatment and covariates is proposed, where the idea is to modify the covariate in a simple way, and then fit a standard model using the modified covariates and no main effects.
Abstract: We consider a setting in which we have a treatment and a potentially large number of covariates for a set of observations, and wish to model their relationship with an outcome of interest. We propose a simple method for modeling interactions between the treatment and covariates. The idea is to modify the covariate in a simple way, and then fit a standard model using the modified covariates and no main effects. We show that coupled with an efficiency augmentation procedure, this method produces clinically meaningful estimators in a variety of settings. It can be useful for practicing personalized medicine: determining from a large set of biomarkers, the subset of patients that can potentially benefit from a treatment. We apply the method to both simulated datasets and real trial data. The modified covariates idea can be used for other purposes, for example, large scale hypothesis testing for determining which of a set of covariates interact with a treatment variable. Supplementary materials for this articl...

327 citations


Journal ArticleDOI
TL;DR: EMVS is proposed, a deterministic alternative to stochastic search based on an EM algorithm which exploits a conjugate mixture prior formulation to quickly find posterior modes in high-dimensional linear regression contexts.
Abstract: Despite rapid developments in stochastic search algorithms, the practicality of Bayesian variable selection methods has continued to pose challenges. High-dimensional data are now routinely analyzed, typically with many more covariates than observations. To broaden the applicability of Bayesian variable selection for such high-dimensional linear regression contexts, we propose EMVS, a deterministic alternative to stochastic search based on an EM algorithm which exploits a conjugate mixture prior formulation to quickly find posterior modes. Combining a spike-and-slab regularization diagram for the discovery of active predictor sets with subsequent rigorous evaluation of posterior model probabilities, EMVS rapidly identifies promising sparse high posterior probability submodels. External structural information such as likely covariate groupings or network topologies is easily incorporated into the EMVS framework. Deterministic annealing variants are seen to improve the effectiveness of our algorithms by mit...

237 citations


Journal ArticleDOI
TL;DR: A new feature screening procedure based on conditional correlation coefficient for varying coefficient models with ultrahigh-dimensional covariates is proposed and systematically study the theoretical properties of the proposed procedure, and establishes their sure screening property and the ranking consistency.
Abstract: This article is concerned with feature screening and variable selection for varying coefficient models with ultrahigh-dimensional covariates. We propose a new feature screening procedure for these models based on conditional correlation coefficient. We systematically study the theoretical properties of the proposed procedure, and establish their sure screening property and the ranking consistency. To enhance the finite sample performance of the proposed procedure, we further develop an iterative feature screening procedure. Monte Carlo simulation studies were conducted to examine the performance of the proposed procedures. In practice, we advocate a two-stage approach for varying coefficient models. The two-stage approach consists of (a) reducing the ultrahigh dimensionality by using the proposed procedure and (b) applying regularization methods for dimension-reduced varying coefficient models to make statistical inferences on the coefficient functions. We illustrate the proposed two-stage approach by a real data example. Supplementary materials for this article are available online.

180 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a nonparametric independence screening (NIS) method to select variables by ranking a measure of the non-parametric marginal contributions of each covariate given the exposure variable.
Abstract: The varying coefficient model is an important class of nonparametric statistical model, which allows us to examine how the effects of covariates vary with exposure variables. When the number of covariates is large, the issue of variable selection arises. In this article, we propose and investigate marginal nonparametric screening methods to screen variables in sparse ultra-high-dimensional varying coefficient models. The proposed nonparametric independence screening (NIS) selects variables by ranking a measure of the nonparametric marginal contributions of each covariate given the exposure variable. The sure independent screening property is established under some mild technical conditions when the dimensionality is of nonpolynomial order, and the dimensionality reduction of NIS is quantified. To enhance the practical utility and finite sample performance, two data-driven iterative NIS (INIS) methods are proposed for selecting thresholding parameters and variables: conditional permutation and greedy metho...

164 citations


Journal ArticleDOI
TL;DR: A model-averaging procedure for high-dimensional regression problems in which the number of predictors p exceeds the sample size n is developed, and a theorem is proved, showing that delete-one cross-validation achieves the lowest possible prediction loss asymptotically.
Abstract: This article considers high-dimensional regression problems in which the number of predictors p exceeds the sample size n. We develop a model-averaging procedure for high-dimensional regression problems. Unlike most variable selection studies featuring the identification of true predictors, our focus here is on the prediction accuracy for the true conditional mean of y given the p predictors. Our method consists of two steps. The first step is to construct a class of regression models, each with a smaller number of regressors, to avoid the degeneracy of the information matrix. The second step is to find suitable model weights for averaging. To minimize the prediction error, we estimate the model weights using a delete-one cross-validation procedure. Departing from the literature of model averaging that requires the weights always sum to one, an important improvement we introduce is to remove this constraint. We derive some theoretical results to justify our procedure. A theorem is proved, showing that del...

154 citations


Journal ArticleDOI
TL;DR: In this paper, the authors define and study a depth function for multivariate functional data, which is based on a weight function and includes important characteristics of functional data such as differences in the amount of local amplitude, shape and phase variation.
Abstract: This article defines and studies a depth for multivariate functional data. By the multivariate nature and by including a weight function, it acknowledges important characteristics of functional data, namely differences in the amount of local amplitude, shape, and phase variation. We study both population and finite sample versions. The multivariate sample of curves may include warping functions, derivatives, and integrals of the original curves for a better overall representation of the functional data via the depth. We present a simulation study and data examples that confirm the good performance of this depth function. Supplementary materials for this article are available online.

144 citations


Journal ArticleDOI
TL;DR: This article investigates how BIC can be adapted to high-dimensional linear quantile regression and shows that a modified BIC is consistent in model selection when the number of variables diverges as the sample size increases and extends the results to structured nonparametric quantile models with a diverging number of covariates.
Abstract: Bayesian information criterion (BIC) is known to identify the true model consistently as long as the predictor dimension is finite. Recently, its moderate modifications have been shown to be consistent in model selection even when the number of variables diverges. Those works have been done mostly in mean regression, but rarely in quantile regression. The best-known results about BIC for quantile regression are for linear models with a fixed number of variables. In this article, we investigate how BIC can be adapted to high-dimensional linear quantile regression and show that a modified BIC is consistent in model selection when the number of variables diverges as the sample size increases. We also discuss how it can be used for choosing the regularization parameters of penalized approaches that are designed to conduct variable selection and shrinkage estimation simultaneously. Moreover, we extend the results to structured nonparametric quantile models with a diverging number of covariates. We illustrate o...

141 citations


Journal ArticleDOI
TL;DR: This article considers inference about effects when the population consists of groups of individuals where interference is possible within groups but not between groups, and considers the effects of cholera vaccination and an intervention to encourage voting.
Abstract: Recently, there has been increasing interest in making causal inference when interference is possible. In the presence of interference, treatment may have several types of effects. In this article, we consider inference about such effects when the population consists of groups of individuals where interference is possible within groups but not between groups. A two-stage randomization design is assumed where in the first stage groups are randomized to different treatment allocation strategies and in the second stage individuals are randomized to treatment or control conditional on the strategy assigned to their group in the first stage. For this design, the asymptotic distributions of estimators of the causal effects are derived when either the number of individuals per group or the number of groups grows large. Under certain homogeneity assumptions, the asymptotic distributions provide justification for Wald-type confidence intervals (CIs) and tests. Empirical results demonstrate that the Wald CIs have g...

135 citations


Journal ArticleDOI
TL;DR: The reformulation of the Kiefer–Wolfowitz estimator as a convex optimization problem reduces the computational effort by several orders of magnitude for typical problems, by comparison to prior EM-algorithm based methods, and thus greatly expands the practical applicability of the resulting methods.
Abstract: Estimation of mixture densities for the classical Gaussian compound decision problem and their associated (empirical) Bayes rules is considered from two new perspectives. The first, motivated by Brown and Greenshtein, introduces a nonparametric maximum likelihood estimator of the mixture density subject to a monotonicity constraint on the resulting Bayes rule. The second, motivated by Jiang and Zhang, proposes a new approach to computing the Kiefer–Wolfowitz nonparametric maximum likelihood estimator for mixtures. In contrast to prior methods for these problems, our new approaches are cast as convex optimization problems that can be efficiently solved by modern interior point methods. In particular, we show that the reformulation of the Kiefer–Wolfowitz estimator as a convex optimization problem reduces the computational effort by several orders of magnitude for typical problems, by comparison to prior EM-algorithm based methods, and thus greatly expands the practical applicability of the resulting method...

Journal ArticleDOI
TL;DR: By finding the best low-rank approximation of the data with respect to a transposable quadratic norm, the generalized least-square matrix decomposition (GMD), directly accounts for structural relationships and is demonstrated for dimension reduction, signal recovery, and feature selection with high-dimensional structured data.
Abstract: Variables in many big-data settings are structured, arising, for example, from measurements on a regular grid as in imaging and time series or from spatial-temporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often resulting in poor performance. We propose a generalization of principal components analysis (PCA) that is appropriate for massive datasets with structured variables or known two-way dependencies. By finding the best low-rank approximation of the data with respect to a transposable quadratic norm, our decomposition, entitled the generalized least-square matrix decomposition (GMD), directly accounts for structural relationships. As many variables in high-dimensional settings are often irrelevant, we also regularize our matrix decomposition by adding two-way penalties to encourage sparsity or smoothness. We develop fast computational algorithms using our methods to perform generalized PCA (GPCA), sparse GPCA, and functional GPCA on ma...

Journal ArticleDOI
TL;DR: In this paper, an observation-driven model, based on a conditional Student's t-distribution, was proposed for rail travel in the United Kingdom, which is tractable and retains some of the desirable features of the linear Gaussian model.
Abstract: An unobserved components model in which the signal is buried in noise that is non-Gaussian may throw up observations that, when judged by the Gaussian yardstick, are outliers. We describe an observation-driven model, based on a conditional Student’s t-distribution, which is tractable and retains some of the desirable features of the linear Gaussian model. Letting the dynamics be driven by the score of the conditional distribution leads to a specification that is not only easy to implement, but which also facilitates the development of a comprehensive and relatively straightforward theory for the asymptotic distribution of the maximum likelihood estimator. The methods are illustrated with an application to rail travel in the United Kingdom. The final part of the article shows how the model may be extended to include explanatory variables.

Journal ArticleDOI
TL;DR: The martingale difference correlation as mentioned in this paper is a natural extension of distance correlation proposed by Szekely, Rizzo, and Bahirov, which is used to measure the dependence between a scalar response variable V and a vector predictor variable U.
Abstract: In this article, we propose a new metric, the so-called martingale difference correlation, to measure the departure of conditional mean independence between a scalar response variable V and a vector predictor variable U. Our metric is a natural extension of distance correlation proposed by Szekely, Rizzo, and Bahirov, which is used to measure the dependence between V and U. The martingale difference correlation and its empirical counterpart inherit a number of desirable features of distance correlation and sample distance correlation, such as algebraic simplicity and elegant theoretical properties. We further use martingale difference correlation as a marginal utility to do high-dimensional variable screening to screen out variables that do not contribute to conditional mean of the response given the covariates. Further extension to conditional quantile screening is also described in detail and sure screening properties are rigorously justified. Both simulation results and real data illustrations demonstr...

Journal ArticleDOI
TL;DR: Three Bayesian methods for PS variable selection and model averaging are proposed that select relevant variables from a set of candidate variables to include in the PS model and estimate causal treatment effects as weighted averages of estimates under different PS models.
Abstract: Causal inference with observational data frequently relies on the notion of the propensity score (PS) to adjust treatment comparisons for observed confounding factors. As decisions in the era of “big data” are increasingly reliant on large and complex collections of digital data, researchers are frequently confronted with decisions regarding which of a high-dimensional covariate set to include in the PS model to satisfy the assumptions necessary for estimating average causal effects. Typically, simple or ad hoc methods are employed to arrive at a single PS model, without acknowledging the uncertainty associated with the model selection. We propose three Bayesian methods for PS variable selection and model averaging that (a) select relevant variables from a set of candidate variables to include in the PS model and (b) estimate causal treatment effects as weighted averages of estimates under different PS models. The associated weight for each PS model reflects the data-driven support for that model’s abilit...

Journal ArticleDOI
TL;DR: Theoretically, the iFOR algorithms prove that they possess sure screening property for ultrahigh-dimensional settings, and are proposed to tackle forward-selection-based procedures called iFOR, which identify interaction effects in a greedy forward fashion while maintaining the natural hierarchical model structure.
Abstract: In ultrahigh-dimensional data analysis, it is extremely challenging to identify important interaction effects, and a top concern in practice is computational feasibility. For a dataset with n observations and p predictors, the augmented design matrix including all linear and order-2 terms is of size n × (p2 + 3p)/2. When p is large, say more than tens of hundreds, the number of interactions is enormous and beyond the capacity of standard machines and software tools for storage and analysis. In theory, the interaction-selection consistency is hard to achieve in high-dimensional settings. Interaction effects have heavier tails and more complex covariance structures than main effects in a random design, making theoretical analysis difficult. In this article, we propose to tackle these issues by forward-selection-based procedures called iFOR, which identify interaction effects in a greedy forward fashion while maintaining the natural hierarchical model structure. Two algorithms, iFORT and iFORM, are studied. ...

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a doubly robust estimator that allows multiple models for both the missingness mechanism and the data distribution, and the resulting estimator is consistent if any one of those multiple models is correctly specified.
Abstract: Doubly robust estimators are widely used in missing-data analysis. They provide double protection on estimation consistency against model misspecifications. However, they allow only a single model for the missingness mechanism and a single model for the data distribution, and the assumption that one of these two models is correctly specified is restrictive in practice. For regression analysis with possibly missing outcome, we propose an estimation method that allows multiple models for both the missingness mechanism and the data distribution. The resulting estimator is consistent if any one of those multiple models is correctly specified, and thus provides multiple protection on consistency. This estimator is also robust against extreme values of the fitted missingness probability, which, for most doubly robust estimators, can lead to erroneously large inverse probability weights that may jeopardize the numerical performance. The numerical implementation of the proposed method through a modified Newton–Ra...

Journal ArticleDOI
TL;DR: In this article, the group least absolute shrinkage and selection operator (LASSO) is proposed to estimate the structural break autoregressive (SBAR) model when the number of change points m is unknown.
Abstract: Consider a structural break autoregressive (SBAR) process where j = 1, …, m + 1, {t1, …, tm} are change-points, 1 = t0 < t1 < ⋅⋅⋅ < tm + 1 = n + 1, σ( · ) is a measurable function on , and {ϵt} are white noise with unit variance. In practice, the number of change-points m is usually assumed to be known and small, because a large m would involve a huge amount of computational burden for parameters estimation. By reformulating the problem in a variable selection context, the group least absolute shrinkage and selection operator (LASSO) is proposed to estimate an SBAR model when m is unknown. It is shown that both m and the locations of the change-points {t1, …, tm} can be consistently estimated from the data, and the computation can be efficiently performed. An improved practical version that incorporates group LASSO and the stepwise regression variable selection technique are discussed. Simulation studies are conducted to assess the finite sample performance. Supplementary materials for this article are av...

Journal ArticleDOI
TL;DR: In this article, the authors highlight the importance of modeling the association structure between the longitudinal and event time responses that can greatly influence the derived predictions, and illustrate how to improve the accuracy of derived predictions by suitably combining joint models with different association structures.
Abstract: The joint modeling of longitudinal and time-to-event data is an active area of statistics research that has received a lot of attention in recent years. More recently, a new and attractive application of this type of model has been to obtain individualized predictions of survival probabilities and/or of future longitudinal responses. The advantageous feature of these predictions is that they are dynamically updated as extra longitudinal responses are collected for the subjects of interest, providing real time risk assessment using all recorded information. The aim of this article is two-fold. First, to highlight the importance of modeling the association structure between the longitudinal and event time responses that can greatly influence the derived predictions, and second, to illustrate how we can improve the accuracy of the derived predictions by suitably combining joint models with different association structures. The second goal is achieved using Bayesian model averaging, which, in this setting, ha...

Journal ArticleDOI
TL;DR: A general approach is proposed that treats late-onset outcomes as missing data, uses data augmentation to impute missing outcomes from posterior predictive distributions computed from partial follow-up times and complete outcome data, and applies the design’s decision rules using the completed data.
Abstract: A practical impediment in adaptive clinical trials is that outcomes must be observed soon enough to apply decision rules to choose treatments for new patients. For example, if outcomes take up to six weeks to evaluate and the accrual rate is one patient per week, on average three new patients will be accrued while waiting to evaluate the outcomes of the previous three patients. The question is how to treat the new patients. This logistical problem persists throughout the trial. Various ad hoc practical solutions are used, none entirely satisfactory. We focus on this problem in phase I-II clinical trials that use binary toxicity and efficacy, defined in terms of event times, to choose doses adaptively for successive cohorts. We propose a general approach to this problem that treats late-onset outcomes as missing data, uses data augmentation to impute missing outcomes from posterior predictive distributions computed from partial follow-up times and complete outcome data, and applies the design's decision rules using the completed data. We illustrate the method with two cancer trials conducted using a phase I-II design based on efficacy-toxicity trade-offs, including a computer stimulation study.

Journal ArticleDOI
TL;DR: The EP-ABC algorithm as discussed by the authors is an adaptation to the likelihood-free context of the variational approximation algorithm known as expectation propagation, which is faster by a few orders of magnitude than standard algorithms.
Abstract: Many models of interest in the natural and social sciences have no closed-form likelihood function, which means that they cannot be treated using the usual techniques of statistical inference. In the case where such models can be efficiently simulated, Bayesian inference is still possible thanks to the approximate Bayesian computation (ABC) algorithm. Although many refinements have been suggested, ABC inference is still far from routine. ABC is often excruciatingly slow due to very low acceptance rates. In addition, ABC requires introducing a vector of “summary statistics” s(y), the choice of which is relatively arbitrary, and often require some trial and error, making the whole process laborious for the user. We introduce in this work the EP-ABC algorithm, which is an adaptation to the likelihood-free context of the variational approximation algorithm known as expectation propagation. The main advantage of EP-ABC is that it is faster by a few orders of magnitude than standard algorithms, while producing ...

Journal ArticleDOI
TL;DR: A sparse additive ODE (SA-ODE) model is proposed, coupled with ODE estimation methods and adaptive group least absolute shrinkage and selection operator (LASSO) techniques, to model dynamic GRNs that could flexibly deal with nonlinear regulation effects.
Abstract: The gene regulation network (GRN) is a high-dimensional complex system, which can be represented by various mathematical or statistical models. The ordinary differential equation (ODE) model is one of the popular dynamic GRN models. High-dimensional linear ODE models have been proposed to identify GRNs, but with a limitation of the linear regulation effect assumption. In this article, we propose a sparse additive ODE (SA-ODE) model, coupled with ODE estimation methods and adaptive group least absolute shrinkage and selection operator (LASSO) techniques, to model dynamic GRNs that could flexibly deal with nonlinear regulation effects. The asymptotic properties of the proposed method are established and simulation studies are performed to validate the proposed approach. An application example for identifying the nonlinear dynamic GRN of T-cell activation is used to illustrate the usefulness of the proposed method.

Journal ArticleDOI
TL;DR: An efficient method is developed on the basis of difference convex programming, the augmented Lagrangian method and the blockwise coordinate descent method, which is scalable to hundreds of graphs of thousands nodes through a simple necessary and sufficient partition rule.
Abstract: Gaussian graphical models are useful to analyze and visualize conditional dependence relationships between interacting units. Motivated from network analysis under different experimental conditions, such as gene networks for disparate cancer subtypes, we model structural changes over multiple networks with possible heterogeneities. In particular, we estimate multiple precision matrices describing dependencies among interacting units through maximum penalized likelihood. Of particular interest are homogeneous groups of similar entries across and zero-entries of these matrices, referred to as clustering and sparseness structures, respectively. A nonconvex method is proposed to seek a sparse representation for each matrix and identify clusters of the entries across the matrices. Computationally, we develop an efficient method on the basis of difference convex programming, the augmented Lagrangian method and the blockwise coordinate descent method, which is scalable to hundreds of graphs of thousands nodes th...

Journal ArticleDOI
TL;DR: It is proved that the proposed method is able to estimate the directions that achieve sufficient dimension reduction and has wide applicability without strong assumptions on the distributions or the type of variables, and needs only eigendecomposition for estimating the projection matrix.
Abstract: This article proposes a novel approach to linear dimension reduction for regression using nonparametric estimation with positive-definite kernels or reproducing kernel Hilbert spaces (RKHSs). The purpose of the dimension reduction is to find such directions in the explanatory variables that explain the response sufficiently: this is called sufficient dimension reduction. The proposed method is based on an estimator for the gradient of the regression function considered for the feature vectors mapped into RKHSs. It is proved that the method is able to estimate the directions that achieve sufficient dimension reduction. In comparison with other existing methods, the proposed one has wide applicability without strong assumptions on the distributions or the type of variables, and needs only eigendecomposition for estimating the projection matrix. The theoretical analysis shows that the estimator is consistent with certain rate under some conditions. The experimental results demonstrate that the proposed metho...

Journal ArticleDOI
TL;DR: The contribution is to develop a fully model-based approach to capture structured spatial dependence for modeling directional data at different spatial locations, and builds a projected Gaussian spatial process, induced from an inline bivariate Gaussia spatial process.
Abstract: Directional data naturally arise in many scientific fields, such as oceanography (wave direction), meteorology (wind direction), and biology (animal movement direction). Our contribution is to develop a fully model-based approach to capture structured spatial dependence for modeling directional data at different spatial locations. We build a projected Gaussian spatial process, induced from an inline bivariate Gaussian spatial process. We discuss the properties of the projected Gaussian process and show how to fit this process as a model for data, using suitable latent variables, with Markov chain Monte Carlo methods. We also show how to implement spatial interpolation and conduct model comparison in this setting. Simulated examples are provided as proof of concept. A data application arises for modeling wave direction data in the Adriatic sea, off the coast of Italy. In fact, this directional data is available across time, requiring a spatio-temporal model for its analysis. We discuss and illustrate this ...

Journal ArticleDOI
TL;DR: Through numerical studies in rare events settings, the proposed exact approach is shown to be efficient and, generally, outperform commonly used meta-analysis methods, including Mantel-Haenszel and Peto methods.
Abstract: This article proposes a general exact meta-analysis approach for synthesizing inferences from multiple studies of discrete data. The approach combines the p-value functions (also known as significance functions) associated with the exact tests from individual studies. It encompasses a broad class of exact meta-analysis methods, as it permits broad choices for the combining elements, such as tests used in individual studies, and any parameter of interest. The approach yields statements that explicitly account for the impact of individual studies on the overall inference, in terms of efficiency/power and the Type I error rate. Those statements also give rises to empirical methods for further enhancing the combined inference. Although the proposed approach is for general discrete settings, for convenience, it is illustrated throughout using the setting of meta-analysis of multiple 2 × 2 tables. In the context of rare events data, such as observing few, zero, or zero total (i.e., zero events in both arms) out...

Journal ArticleDOI
TL;DR: A Bayesian generalized low-rank regression model (GLRR) is proposed for the analysis of both high-dimensional responses and covariates and applied to investigate the impact of 1071 SNPs on top 40 genes reported by AlzGene database on the volumes of 93 regions of interest (ROI) obtained from Alzheimer’s Disease Neuroimaging Initiative (ADNI).
Abstract: We propose a Bayesian generalized low-rank regression model (GLRR) for the analysis of both high-dimensional responses and covariates. This development is motivated by performing searches for associations between genetic variants and brain imaging phenotypes. GLRR integrates a low rank matrix to approximate the high-dimensional regression coefficient matrix of GLRR and a dynamic factor model to model the high-dimensional covariance matrix of brain imaging phenotypes. Local hypothesis testing is developed to identify significant covariates on high-dimensional responses. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of GLRR and its comparison with several competing approaches. We apply GLRR to investigate the impact of 1071 SNPs on top 40 genes reported by AlzGene database on the volumes of 93 regions of interest (ROI) obtained from Alzheimer’s Disease Neuroimaging Initiative (ADNI). Supplementary...

Journal ArticleDOI
TL;DR: In this article, an observation-driven model for time series of counts is proposed, where the observations follow a Poisson distribution conditioned on an accompanying intensity process, which is equipped with a two-regime structure according to the magnitude of the lagged observations.
Abstract: This article studies theory and inference of an observation-driven model for time series of counts. It is assumed that the observations follow a Poisson distribution conditioned on an accompanying intensity process, which is equipped with a two-regime structure according to the magnitude of the lagged observations. Generalized from the Poisson autoregression, it allows more flexible, and even negative correlation, in the observations, which cannot be produced by the single-regime model. Classical Markov chain theory and Lyapunov’s method are used to derive the conditions under which the process has a unique invariant probability measure and to show a strong law of large numbers of the intensity process. Moreover, the asymptotic theory of the maximum likelihood estimates of the parameters is established. A simulation study and a real-data application are considered, where the model is applied to the number of major earthquakes in the world. Supplementary materials for this article are available online.

Journal ArticleDOI
TL;DR: In this paper, a spatially varying coefficient model (SVCM) is proposed to capture the varying association between imaging measures in a three-dimensional volume (or two-dimensional surface) with a set of covariates.
Abstract: Motivated by recent work on studying massive imaging data in various neuroimaging studies, we propose a novel spatially varying coefficient model (SVCM) to capture the varying association between imaging measures in a three-dimensional volume (or two-dimensional surface) with a set of covariates. Two stylized features of neuorimaging data are the presence of multiple piecewise smooth regions with unknown edges and jumps and substantial spatial correlations. To specifically account for these two features, SVCM includes a measurement model with multiple varying coefficient functions, a jumping surface model for each varying coefficient function, and a functional principal component model. We develop a three-stage estimation procedure to simultaneously estimate the varying coefficient functions and the spatial correlations. The estimation procedure includes a fast multiscale adaptive estimation and testing procedure to independently estimate each varying coefficient function, while preserving its edges among...

Journal ArticleDOI
TL;DR: In this article, a nonparametric estimation for a class of stochastic measures of leverage effect is provided, and the estimators and their statistical properties are provided in cases both with and without microstructure noise.
Abstract: The leverage effect has become an extensively studied phenomenon that describes the (usually) negative relation between stock returns and their volatility. Although this characteristic of stock returns is well acknowledged, most studies of the phenomenon are based on cross-sectional calibration with parametric models. On the statistical side, most previous works are conducted over daily or longer return horizons, and few of them have carefully studied its estimation, especially with high-frequency data. However, estimation of the leverage effect is important because sensible inference is possible only when the leverage effect is estimated reliably. In this article, we provide nonparametric estimation for a class of stochastic measures of leverage effect. To construct estimators with good statistical properties, we introduce a new stochastic leverage effect parameter. The estimators and their statistical properties are provided in cases both with and without microstructure noise, under the stochastic volat...