Showing papers in "Journal of the American Statistical Association in 2020"

PDF

Open Access

Journal Article•DOI•

Simple Local Polynomial Density Estimators

[...]

Matias D. Cattaneo¹, Michael Jansson², Xinwei Ma³•Institutions (3)

Princeton University¹, University of California, Berkeley², University of California, San Diego³

02 Jul 2020-Journal of the American Statistical Association

TL;DR: An intuitive and easy-to-implement nonparametric density estimator based on local polynomial techniques that is fully boundary adaptive and automatic, but does not require prebinning or any other transformation of the data is introduced.

...read moreread less

Abstract: This article introduces an intuitive and easy-to-implement nonparametric density estimator based on local polynomial techniques. The estimator is fully boundary adaptive and automatic, but does not...

...read moreread less

235 citations

Journal Article•DOI•

Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures.

[...]

Yaowu Liu¹, Jun Xie²•Institutions (2)

Harvard University¹, Purdue University²

01 Jan 2020-Journal of the American Statistical Association

TL;DR: Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher's combination test in modern large-scale data sets as discussed by the authors.

...read moreread less

Abstract: –Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher’s combination test In modern large-scale da

...read moreread less

196 citations

Journal Article•DOI•

The Book of Why: The New Science of Cause and Effect

[...]

Peter M. Aronow¹, Fredrik Sävje¹•Institutions (1)

Yale University¹

19 Mar 2020-Journal of the American Statistical Association

TL;DR: Judea Pearl is a giant in the field of causal inference whose many contributions, including the discovery of the d-separation criterion, have been immeasurably valuable as discussed by the authors.

...read moreread less

Abstract: Judea Pearl is a giant in the field of causal inference, whose many contributions, including the discovery of the d-separation criterion, have been immeasurably valuable. He, along with science wri...

...read moreread less

146 citations

Journal Article•DOI•

Adaptive Huber Regression.

[...]

Qiang Sun¹, Wen-Xin Zhou², Jianqing Fan³•Institutions (3)

University of Toronto¹, University of California, San Diego², Fudan University³

02 Jan 2020-Journal of the American Statistical Association

TL;DR: A sharp phase transition is established for robust estimation of regression parameters in both low and high dimensions: when, the estimator admits a sub- Gaussian-type deviation bound without sub-Gaussian assumptions on the data, while only a slower rate is available in the regime and the transition is smooth and optimal.

...read moreread less

Abstract: Big data can easily be contaminated by outliers or contain variables with heavy-tailed distributions, which makes many conventional methods inadequate. To address this challenge, we propose the adaptive Huber regression for robust estimation and inference. The key observation is that the robustification parameter should adapt to the sample size, dimension and moments for optimal tradeoff between bias and robustness. Our theoretical framework deals with heavy-tailed distributions with bounded (1 + δ)-th moment for any δ > 0. We establish a sharp phase transition for robust estimation of regression parameters in both low and high dimensions: when δ ≥ 1, the estimator admits a sub-Gaussian-type deviation bound without sub-Gaussian assumptions on the data, while only a slower rate is available in the regime 0 < δ < 1 and the transition is smooth and optimal. In addition, we extend the methodology to allow both heavy-tailed predictors and observation noise. Simulation studies lend further support to the theory. In a genetic study of cancer cell lines that exhibit heavy-tailedness, the proposed methods are shown to be more robust and predictive.

...read moreread less

122 citations

Journal Article•DOI•

Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning.

[...]

Daniel J. Luckett¹, Eric B. Laber², Anna R. Kahkoska¹, David M. Maahs³, Elizabeth J. Mayer-Davis¹, Michael R. Kosorok¹ - Show less +2 more•Institutions (3)

University of North Carolina at Chapel Hill¹, North Carolina State University², Stanford University³

01 Jan 2020-Journal of the American Statistical Association

TL;DR: A new reinforcement learning method is proposed for estimating an optimal treatment regime that is applicable to data collected using mobile technologies in an outpatient setting and accommodates an indefinite time horizon and minute-by-minute decision making that are common in mobile health applications.

...read moreread less

Abstract: The vision for precision medicine is to use individual patient characteristics to inform a personalized treatment plan that leads to the best possible healthcare for each patient. Mobile technologi...

...read moreread less

97 citations

Journal Article•DOI•

Informed proposals for local MCMC in discrete spaces

[...]

Giacomo Zanella¹•Institutions (1)

Bocconi University¹

02 Apr 2020-Journal of the American Statistical Association

TL;DR: In this article, a lack of methodological results to design efficient Markov chain Monte Carlo (MCMC) algorithms for statistical models with discrete-valued high-dimensional parameters is discussed.

...read moreread less

Abstract: There is a lack of methodological results to design efficient Markov chain Monte Carlo ( MCMC) algorithms for statistical models with discrete-valued high-dimensional parameters. Motivated by this ...

...read moreread less

93 citations

Journal Article•DOI•

Prediction, Estimation, and Attribution

[...]

Bradley Efron¹•Institutions (1)

Stanford University¹

04 Jun 2020-Journal of the American Statistical Association

TL;DR: Several key discrepancies will be examined, centering on the differences between prediction and estimation or prediction and attribution (significance testing), most of the discussion is carried out through small numerical examples.

...read moreread less

Abstract: The scientific needs and computational limitations of the twentieth century fashioned classical statistical methodology. Both the needs and limitations have changed in the twenty-first, and so has ...

...read moreread less

93 citations

Journal Article•DOI•

Efficiently inferring the demographic history of many populations with allele count data.

[...]

Jack Kamm¹, Jonathan Terhorst², Richard Durbin¹, Yun S. Song³•Institutions (3)

Wellcome Trust Sanger Institute¹, University of Michigan², University of California, Berkeley³

01 Jan 2020-Journal of the American Statistical Association

TL;DR: A new method for efficiently computing the expected SFS and linear functionals of it, for demographies described by general directed acyclic graphs is presented, which can scale to more populations than previously possible for complex demographic histories including admixture.

...read moreread less

Abstract: The sample frequency spectrum (SFS), or histogram of allele counts, is an important summary statistic in evolutionary biology, and is often used to infer the history of population size changes, mig...

...read moreread less

79 citations

Journal Article•DOI•

Estimating Mixed Memberships With Sharp Eigenvector Deviations

[...]

Xueyu Mao¹, Purnamrita Sarkar¹, Deepayan Chakrabarti¹•Institutions (1)

University of Texas at Austin¹

10 Jun 2020-Journal of the American Statistical Association

TL;DR: This article provides uniform rates of convergence for the inferred community membership vector of each node in a network generated from the mixed membership stochastic blockmodel (MMSB) by establishing sharp row-wise eigenvector deviation bounds for MMSB, the first work to establish per-node rates for overlapping community detection in networks.

...read moreread less

Abstract: We consider the problem of estimating community memberships of nodes in a network, where every node is associated with a vector determining its degree of membership in each community. Existing prov...

...read moreread less

73 citations

Journal Article•DOI•

Flexible Sensitivity Analysis for Observational Studies Without Observable Implications

[...]

Alexander Franks¹, Alexander D'Amour², Avi Feller³•Institutions (3)

University of California, Santa Barbara¹, Google², University of California, Berkeley³

01 Oct 2020-Journal of the American Statistical Association

TL;DR: This work proposes a framework that allows flexible models for the observed data and a clean separation of the identified and unidentified parts of the sensitivity model, and provides heuristics for calibrating these parameters against observable quantities.

...read moreread less

Abstract: A fundamental challenge in observational causal inference is that assumptions about unconfoundedness are not testable from data. Assessing sensitivity to such assumptions is therefore important in ...

...read moreread less

70 citations

Journal Article•DOI•

Long-range dependent curve time series

[...]

Degui Li¹, Peter M. Robinson², Han Lin Shang³•Institutions (3)

University of York¹, London School of Economics and Political Science², Australian National University³

02 Apr 2020-Journal of the American Statistical Association

TL;DR: In this article, the temporal sum of the curve process is shown to be asymptotically normally distributed, and the conditi cientity of the time series with long-range dependence is analyzed.

...read moreread less

Abstract: We introduce methods and theory for functional or curve time series with long-range dependence. The temporal sum of the curve process is shown to be asymptotically normally distributed, the conditi...

...read moreread less

Journal Article•DOI•

Sensitivity Analysis for Unmeasured Confounding in Meta-Analyses.

[...]

Maya B. Mathur¹, Maya B. Mathur², Tyler J. VanderWeele²•Institutions (2)

Stanford University¹, Harvard University²

01 Jan 2020-Journal of the American Statistical Association

TL;DR: In this article, sensitivity analyses quantifying the confounding effect of random effects meta-analyses of observational studies are proposed to quantify the effect of unmeasured confounding in synthesized studies.

...read moreread less

Abstract: Random-effects meta-analyses of observational studies can produce biased estimates if the synthesized studies are subject to unmeasured confounding. We propose sensitivity analyses quantifying the ...

...read moreread less

Journal Article•DOI•

Statistical Inference for Average Treatment Effects Estimated by Synthetic Control Methods

[...]

Kathleen Li¹•Institutions (1)

University of Texas at Austin¹

01 Oct 2020-Journal of the American Statistical Association

TL;DR: The synthetic control (SC) method, a powerful tool for estimating average treatment effects (ATE), is increasingly popular in fields such as statistics, economics, political science, and mathematics as mentioned in this paper.

...read moreread less

Abstract: The synthetic control (SC) method, a powerful tool for estimating average treatment effects (ATE), is increasingly popular in fields such as statistics, economics, political science, and ma...

...read moreread less

Journal Article•DOI•

A New Coefficient of Correlation

[...]

Sourav Chatterjee¹•Institutions (1)

Stanford University¹

28 May 2020-Journal of the American Statistical Association

TL;DR: In this paper, the authors define a coefficient of correlation which is as simple as Pearson's correlation or Spearman's correlation, and yet consistently scales well with the number of pairs in a dataset.

...read moreread less

Abstract: –Is it possible to define a coefficient of correlation which is (a) as simple as the classical coefficients like Pearson’s correlation or Spearman’s correlation, and yet (b) consistently es...

...read moreread less

Journal Article•DOI•

Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators With Massive Data

[...]

Jun Yu¹, HaiYing Wang², Mingyao Ai³, Huiming Zhang³•Institutions (3)

Beijing Institute of Technology¹, University of Connecticut², Peking University³

07 Jul 2020-Journal of the American Statistical Association

TL;DR: This article derives optimal Poisson subsampling probabilities in the context of quasi-likelihood estimation under the A- and L-optimality criteria, and establishes the consistency and asymptotic normality of the resultant estimators.

...read moreread less

Abstract: Nonuniform subsampling methods are effective to reduce computational burden and maintain estimation efficiency for massive data. Existing methods mostly focus on subsampling with replacement due to...

...read moreread less

Journal Article•DOI•

Predicting Clinical Outcomes in Glioblastoma: An Application of Topological and Functional Data Analysis

[...]

Lorin Crawford¹, Anthea Monod², Andrew X. Chen³, Sayan Mukherjee⁴, Raul Rabadan³ - Show less +1 more•Institutions (4)

Brown University¹, Tel Aviv University², Columbia University³, Duke University⁴

02 Jul 2020-Journal of the American Statistical Association

TL;DR: Glioblastoma multiforme (GBM) is an aggressive form of human brain cancer that is under active study in the field of cancer biology as discussed by the authors, and its rapid progression and the relative time cost of obtaining mo...

...read moreread less

Abstract: Glioblastoma multiforme (GBM) is an aggressive form of human brain cancer that is under active study in the field of cancer biology. Its rapid progression and the relative time cost of obtaining mo...

...read moreread less

Journal Article•DOI•

Smoothing with Couplings of Conditional Particle Filters

[...]

Pierre Jacob¹, Fredrik Lindsten², Thomas B. Schön³•Institutions (3)

Harvard University¹, Linköping University², Uppsala University³

02 Apr 2020-Journal of the American Statistical Association

TL;DR: In this paper, an unbiased estimator of smoothing is proposed for state-space models with noisy measurements related to the process, and the estimator is shown to be unbiased.

...read moreread less

Abstract: In state–space models, smoothing refers to the task of estimating a latent stochastic process given noisy measurements related to the process. We propose an unbiased estimator of smoothing expectat...

...read moreread less

Journal Article•DOI•

Doubly Robust Inference With Nonprobability Survey Samples

[...]

Yilin Chen¹, Pengfei Li¹, Changbao Wu¹•Institutions (1)

University of Waterloo¹

01 Oct 2020-Journal of the American Statistical Association

TL;DR: This paper established a general framework for statistical inferences with non-probability survey samples when relevant auxiliary information is available from a probability survey sample, and developed a rigoro-...

...read moreread less

Abstract: We establish a general framework for statistical inferences with nonprobability survey samples when relevant auxiliary information is available from a probability survey sample. We develop a rigoro...

...read moreread less

Journal Article•DOI•

RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs.

[...]

Yingying Fan¹, Emre Demirkaya¹, Gaorong Li², Jinchi Lv¹•Institutions (2)

University of Southern California¹, Beijing University of Technology²

01 Jan 2020-Journal of the American Statistical Association

TL;DR: In this article, the authors provide theorems for reproducibility in big data applications with general high-dimensional nonlinear models, which are key to enabling refined scientific discoveries.

...read moreread less

Abstract: Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonlinear models. In this article, we provide theore...

...read moreread less

Journal Article•DOI•

Robust Inference Using Inverse Probability Weighting

[...]

Xinwei Ma¹, Jingshen Wang²•Institutions (2)

University of California, San Diego¹, University of California, Berkeley²

01 Oct 2020-Journal of the American Statistical Association

TL;DR: It is shown that the IPW estimator can have different (Gaussian or non-Gaussian) asymptotic distributions, depending on how “close to zero” the probability weights are and on how large the trimming threshold is, and an inference procedure is proposed that remains valid with the use of a data-driven trimmedming threshold.

...read moreread less

Abstract: Inverse probability weighting (IPW) is widely used in empirical work in economics and other disciplines. As Gaussian approximations perform poorly in the presence of “small denominators,” trimming ...

...read moreread less

Journal Article•DOI•

Constrained Factor Models for High-Dimensional Matrix-Variate Time Series

[...]

Elynn Y. Chen¹, Ruey S. Tsay², Rong Chen³•Institutions (3)

Princeton University¹, University of Chicago², Rutgers University³

02 Apr 2020-Journal of the American Statistical Association

TL;DR: In this paper, a general framework for incorporating domain and prior knowledge in the matrix factor model through linear constraints is proposed, which is useful in achieving parsimonious parameterization, facilitating interpretation of the latent matrix factor, and identifying specific factors of interest.

...read moreread less

Abstract: High-dimensional matrix-variate time series data are becoming widely available in many scientific fields, such as economics, biology, and meteorology. To achieve significant dimension reduction while preserving the intrinsic matrix structure and temporal dynamics in such data, Wang, Liu, and Chen proposed a matrix factor model, that is, shown to be able to provide effective analysis. In this article, we establish a general framework for incorporating domain and prior knowledge in the matrix factor model through linear constraints. The proposed framework is shown to be useful in achieving parsimonious parameterization, facilitating interpretation of the latent matrix factor, and identifying specific factors of interest. Fully utilizing the prior-knowledge-induced constraints results in more efficient and accurate modeling, inference, dimension reduction as well as a clear and better interpretation of the results. Constrained, multi-term, and partially constrained factor models for matrix-variate ti...

...read moreread less

Journal Article•DOI•

Combining Multiple Observational Data Sources to Estimate Causal Effects.

[...]

Shu Yang¹, Peng Ding²•Institutions (2)

North Carolina State University¹, University of California, Berkeley²

01 Jan 2020-Journal of the American Statistical Association

TL;DR: In this paper, the authors consider estimation of causal effects combining big main data with unmeasured confounders and smaller validation data with supplementary information on these confoundsers, and propose appropriate bootstrap procedures to implement using software routines for existing estimators.

...read moreread less

Abstract: The era of big data has witnessed an increasing availability of multiple data sources for statistical analyses. We consider estimation of causal effects combining big main data with unmeasured confounders and smaller validation data with supplementary information on these confounders. Under the unconfoundedness assumption with completely observed confounders, the smaller validation data allow for constructing consistent estimators for causal effects, but the big main data can only give error-prone estimators in general. However, by leveraging the information in the big main data in a principled way, we can improve the estimation efficiencies yet preserve the consistencies of the initial estimators based solely on the validation data. Our framework applies to asymptotically normal estimators, including the commonly used regression imputation, weighting, and matching estimators, and does not require a correct specification of the model relating the unmeasured confounders to the observed variables. We also propose appropriate bootstrap procedures, which makes our method straightforward to implement using software routines for existing estimators. Supplementary materials for this article are available online.

...read moreread less

Journal Article•DOI•

Separable Effects for Causal Inference in the Presence of Competing Events

[...]

Mats Julius Stensrud¹, Mats Julius Stensrud², Jessica G. Young¹, Vanessa Didelez³, Vanessa Didelez⁴, James Robins¹, Miguel A. Hernán⁵, Miguel A. Hernán¹ - Show less +4 more•Institutions (5)

Harvard University¹, University of Oslo², University of Bremen³, Leibniz Association⁴, Massachusetts Institute of Technology⁵

24 Jun 2020-Journal of the American Statistical Association

TL;DR: In this article, the authors propose separable effects to study the causal effect of a treatment on an e ect in time-to-event settings, where the presence of competing events complicates the definition of causal effects.

...read moreread less

Abstract: In time-to-event settings, the presence of competing events complicates the definition of causal effects. Here we propose the new separable effects to study the causal effect of a treatment on an e...

...read moreread less

Journal Article•DOI•

A likelihood ratio approach to sequential change point detection for a general class of parameters

[...]

Holger Dette¹, Josua Gösmann¹•Institutions (1)

Ruhr University Bochum¹

02 Jul 2020-Journal of the American Statistical Association

TL;DR: In this paper, a new approach for sequential monitoring of a general class of parameters of a d-dimensional time series, which can be estimated by approximately linear functionals of t.

...read moreread less

Abstract: In this article, we propose a new approach for sequential monitoring of a general class of parameters of a d-dimensional time series, which can be estimated by approximately linear functionals of t...

...read moreread less

Journal Article•DOI•

From Fixed-X to Random-X Regression: Bias-Variance Decompositions, Covariance Penalties, and Prediction Error Estimation

[...]

Saharon Rosset¹, Ryan J. Tibshirani²•Institutions (2)

Tel Aviv University¹, Carnegie Mellon University²

19 Mar 2020-Journal of the American Statistical Association

TL;DR: In statistical prediction, classical approaches for model selection and model evaluation based on covariance penalties are still widely used as discussed by the authors, and most of the literature on this topic is based on what w...

...read moreread less

Abstract: In statistical prediction, classical approaches for model selection and model evaluation based on covariance penalties are still widely used. Most of the literature on this topic is based on what w...

...read moreread less

Journal Article•DOI•

L2RM: Low-rank Linear Regression Models for High-dimensional Matrix Responses.

[...]

Dehan Kong¹, Baiguo An², Jingwen Zhang³, Hongtu Zhu³•Institutions (3)

University of Toronto¹, Capital University of Economics and Business², University of North Carolina at Chapel Hill³

30 Apr 2020-Journal of the American Statistical Association

TL;DR: A fast and efficient screening procedure based on the spectral norm of each coefficient matrix to deal with the case when the number of covariates is extremely large and a theoretical guarantee for the overall solution of the two-step screening and estimation procedure is established.

...read moreread less

Abstract: The aim of this article is to develop a low-rank linear regression model to correlate a high-dimensional response matrix with a high-dimensional vector of covariates when coefficient matrices have ...

...read moreread less

Journal Article•DOI•

Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes.

[...]

Zhengling Qi¹, Dacheng Liu², Haoda Fu³, Yufeng Liu¹•Institutions (3)

University of North Carolina at Chapel Hill¹, Boehringer Ingelheim², Eli Lilly and Company³

01 Jan 2020-Journal of the American Statistical Association

TL;DR: The proposed angle-based direct learning (AD-learning) to efficiently estimate optimal ITRs with multiple treatments has an interesting geometric interpretation on the effect of different treatments for each individual patient, which can help doctors and patients make better decisions.

...read moreread less

Abstract: Estimating an optimal individualized treatment rule (ITR) based on patients’ information is an important problem in precision medicine. An optimal ITR is a decision function that optimizes patients...

...read moreread less

Journal Article•DOI•

From Distance Correlation to Multiscale Graph Correlation

[...]

Cencheng Shen¹, Carey E. Priebe², Joshua T. Vogelstein²•Institutions (2)

University of Delaware¹, Johns Hopkins University²

02 Jan 2020-Journal of the American Statistical Association

TL;DR: In this paper, the authors propose a correlation measure that can detect general dependencies in statistics and machine learning, but also crucial to general scientific discovery and understanding and developing such correlation measure is not only imperative to statistics and Machine Learning, and also crucial for general scientific research.

...read moreread less

Abstract: Understanding and developing a correlation measure that can detect general dependencies is not only imperative to statistics and machine learning, but also crucial to general scientific discovery i...

...read moreread less

Journal Article•DOI•

Estimating Number of Factors by Adjusted Eigenvalues Thresholding

[...]

Jianqing Fan¹, Jianhua Guo², Shurong Zheng²•Institutions (2)

Princeton University¹, Northeast Normal University²

21 Sep 2020-Journal of the American Statistical Association

TL;DR: In this paper, the number of common factors in high-dimensional factor models is determined based on the eigenvalues of the covariance matrix, and the existing literature is mainly based on eigenvectors of the matrix.

...read moreread less

Abstract: Determining the number of common factors is an important and practical topic in high-dimensional factor models. The existing literature is mainly based on the eigenvalues of the covariance matrix. ...

...read moreread less

Journal Article•DOI•

Ball Covariance: A Generic Measure of Dependence in Banach Space

[...]

Wenliang Pan¹, Xueqin Wang¹, Heping Zhang², Hongtu Zhu³, Jin Zhu¹ - Show less +1 more•Institutions (3)

Sun Yat-sen University¹, Yale University², University of Texas MD Anderson Cancer Center³

02 Jan 2020-Journal of the American Statistical Association

TL;DR: The Ball Covariance is proposed as a generic measure of dependence between two random objects in two possibly different Banach spaces and is nonparametric and model-free, which make the proposed measure robust to model mis-specification.

...read moreread less

Abstract: Technological advances in science and engineering have led to the routine collection of large and complex data objects, where the dependence structure among those objects is often of great interest...

...read moreread less

Collapse