scispace - formally typeset
Search or ask a question
Topic

Coverage probability

About: Coverage probability is a research topic. Over the lifetime, 2479 publications have been published within this topic receiving 53259 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors proposed a method to construct confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model by turning the regression data into an approximate Gaussian sequence of point estimators of individual regression coefficients.
Abstract: Summary The purpose of this paper is to propose methodologies for statistical inference of low dimensional parameters with high dimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, although our ideas are applicable in a much broader context. The theoretical results that are presented provide sufficient conditions for the asymptotic normality of the proposed estimators along with a consistent estimator for their finite dimensional covariance matrices. These sufficient conditions allow the number of variables to exceed the sample size and the presence of many small non-zero coefficients. Our methods and theory apply to interval estimation of a preconceived regression coefficient or contrast as well as simultaneous interval estimation of many regression coefficients. Moreover, the method proposed turns the regression data into an approximate Gaussian sequence of point estimators of individual regression coefficients, which can be used to select variables after proper thresholding. The simulation results that are presented demonstrate the accuracy of the coverage probability of the confidence intervals proposed as well as other desirable properties, strongly supporting the theoretical results.

892 citations

Journal ArticleDOI
TL;DR: The aim is to identify known methods for estimation of the between‐study variance and its corresponding uncertainty, and to summarise the simulation and empirical evidence that compares them and recommend the Q‐profile method and the alternative approach based on a ‘generalised Cochran between‐ study variance statistic’.
Abstract: Meta-analyses are typically used to estimate the overall/mean of an outcome of interest. However, inference about between-study variability, which is typically modelled using a between-study variance parameter, is usually an additional aim. The DerSimonian and Laird method, currently widely used by default to estimate the between-study variance, has been long challenged. Our aim is to identify known methods for estimation of the between-study variance and its corresponding uncertainty, and to summarise the simulation and empirical evidence that compares them. We identified 16 estimators for the between-study variance, seven methods to calculate confidence intervals, and several comparative studies. Simulation studies suggest that for both dichotomous and continuous data the estimator proposed by Paule and Mandel and for continuous data the restricted maximum likelihood estimator are better alternatives to estimate the between-study variance. Based on the scenarios and results presented in the published studies, we recommend the Q-profile method and the alternative approach based on a 'generalised Cochran between-study variance statistic' to compute corresponding confidence intervals around the resulting estimates. Our recommendations are based on a qualitative evaluation of the existing literature and expert consensus. Evidence-based recommendations require an extensive simulation study where all methods would be compared under the same scenarios.

828 citations

Journal ArticleDOI
TL;DR: The limitations and usefulness of each method are addressed in order to give researchers guidance in constructing appropriate estimates of biomarkers' true discriminating capabilities.
Abstract: The receiver operating characteristic (ROC) curve is used to evaluate a biomarker's ability for classifying disease status. The Youden Index (J), the maximum potential effectiveness of a biomarker, is a common summary measure of the ROC curve. In biomarker development, levels may be unquantifiable below a limit of detection (LOD) and missing from the overall dataset. Disregarding these observations may negatively bias the ROC curve and thus J. Several correction methods have been suggested for mean estimation and testing; however, little has been written about the ROC curve or its summary measures. We adapt non-parametric (empirical) and semi-parametric (ROC-GLM [generalized linear model]) methods and propose parametric methods (maximum likelihood (ML)) to estimate J and the optimal cut-point (c *) for a biomarker affected by a LOD. We develop unbiased estimators of J and c * via ML for normally and gamma distributed biomarkers. Alpha level confidence intervals are proposed using delta and bootstrap methods for the ML, semi-parametric, and non-parametric approaches respectively. Simulation studies are conducted over a range of distributional scenarios and sample sizes evaluating estimators' bias, root-mean square error, and coverage probability; the average bias was less than one percent for ML and GLM methods across scenarios and decreases with increased sample size. An example using polychlorinated biphenyl levels to classify women with and without endometriosis illustrates the potential benefits of these methods. We address the limitations and usefulness of each method in order to give researchers guidance in constructing appropriate estimates of biomarkers' true discriminating capabilities.

801 citations

Journal ArticleDOI
TL;DR: In this article, a conceptually different type of confidence interval is proposed, which asymptotically covers the true value of the parameter with this probability, but the exact coverage probabilities of the simplest version of their new CI do not converge to their nominal values uniformly across different values for the width of the identification region.
Abstract: Recently a growing body of research has studied inference in settings where parameters of interest are partially identified. In many cases the parameter is real-valued and the identification region is an interval whose lower and upper bounds may be estimated from sample data. For this case confidence intervals (CIs) have been proposed that cover the entire identification region with fixed probability. Here, we introduce a conceptually different type of confidence interval. Rather than cover the entire identification region with fixed probability, we propose CIs that asymptotically cover the true value of the parameter with this probability. However, the exact coverage probabilities of the simplest version of our new CIs do not converge to their nominal values uniformly across different values for the width of the identification region. To avoid the problems associated with this, we modify the proposed CI to ensure that its exact coverage probabilities do converge uniformly to their nominal values. We motivate this modified CI through exact results for the Gaussian case.

662 citations

Journal ArticleDOI
TL;DR: Two methods are provided to correct relative risk estimates obtained from logistic regression models for measurement errors in continuous exposures within cohort studies that may be due to either random (unbiased) within-person variation or to systematic errors for individual subjects.
Abstract: Errors in the measurement of exposure that are independent of disease status tend to bias relative risk estimates and other measures of effect in epidemiologic studies toward the null value. Two methods are provided to correct relative risk estimates obtained from logistic regression models for measurement errors in continuous exposures within cohort studies that may be due to either random (unbiased) within-person variation or to systematic errors for individual subjects. These methods require a separate validation study to estimate the regression coefficient lambda relating the surrogate measure to true exposure. In the linear approximation method, the true logistic regression coefficient beta* is estimated by beta/lambda, where beta is the observed logistic regression coefficient based on the surrogate measure. In the likelihood approximation method, a second-order Taylor series expansion is used to approximate the logistic function, enabling closed-form likelihood estimation of beta*. Confidence intervals for the corrected relative risks are provided that include a component representing error in the estimation of lambda. Based on simulation studies, both methods perform well for true odds ratios up to 3.0; for higher odds ratios the likelihood approximation method was superior with respect to both bias and coverage probability. An example is provided based on data from a prospective study of dietary fat intake and risk of breast cancer and a validation study of the questionnaire used to assess dietary fat intake.

649 citations


Network Information
Related Topics (5)
Estimator
97.3K papers, 2.6M citations
86% related
Statistical hypothesis testing
19.5K papers, 1M citations
80% related
Linear model
19K papers, 1M citations
79% related
Markov chain
51.9K papers, 1.3M citations
79% related
Multivariate statistics
18.4K papers, 1M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
202363
2022153
2021142
2020151
2019142