scispace - formally typeset
Search or ask a question

Showing papers on "Coverage probability published in 2011"


Journal ArticleDOI
TL;DR: A new, fast, yet reliable method for the construction of PIs for NN predictions, and the quantitative comparison with three traditional techniques for prediction interval construction reveals that the LUBE method is simpler, faster, and more reliable.
Abstract: Prediction intervals (PIs) have been proposed in the literature to provide more information by quantifying the level of uncertainty associated to the point forecasts. Traditional methods for construction of neural network (NN) based PIs suffer from restrictive assumptions about data distribution and massive computational loads. In this paper, we propose a new, fast, yet reliable method for the construction of PIs for NN predictions. The proposed lower upper bound estimation (LUBE) method constructs an NN with two outputs for estimating the prediction interval bounds. NN training is achieved through the minimization of a proposed PI-based objective function, which covers both interval width and coverage probability. The method does not require any information about the upper and lower bounds of PIs for training the NN. The simulated annealing method is applied for minimization of the cost function and adjustment of NN parameters. The demonstrated results for 10 benchmark regression case studies clearly show the LUBE method to be capable of generating high-quality PIs in a short time. Also, the quantitative comparison with three traditional techniques for prediction interval construction reveals that the LUBE method is simpler, faster, and more reliable.

533 citations


Posted Content
TL;DR: The method proposed turns the regression data into an approximate Gaussian sequence of point estimators of individual regression coefficients, which can be used to select variables after proper thresholding, and demonstrates the accuracy of the coverage probability and other desirable properties of the confidence intervals proposed.
Abstract: The purpose of this paper is to propose methodologies for statistical inference of low-dimensional parameters with high-dimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, although our ideas are applicable in a much broad context. The theoretical results presented here provide sufficient conditions for the asymptotic normality of the proposed estimators along with a consistent estimator for their finite-dimensional covariance matrices. These sufficient conditions allow the number of variables to far exceed the sample size. The simulation results presented here demonstrate the accuracy of the coverage probability of the proposed confidence intervals, strongly supporting the theoretical results.

188 citations


Journal ArticleDOI
TL;DR: The obtained results indicate that the delta technique outperforms the Bayesian technique in terms of narrowness of PIs with satisfactory coverage probability, and PIs constructed using theBayesian technique are more robust against the NN structure and exhibit excellent coverage probability.
Abstract: The accurate prediction of travel times is desirable but frequently prone to error. This is mainly attributable to both the underlying traffic processes and the data that are used to infer travel time. A more meaningful and pragmatic approach is to view travel time prediction as a probabilistic inference and to construct prediction intervals (PIs), which cover the range of probable travel times travelers may encounter. This paper introduces the delta and Bayesian techniques for the construction of PIs. Quantitative measures are developed and applied for a comprehensive assessment of the constructed PIs. These measures simultaneously address two important aspects of PIs: 1) coverage probability and 2) length. The Bayesian and delta methods are used to construct PIs for the neural network (NN) point forecasts of bus and freeway travel time data sets. The obtained results indicate that the delta technique outperforms the Bayesian technique in terms of narrowness of PIs with satisfactory coverage probability. In contrast, PIs constructed using the Bayesian technique are more robust against the NN structure and exhibit excellent coverage probability.

152 citations


Journal ArticleDOI
TL;DR: A genetic algorithm–based method is developed that automates the neural network model selection and adjustment of the hyperparameter and demonstrates the suitability of the proposed method for improving the quality of constructed prediction intervals in terms of their length and coverage probability.
Abstract: The transportation literature is rich in the application of neural networks for travel time prediction. The uncertainty prevailing in operation of transportation systems, however, highly degrades prediction performance of neural networks. Prediction intervals for neural network outcomes can properly represent the uncertainty associated with the predictions. This paper studies an application of the delta technique for the construction of prediction intervals for bus and freeway travel times. The quality of these intervals strongly depends on the neural network structure and a training hyperparameter. A genetic algorithm–based method is developed that automates the neural network model selection and adjustment of the hyperparameter. Model selection and parameter adjustment is carried out through minimization of a prediction interval-based cost function, which depends on the width and coverage probability of constructed prediction intervals. Experiments conducted using the bus and freeway travel time datasets demonstrate the suitability of the proposed method for improving the quality of constructed prediction intervals in terms of their length and coverage probability.

88 citations


Journal ArticleDOI
TL;DR: The model properties and reliability measures are derived and studied in detail and maximum likelihood and Bayes approaches are used for estimation and coverage probability for the parameter.

86 citations


Journal ArticleDOI
Hisashi Noma1
TL;DR: Three confidence intervals for improving coverage properties are developed, based on (i) the Bartlett corrected likelihood ratio statistic, (ii) the efficient score statistic, and (iii) thebartlett-type adjusted efficient Score statistic, which demonstrate better coverage properties than the existing methods.
Abstract: In medical meta-analysis, the DerSimonian-Laird confidence interval for the average treatment effect has been widely adopted in practice. However, it is well known that its coverage probability (the probability that the interval actually includes the true value) can be substantially below the target level. One particular reason is that the validity of the confidence interval depends on the assumption that the number of synthesized studies is sufficiently large. In typical medical meta-analyses, the number of studies is fewer than 20. In this article, we developed three confidence intervals for improving coverage properties, based on (i) the Bartlett corrected likelihood ratio statistic, (ii) the efficient score statistic, and (iii) the Bartlett-type adjusted efficient score statistic. The Bartlett and Bartlett-type corrections improve the large sample approximations for the likelihood ratio and efficient score statistics. Through numerical evaluations by simulations, these confidence intervals demonstrated better coverage properties than the existing methods. In particular, with a moderate number of synthesized studies, the Bartlett and Bartlett-type corrected confidence intervals performed well. An application to a meta-analysis of the treatment for myocardial infarction with intravenous magnesium is presented.

79 citations


Posted Content
12 Oct 2011
TL;DR: The theoretical results presented here provide sufficient conditions for the asymptotic normality of the proposed estimators along with a consistent estimator for their finite-dimensional covariance matrices to allow the number of variables to far exceed the sample size.
Abstract: . The purpose of this paper is to propose methodologies for statistical inference of low-dimensional parameters with high-dimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, although our ideas are applicable in a much broad context. The theoretical results presented here provide sufficient conditions for the asymptotic normality of the proposed estimators along with a consistent estimator for their finite-dimensional covariance matrices. These sufficient conditions allow the number of variables to far exceed the sample size. The simulation results presented here demonstrate the accuracy of the coverage probability of the proposed confidence intervals, strongly supporting the theoretical results.

57 citations


Journal ArticleDOI
TL;DR: In this paper, the maximum likelihood estimators (MLEs) of the Weibull parameters were derived and the asymptotic distributions of the MLE estimators were used to construct approximate confidence intervals.

44 citations


Journal ArticleDOI
TL;DR: An attempt has been made to review some existing estimators along with some proposed methods and compare them under the same simulation condition and recommend some possible good interval estimators for the practitioners.
Abstract: Several researchers considered various interval estimators for estimating the population coefficient of variation (CV) of symmetric and skewed distributions. Since they considered at different times and under different simulation conditions, their performances are not comparable as a whole. In this article, an attempt has been made to review some existing estimators along with some proposed methods and compare them under the same simulation condition. In particular, we have considered Hendricks and Robey, Mckay, Miller, Sharma and Krishna, Curto and Pinto, and also some bootstrap proposed interval estimators for estimating the population CV. A simulation study has been conducted to compare the performance of the estimators. Both average widths and coverage probabilities are considered as a criterion of the good estimators. Two real life health related data sets are analyzed to illustrate the findings of the article. Based on the simulation study, some possible good interval estimators have been recommende...

38 citations


Journal ArticleDOI
TL;DR: The difference between the means is an appropriate effect measure for comparing two independent discrete numerical variables that has both lower and upper bounds.
Abstract: The number of events per individual is a widely reported variable in medical research papers. Such variables are the most common representation of the general variable type called discrete numerical. There is currently no consensus on how to compare and present such variables, and recommendations are lacking. The objective of this paper is to present recommendations for analysis and presentation of results for discrete numerical variables. Two simulation studies were used to investigate the performance of hypothesis tests and confidence interval methods for variables with outcomes {0, 1, 2}, {0, 1, 2, 3}, {0, 1, 2, 3, 4}, and {0, 1, 2, 3, 4, 5}, using the difference between the means as an effect measure. The Welch U test (the T test with adjustment for unequal variances) and its associated confidence interval performed well for almost all situations considered. The Brunner-Munzel test also performed well, except for small sample sizes (10 in each group). The ordinary T test, the Wilcoxon-Mann-Whitney test, the percentile bootstrap interval, and the bootstrap-t interval did not perform satisfactorily. The difference between the means is an appropriate effect measure for comparing two independent discrete numerical variables that has both lower and upper bounds. To analyze this problem, we encourage more frequent use of parametric hypothesis tests and confidence intervals.

38 citations


Journal ArticleDOI
TL;DR: In this article, simple approximate prediction intervals based on the joint distribution of the past samples and the future sample are proposed for the binomial and Poisson distributions are reviewed and compared.

Journal ArticleDOI
TL;DR: Skart is an automated batch-means procedure for constructing a skewness- and autoregression-adjusted confidence interval for the steady-state mean of a simulation output process that satisfies user-specified requirements concerning not only coverage probability but also the absolute or relative precision provided by the half-length.
Abstract: An analysis is given for an extensive experimental performance evaluation of Skart, an automated sequential batch means procedure for constructing an asymptotically valid confidence interval (CI) on the steady-state mean of a simulation output process. Skart is designed to deliver a CI satisfying user-specified requirements on absolute or relative precision as well as coverage probability. Skart exploits separate adjustments to the half-length of the classical batch means CI so as to account for the effects on the distribution of the underlying Student's t-statistic that arise from skewness (nonnormality) and autocorrelation of the batch means. Skart also delivers a point estimator for the steady-state mean that is approximately free of initialization bias. In an experimental performance evaluation involving a wide range of test processes, Skart compared favorably with other steady-state simulation analysis methods---namely, its predecessors ASAP3, WASSP, and SBatch, as well as ABATCH, LBATCH, the Heidelberger--Welch procedure, and the Law--Carson procedure. Specifically, Skart exhibited competitive sampling efficiency and closer conformance to the given CI coverage probabilities than the other procedures, especially in the most difficult test processes.

Journal ArticleDOI
TL;DR: In this paper, the authors show that block-bootstrap outperforms subsampling methods in size accuracy and propose to use resampling to implement robust backtesting methods to uncover value-at-risk (VaR) models.
Abstract: Backtesting methods are statistical tests designed to uncover value-at-risk (VaR) models not capable of reporting the correct unconditional coverage probability or filtering the serial dependence in the data. We show in this paper that these methods are subject to the presence of model risk produced by the incorrect specification of the conditional VaR model and derive its effect in the asymptotic distribution of the relevant out-of-sample tests. We also show that in the absence of estimation risk, the unconditional backtest is affected by model misspecification but the independence test is not. We propose using resampling methods to implement robust backtests. Our experiments suggest that block-bootstrap outperforms subsampling methods in size accuracy. We carry out a Monte Carlo study to see the importance of model risk in finite samples for location-scale models that are incorrectly specified but correct on “average ”. An application to Dow–Jones Index shows the impact of correcting for model risk on backtesting procedures for different dynamic VaR models measuring risk exposure

Journal ArticleDOI
TL;DR: In this article, the authors considered the interval estimation for two Weibull populations when joint Type-II progressive censoring is implemented and obtained the conditional maximum likelihood estimators of the two weibull parameters under this scheme.
Abstract: Comparative lifetime experiments are important when the object of a study is to determine the relative merits of two competing duration of life products. This study considers the interval estimation for two Weibull populations when joint Type-II progressive censoring is implemented. We obtain the conditional maximum likelihood estimators of the two Weibull parameters under this scheme. Moreover, simultaneous approximate confidence region based on the asymptotic normality of the maximum likelihood estimators are also discussed and compared with two Bootstrap confidence regions. We consider the behavior of probability of failure structure with different schemes. A simulation study is performed and an illustrative example is also given.

Journal ArticleDOI
TL;DR: Extensions to construct simultaneous confidence bands for the mean profile over the covariate region of interest and to assess equivalence between two models in biosimilarity applications are presented.
Abstract: Many applications in biostatistics rely on nonlinear regression models, such as, for example, population pharmacokinetic and pharmacodynamic modeling, or modeling approaches for dose-response characterization and dose selection. Such models are often expressed as nonlinear mixed-effects models, which are implemented in all major statistical software packages. Inference on the model curve can be based on the estimated parameters, from which pointwise confidence intervals for the mean profile at any single point in the covariate region (time, dose, etc.) can be derived. These pointwise confidence intervals, however, should not be used for simultaneous inferences beyond that single covariate value. If assessment over the entire covariate region is required, the joint coverage probability by using the combined pointwise confidence intervals is likely to be less than the nominal coverage probability. In this paper we consider simultaneous confidence bands for the mean profile over the covariate region of inter...

Journal ArticleDOI
TL;DR: N-Skart is a nonsequential procedure designed to deliver a confidence interval (CI) for the steady-state mean of a simulation output process when the user supplies a single simulation-generated time series of arbitrary size and specifies the required coverage probability for a CI based on that data set.
Abstract: We discuss N-Skart, a nonsequential procedure designed to deliver a confidence interval (CI) for the steady-state mean of a simulation output process when the user supplies a single simulation-generated time series of arbitrary size and specifies the required coverage probability for a CI based on that data set N-Skart is a variant of the method of batch means that exploits separate adjustments to the half-length of the CI so as to account for the effects on the distribution of the underlying Student's t-statistic that arise from skewness (nonnormality) and autocorrelation of the batch means If the sample size is sufficiently large, then N-Skart delivers not only a CI but also a point estimator for the steady-state mean that is approximately free of initialization bias In an experimental performance evaluation involving a wide range of test processes and sample sizes, N-Skart exhibited close conformance to the user-specified CI coverage probabilities

Journal ArticleDOI
TL;DR: This paper argued that the poor coverage properties claimed by Santner et al. (2007) actually relate to an inferior version of the score interval (Mee, 1984) and that it is appropriate to align mean rather than minimum coverage with 1−−α, based on a moving average representation of the coverage probability.
Abstract: A recent article (Santner et al., 2007) asserted that a score interval for a difference of independent binomial proportions (Miettinen and Nurminen, 1985) may have inadequate coverage. We re-visit the properties of score intervals for binomial proportions and their differences. Published data indicate these methods produce mean coverage slightly above the nominal confidence level 1 − α. We argue it is appropriate to align mean rather than minimum coverage with 1 − α, based on a moving average representation of the coverage probability. The poor coverage properties claimed by Santner et al. (2007) actually relate to an inferior version of the score interval (Mee, 1984).

Journal ArticleDOI
TL;DR: In this article, the problem of constructing tolerance intervals for the binomial and Poisson distributions is considered, and closed-form approximate equal-tailed tolerance intervals (that control percentages in both tails) are proposed for both distributions.
Abstract: The problems of constructing tolerance intervals for the binomial and Poisson distributions are considered. Closed-form approximate equal-tailed tolerance intervals (that control percentages in both tails) are proposed for both distributions. Exact coverage probabilities and expected widths are evaluated for the proposed equal-tailed tolerance intervals and the existing intervals. Furthermore, an adjustment to the nominal confidence level is suggested so that an equal-tailed tolerance interval can be used as a tolerance interval which includes a specified proportion of the population, but does not necessarily control percentages in both tails. Comparison of such coverage-adjusted tolerance intervals with respect to coverage probabilities and expected widths indicates that the closed-form approximate tolerance intervals are comparable with others, and less conservative, with minimum coverage probabilities close to the nominal level in most cases. The approximate tolerance intervals are simple and easy to compute using a calculator, and they can be recommended for practical applications. The methods are illustrated using two practical examples.

Journal ArticleDOI
TL;DR: Empirical results show that Wald-type, score-type and bootstrap confidence intervals based on the dependence model perform satisfactorily for small to large sample sizes in the sense that their empirical coverage probabilities are close to the pre-specified nominal confidence level and are hence recommended.
Abstract: Bilateral dichotomous data are very common in modern medical comparative studies (e.g. comparison of two treatments in ophthalmologic, orthopaedic and otolaryngologic studies) in which information involving paired organs (e.g. eyes, ears and hips) is available from each subject. In this article, we study various confidence interval estimators for proportion difference based on Wald-type statistics, Fieller theorem, likelihood ratio statistic, score statistics and bootstrap resampling method under the dependence or/and independence models for bilateral binary data. Performance is evaluated with respect to the coverage probability and expected width via simulation studies. Our empirical results show that (1) ignoring the dependence feature of bilateral data could lead to severely incorrect coverage probabilities; and (2) Wald-type, score-type and bootstrap confidence intervals based on the dependence model perform satisfactorily for small to large sample sizes in the sense that their empirical coverage prob...

Journal ArticleDOI
TL;DR: Under certain conditions, it is shown that the ECP of the bootstrap and the E CP of the asymptotic approximation converge to zero at the same rate, which is a faster rate than the rate of theECP of subsampling methods.
Abstract: This paper considers the problem of coverage of the elements of the identified set in a class of partially identified econometric models with a prespecified probability. In order to conduct inference in partially identified econometric models defined by moment (in)equalities, the literature has proposed three methods: the bootstrap, subsampling, and an asymptotic approximation. The objective of this paper is to compare these methods in terms of the rate at which they achieve the desired coverage level, i.e., in terms of the rate at which the error in the coverage probability (ECP) converges to zero.Under certain conditions, we show that the ECP of the bootstrap and the ECP of the asymptotic approximation converge to zero at the same rate, which is a faster rate than the rate of the ECP of subsampling methods. As a consequence, under these conditions, the bootstrap and the asymptotic approximation produce inference that is more precise than subsampling. A Monte Carlo simulation study confirms that these results are relevant in nite samples.

Journal ArticleDOI
TL;DR: An approach based on the concepts of generalized inference generally can provide confidence intervals with reasonable coverage probabilities even at small sample sizes and is illustrated via an application to a data set of blood test results of anemia patients.

Journal ArticleDOI
TL;DR: In this article, the authors apply the concept of generalized confidence intervals (GCI) to measure process capability based on the most widely used index C pk in the presence of measurement errors.
Abstract: In recent years, process capability indices have been widely used to provide numerical measures on process performance. A substantial majority of capability research works that appeared in the literature do not take into account gauge measurement errors. However, such assumptions do not adequately reflect real situations since measurement errors unfortunately cannot be avoided in most manufacturing processes. Estimating and testing process capability without considering gauge measurement errors may often lead to unreliable decisions. Therefore, this paper applies the concept of generalized confidence intervals (GCI) to measure process capability based on the most widely used index C pk in the presence of measurement errors. An exhaustive simulation was conducted to assess the performance of the GCI method in terms of the coverage probability (CP) and the expected value of the generalized lower confidence limit. The results indicate that GCI method can provide more accurate lower confidence limits, and CPs...

Journal ArticleDOI
TL;DR: Simulation studies are conducted to evaluate the performance of the normal-based and the REML-based confidence intervals for the intraclass correlation coefficient under non-normal distribution assumptions.

Proceedings ArticleDOI
01 Jan 2011
TL;DR: Through experiments and simulations it is shown that the proposed method can be used to construct better quality bootstrap-based prediction intervals and the optimized prediction intervals have narrower widths with a greater coverage probability compared to traditional bootstrapped prediction intervals.
Abstract: The bootstrap method is one of the most widely used methods in literature for construction of confidence and prediction intervals. This paper proposes a new method for improving the quality of bootstrap-based prediction intervals. The core of the proposed method is a prediction interval-based cost function, which is used for training neural networks. A simulated annealing method is applied for minimization of the cost function and neural network parameter adjustment. The developed neural networks are then used for estimation of the target variance. Through experiments and simulations it is shown that the proposed method can be used to construct better quality bootstrap-based prediction intervals. The optimized prediction intervals have narrower widths with a greater coverage probability compared to traditional bootstrap-based prediction intervals.

Journal ArticleDOI
TL;DR: Asymptotically unbiased estimators are developed, via the maximum likelihood technique, of the area under the ROC curve of BLC of two bivariate normally distributed biomarkers affected by LODs.
Abstract: The receiver operating characteristic (ROC) curve is a tool commonly used to evaluate biomarker utility in clinical diagnosis of disease. Often, multiple biomarkers are developed to evaluate the discrimination for the same outcome. Levels of multiple biomarkers can be combined via best linear combination (BLC) such that their overall discriminatory ability is greater than any of them individually. Biomarker measurements frequently have undetectable levels below a detection limit sometimes denoted as limit of detection (LOD). Ignoring observations below the LOD or substituting some replacement value as a method of correction has been shown to lead to negatively biased estimates of the area under the ROC curve for some distributions of single biomarkers. In this paper, we develop asymptotically unbiased estimators, via the maximum likelihood technique, of the area under the ROC curve of BLC of two bivariate normally distributed biomarkers affected by LODs. We also propose confidence intervals for this area under curve. Point and confidence interval estimates are scrutinized by simulation study, recording bias and root mean square error and coverage probability, respectively. An example using polychlorinated biphenyl (PCB) levels to classify women with and without endometriosis illustrates the potential benefits of our methods.

Journal ArticleDOI
TL;DR: In this article, a new approach for computing the minimum sample size for the estimation of a binomial parameter with prescribed margin of error and confidence level is presented. But the problem of computing the exact solution is not insurmountable.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the prediction of future order statistics based on the largest and smallest observations at the times when a new record of either kind (upper or lower) occurs.
Abstract: In this paper, based on the largest and smallest observations at the times when a new record of either kind (upper or lower) occurs, we discuss the prediction of future order statistics. The proposed prediction intervals are distribution-free in that the corresponding coverage probabilities are known exactly without any assumption about the parent distribution other than that it being continuous. An exact expression for the prediction coefficient of these intervals is derived. Similarly, prediction intervals for future records based on observed order statistics are also obtained. Finally, two real-life data, one involving the average July temperatures in Neurenburg, Switzerland, and the other involving the amount of annual rainfall at the Los Angeles Civic Center, are used to illustrate the procedures developed here.

Journal ArticleDOI
TL;DR: Numerical studies show that the coverage probabilities of the proposed interval estimation method are very accurate and type I errors are close to the nominal level even for very small samples.

Journal ArticleDOI
01 Aug 2011-Test
TL;DR: In this paper, an empirical likelihood method is proposed to construct a confidence interval for the endpoint of a distribution function, which has better coverage accuracy than the normal approximation method, and bootstrap calibration improves the accuracy.
Abstract: Estimating the endpoint of a distribution function is of interest in product analysis and predicting the maximum lifetime of an item. In this paper, we propose an empirical likelihood method to construct a confidence interval for the endpoint. A simulation study shows the proposed confidence interval has better coverage accuracy than the normal approximation method, and bootstrap calibration improves the accuracy.

Journal ArticleDOI
TL;DR: In this article, the authors considered the standard two-sample framework with right censoring and constructed useful confidence intervals for the ratio or difference of two hazard functions using smoothed empirical likelihood (EL) methods.