scispace - formally typeset
Search or ask a question
Posted Content

Confidence intervals in regression utilizing prior information

TL;DR: In this paper, a 1-alpha confidence interval for theta with uncertain prior information that tau = 0.95 has been proposed, which is optimal in the sense that the largest weight is given to this expected length when tau=0.95.
Abstract: We consider a linear regression model with regression parameter beta=(beta_1,...,beta_p) and independent and identically N(0,sigma^2) distributed errors. Suppose that the parameter of interest is theta = a^T beta where a is a specified vector. Define the parameter tau=c^T beta-t where the vector c and the number t are specified and a and c are linearly independent. Also suppose that we have uncertain prior information that tau = 0. We present a new frequentist 1-alpha confidence interval for theta that utilizes this prior information. We require this confidence interval to (a) have endpoints that are continuous functions of the data and (b) coincide with the standard 1-alpha confidence interval when the data strongly contradicts this prior information. This interval is optimal in the sense that it has minimum weighted average expected length where the largest weight is given to this expected length when tau=0. This minimization leads to an interval that has the following desirable properties. This interval has expected length that (a) is relatively small when the prior information about tau is correct and (b) has a maximum value that is not too large. The following problem will be used to illustrate the application of this new confidence interval. Consider a 2-by 2 factorial experiment with 20 replicates. Suppose that the parameter of interest theta is a specified simple effect and that we have uncertain prior information that the two-factor interaction is zero. Our aim is to find a frequentist 0.95 confidence interval for theta that utilizes this prior information.
Citations
More filters
Journal ArticleDOI
Paul Kabaila1
TL;DR: It is considered the important case that the inference of interest is a confidence region, and the literature in which the aim is to utilize uncertain prior information directly in the construction of confidence regions, without requiring the intermediate step of a preliminary statistical model selection.
Abstract: Summary It is very common in applied frequentist (“classical”) statistics to carry out a preliminary statistical (i.e. data-based) model selection by, for example, using preliminary hypothesis tests or minimizing AIC. This is usually followed by the inference of interest, using the same data, based on the assumption that the selected model had been given to us a priori. This assumption is false and it can lead to an inaccurate and misleading inference. We consider the important case that the inference of interest is a confidence region. We review the literature that shows that the resulting confidence regions typically have very poor coverage properties. We also briefly review the closely related literature that describes the coverage properties of prediction intervals after preliminary statistical model selection. A possible motivation for preliminary statistical model selection is a wish to utilize uncertain prior information in the inference of interest. We review the literature in which the aim is to utilize uncertain prior information directly in the construction of confidence regions, without requiring the intermediate step of a preliminary statistical model selection. We also point out this aim as a future direction for research. Resume En statistiques appliquees de l'approche frequentiste (“classique”), il est courant de proceder a une selection preliminaire du modele statistique (c'est-a-dire basee sur des donnees) en utilisant, par exemple, des tests preliminaires fondes sur des hypotheses ou en minimisant AIC. Ceci est generalement suivi par l'inference d'interet, ou les memes donnees sont utilisees, et qui suppose que le modele choisi nous avait ete donnea priori. Cette supposition est erronee et peut entrainer une inference inexacte et trompeuse. Nous examinons un cas primordial ou l'inference d'interet constitue une region de confiance. Nous etudions la documentation qui indique que les regions de confiance qui en resultent ont en principe des proprietes d'application reduites. Nous examinons egalement de maniere succincte les ecrits en etroite relation qui decrivent les proprietes d'application des intervalles de prediction apres la selection preliminaire du modele statistique. Il est possible que la motivation sous-tendant la selection preliminaire du modele statistique represente un desir d'utilizer des renseignements prealables incertains dans l'inference d'interet. Nous etudions la documentation ou l'objectif est d'utilizer des renseignements prealables incertains directement dans l'elaboration de regions de confiance, sans exiger de recourir a l'etape intermediaire de selection preliminaire du modele statistique. Nous precisons egalement que cet objectif constitue un axe de recherche future.

47 citations

Journal ArticleDOI
TL;DR: In this article, the authors developed an approach to evaluate frequentist model averaging procedures by considering them in a simple situation in which there are two-nested linear regression models over which we average.
Abstract: We develop an approach to evaluating frequentist model averaging procedures by considering them in a simple situation in which there are two-nested linear regression models over which we average. We introduce a general class of model averaged confidence intervals, obtain exact expressions for the coverage and the scaled expected length of the intervals, and use these to compute these quantities for the model averaged profile likelihood (MPI) and model-averaged tail area confidence intervals proposed by D. Fletcher and D. Turek. We show that the MPI confidence intervals can perform more poorly than the standard confidence interval used after model selection but ignoring the model selection process. The model-averaged tail area confidence intervals perform better than the MPI and postmodel-selection confidence intervals but, for the examples that we consider, offer little over simply using the standard confidence interval for θ under the full model, with the same nominal coverage.

32 citations


Additional excerpts

  • ...5 (Kabaila & Giri, 2009a, 2009b)....

    [...]

Posted Content
TL;DR: In this article, a frequentist 1-alpha confidence interval was proposed for uncertain prior information that mu = 0, and the question was posed: to what extent can a 1-α confidence interval utilize this prior information?
Abstract: Consider X_1,X_2,...,X_n that are independent and identically N(mu,sigma^2) distributed. Suppose that we have uncertain prior information that mu = 0. We answer the question: to what extent can a frequentist 1-alpha confidence interval for mu utilize this prior information?

21 citations

Journal ArticleDOI
TL;DR: In this paper, a broad class of confidence intervals for μ1−μ2 with minimum coverage probability 1 −α is considered. But the authors focus on the question of whether condition (a) can be satisfied, i.e., there does not exist any confidence interval belonging to that utilizes the prior information substantially better than J0.
Abstract: Summary Consider two independent random samples of size f + 1, one from an N (μ1, σ21) distribution and the other from an N (μ2, σ22) distribution, where σ21/σ22∈ (0, ∞). The Welch ‘approximate degrees of freedom’ (‘approximate t-solution’) confidence interval for μ1−μ2 is commonly used when it cannot be guaranteed that σ21/σ22= 1. Kabaila (2005, Comm. Statist. Theory and Methods 34, 291–302) multiplied the half-width of this interval by a positive constant so that the resulting interval, denoted by J0, has minimum coverage probability 1 −α. Now suppose that we have uncertain prior information that σ21/σ22= 1. We consider a broad class of confidence intervals for μ1−μ2 with minimum coverage probability 1 −α. This class includes the interval J0, which we use as the standard against which other members of will be judged. A confidence interval utilizes the prior information substantially better than J0 if (expected length of J)/(expected length of J0) is (a) substantially less than 1 (less than 0.96, say) for σ21/σ22= 1, and (b) not too much larger than 1 for all other values of σ21/σ22. For a given f, does there exist a confidence interval that satisfies these conditions? We focus on the question of whether condition (a) can be satisfied. For each given f, we compute a lower bound to the minimum over of (expected length of J)/(expected length of J0) when σ21/σ22= 1. For 1 −α= 0.95, this lower bound is not substantially less than 1. Thus, there does not exist any confidence interval belonging to that utilizes the prior information substantially better than J0.

14 citations


Cites methods from "Confidence intervals in regression ..."

  • ...…Hodges & Lehmann (1952) and Bickel (1983, 1984), our aim is to utilize the uncertain prior information in the frequentist inference of interest, while providing a safeguard in case this prior information happens to be incorrect (cf. Kabaila, 1998; Farchione & Kabaila, 2008; Kabaila & Giri, 2007)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors show that the naive 0.95 confidence interval has minimum coverage probability 0.0846, showing that it is completely inadequate and propose a new method for computing the minimum coverage probabilities of this naive confidence interval, regardless of how large s is.

10 citations

References
More filters
Book
24 Apr 1990

6,235 citations

Book
01 Jan 1954
TL;DR: This paper is based on a lecture on the “Design and Analysis of Industrial Experiments” given by Dr O. L. Davies on the 8th of May 1954 and the recent designs developed by Box for the exploration of response surfaces are briefly considered.
Abstract: Summary Design and analysis of industrial experiments This paper is based on a lecture on the “Design and Analysis of Industrial Experiments” given by Dr O. L. Davies on the 8th of May 1954 to the Industrial Section of the “Vereniging voor Statistiek”. Production, formulation, and testing are distinguished as three separate fields of chemical activity where experimental designs can be applied, and various numerical examples of such experiments are discussed in detail. They consist of a 23factorial design, a 24half replicate and a 25quarter replicate fractional factorial design, and a three-way classification. In a final section the recent designs developed by Box for the exploration of response surfaces are briefly considered.

1,035 citations

Book
01 Jan 1988

238 citations

Book ChapterDOI
TL;DR: In this article, the authors propose to re-strict attention to decision procedures whose maximum risk does not exceed the minimax risk by more than a given amount, where the average risk is minimized with respect to some guessed a priori distribution suggested by previous experience.
Abstract: Instead of minimizing the maximum risk it is proposed to re-strict attention to decision procedures whose maximum risk does not exceed the minimax risk by more than a given amount. Subject to this restriction one may wish to minimize the average risk with respect to some guessed a priori distribution suggested by previous experience. It is shown how Wald’s minimax theory can be modified to yield analogous results concerning such restricted Bayes solutions. A number of examples are discussed, and some extensions of the above criterion are briefly considered.

207 citations


"Confidence intervals in regression ..." refers background in this paper

  • ...Similarly to Hodges and Lehmann (1952), Bickel (1983, 1984), Kabaila (1998), Kabaila (2005b), Farchione and Kabaila (2008), Kabaila and Tuck (2008) and Kabaila and Giri (2009b), our aim is to utilize the uncertain prior information in the frequentist inference of interest, whilst providing a…...

    [...]

Journal ArticleDOI
John W. Pratt1
TL;DR: In this article, the expected length of a confidence interval is shown to equal the integral over false values of the probability each false value is included, and two desiderata for choosing among confidence procedures lead to the same measure of desirability.
Abstract: The expected length of a confidence interval is shown to equal the integral over false values of the probability each false value is included. Thus two desiderata for choosing among confidence procedures lead to the same measure of desirability. Furthermore, by common definitions of “optimum,” a procedure is optimum as regards including false values if and only if it is optimum as regards expected length. However, the procedure with minimum expected length ordinarily depends on the true value of the parameter. The possibility is explored of minimizing the average expected length, averaging according to some weighting on the possible parameter values. (This is not the same as assuming a prior distribution and using Bayes' Theorem.) The ideas are applied to the mean and variance of a normal distribution and the probability of success in binomial trials.

182 citations


"Confidence intervals in regression ..." refers methods in this paper

  • ...The idea of minimizing a weighted average expected length of a confidence interval, subject to a coverage probability inequality constraint, appears to have been first used by Pratt (1961)....

    [...]