Simultaneous inference in general parametric models.
Summary (4 min read)
1 Introduction
- Common multiple comparison procedures adjust for multiplicity and thus ensure that the overall type I error remains below the pre-specified significance level α.
- Section˜2 defines the general model and obtains the asymptotic or exact distribution of linear functions of elemental model parameters under rather weak conditions.
- In Section˜3 the authors describe the framework for simultaneous inference procedures in general parametric models.
- Most interesting from a practical point of view is Section˜6 where the authors analyze four rather challenging problems with the tools developed in this paper.
2 Model and Parameters
- In this section the authors introduce the underlying model assumptions and derive some asymptotic results necessary in the subsequent sections.
- The results from this section form the basis for the simultaneous inference procedures described in Section˜3.
- The model contains fixed but unknown elemental parameters θ ∈ Rp and other (random or nuisance) parameters η.
- In what follows the authors describe the underlying model assumptions, the limiting distribution of estimates of their parameters of interest ϑ, as well as the corresponding test statistics for hypotheses about ϑ and their limiting joint distribution.
3 Global and Simultaneous Inference
- Based on the results from Section˜2, the authors now focus on the derivation of suitable inference procedures.
- The authors start considering the general linear hypothesis (Searle, 1971) formulated in terms of their parameters of interest ϑ.
- This approximating distribution will now be used as the reference distribution when constructing the inference procedures.
- Note that a small global p-value (obtained from one of these procedures) leading to a rejection of H0 does not give further indication about the nature of the significant result.
- A stronger assumption than asymptotic normality of θ̂n in (2) is exact normality, i.e., θ̂n ∼ Np(θ,Σ).
3.1 Global Inference
- The F - and the χ2-test are classical approaches to assess the global null hypothesis H0.
- Standard results (such as Theorem 3.5, Serfling, 1980) ensure that X2 = T⊤nR + Furthermore, Rank(Rn) + denotes the Moore-Penrose inverse of the correlation matrix Rank(R).
3.2 Simultaneous Inference
- In what follows the authors use adjusted p-values to describe the decision rules.
- The adjusted p-values are calculated from expression˜(5).
- Similar results also hold for one-sided testing problems.
- Single-step procedures can always be improved by stepwise extensions based on the closed test procedure.
4 Applications
- The methodological framework described in Sections˜2 and 3 is very general and thus applicable to a wide range of statistical models.
- Many estimation techniques, such as maximum likelihood and M-estimation, provide at least asymptotically normal estimates of the elemental parameters together with consistent estimates of their covariance matrix.
- Detailed numerical examples are discussed in Section˜6.
- Many further multiple comparison procedures have been investigated in the past, which all fit into this framework.
- Thus, related simultaneous tests and confidence intervals do not rely on asymptotics and can be computed analytically instead, as shown in Section˜3.
5 Implementation
- The multcomp package (Hothorn et˜al., 2008) in R (R Development Core Team, 2008) provides a general implementation of the framework for simultaneous inference in semiparametric models described in Sections˜2 and˜3.
- The two remaining arguments alternative and rhs define the direction of the alternative (see Section˜3) and m, respectively.
- One should do this whenever there is doubt about what the default contrasts measure, which typically happens in models with higher order interaction terms.
- The numerical accuracy of adjusted p-values and simultaneous confidence intervals implemented in multcomp is continuously checked against results reported by Westfall et˜al.
6.1 Genetic Components of Alcoholism
- Various studies have linked alcohol dependence phenotypes to chromosome 4.
- One candidate gene is NACP (non-amyloid component of plaques), coding for alpha synuclein.
- Bönsch et˜al. (2005) found longer alleles of NACP -REP1 in alcohol-dependent patients compared with healthy controls and report that the allele lengths show some association with levels of expressed alpha synuclein mRNA in alcohol-dependent subjects .
- The data are available from package coin.
- Thus, the authors fit a simple one-way ANOVA model to the data and define K such that Kθ contains all three group differences (Tukey’s all-pairwise comparisons):.
R> summary(amod_glht)
- Because of the variance heterogeneity that can be observed in Figure˜1, one might be concerned with the validity of the above results stating that there is no difference between any combination of the three allele lengths.
- Sn might be more appropriate in this situation, and the vcov argument can be used to specify a function to compute some alternative covariance estimator.
R> summary(amod_glht_sw)
- The authors used the sandwich function from package sandwich (Zeileis, 2004, 2006) which provides us with a heteroscedasticity-consistent estimator of the covariance matrix.
- This result is more in line with previously published findings for this study obtained from nonparametric test procedures such as the Kruskal-Wallis test.
- A comparison of the simultaneous confidence intervals calculated based on the ordinary and sandwich estimator is given in Figure˜2.
6.2 Prediction of Total Body Fat
- Garcia et˜al. (2005) report on the development of predictive regression equations for body fat content by means of p = 9 common anthropometric measurements which were obtained for n = 71 healthy German women.
- In addition, the women’s body composition was measured by Dual Energy X-Ray Absorptiometry (DXA).
- This reference method is very accurate in measuring body fat but finds little applicability in practical environments, mainly because of high costs and the methodological efforts needed.
- Backward-elimination was applied to select important variables from the available anthropometrical measurements and Garcia et˜al.
- Here, the authors fit the saturated model to the data and use the max-t test over all t-statistics to select important variables based on adjusted p-values.
R> summary(lmod <- lm(DEXfat ~ ., data = bodyfat))
- The marix of linear functions K is basically the identity matrix, except for the intercept which is omitted.
- Once the matrix K has been defined, it can be used to set up the general linear hypotheses:.
R> summary(lmod_glht)
- Only two covariates, waist and hip circumference, seem to be important and caused the rejection of H0.
- Alternatively, an MM-estimator (Yohai, 1987) as implemented by lmrob from package lmrob (Todorov et˜al., 2007) can be used to fit a robust version of the above linear model, the results coincide rather nicely (note that the control arguments to lmrob were changed in multcomp version 1.2-6 and thus the results have slightly changed):.
6.3 Smoking and Alzheimer’s Disease
- Salib and Hillier (1997) report results of a case-control study on Alzheimer’s disease and smoking behavior of 198 female and male Alzheimer patients and 164 controls.
- The alzheimer data have been re-constructed from Table˜4 in Salib and Hillier (1997).
- The authors conclude that ‘cigarette smoking is less frequent in men with Alzheimer’s disease.’.
- Here, the authors focus on how a potential association can be described (see Hothorn et˜al., 2006, for a non-parametric approach).
- First, the authors fit a logistic regression model including both main effects and an interaction effect of smoking and gender.
R> summary(gmod)
- The negative regression coefficient for heavy smoking males indicates that Alzheimer’s disease might be less frequent in this group, but the model is still difficult to interpret based on the coefficients and corresponding p-values only.
- Therefore, confidence intervals on the probability scale for the different ‘risk groups’ are interesting and can be computed as follows.
- For each combination of gender and smoking behavior, the probability of suffering from Alzheimer’s disease can be estimated by computing the logit function of the linear predictor from model gmod.
- Using the predict method for generalized linear models is a convenient way to compute these probability estimates.
R> plot(gmod_ci, xlab = "Probability of Developing Alzheimer",
- + xlim = c(0, 1)) The simultaneous confidence intervals are depicted in Figure˜3.
- Using this representation of the results, it is obvious that Alzheimer’s disease is less frequent in heavy smoking men compared to all other configurations of the two covariates.
6.4 Acute Myeloid Leukemia Survival
- The treatment of patients suffering from acute myeloid leukemia (AML) is determined by a tumor classification scheme taking the status of various cytogenetic aberrations into account.
- Bullinger et˜al. (2004) investigate an extended tumor classification scheme incorporating molecular subgroups of the disease obtained by gene expression profiling.
- The overall survival time and censoring indicator as well as the clinical variables age, sex, lactic dehydrogenase level (LDH), white blood cell count (WBC), and treatment group are taken from Supplementary Table 1 in Bullinger et˜al.
- One interesting question might be the usefulness of this risk score.
- Tukey’s allpairwise comparisons highlight that there seems to be a difference between ‘high’ scores and both ‘low’ and ‘intermediate’ ones but the latter two aren’t distinguishable:.
R> summary(glht(smod, linfct = mcp(risk = "Tukey")))
- Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: survreg(formula = Surv(time, event) ~ Sex +.
- Again, a sandwich estimator of the covariance matrix.
- Sn can be plugged-in but the results stay very much the same in this case.
6.5 Forest Regeneration
- Young trees suffer from browsing damage, mostly by roe and red deer.
- The survey takes place in all 756 game management districts (‘Hegegemeinschaften’) in Bavaria.
- The data of 2700 trees include the species and a binary variable indicating whether or not the tree suffers from damage caused by deer browsing.
- The authors fit a mixed logistic regression model (using package lme4, Bates, 2005, 2007) without intercept and with random effects accounting for the spatial variation of the trees.
- For each plot nested within a set of five plots orientated on a 100m transect (the location of the transect is determined by a pre-defined equally spaced lattice of the area under test), a random intercept is included in the model.
R> ci$confint[,2:3] <- ci$confint[,3:2]
- Browsing is less frequent in hardwood but especially small oak trees are severely at risk.
- The local authorities increased the number of roe deers to be harvested in the following years.
- The large confidence interval for ash, maple, elm and lime trees is caused by the small sample size.
7 Conclusion
- In essence, all that is required is a parameter estimate θ̂n following an asymptotic multivariate normal distribution, and a consistent estimate of its covariance matrix.
- Standard software packages can be used to compute these quantities.
- The examples given in Section˜6 illustrate two facts.
- At first, the presented approach helps to formulate simultaneous inference procedures in situations that were previously hard to deal with and, at second, a flexible open-source implementation offers tools to actually perform such procedures rather easily.
Did you find this useful? Give us your feedback
Citations
1,827 citations
Cites methods from "Simultaneous inference in general p..."
...However, adjusted p-values controlling the family-wise error rate may be obtained using the function glht() in the packagemultcomp, assuming that test statistics jointly follow a multivariate normal or t distribution [43]....
[...]
1,042 citations
Cites methods from "Simultaneous inference in general p..."
...P values resulting from differential abundance testing (via R multcomp and lsmeans packages) were adjusted using the Benjamini-Hochberg procedure and an FDR-cutoff of 5% across all clusters/ subsets and between-group comparisons (Hothorn et al., 2008; Lenth, 2016)....
[...]
923 citations
767 citations
Cites methods from "Simultaneous inference in general p..."
...…of the aberrant DNA methylation signature for each cluster was performed using an ANOVA test, with correction for multiple testing according to the Benjamini-Hochberg method, followed by Dunnett’s post hoc test using the normal CD34+ samples as the reference group (Hothorn et al., 2008)....
[...]
...Identification of the aberrant DNA methylation signature for each cluster was performed using an ANOVA test, with correction for multiple testing according to the Benjamini-Hochberg method, followed by Dunnett’s post hoc test using the normal CD34+ samples as the reference group (Hothorn et al., 2008)....
[...]
References
6,955 citations
4,827 citations
[...]
3,429 citations
"Simultaneous inference in general p..." refers methods in this paper
...We refer to Hsu (1996), Chapter 7, and Searle (1971), Chapter 7.3, for further discussions and examples on this issue....
[...]
...We start considering the general linear hypothesis (Searle, 1971) formulated in terms of our parameters of interest ϑ H0 : ϑ := Kθ = m....
[...]
...We start considering the general linear hypothesis (Searle, 1971) formulated in terms of our parameters of interest θ...
[...]
2,401 citations
1,695 citations
"Simultaneous inference in general p..." refers methods in this paper
...For a general reading on multiple comparison procedures we refer to Hochberg and Tamhane (1987) and Hsu (1996)....
[...]
...Multiple comparisons in linear models have been in use for a long time, see Hochberg and Tamhane (1987), Hsu (1996), and Bretz et ̃al....
[...]
Related Papers (5)
Frequently Asked Questions (14)
Q2. What is the default re-parametrization used as elemental parameters in the R?
The so-called ”treatment contrast” vector θ = (µ, γ2− γ1, γ3− γ1, . . . , γq−γ1) is, for example, the default re-parametrization used as elemental parameters in the R-system for statistical computing (R Development Core Team, 2008).
Q3. What is the purpose of this paper?
In this paper the authors aim at a unified description of simultaneous inference procedures in parametric models with generally correlated parameter estimates.
Q4. What is the advantage of single-step procedures?
Single-step procedures have the advantage that corresponding simultaneous confidence intervals are easily available, as previously noted.
Q5. What is the p-value for a given family of null hypotheses?
That is, for a given family of null hypotheses H10 , . . . , H k 0 , an individual hypothesis H j 0 is rejected only if all intersection hypotheses HJ = ⋂ i∈J H i 0 with j ∈ J ⊆ {1, . . . , k} are rejected (Marcus et˜al., 1976).
Q6. What is the scalar test statistic for testing the global null hypothesis?
By construction, the authors can reject an individual null hypothesis Hj0 , j = 1, . . . , k, whenever the associated adjusted p-value is less than or equal to the pre-specified significance level α, i.e., pj ≤ α.
Q7. What is the simplest way to model the response?
The response is modelled by a linear combination of the covariates with normal error εi and constant variance σ 2,Yi = β0 +q ∑j=1βjXij +
Q8. What is the general framework for simultaneous inference?
The general framework described here extends the current canonical theory with respect to the following aspects: (i) model assumptions such as normality and homoscedasticity are relaxed, thus allowing for simultaneous inference in generalized linear models, mixed effects models, survival models, etc.; (ii) arbitrary linear functions of the elemental parameters are allowed, not just contrasts of means in AN(C)OVA models; (iii) computing the reference distribution is feasible for arbitrary designs, especially for unbalanced designs; and (iv) a unified implementation is provided which allows for a fast transition of the theoretical results to the desks of data analysts interested in simultaneous inferences for multiple hypotheses.
Q9. What is the p-value for the jth individual two-sided hypothesis?
In the present context of single-step tests, the (at least asymptotic) adjusted p-value for the jth individual two-sided hypothesis Hj0 : ϑj = mj, j = 1, . . . , k, is given bypj = 1− gν(Rn, |tj|),where t1, . . . , tk denote the observed test statistics.
Q10. What is the p-value for the global null hypothesis?
The resulting global p-value (exact or approximate, depending on context) for H0 is 1 − gν(Rn,max |t|) when T = t has been observed.
Q11. What is the way to test the global null hypothesis?
Another suitable scalar test statistic for testing the global hypothesis H0 is to consider the maximum of the individual test statistics T1,n, . . . , Tk,n of the multivariate statistic Tn = (T1,n, . . . , Tk,n), leading to a max-t type test statistic max(|Tn|).
Q12. What are examples of multiple comparison procedures?
Examples of such multiple comparison procedures include Dunnett’s many-to-one comparisons, Tukey’s all-pairwise comparisons, sequential pairwise contrasts, comparisons with the average, changepoint analyses, dose-response contrasts, etc.
Q13. Why is mcp() not available in multcomp?
Because it is impossible to determine the parameters of interest automatically in this case, mcp() in multcomp will by default generate comparisons for the main effects γj only, ignoring covariates and interactions.
Q14. What is the sequence of n needed to establish -convergence in (4)?
Then the authors haveãnRn = D −1/2 n S ⋆ nD −1/2 n= (anDn) −1/2(anS ⋆ n)(anDn) −1/2P −→ diag(Σ⋆)−1/2 Σ⋆ diag(Σ⋆)−1/2 =: R ∈ Rk,kwhere the convergence in probability to a constant follows from Slutzky’s Theorem (Theorem 1.5.4, Serfling, 1980) and therefore (4) holds.