Journal Article•DOI•

Simultaneous inference in general parametric models.

Torsten Hothorn¹, Frank Bretz², Peter H. Westfall³•Institutions (3)

Ludwig Maximilian University of Munich¹, Novartis², Texas Tech University³

01 Jun 2008-Biometrical Journal (Biom J)-Vol. 50, Iss: 3, pp 346-363

TL;DR: This paper describes simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters, and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalizedlinear models, linear mixed effects models, the Cox model, robust linear models, etc.

read less

Abstract: Simultaneous inference is a common problem in many areas of application. If multiple null hypotheses are tested simultaneously, the probability of rejecting erroneously at least one of them increases beyond the pre-specified significance level. Simultaneous inference procedures have to be used which adjust for multiplicity and thus control the overall type I error rate. In this paper we describe simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters. The framework described here is quite general and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalized linear models, linear mixed effects models, the Cox model, robust linear models, etc. Several examples using a variety of different statistical models illustrate the breadth

...read moreread less

Summary (4 min read)

1 Introduction

Common multiple comparison procedures adjust for multiplicity and thus ensure that the overall type I error remains below the pre-specified significance level α.
Section˜2 defines the general model and obtains the asymptotic or exact distribution of linear functions of elemental model parameters under rather weak conditions.
In Section˜3 the authors describe the framework for simultaneous inference procedures in general parametric models.
Most interesting from a practical point of view is Section˜6 where the authors analyze four rather challenging problems with the tools developed in this paper.

2 Model and Parameters

In this section the authors introduce the underlying model assumptions and derive some asymptotic results necessary in the subsequent sections.
The results from this section form the basis for the simultaneous inference procedures described in Section˜3.
The model contains fixed but unknown elemental parameters θ ∈ Rp and other (random or nuisance) parameters η.
In what follows the authors describe the underlying model assumptions, the limiting distribution of estimates of their parameters of interest ϑ, as well as the corresponding test statistics for hypotheses about ϑ and their limiting joint distribution.

3 Global and Simultaneous Inference

Based on the results from Section˜2, the authors now focus on the derivation of suitable inference procedures.
The authors start considering the general linear hypothesis (Searle, 1971) formulated in terms of their parameters of interest ϑ.
This approximating distribution will now be used as the reference distribution when constructing the inference procedures.
Note that a small global p-value (obtained from one of these procedures) leading to a rejection of H0 does not give further indication about the nature of the significant result.
A stronger assumption than asymptotic normality of θ̂n in (2) is exact normality, i.e., θ̂n ∼ Np(θ,Σ).

3.1 Global Inference

The F - and the χ2-test are classical approaches to assess the global null hypothesis H0.
Standard results (such as Theorem 3.5, Serfling, 1980) ensure that X2 = T⊤nR + Furthermore, Rank(Rn) + denotes the Moore-Penrose inverse of the correlation matrix Rank(R).

3.2 Simultaneous Inference

In what follows the authors use adjusted p-values to describe the decision rules.
The adjusted p-values are calculated from expression˜(5).
Similar results also hold for one-sided testing problems.
Single-step procedures can always be improved by stepwise extensions based on the closed test procedure.

4 Applications

The methodological framework described in Sections˜2 and 3 is very general and thus applicable to a wide range of statistical models.
Many estimation techniques, such as maximum likelihood and M-estimation, provide at least asymptotically normal estimates of the elemental parameters together with consistent estimates of their covariance matrix.
Detailed numerical examples are discussed in Section˜6.
Many further multiple comparison procedures have been investigated in the past, which all fit into this framework.
Thus, related simultaneous tests and confidence intervals do not rely on asymptotics and can be computed analytically instead, as shown in Section˜3.

5 Implementation

The multcomp package (Hothorn et˜al., 2008) in R (R Development Core Team, 2008) provides a general implementation of the framework for simultaneous inference in semiparametric models described in Sections˜2 and˜3.
The two remaining arguments alternative and rhs define the direction of the alternative (see Section˜3) and m, respectively.
One should do this whenever there is doubt about what the default contrasts measure, which typically happens in models with higher order interaction terms.
The numerical accuracy of adjusted p-values and simultaneous confidence intervals implemented in multcomp is continuously checked against results reported by Westfall et˜al.

6.1 Genetic Components of Alcoholism

Various studies have linked alcohol dependence phenotypes to chromosome 4.
One candidate gene is NACP (non-amyloid component of plaques), coding for alpha synuclein.
Bönsch et˜al. (2005) found longer alleles of NACP -REP1 in alcohol-dependent patients compared with healthy controls and report that the allele lengths show some association with levels of expressed alpha synuclein mRNA in alcohol-dependent subjects .
The data are available from package coin.
Thus, the authors fit a simple one-way ANOVA model to the data and define K such that Kθ contains all three group differences (Tukey’s all-pairwise comparisons):.

R> summary(amod_glht)

Because of the variance heterogeneity that can be observed in Figure˜1, one might be concerned with the validity of the above results stating that there is no difference between any combination of the three allele lengths.
Sn might be more appropriate in this situation, and the vcov argument can be used to specify a function to compute some alternative covariance estimator.

R> summary(amod_glht_sw)

The authors used the sandwich function from package sandwich (Zeileis, 2004, 2006) which provides us with a heteroscedasticity-consistent estimator of the covariance matrix.
This result is more in line with previously published findings for this study obtained from nonparametric test procedures such as the Kruskal-Wallis test.
A comparison of the simultaneous confidence intervals calculated based on the ordinary and sandwich estimator is given in Figure˜2.

6.2 Prediction of Total Body Fat

Garcia et˜al. (2005) report on the development of predictive regression equations for body fat content by means of p = 9 common anthropometric measurements which were obtained for n = 71 healthy German women.
In addition, the women’s body composition was measured by Dual Energy X-Ray Absorptiometry (DXA).
This reference method is very accurate in measuring body fat but finds little applicability in practical environments, mainly because of high costs and the methodological efforts needed.
Backward-elimination was applied to select important variables from the available anthropometrical measurements and Garcia et˜al.
Here, the authors fit the saturated model to the data and use the max-t test over all t-statistics to select important variables based on adjusted p-values.

R> summary(lmod <- lm(DEXfat ~ ., data = bodyfat))

The marix of linear functions K is basically the identity matrix, except for the intercept which is omitted.
Once the matrix K has been defined, it can be used to set up the general linear hypotheses:.

R> summary(lmod_glht)

Only two covariates, waist and hip circumference, seem to be important and caused the rejection of H0.
Alternatively, an MM-estimator (Yohai, 1987) as implemented by lmrob from package lmrob (Todorov et˜al., 2007) can be used to fit a robust version of the above linear model, the results coincide rather nicely (note that the control arguments to lmrob were changed in multcomp version 1.2-6 and thus the results have slightly changed):.

6.3 Smoking and Alzheimer’s Disease

Salib and Hillier (1997) report results of a case-control study on Alzheimer’s disease and smoking behavior of 198 female and male Alzheimer patients and 164 controls.
The alzheimer data have been re-constructed from Table˜4 in Salib and Hillier (1997).
The authors conclude that ‘cigarette smoking is less frequent in men with Alzheimer’s disease.’.
Here, the authors focus on how a potential association can be described (see Hothorn et˜al., 2006, for a non-parametric approach).
First, the authors fit a logistic regression model including both main effects and an interaction effect of smoking and gender.

R> summary(gmod)

The negative regression coefficient for heavy smoking males indicates that Alzheimer’s disease might be less frequent in this group, but the model is still difficult to interpret based on the coefficients and corresponding p-values only.
Therefore, confidence intervals on the probability scale for the different ‘risk groups’ are interesting and can be computed as follows.
For each combination of gender and smoking behavior, the probability of suffering from Alzheimer’s disease can be estimated by computing the logit function of the linear predictor from model gmod.
Using the predict method for generalized linear models is a convenient way to compute these probability estimates.

R> plot(gmod_ci, xlab = "Probability of Developing Alzheimer",

+ xlim = c(0, 1)) The simultaneous confidence intervals are depicted in Figure˜3.
Using this representation of the results, it is obvious that Alzheimer’s disease is less frequent in heavy smoking men compared to all other configurations of the two covariates.

6.4 Acute Myeloid Leukemia Survival

The treatment of patients suffering from acute myeloid leukemia (AML) is determined by a tumor classification scheme taking the status of various cytogenetic aberrations into account.
Bullinger et˜al. (2004) investigate an extended tumor classification scheme incorporating molecular subgroups of the disease obtained by gene expression profiling.
The overall survival time and censoring indicator as well as the clinical variables age, sex, lactic dehydrogenase level (LDH), white blood cell count (WBC), and treatment group are taken from Supplementary Table 1 in Bullinger et˜al.
One interesting question might be the usefulness of this risk score.
Tukey’s allpairwise comparisons highlight that there seems to be a difference between ‘high’ scores and both ‘low’ and ‘intermediate’ ones but the latter two aren’t distinguishable:.

R> summary(glht(smod, linfct = mcp(risk = "Tukey")))

Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: survreg(formula = Surv(time, event) ~ Sex +.
Again, a sandwich estimator of the covariance matrix.
Sn can be plugged-in but the results stay very much the same in this case.

6.5 Forest Regeneration

Young trees suffer from browsing damage, mostly by roe and red deer.
The survey takes place in all 756 game management districts (‘Hegegemeinschaften’) in Bavaria.
The data of 2700 trees include the species and a binary variable indicating whether or not the tree suffers from damage caused by deer browsing.
The authors fit a mixed logistic regression model (using package lme4, Bates, 2005, 2007) without intercept and with random effects accounting for the spatial variation of the trees.
For each plot nested within a set of five plots orientated on a 100m transect (the location of the transect is determined by a pre-defined equally spaced lattice of the area under test), a random intercept is included in the model.

R> ci$confint[,2:3] <- ci$confint[,3:2]

Browsing is less frequent in hardwood but especially small oak trees are severely at risk.
The local authorities increased the number of roe deers to be harvested in the following years.
The large confidence interval for ash, maple, elm and lime trees is caused by the small sample size.

7 Conclusion

In essence, all that is required is a parameter estimate θ̂n following an asymptotic multivariate normal distribution, and a consistent estimate of its covariance matrix.
Standard software packages can be used to compute these quantities.
The examples given in Section˜6 illustrate two facts.
At first, the presented approach helps to formulate simultaneous inference procedures in situations that were previously hard to deal with and, at second, a flexible open-source implementation offers tools to actually perform such procedures rather easily.

Did you find this useful? Give us your feedback

Figures (4)

Figure 1: alpha data: Distribution of levels of expressed alpha synuclein mRNA in three groups defined by the NACP -REP1 allele lengths.

Figure 2: alpha data: Simultaneous confidence intervals based on the ordinary covariance matrix (left) and a sandwich estimator (right).

Figure 3: alzheimer data: Simultaneous confidence intervals for the probability to suffer from Alzheimer’s disease.

Figure 4: trees513 data: Probability of damage caused by roe deer browsing for six tree species. Sample sizes are given in brackets.

Content maybe subject to copyright Report

Simultaneous Inference

in General Parametric Models

∗

Torsten Hothorn

Institut f

ur Statistik

Ludwig-Maximilians-Universit

at M

unchen

Ludwigstraße 33, D–80539 M

unchen, Germany

Frank Bretz

Statistical Methodology, Clinical Information Sciences

Novartis Pharma AG

CH-4002 Basel, Switzerland

Peter Westfall

Texas Tech University

Lubbock, TX 79409, U.S.A

March 15, 2013

Abstract

Simu ltaneous inference is a common problem in many areas of application. If

multiple null hypotheses are tested simultaneously, the probability of rejecting er-

roneously at least one of them increases beyond the pre-speciﬁed signiﬁcance level.

Simu ltaneous inference procedures have to be used which adjust for multiplicity and

thu s control the overall type I error rate. In this paper we describe simultaneous infer-

ence procedures in general parametric models, where the experimental questions are

speciﬁed through a linear combination of elemental model parameters. The frame-

work described here is quite general and extends the canonical theory of multiple

comparison procedures in ANOVA models to linear regression problems, generalized

linear models, linear mixed eﬀects models, the Cox model, robust linear models, etc.

Seve ral examples using a variety of diﬀerent statistical models illustrate the breadth

∗

This is a preprint of an article published in Biometrical Journal, Volume 50, Number 3, 346–363.

➞

2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim; available online

http://www.

biometrical-journal.com

of the results. For the analyses we use the R add-on package multcomp, which pro-

vides a convenient interface to the general approach adopted here.

Key words: multiple tests, multiple comparisons, simultaneous conﬁdence intervals,

adjusted p-values, multivariate normal distribution, robust statistics.

1 Introduction

Multiplicity is an intrinsic problem of any simultaneous inference. If each of k, say, null

hypotheses is tested at nominal level α, the overall type I error rate can be substantially

larger than α. That is, the probability of at least one erroneous rejection is larger tha n

α for k ≥ 2. Common multiple comparison procedures adjust for multiplicity and thus

ensure that the overall type I error remains below the pre-sp eciﬁe d signiﬁcance level α.

Examples of such multiple comparison procedures include Dunnett’s many-to-one compar-

isons, Tukey’s all-pairwise comparisons, sequential pairwise contrasts, comparisons with

the average, changep oint analyses, dose-response contrasts, etc. These procedures are all

well established for classical regression and ANOVA models allowing for cova riates and/or

factorial treatment structures with i.i.d.˜normal errors and constant variance, see Bretz

et˜al.

(2008) and the references therein. For a general reading on multiple co mparison

procedures we refer to

Hochberg and Tamhane (1987) and Hsu (1996).

In this paper we aim at a uniﬁed description of simultaneous inference procedures in para-

metric models with generally correlated parameter estimates. Each individual null hypothe-

sis is speciﬁed through a linear combination of elemental model parameters and we allow for

k of such null hypotheses to be tested simultaneously, regardless of the number of elemental

model parameters p. The general framework described here extends the current canoni-

cal theory with respect to the following aspects: (i) model assumptions such as normality

and homoscedasticity are relaxed, thus allowing for simultaneous inference in generalized

linear models, mixed eﬀects models, survival models, etc.; (ii) arbitrary linear functions of

the elemental parameters are allowed, not just contrasts of means in AN(C)OVA models;

(iii) computing the r eference distribution is feasible for arbitrary designs, especially for

unbalanced designs; and (iv) a uniﬁed implementation is provided which allows for a fast

transition of the theoretical results to the desks of data analysts interested in simultaneous

inferences for multiple hypotheses.

Accordingly, the paper is organized as follows. Section˜

2 deﬁnes the general model and ob-

tains the asymptotic or exact distribution of linear functions of elemental model parameters

under rather weak conditions. In Section˜

3 we describe the framework for simultaneous

inference procedures in general parametric models. An overview about important applica-

tions of the methodology is given in Section˜4 followed by a s hort discussion of the software

implementation in Section˜

5. Most interesting from a practical point of view is Section˜6

where we analyze four rather challenging problems with the tools developed in this paper.

2 Model and Parameters

In this section we introduce the underlying model assumptions and derive some asymptotic

results necessary in the subsequent sections. The results from this section form the basis

for the simultaneous inference procedures described in Section˜3.

Let M((Z

, . . . , Z

), θ, η ) denote a semi-parametric statistical model. The set of n obser-

vations is described by (Z

, . . . , Z

). The model contains ﬁxed but unknown elemental

parameters θ ∈ R

and other (random or nuisance) parameters η. We are primarily in-

terested in the linear functions ϑ := Kθ of the parameter vector θ as speciﬁed through

the consta nt matrix K ∈ R

k,p

. In what follows we describe the underlying model assump-

tions, the limiting distribution of estimates of our parameters of interest ϑ, as well as the

corresponding test statistics for hypotheses about ϑ and their limiting joint distribution.

Suppose

∈ R

is an estimate of θ and S

∈ R

p,p

is an estimate of cov(

) with

−→ Σ ∈ R

p,p

(1)

for some positive, nondecreasing sequence a

. Furthermore, we assume that a multivariate

central limit theorem holds, i.e.,

1/2

(

− θ)

−→ N

(0, Σ). (2)

If both (

1) and (2) are fulﬁlled we write

∼ N

(θ, S

). Then, by Theorem 3.3.A in Serﬂing

(1980), the linear function

= K

, i.e., an estimate of our parameters of interest, also

follows an approximate multivariate normal distribution

= K

∼ N

(ϑ, S

⋆

)

with covariance matrix S

⋆

:= KS

⊤

for any ﬁxed matrix K ∈ R

k,p

. Thus we need not

to distinguish between elemental parameters θ or derived parameters ϑ = Kθ that are of

interest to the r esearcher. Instead we simply assume for the moment that we have (in

analogy to (

1) and (2))

∼ N

(ϑ, S

⋆

) with a

⋆

−→ Σ

⋆

:= KΣK

⊤

∈ R

k,k

(3)

and that the k parameters in ϑ are themselves the parameters of interest to the researcher.

It is assumed that the diagonal elements of the covariance matrix are positive, i.e., Σ

⋆

> 0

for j = 1, . . . , k.

Then, the standardized estimator

is again asymptotically normally distributed

:= D

−1/2

(

− ϑ)

∼ N

(0, R

) (4)

where D

= diag(S

⋆

) is the diagonal matrix given by the diagonal elements of S

⋆

and

= D

−1/2

⋆

−1/2

∈ R

k,k

is the correlation matrix of the k-dimensional statistic T

. To demonstrate (4), note that

with (3) we have a

⋆

−→ Σ

⋆

and a

−→ diag(Σ

⋆

). Deﬁne the sequence ˜a

needed to

establish ˜a-convergence in (

4) by ˜a

≡ 1. Then we have

˜a

= D

−1/2

⋆

−1/2

= (a

)

−1/2

⋆

)(a

)

−1/2

−→ diag(Σ

⋆

)

−1/2

⋆

diag(Σ

⋆

)

−1/2

=: R ∈ R

k,k

where the convergence in probability to a constant follows from Slutzky’s Theorem (The-

orem 1.5.4,

Serﬂing, 1980) and therefore (4) holds. To ﬁnish note that

= D

−1/2

(

− ϑ) = (a

)

−1/2

1/2

(

− ϑ)

−→ N

(0, R).

For the purposes of multiple comparisons, we need convergence of multivariate probabilities

calculated for the vector T

when T

is assumed normally distributed with R

treated

as if it were the true correlation matrix. However, such probabilities P(max(|T

| ≤ t)

are continuous functions of R

(and a critical value t) which converge by R

−→ R as

a consequence of Theorem 1.7 in

Serﬂing (1980). In cases where T

is assumed multi-

variate t distributed with R

treated as the estimated correlation matrix, we have similar

convergence as the degrees of freedom approach inﬁnity.

Since we only assume that the parameter estimates are asymptotically normally distributed

with a consistent estimate of the associated covariance matrix being available, our frame-

work covers a large class of statistical models, including linear regression and ANOVA

models, generalized linear models, linear mixed eﬀects models, the Cox model, robust lin-

ear models, etc. Standard software packages can be used to ﬁt such models and obtain

the estimates

and S

which are essentially the only two quantities that are needed for

what follow s in Section˜

3. It should be noted that the elemental parameters θ are not

necessarily means or diﬀerences of means in AN(C)OVA models. Also, we do not restrict

our attention to contrasts of such means, but allow for any set of constants leading to the

linear functions ϑ = Kθ of interest. Speciﬁc examples for K and θ will be given later in

Sections˜

4 and 6.

3 Global and Simultaneous Inference

Based on the results from Section˜

2, we now focus on the derivation of suitable inference

procedures. We start considering the general linear hypothesis (

Searle, 1971) formulated

in terms of our parameters of interest ϑ

: ϑ := Kθ = m.

Under the conditions of H

it follows from Section˜2 that

= D

−1/2

(

− m)

∼ N

(0, R

This approximating distribution will now be used as the reference distribution when con-

structing the inference procedures. The global hypothesis H

can be tested using standard

global tests, such as the F - or the χ

-test. An alternative approach is to use maximum

tests, as explained in Subsection˜

3.1. Note that a small global p-value (obtained from one

of these procedures) leading to a rejection of H

does not give further indication about

the nature of the signiﬁcant result. Therefore, one is often interested in the individual null

hypotheses

: ϑ

= m

Testing the hypotheses set {H

, . . . , H

} simultaneously thus requires the individual as-

sessments while maintaining the familywise error rate, as discussed in Subsection˜3.2

At this point it is worth considering two special cases. A stronger a ssumption than asymp-

totic normality of

in (

2) is exact normality, i.e.,

∼ N

(θ, Σ). If the covariance matrix

Σ is known, it follows by standard arguments that T

∼ N

(0, R), when T

is normalized

using ﬁxed, known variances. Otherwise, in the typical situation of linear models with

normal i.i.d. errors, Σ = σ

A, where σ

is unknown but A is ﬁxed and known, the exact

distribution of T

is a k-dimensional multivariate t

(ν, R) distribution with ν degrees of

freedom (ν = n − p − 1 for linear models), see

Tong (1990 ).

3.1 Global Inference

The F - and the χ

-test are classical approaches to assess the global null hypothesis H

Standard results (such as Theorem 3.5,

Serﬂing, 1980) ensure that

= T

⊤

−→ χ

(Rank(R)) when

∼ N

(θ, S

)

F =

⊤

Rank(R)

∼ F(Rank(R), ν) when

∼ N

(θ, σ

A),

where Rank(R) and ν are the corresponding degrees of freedom of the χ

and F distri-

bution, respectively. Furthermore, Rank(R

)

denotes the Moore-Penrose inverse of the

correlation matrix Rank(R).

Another suitable scalar test statistic for testing the global hypothesis H

is to consider

the maximum of the individual test statistics T

1,n

, . . . , T

k,n

of the multivariate statistic

= (T

1,n

, . . . , T

k,n

), leading to a max-t type test statistic max(|T

|). The distribution

of this statistic under the conditions of H

can be handled through the k-dimensional

distribution

P(max(|T

|) ≤ t)

∼

−t

· · ·

−t

, . . . , x

; R, ν) dx

· · · dx

=: g

(R, t) (5)

HTML Viewer

Frequently Asked Questions (14)

Q1. What are the contributions mentioned in the paper "Simultaneous inference in general parametric models" ?

In this paper the authors describe simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters. Several examples using a variety of different statistical models illustrate the breadth ∗This is a preprint of an article published in Biometrical Journal, Volume 50, Number 3, 346–363.

Q2. What is the default re-parametrization used as elemental parameters in the R?

The so-called ”treatment contrast” vector θ = (µ, γ2− γ1, γ3− γ1, . . . , γq−γ1) is, for example, the default re-parametrization used as elemental parameters in the R-system for statistical computing (R Development Core Team, 2008).

Q3. What is the purpose of this paper?

In this paper the authors aim at a unified description of simultaneous inference procedures in parametric models with generally correlated parameter estimates.

Q4. What is the advantage of single-step procedures?

Single-step procedures have the advantage that corresponding simultaneous confidence intervals are easily available, as previously noted.

Q5. What is the p-value for a given family of null hypotheses?

That is, for a given family of null hypotheses H10 , . . . , H k 0 , an individual hypothesis H j 0 is rejected only if all intersection hypotheses HJ = ⋂ i∈J H i 0 with j ∈ J ⊆ {1, . . . , k} are rejected (Marcus et˜al., 1976).

Q6. What is the scalar test statistic for testing the global null hypothesis?

By construction, the authors can reject an individual null hypothesis Hj0 , j = 1, . . . , k, whenever the associated adjusted p-value is less than or equal to the pre-specified significance level α, i.e., pj ≤ α.

Q7. What is the simplest way to model the response?

The response is modelled by a linear combination of the covariates with normal error εi and constant variance σ 2,Yi = β0 +q ∑j=1βjXij +

Q8. What is the general framework for simultaneous inference?

The general framework described here extends the current canonical theory with respect to the following aspects: (i) model assumptions such as normality and homoscedasticity are relaxed, thus allowing for simultaneous inference in generalized linear models, mixed effects models, survival models, etc.; (ii) arbitrary linear functions of the elemental parameters are allowed, not just contrasts of means in AN(C)OVA models; (iii) computing the reference distribution is feasible for arbitrary designs, especially for unbalanced designs; and (iv) a unified implementation is provided which allows for a fast transition of the theoretical results to the desks of data analysts interested in simultaneous inferences for multiple hypotheses.

Q9. What is the p-value for the jth individual two-sided hypothesis?

In the present context of single-step tests, the (at least asymptotic) adjusted p-value for the jth individual two-sided hypothesis Hj0 : ϑj = mj, j = 1, . . . , k, is given bypj = 1− gν(Rn, |tj|),where t1, . . . , tk denote the observed test statistics.

Q10. What is the p-value for the global null hypothesis?

The resulting global p-value (exact or approximate, depending on context) for H0 is 1 − gν(Rn,max |t|) when T = t has been observed.

Q11. What is the way to test the global null hypothesis?

Another suitable scalar test statistic for testing the global hypothesis H0 is to consider the maximum of the individual test statistics T1,n, . . . , Tk,n of the multivariate statistic Tn = (T1,n, . . . , Tk,n), leading to a max-t type test statistic max(|Tn|).

Q12. What are examples of multiple comparison procedures?

Examples of such multiple comparison procedures include Dunnett’s many-to-one comparisons, Tukey’s all-pairwise comparisons, sequential pairwise contrasts, comparisons with the average, changepoint analyses, dose-response contrasts, etc.

Q13. Why is mcp() not available in multcomp?

Because it is impossible to determine the parameters of interest automatically in this case, mcp() in multcomp will by default generate comparisons for the main effects γj only, ignoring covariates and interactions.

Q14. What is the sequence of n needed to establish -convergence in (4)?

Then the authors haveãnRn = D −1/2 n S ⋆ nD −1/2 n= (anDn) −1/2(anS ⋆ n)(anDn) −1/2P −→ diag(Σ⋆)−1/2 Σ⋆ diag(Σ⋆)−1/2 =: R ∈ Rk,kwhere the convergence in probability to a constant follows from Slutzky’s Theorem (Theorem 1.5.4, Serfling, 1980) and therefore (4) holds.

Simultaneous inference in general parametric models.

Summary (4 min read)

1 Introduction

2 Model and Parameters

3 Global and Simultaneous Inference

3.1 Global Inference

3.2 Simultaneous Inference

4 Applications

5 Implementation

6.1 Genetic Components of Alcoholism

R> summary(amod_glht)

R> summary(amod_glht_sw)

6.2 Prediction of Total Body Fat

R> summary(lmod <- lm(DEXfat ~ ., data = bodyfat))

R> summary(lmod_glht)

6.3 Smoking and Alzheimer’s Disease

R> summary(gmod)

R> plot(gmod_ci, xlab = "Probability of Developing Alzheimer",

6.4 Acute Myeloid Leukemia Survival

R> summary(glht(smod, linfct = mcp(risk = "Tukey")))

6.5 Forest Regeneration

R> ci$confint[,2:3] <- ci$confint[,3:2]

7 Conclusion

Figures (4)

Citations

Cites methods from "Simultaneous inference in general p..."

Cites methods from "Simultaneous inference in general p..."

Cites methods from "Simultaneous inference in general p..."

References

"Simultaneous inference in general p..." refers methods in this paper

"Simultaneous inference in general p..." refers methods in this paper

Related Papers (5)

Frequently Asked Questions (14)

Q1. What are the contributions mentioned in the paper "Simultaneous inference in general parametric models" ?

Q2. What is the default re-parametrization used as elemental parameters in the R?

Q3. What is the purpose of this paper?

Q4. What is the advantage of single-step procedures?

Q5. What is the p-value for a given family of null hypotheses?

Q6. What is the scalar test statistic for testing the global null hypothesis?

Q7. What is the simplest way to model the response?

Q8. What is the general framework for simultaneous inference?

Q9. What is the p-value for the jth individual two-sided hypothesis?

Q10. What is the p-value for the global null hypothesis?

Q11. What is the way to test the global null hypothesis?

Q12. What are examples of multiple comparison procedures?

Q13. Why is mcp() not available in multcomp?

Q14. What is the sequence of n needed to establish -convergence in (4)?