Interaction Effects in Econometrics
Summary (3 min read)
1 Introduction
- A country may consider a reform that would strengthen the financial sector.
- This question involves interactions between financial development and dependency on external finance.
- In Section 2, the authors discuss some practical issues related to the specification of regressions with interaction effects and make recommendations for practitioners.
2 Linear Regression with Interaction Effects
- Many econometric issues related to models with interaction effects are very simple and the authors illustrate their discussion using simple Ordinary Least Squares (OLS) and instrumental variable (IV) estimation.
- Let Y be dependent variable, such as growth of an industrial sector, and X1 and X2 independent variables that may impact on growth, such as the dependency on external finance and financial development.
- Applied econometricians have typically allowed for interaction effects between two independent variables, X1 and X2 by estimating a simple multiple regression model of the form: Y = β0 + β1X1 + β2X2 + β3X1X2 + , (1) where X1X2 refers to a variable calculated as the simple observation-by-observation product of X1 and X2.
- Smith and Sasaki (1979) also argue that the inclusion of the interaction term might cause a multicollinearity problem.
- The point is simply that researchers sometimes do not notice the change in the interpretation of the coefficient estimate for the main terms when the interaction term is added.
2.1 Robustness to misspecification
- Often a researcher wants to test whether Y = f(X1, X2) and chose a linear specification such as (2) for convenience.
- The relevance of this observation is as follows.
- If quadratic terms are not otherwise ruled out, the authors recommend also estimating the specification (4) in order to verify that a purported interaction term is not spuriously capturing left-out squared terms.
- The potential bias from leaving out second order terms is easily understood.
- If X1 and X2 are correlated, the authors can write X2 = αX1 +w (where α is positive) so the interaction term (they suppress the mean for simplicity) becomes αX21 + X1w where the latter term has mean zero and will be part of the error in the regression.
2.2 Panel data
- Consider a panel data regression with left-hand side variable.
- Yit where i typically is a cross-sectional index, such as an individual or a country (the authors will use the term country, for brevity), and t a time index.
- The regression (5) is not robust to squared terms as in the simple OLS case, but in the panel data situation this regression is also not robust to slopes that vary across, say, countries.
- (Of course, if the time-series dimension of the data is large, one may directly allow for country-varying slopes.).
3.2 Using the Frisch-Waugh theorem to hedge against a spuri-
- If the authors want to find the effect of X1 on ∂Y/∂X2 and they want to ascertain that they are not picking up any other interaction or square term, they can interact X2 with the Frisch-Waugh residual.
- Notice that this generalizes the subtraction of the average (equivalent to a regression on a constant) and the subtraction of “country-specific” averages.
- This procedure may not result in an unbiased coefficient to the interaction if it is truly the interaction of the non-orthogonalized X1 and X2 that affects Y ; however, if the interaction involving orthogonalized terms are significant it makes it less likely that the interaction is spurious.
- In applications, interaction effects are however often intuitively motivated and the authors will illustrate in the Monte Carlo section how different generating processes will affect inference.
4.1 Interpretation of the main terms
- The authors first illustrate how the specification of the interaction term affects the interpretation of the main terms although they are not the first to make this point.
- Next, the authors allow for an interaction term that is either demeaned or not.
- The latter specifications are both correctly specified.
- In column (1) of Table 1, the results for the model without an interaction term are presented and, in columns (2) and (3), the correctly specified model is estimated.
- In column (2), the authors see how the coefficient to X1 changes from about 11 to about 3 when the regressors are not demeaned before they are interacted—a change is close to the predicted size of β3E{X2}.
4.2 IV estimation
- Next the authors consider a model with an interaction effect where one of the independent variables is endogenous.
- In Table 2, the authors show OLS and IV regressions, starting with OLS-estimates of model (1).
- The coefficients are, as expected, severely biased.
- The regression delivers point estimates similar to those of column (2), but this regression uses the exogenous X1 less directly in the interaction so this estimator is less efficient.
- In general, an IV-estimator is more efficient the higher the correlation of the instrument with the endogenous variable and in most applications one can expect X1 X̂2 to have the highest correlation with X1X2.
4.3 Non-linear terms in the regression
- In Table 3, the true model doesn’t include an interaction term, instead it is nonlinear in one of the main terms.
- When corr(X1, X2) 6= 0, as in this example, the interaction term might pick up a left-out variable effect.
- In column (1), the authors show the correct specification.
- The authors suggestion, to hedge against such spurious inference, is to include the squares of both main terms together with the interaction term.
- This model is correctly specified, albeit some regressors have true coefficients of zero and the authors get the correct result.
4.4 Panel data with varying slopes
- The true model have the slope for X2 varying across countries:.
- The authors find a spuriously significant coefficient to the interaction term and a coefficient to X2 which is similar to the average of the true country-varying slopes.
- The variable X1 has a lower mean for country 2 and since the slope of X2 is higher for country 2 the least squares algorithm can minimize the squared errors by assigning a negative coefficient to the interaction term.
- In effect, the estimated model allows for different slopes to X2 since ∂Yit/∂X2 = β2+β3X1it.
- This is not the true model, but since the model estimated does not allow the slope to vary in any other way, this outcome occurs.
5 Replications
- The authors replicate five important papers and examine if their implementation of interaction effects are robust.
- The authors robustness exercise makes the original claims of Rajan and Zingales (1998) empirically more convincing.
- The non-centered implementation of Caprio, Laeven, and Levine (2007), in their opinion, give a misleading impression of the effect of the main terms; for example, the t-statistic of “rights” in column 1 implies that there is large significant effect of ownership rights on valuation when owners cash-flow share is nil.
- Easterly, Levine, and Roodman (2004) examine whether foreign aid (Aid) is more effective in countries with good policy .
- Including quadratic terms strengthens the significance of the interaction of interest while the interaction becomes insignificant—with a point estimate equal to that of column (1)—when the Frisch-Waugh residual is used for aid.
6 Conclusions
- The authors provide practical advice regarding interpretation and robustness of models with interaction terms for econometric practitioners—in particular, they suggest some simple rules-of-thumb intended to minimize the risk of estimated interaction terms spuriously capturing other features of the data.
- The dependent variable is the annual compounded growth rate in real value added for each ISIC industry in each country for the period 1980–1990.
- The authors collected the data using the sources given in Castro, Clementi, and MacDonald (2004).
- To compute the controlling shareholders total cash-flow rights the authors sum direct and all indirect cash-flow rights.
- Dependent variable, Polity2, is a measure of democracy index.
Did you find this useful? Give us your feedback
Citations
442 citations
254 citations
177 citations
136 citations
131 citations
References
[...]
13,984 citations
5,648 citations
"Interaction Effects in Econometrics..." refers background in this paper
...Ethnic (Et) is ethnic fractionalization from Easterly and Levine (1997)....
[...]
5,425 citations
4,981 citations
3,127 citations
Related Papers (5)
Frequently Asked Questions (11)
Q2. What is the reason for the large change in the coefficient to the main term?
The large change in the coefficient to the main term is not due to misspecification but it reflects that the coefficient to X1 is to be interpreted as the marginal effect of X1 when X2 is zero.
Q3. What does the authors find to be the strongest result of negative interactions?
Including quadratic terms in the property rights measures seem to strengthen the authors’ main result of negative interactions (although the inclusion of a quadratic term in GDP weakens it).
Q4. What does the study show about the effects of the interaction terms?
The authors find that using Frisch-Waugh residuals strengthens the size and sig-nificance of the interactions; in fact, the interaction of external dependence and equity market capitalization and credit turns from insignificant to clearly significant at the 5- percent level with the expected sign.
Q5. What is the coefficient of the interaction term when estimating equation (1)?
If X21 is part of the correctly specified regression with coefficient δ, the estimated coefficient to the interaction term when estimating equation (1) will be α δ.
Q6. What does the author find to be the strongest evidence of the effect of interaction terms?
Clementi, and MacDonald (2004) hypothesize that strengthening of property rights, as measured by laws mandating “one share-one vote,” “anti-director rights” (which limit the power of directors to extract surplus), “creditor rights,” and “rule of law,” are beneficial for growth and more so when restrictions on capital transactions (capital flows) are weaker where the latter effect is captured by interaction terms.
Q7. What is the way to estimate the coefficient of a quadratic term?
If quadratic terms are not otherwise ruled out, the authors recommend also estimating the specification (4) in order to verify that a purported interaction term is not spuriously capturing left-out squared terms.
Q8. What is the partial derivative of Y with respect to X1?
In this regression, λ1 = ∂Y/∂X1 is the partial derivative of Y with respect to X1, implicitly evaluated at X2 = X2 (the mean value of X2).
Q9. What is the main message of the Castro, Clementi, and MacDonald (2004)?
the point estimates in the Castro, Clementi, and MacDonald (2004) study are not all robust, as one might conjecture from the size of the t-statistics, but the overall message of their regressions appear very robust to the kind of robustness checks that the authors recommend.
Q10. What is the way to determine if a regression with interactions really captures only interactions?
Case 2: if one wants to ascertain that the interaction of X1 and X2 captures no other regressors the safest strategy is to run the following regression model:Y = β0 + β1X1 + β2X2 + β3X ψ 1 X ψ 2 + , (9)where Xψ1 = M2X1 and X ψ 2 = M1X2, M1 = [I − Pβ0,X1 ] and M2 = [I − Pβ0,X2 ] (M1 is a residual maker; regressing X2 on a constant and X1 and M2 is the residual maker; regressing X1 on a constant and X2).
Q11. What is the way to explain the difference in the slope of the interaction term?
In the second column, the authors illustrate how the simple suggestion of subtracting the country-specific means from each variable prevents the interaction term from becoming spuriously significant due to country-varying slopes.