$$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_1x_2 + \epsilon$$

where $\beta_3$ is the coefficient on the "interaction term" $x_1x_2$. However, interaction terms are often tricky to work with. Bear Braumoeller's 2004 article in International Organization illustrated how published quantitative papers often made basic mistakes in interpreting interaction models. Scholars frequently would mis-interpret the lower-order coefficients, $\beta_1$ and $\beta_2$. Published research articles would argue that a significant coefficient on $x_1$ suggests a meaningful relationship between $x_1$ and $y$. In a model with an interaction term, this is not necessarily the case. The marginal effect of $x_1$ on $y$ in a linear model is not equal to $\beta_1$, it is actually $\beta_1 + \beta_2x_2$. That is, $\beta_1$, is the relationship between $x_1$ and $y$

*when $x_2$ is zero.*Often this is a meaningless quantity since some variables (for example, the age of a registered voter) cannot possibly equal zero.
The correct way to interpret an interaction model is to plot out the relationship between $x_1$ and $y$ for the possible values of $x_2$. It's not a matter of simply looking at a single coefficient and declaring a positive or negative effect. Even if the interaction coefficient $\beta_3$ is significant, the actual meaning of the interaction can differ. One interpretation may be that $x_1$ is always positively related with y, but the effect is greater for some values of $x_2$. Another is that $x_1$ is

*sometimes*positively associated with $y$ and*sometimes*negatively associated with y, depending on the value of $x_2$. Looking only at the coefficients does not capture these two different types of relationships.
Luckily, figuring out the marginal effect of $x_1$ on $y$ is rather easy. In a linear model, the point estimate for how much $y$ increases when $x_1$ is increased by 1, $\hat{\delta_1}$, is equal to

$$\hat{\delta_1} = \hat{\beta_1} + \hat{\beta_3}x_2$$

The variance of the estimator $\hat{\delta_1}$ is

$$Var(\delta_1) = Var(\hat{\beta_1} + \hat{\beta_3}x_2)$$

$$Var(\delta_1) = Var(\hat{\beta_1}) + Var(\hat{\beta_3}x_2) + 2Cov(\hat{\beta_1}, \hat{\beta_3}x_2)$$

$$Var(\delta_1) = Var(\hat{\beta_1}) + x_2^2Var(\hat{\beta_3}) + 2x_2Cov(\hat{\beta_1}, \hat{\beta_3})$$

Note that when $x_2 = 0$, $\hat{\delta_1} = \hat{\beta_1}$ and $Var(\hat{\delta_1}) = Var(\hat{\beta_1})$. The standard deviation or standard error of $\hat{\delta_1}$ is equal to the square root of this variance. Extending these formulae to the non-linear case is easy - the coefficient estimates and variances are computed the same way, and from there one can simulate relevant quantities of interest (probabilities, predicted counts).

An even simpler way to calculate the marginal effect of $x_1$ for an arbitrary value of $x_2$ is to re-center $x_2$ by subtracting from it some value $k$

*and re-estimating the regression model. The coefficient and standard error of $x_1$ will be the marginal effect of x_1 on y when $x_2 = k$. A handy trick is to mean-center $x_2$ (subtract the mean of $x_2$ from each value of $x_2$). Then, the coefficient on $x_1$ (in a linear model) is equal to the**average*effect of $x_1$ on $y$ over all of the values of $x_2$.*
Braumoeller's article came with Stata code to make interaction plots (though I can't seem to find it online anymore). In 2011, Stata 12 added the marginsplot command, making these sorts of figures even easier to create. Quantitative political scientists appear to have taken notice. I could not find a single article in the 2012/2013 issues of the American Political Science Review, American Journal of Political Science, and International Organization that used an interaction model without including a corresponding marginal effects plot. Correctly interpreting interaction effects is now about as easy as running the regression itself.

This is all well and good for Stata users, but what about R? Coding up these sorts of plots from scratch can get a little tedious, and no canned function (to my knowledge) exists on CRAN. Moreover, the availability of easy-to-use functions for statistical methods seems to encourage wider use among applied quantitative researchers.

So here's my code for quickly making decent-looking two-variable interaction plots in R. The first function,

Below is an example of the output. For the sake of demonstration, I took the built-in R dataset

Note that just interpreting the main effect of wind speed at zero (the regression coefficient) gives a misleading picture of the actual relationship. At 0 parts per billion of ozone, wind speed is negatively associated with temperature. But for higher values of ozone content wind speed becomes positively associated with temperature (I have no idea why this is the case, or why there would even be an interaction - my guess is there's some omitted variable). For the average value of ozone concentration (the red-dashed line), wind speed is not significantly associated with temperature.

Sometimes the moderating variable is a binary indicator. In these cases, a continuous interaction plot like the one above is probably less useful - we just want the effect when the moderator is "0" and when it's "1". The second function in the file,

So hopefully these two functions will save R users some time. Note that these functions also work perfectly fine with non-linear models, but the quantity plotted will be the regression coefficient and not necessarily something with substantive meaning. Unlike simple OLS, the coefficients of most non-linear models do not have a clear interpretation. You'll have to do a little bit of work to convert the coefficient estimates into something actually meaningful.

Feel free to copy and share this code, and let me know if there are any bugs. If there's enough demand, I might clean it up more and put together an R package (time permitting of course).

*interaction_plot_continuous(),*plots the estimated marginal effect for one variable that is interacted with a continuous "moderator." In simple terms, it plots $\delta_1$ for the range of values of $x_2$.Below is an example of the output. For the sake of demonstration, I took the built-in R dataset

*airquality,*which contains air quality measurements in New York taken during the 70s, and regressed maximum daily temperature on ozone content, wind speed and an interaction of ozone and wind. The plot below shows the marginal effect of wind speed moderated by ozone content:Note that just interpreting the main effect of wind speed at zero (the regression coefficient) gives a misleading picture of the actual relationship. At 0 parts per billion of ozone, wind speed is negatively associated with temperature. But for higher values of ozone content wind speed becomes positively associated with temperature (I have no idea why this is the case, or why there would even be an interaction - my guess is there's some omitted variable). For the average value of ozone concentration (the red-dashed line), wind speed is not significantly associated with temperature.

Sometimes the moderating variable is a binary indicator. In these cases, a continuous interaction plot like the one above is probably less useful - we just want the effect when the moderator is "0" and when it's "1". The second function in the file,

*interaction_plot_binary(),*handles this case. Again to demonstrate, I took the classic LaLonde job training experiment dataset and fitted a simple (and very much wrong) regression model. The model predicted real 1978 wages using assignment to a job training program (treatment), marital status and an interaction of the two. I then estimated the marginal "effect" of treatment assignment on wages for each of the two marital status levels. In this case, the interaction was not statistically significant.So hopefully these two functions will save R users some time. Note that these functions also work perfectly fine with non-linear models, but the quantity plotted will be the regression coefficient and not necessarily something with substantive meaning. Unlike simple OLS, the coefficients of most non-linear models do not have a clear interpretation. You'll have to do a little bit of work to convert the coefficient estimates into something actually meaningful.

Feel free to copy and share this code, and let me know if there are any bugs. If there's enough demand, I might clean it up more and put together an R package (time permitting of course).

** I'm using the word "effect" here loosely and as shorthand for "relationship between." Assigning a causal relationship between two variables requires further conditional independence assumptions that may or may not hold.*

*Edit: Thanks to Patrick Lam for pointing out a typo in the variance formula (missed the 2) - fixed above and in the code.*

*Edit 6/18: Forgot to also include a link to Brambor, Clark, and Golder's excellent 2006 paper in Political Analysis which discussed similar issues regarding interpretation of interaction terms.*

It doesn't seem to work for my negative binomial model for some reason. How can the code access the minimum and maximum values of a variable, if it doesn't have access to the data itself?

ReplyDeleteIt says "Error in model$coefficients[[interaction]] : subscript out of bounds"

ReplyDeleteGreat work, thanks!

ReplyDeleteI searched quite a while for a function that generates interaction plots like these - works great for me, even with plm models. Thank you!

ReplyDelete