Site icon R-bloggers

Clarifying vague interactions

[This article was first published on Ecology in silico, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

For some reason, authors occasionally present linear model results with vague or unintelligible interaction effects. One way to be vague when presenting interaction effects is to provide only a table of model coefficients, including no information on the range of covariate values observed, and no plots to aid in interpretation. Here’s an example:

Suppose you have discovered a statistically significant interaction effect between two continous covariates in the context of a linear model.

Suppose also that you have decided to present the model results with the following table, and the reviewers requested no additional information:

  Estimate SE P-value
$\beta_0$ -0.004 0.037 0.921
$\beta_1$ 1.055 0.038 <0.05
$\beta_2$ -0.496 0.037 <0.05
$\beta_3$ 2.002 0.040 <0.05
RSE 0.517    
       

Without knowing the range of covariate values observed, this table gives an incomplete story about relationship between the covariates and the response variable. Assuming the reader has a decent guess about the range of possible values for the covariates, this is what they can piece together:

< notextile>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# parameter estimates
beta0 <- -.004
beta1 <- 1.055
beta2 <- -.496
beta3 <- 2.002

# reader's guess: range of possible covariate values
x1 <- seq(-5, 5, .1)
x2 <- seq(-5, 5, .1)
X <- expand.grid(x1=x1, x2=x2)

# reader's attempt to know how the covariates relate to E(y)
mu <- with(X, beta0 + beta1*x1 + beta2*x2 + beta3*x1*x2)

require(ggplot2)
d <- data.frame(mu=mu, x1=X$x1, x2=X$x2)
p1 <- ggplot(d, aes(x1, x2, z=mu)) + theme_bw() +
  geom_tile(aes(fill=mu)) +
  stat_contour(binwidth=1.5) +
  scale_fill_gradient2(low="blue", mid="white", high="orange") +
  xlab("Covariate 1") + ylab("Covariate 2") +
  ggtitle("Contour plot of E(y)")
p1

If the reader does not know where the observations fell in this plot, it is difficult to know whether the response variable was increasing or decreasing with each covariate across the range of observed values.

Consider the following two cases, where the observed covariate combinations are included as points.

These two plots tell somewhat different stories despite identical model parameters. On the left, across the range of observed covariates, the expected value of $y$ increases as either covariate increases and the interaction term affects the magnitude this increase. On the right, increases in covariate 1 or 2 could increase or decrease $\mu$, depending on the value of the other covariate.

I won’t get into the nitty gritty of how to present interaction effects (but if you’re interested, there are articles out there, e.g. Lamina et al. 2012). My main goal here is to point out the ambiguity associated with only presenting a table of parameter estimates. My preference would be that authors at least present observed covariate ranges (or better yet values), and provide a plot that illustrates the interaction.

To leave a comment for the author, please follow the link and comment on their blog: Ecology in silico.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.