Interaction in Linear Regression

This tutorial focuses on interaction between a categorial variable and a continuous variable in linear regression. Note that, in this tutorial, we limit the the categorical variable to be 2 levels. (For a categrocial variable with 3 levels, please refer to my another tutotrial on interaction and coding in linear regression .)

Coding Note

In this tutotiral, the dummy coding uses the group 3 as the reference group. Thus, the first comparison is between group 1 and group 3. The second comparison is between group 2 and group 3. For the detailed reasoning, please refer to my another tutorial on Dummy and Contrast Codings in Linear Regression.

Simulated Data

# set seed
set.seed(123)

# Repeat a sequence of numbers:
Xa<-rep(c(1, 2), times=5)
Xa<-as.factor(Xa)
Y<-rnorm(10)
Xb<-rnorm(10)

# combine it into a data frame
df<-data.frame(Xa,Xb,Y)
print(df)

   Xa         Xb           Y
1   1  1.2240818 -0.56047565
2   2  0.3598138 -0.23017749
3   1  0.4007715  1.55870831
4   2  0.1106827  0.07050839
5   1 -0.5558411  0.12928774
6   2  1.7869131  1.71506499
7   1  0.4978505  0.46091621
8   2 -1.9666172 -1.26506123
9   1  0.7013559 -0.68685285
10  2 -0.4727914 -0.44566197

We can also calculate the means for each level of the categorical variable in R.

# calculate means by group
aggregate(df$Y, list(df$Xa), FUN=mean)

  Group.1           x
1       1  0.18031675
2       2 -0.03106546

Situation 1: Categorical IV only

In situation 1, we will focus on a regression model with single IV, namely the categorical variable.

Xa	Means		Dummy Coding	Contrst Coding
Group 1	0.1803	Intercept	-0.0311	(0.1803-0.0311)/2= 0.0746
Group 2	-0.0311	Coded Variable	0.1803-(-0.0311)=0.2114	0.1803-0.0746= 0.1057

Situation 1a: Dummy Coding:

# dummy coding
contrasts(df$Xa) =contr.treatment(2, base = 2)

# linear regression with dummy coding
result<-lm(Y~Xa,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.2340 -0.6592 -0.1251  0.2358  1.7461 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.03107    0.44932  -0.069    0.947
Xa1          0.21138    0.63544   0.333    0.748

Residual standard error: 1.005 on 8 degrees of freedom
Multiple R-squared:  0.01364,	Adjusted R-squared:  -0.1097 
F-statistic: 0.1107 on 1 and 8 DF,  p-value: 0.7479

Situation 1b: Contrast Coding:

# contrast coding
contrasts(df$Xa) =contr.sum(2)

# linear regression with dummy coding
result<-lm(Y~Xa,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.2340 -0.6592 -0.1251  0.2358  1.7461 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.07463    0.31772   0.235    0.820
Xa1          0.10569    0.31772   0.333    0.748

Residual standard error: 1.005 on 8 degrees of freedom
Multiple R-squared:  0.01364,	Adjusted R-squared:  -0.1097 
F-statistic: 0.1107 on 1 and 8 DF,  p-value: 0.7479

Situation 2: Categorical IV + Continuous IV

Situation 2a: Dummy Coding

# dummy coding
contrasts(df$Xa) =contr.treatment(2, base=2)

# linear regression with dummy coding
result<-lm(Y~Xa+Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.15473 -0.35823 -0.07879  0.43272  1.40680 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.01151    0.39473  -0.029    0.978
Xa1         -0.05190    0.57614  -0.090    0.931
Xb           0.53727    0.29252   1.837    0.109

Residual standard error: 0.8823 on 7 degrees of freedom
Multiple R-squared:  0.3344,	Adjusted R-squared:  0.1442 
F-statistic: 1.758 on 2 and 7 DF,  p-value: 0.2406

Situation 2b: Contrast Coding:

# contrast coding
contrasts(df$Xa) =contr.sum(2)

# linear regression with contrast coding
result<-lm(Y~Xa+Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.15473 -0.35823 -0.07879  0.43272  1.40680 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.03746    0.28561  -0.131    0.899
Xa1         -0.02595    0.28807  -0.090    0.931
Xb           0.53727    0.29252   1.837    0.109

Residual standard error: 0.8823 on 7 degrees of freedom
Multiple R-squared:  0.3344,	Adjusted R-squared:  0.1442 
F-statistic: 1.758 on 2 and 7 DF,  p-value: 0.2406

Situation 2c: Dummy Coding + Centering

Here, we continue focusing on a model with the dummy coding for the categorical variable. In addition, we center the continuous variable.

# dummy coding
contrasts(df$Xa) =contr.treatment(2, base=2)

# centering continuous variable
center_scale <- function(x) { scale(x, scale = FALSE)}
df$Xb<-center_scale(df$Xb)

# linear regression with dummy coding
result<-lm(Y~Xa+Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.15473 -0.35823 -0.07879  0.43272  1.40680 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.1006     0.4010   0.251    0.809
Xa1          -0.0519     0.5761  -0.090    0.931
Xb            0.5373     0.2925   1.837    0.109

Residual standard error: 0.8823 on 7 degrees of freedom
Multiple R-squared:  0.3344,	Adjusted R-squared:  0.1442 
F-statistic: 1.758 on 2 and 7 DF,  p-value: 0.2406

Situation 2d: Contrast Coding+ Centering

# contrast coding
contrasts(df$Xa) =contr.sum(2)

# centering continuous variable
center_scale <- function(x) { scale(x, scale = FALSE)}
df$Xb<-center_scale(df$Xb)

# linear regression with contrast coding
result<-lm(Y~Xa+Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.15473 -0.35823 -0.07879  0.43272  1.40680 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.07463    0.27901   0.267    0.797
Xa1         -0.02595    0.28807  -0.090    0.931
Xb           0.53727    0.29252   1.837    0.109

Residual standard error: 0.8823 on 7 degrees of freedom
Multiple R-squared:  0.3344,	Adjusted R-squared:  0.1442 
F-statistic: 1.758 on 2 and 7 DF,  p-value: 0.2406

Situation 3: Interaction

Situation 3a: Dummy Coding

# dummy coding
contrasts(df$Xa) =contr.treatment(2, base = 2)

# linear regression with dummy coding
result<-lm(Y~Xa+Xb+Xa*Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb + Xa * Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.74993 -0.47098 -0.04572  0.28724  1.35337 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept) -0.003186   0.334174  -0.010   0.9927  
Xa1          0.398200   0.540029   0.737   0.4887  
Xb           0.765925   0.274210   2.793   0.0314 *
Xa1:Xb      -1.239197   0.638357  -1.941   0.1003  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7469 on 6 degrees of freedom
Multiple R-squared:  0.5912,	Adjusted R-squared:  0.3868 
F-statistic: 2.892 on 3 and 6 DF,  p-value: 0.1243

Situation 3b: Contrast Coding

# contrast coding
contrasts(df$Xa) =contr.sum(2)

# linear regression with contrast coding
result<-lm(Y~Xa+Xb+Xa*Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb + Xa * Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.74993 -0.47098 -0.04572  0.28724  1.35337 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.1959     0.2700   0.726    0.495
Xa1           0.1991     0.2700   0.737    0.489
Xb            0.1463     0.3192   0.458    0.663
Xa1:Xb       -0.6196     0.3192  -1.941    0.100

Residual standard error: 0.7469 on 6 degrees of freedom
Multiple R-squared:  0.5912,	Adjusted R-squared:  0.3868 
F-statistic: 2.892 on 3 and 6 DF,  p-value: 0.1243

Situation 3c: Dummy Coding + Centering

# dummy coding
contrasts(df$Xa) =contr.treatment(2)

# centering the continuous IV
center_scale <- function(x) { scale(x, scale = FALSE)}
df$Xb<-center_scale(df$Xb)

# linear regression with dummy coding
result<-lm(Y~Xa+Xb+Xa*Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb + Xa * Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.74993 -0.47098 -0.04572  0.28724  1.35337 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   0.1566     0.3407   0.460   0.6620  
Xa1           0.1397     0.4976   0.281   0.7884  
Xb            0.7659     0.2742   2.793   0.0314 *
Xa1:Xb       -1.2392     0.6384  -1.941   0.1003  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7469 on 6 degrees of freedom
Multiple R-squared:  0.5912,	Adjusted R-squared:  0.3868 
F-statistic: 2.892 on 3 and 6 DF,  p-value: 0.1243

Situation 3d: Contrast Coding + Centering

# contrast coding
contrasts(df$Xa) =contr.sum(2)

# centering the continuous IV
center_scale <- function(x) { scale(x, scale = FALSE)}
df$Xb<-center_scale(df$Xb)

# linear regression with contrast coding
result<-lm(Y~Xa+Xb+Xa*Xb,data=df)

# summarize the result
summary(result)

Call:
lm(formula = Y ~ Xa + Xb + Xa * Xb, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.74993 -0.47098 -0.04572  0.28724  1.35337 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.22644    0.24880   0.910    0.398
Xa1          0.06984    0.24880   0.281    0.788
Xb           0.14633    0.31918   0.458    0.663
Xa1:Xb      -0.61960    0.31918  -1.941    0.100

Residual standard error: 0.7469 on 6 degrees of freedom
Multiple R-squared:  0.5912,	Adjusted R-squared:  0.3868 
F-statistic: 2.892 on 3 and 6 DF,  p-value: 0.1243

Summary

	S1 (D)	S1 (C)	S2a (D)	S2b (C)	S2c (D)	S2d (C)
			No C.	No C.	Yes C.	Yes C.
Intercept	-0.03107	0.0746	-0.01151	-0.0375	0.1006	0.0746
p-value for intercept	0.947	0.820	0.978	0.899	0.809	0.797

coefficient for Categorial IV	-0.211	0.1057	-0.05190	-0.0260	-0.0519	-0.0260
p-value for Categorial IV	0.748	0.748	0.931	0.931	0.931	0.931

coefficient for continuous IV			0.5373	0.5373	0.5373	0.5373
p-value for continuous IV			0.109	0.109	0.109	0.109

Situation 2: Categorical IV + Centered Continuous IV

S1 vs. S2: Adding a continuous IV into the regression changes everything, both coefficient and p-value, for the categorical variable.
S2a vs. S2b: Coding (contrast vs. dummy) does not change anything of the continuous IV.
S2c vs. S2d: same point as point 2.
S2a vs. S2b: Coding (contrast vs. dummy) does not change p-value of the categorical variable, but does change the regression coefficient for the categorical variable.
S2c vs. S2d: same point as point 4.
S2a +S2b vs. S2c+S2d: Centering only changes both things in intercept, and does not change anything else.

	S1	S1	S2a	S2b	S2c	S2d	S3a	S3b	S3c	S3d
Coding	D	C	D	C	D	C	D	C	D	C
Centering Continuous IV			No	No	Yes	Yes	No	No	Yes	Yes

Intercept	-0.03107	0.0746	-0.01151	-0.0375	0.1006	0.0746	-0.003186	0.1959	0.1566	0.2264
p-value for intercept	0.947	0.820	0.978	0.899	0.809	0.797	0.9927	0.495	0.6620	0.398

coefficient for categorial IV	-0.211	0.1057	-0.05190	-0.0260	-0.0519	-0.0260	-0.398200	0.1991	0.1397	0.0698
p-value for categorial IV	0.748	0.748	0.931	0.931	0.931	0.931	0.4887	0.489	0.7884	0.788

coefficient for continuous IV			0.5373	0.5373	0.5373	0.5373	0.765925	0.1463	0.7659	0.1463
p-value for continuous IV			0.109	0.109	0.109	0.109	0.0314 *	0.663	0.0314 *	0.663

Interacttion coefficient							-1.2392	-0.6196	-1.2392	-0.6196
p-value							0.100	0.100	0.100	0.100

Situation 3: Categorical IV + Centered Continuous IV+ Categorical Variable * Continuous

S2 vs. S3: Adding an interaction item changes all other coefficients.
S3a vs. S3b: Coding makes a difference in both things (coefficient and p-value) for the continuous IV.
S3c vs. S3d: same as point 2.
S3a vs. S3b: While coding makes a difference in the regression coefficients for categorical IV and the interaction item, the p-values for categorical IV and the interaction item do not change.
S3c vs. S2d: same as point 4.
S3a+S3b vs. S3c+S3d: Centering only changes both things (coefficient and p-value) in intercetpt and categorical IV.
S3a+S3b vs. S3c+S3d: Centering does not change either thing for the continuous IV or the interaction item.

Take-home message

For a categorical variable with 2 levels, different codings do not impact the p-value of interaction effect.
For a categorical variable with 2 levels, when the interaction effect is significant, if you want to look at main effects, you better look at the simple main effect (e.g., using Johnson Neyman or just doing slope tests under each level of the categorical variable).
If the interaction effect is significant, it is difficult to explain any other main effect, except for the simple main effects mentioned in point 2.