Expanding the mixed effects model framework#

Learning Objectives#

  • Learn how regression models can handle different types of variables, multiple variables, and various relationships between variables

  • Gain knowledge on incorporating more predictor variables as fixed effects in mixed effects models

  • Learn to fit logistic mixed effects regression models for binary outcome variables using the glmer() function in R

  • Understand how to compare models with and without random effects to determine the necessity of including random intercepts

  • Develop skills to interpret the fixed and random effects in mixed effects models, including the estimation of odds ratios for logistic regression models

So far we have considered a fairly simple regression model with one continuous predictor variable and one continuous outcome variable and fitting a straight line between these. This has enabled us to get to grips with the core concepts but regression can do so much more than this. It is an incredibly flexible framework that can handle

  • different types of variables

  • multiple variables

  • different relationships between variables

We will now look at how we extend the methodology to allow more complex analysis designs. You should think of regression as a modular approach which you select the necessary components depending on the

  • properties of the outcome variable(s)

  • properties of the predictor variable(s)

  • the relationship that you want to model between each predictor variable and outcome variable.

Basically anything you can do with a standard regression model you can do with a mixed effects model.

Including more predictor variables#

We can easily incorporate more predictor variables as fixed effects into our model. As with linear regression, we need to specify these in the formula we provide to R. If in addition to VisitNum we want to control for differences between males and females we can include a fixed effect for sex.

%%R
library(lme4)
model.sex<-lmer(CognitionA ~ VisitNum  + Sex + (1 | ID), data = cogDat)
summary(model.sex)
Linear mixed model fit by REML ['lmerMod']
Formula: CognitionA ~ VisitNum + Sex + (1 | ID)
   Data: cogDat

REML criterion at convergence: 808.5

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.15684 -0.64686 -0.03676  0.60659  2.02730 

Random effects:
 Groups   Name        Variance Std.Dev.
 ID       (Intercept) 23.672   4.865   
 Residual              3.762   1.940   
Number of obs: 164, groups:  ID, 41

Fixed effects:
            Estimate Std. Error t value
(Intercept)  20.8406     1.2363  16.857
VisitNum      0.4433     0.1071   4.140
SexM          0.8121     1.5806   0.514

Correlation of Fixed Effects:
         (Intr) VistNm
VisitNum -0.203       
SexM     -0.746 -0.021
Loading required package: Matrix

We can see that the fixed effect we have estimated for sex is not significant.

Logistic mixed effects regression models#

If our outcome is a binary variable we alternatively need to fit a logistic regression model. For an explanation as to why, please see the Introduction to Regression Analysis tutorial. As logistic regression requires a generalized linear model framework, we need to use the function glmer() rather than lmer().

Let’s look to see if in general the participants mental well being improves as the study progresses. The variable that captures change over the course of the study is VisitNum so this is our predictor variable, we keep our random effect for ID and our outcome variable is the factor MentalWellbeing.

%%R
model.log <- glmer(MentalWellbeing ~ VisitNum + (1 | ID), data = cogDat, family="binomial")
summary(model.log)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: MentalWellbeing ~ VisitNum + (1 | ID)
   Data: cogDat

     AIC      BIC   logLik deviance df.resid 
   367.2    380.9   -180.6    361.2      706 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.5324 -0.0070 -0.0031  0.0717  1.1377 

Random effects:
 Groups Name        Variance Std.Dev.
 ID     (Intercept) 347      18.63   
Number of obs: 709, groups:  ID, 175

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -8.2663     0.8698  -9.504  < 2e-16 ***
VisitNum     -0.8023     0.1762  -4.552 5.31e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
         (Intr)
VisitNum -0.187

Let’s see if we need the random intercept which is essentially asking the question whether an individuals well being at one point in the study predicts their well being at another stage. It is important to do this for each model, because just because individual has an affect on one variable, doesn’t automatically mean it affects all variables in a data set.

As before we do this my comparing it to a standard regression model with just fixed effects and no random effects. As the standard model also needs to be a logistic regression model we use the glm() function to fit it. We then use an anova() to compare the models with and without the random effects.

%%R
null.log <- glm(MentalWellbeing ~ VisitNum, data = cogDat, family="binomial")
anova(model.log, null.log)
Data: cogDat
Models:
null.log: MentalWellbeing ~ VisitNum
model.log: MentalWellbeing ~ VisitNum + (1 | ID)
          npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
null.log     2 853.82 862.95 -424.91   849.82                         
model.log    3 367.21 380.90 -180.60   361.21 488.61  1  < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

These results show that the inclusion of the random intercept does significantly improve the fit of the model as P < 0.05. Therefore we can conclude that individual’s mental well being is correlated across the course of the study.

We interpret the fixed effects as we would for any other logistic regression model - they relate to the log odds ratio of the outcome per one unit increase in the predictor variable. As a one unit increase in the predictor variable equates to one extra visit, we can summarise from this model that each extra visit is associated with a log odd ratio of r signif(summary(model.log)$coefficients["VisitNum", "Estimate"],2). We can convert this to an odds ratio by raising it to an exponential.

%%R
exp(coef(summary(model.log))[,"Estimate"])
 (Intercept)     VisitNum 
0.0002570408 0.4482874766 

So the odds of having low mental well being relative to high mental well being decreases by a factor of r signif(exp(coef(summary(model.log))["VisitNum","Estimate"]),2) for each extra visit. We can flip this round and say that each visit increases the odds of having high mental well being by a factor of r signif(1/exp(coef(summary(model.log))["VisitNum","Estimate"]),2). Note that the individual level intercepts represent each individuals baseline odds ratio for their mental well being.

Exercise 3#

Let’s practise fitting more complex mixed effects models

Write the R code required to test using a mixed effects regression model the following. For eachmodel include a random intercept for individual.

  1. Is cognitive performance measured by any of the three tests influenced by smoking or years of education?

%%R
#model.coga<-lmer(CognitionA ~ VisitNum + ... + (1|ID), data = cogDat)
#model.cogb<-
#model.cogc<-
NULL

  1. Does cognitive performance in any of the three tests influence the mental well being of the participants? Include co-variates for sex and years of education.

%%R
#model.mw.coga<-glmer(MentalWellebing ~ CognitionA + VisitNum + ... + (1|ID), data = cogDat)
#model.mw.cogb<-
#model.mw.cogc<-
NULL

  1. Does physical well being improve over the course of the study?