Introduction into when to use Linear Mixed Effects Models - Private-Projects237/Statistics GitHub Wiki
This wikipage will be detailing the usage of using linear mixed effects model. The goal is to keep things simple and just focus on different ways that random effects can be specified. We will not dive deeply into post hoc tests of fixed effects and only continuous outcomes will be used for this page. For non continuous outcomes, a generalized linear mixed effects model approach will need to be used which can be found elsewhere in the main wiki hub. We will be relying on the lmer()
function from the 'lme4' package and also loading in the 'lmerTest' package to produce significance testing of the fixed effects.
There are many reasons why we would want to run these types of models. One is that we suspect there is an effect of the cluster (k) the data is nested into. We will be covering this topic in detail, including unmeasured confounders. Additionally, it is important to mention that running mixed models separates variance of the outcome into level-1 and level-2 components, both of which specific factors may explain variance from. Thus, there are also level-1 and level-2 factors to be mindful of.
Here we will be generating dummy data to test anxiety regressed on depression and sex. For this data we will give it context, this is information obtained from 5000 patients within one clinic over the past 10 years. Across this time period, patients were screened for their anxiety and depression on their first visit to the clinic- this is the data we will analyze. The screening was done by a trained therapist, and each therapist (there are/were 100 of them) screened 50 patients.
Therefore, we technically have nested data, where patients scores are nested within the therapist that did their screening. However, this is a case where I would argue that nesting should not really matter. Why should it not matter? Well because mixed-models are typically done to assess whether the cluster or group that subjects or observations are nested within, have some type of effect on them. In this context, this would mean that patients who had their anxiety measured (outcome variable) within the same therapist, would have anxiety scores that are more correlated with each other than members outside of that cluster. This is extremely unlikely, especially since therapist are engaging in a standardized procedure. Regardless we are going to run a mixed-effects model to showcase this.
# Load required packages
library(tidyverse)
library(psych)
library(lme4)
library(lmerTest) # For p-values
library(performance)
library(ggplot2)
library(sjPlot) # plot_model
library(parameters) # standardize_parameters
# Set seed for reproducibility
set.seed(123)
# Set population parameters
b0 = 45
b1 = 3
b2 = 10
tau = 1
sig = 8
# Set sample size
k = 100 # cluster group (therapists)
n = 50 # sample within cluster (clients)
N = k * n # total row length
# Generate IDs
k_id <- rep(1:k, each = n)
N_id <- 1:N
# Fixed effects predictors
x1 <- rnorm(N, mean = 5, sd = 2) # Depression level
x2 <- rbinom(N, 1, 0.5) # Sex
# Random intercept for cluster group
k_effect <- rep(rnorm(k, 0, sd = tau), each = n)
# Generate the outcome
y <- b0 + b1 * x1 + b2 * x2 + k_effect + rnorm(N, 0, sd = sig)
# Create a dataframe
dat <- data.frame(
k_id,
N_id,
x1,
x2,
k_effect,
y
)
# descriptives
describe(dat)
round(cov(dplyr::select(dat, x1, y)), 2)
round(cor(dplyr::select(dat, x1, y)), 2)
# Correlations between sex
by(dat, dat$x2, function(x) cor(x$x1, x$y, method = "pearson"))
aggregate(dat$y ~ dat$x2, data = dat, FUN = function(x) c(mean = mean(x), sd = sd(x)))
aggregate(dat$x1 ~ dat$x2, data = dat, FUN = function(x) c(mean = mean(x), sd = sd(x)))
# Fitting an empty and regular model
mod_empty <- lmer(y ~ 1 + (1 | k_id), data = dat)
mod_mixed <- lmer(y ~ x1 + x2 + (1 | k_id), data = dat)
# Print out model summary
summary(mod_mixed)
# Extract parameter information
performance::icc(mod_empty) # ICC
fixef(mod_mixed) # fixed effects
VarCorr(mod_mixed) # random effects
r2(mod_mixed)
ranef(mod_mixed)$k_id[1:20,] # Estimated random intercept for each k cluster
# plot the random effects
plot_model(mod_mixed, type = "re") + theme_classic()
# Run a model with standardized betas
data_std <- dat %>% mutate(across(c(x1, x2, y), scale))
mod_mixed_std <- update(mod_mixed, data = data_std)
round(fixef(mod_mixed_std),2)
standardize_parameters(mod_mixed) # or this function works too sometimes (I don't think it works with multiple outcomes)
Below we have the summary output of a model for data that we simulates. We generated data for 100 clusters (k = therapist), each containing data for 50 observations (n = patients with anxiety), thus the dataset contained 5000 rows (N = total observation number). We are basically testing whether anxiety is related to depression and subject's sex after controlling for each other's effects. Additionally, we are also controlling for the effect of cluster (k), if such a thing exists. In this context, k would be an effect that is attributed to the therapist that did the screener on the patients. We can write out the formula to this model below:
Where:
-
$Y_{ij}$ = The observed anxiety score for patient i screened by therapist j -
$\gamma_{00}$ = The y-intercept, or the expected anxiety score for a male who scored 0 in depression -
$\gamma_{10}$ = The slope representing the expected change in anxiety for a 1 unit increase in depression -
$\gamma_{20}$ = The effect of being female on anxiety score -
$u_{0j}$ = The effect of the therapist on anxiety score (high effect means anxiety tends to correlate among patients screened by the same therapist) -
$r_{ij}$ = The difference between the observed values for anxiety and the predicted values of the model including fixed effects and influence of therapist (level -2 random effect)
So we see that we have 5000 observations within our data- this does not mean that the number corresponds to unique observations for all variables. For k_id there are only 100 observations. For x1, which is our depression predictor, we see that there is a mean of 5 and the min and max ranges are -1.27 and 11.89 respectively. This does not make sense but we do not care that much about that aspect. The second predictor is group so we see that min and max are 0 and 1 as expected. What is interesting is that we did introduce the effect of k cluster into the dataset we see that it is small (-0.15) and tat it ranges from -3.06 to 1.80, which is minimal. The outcome (y) is our measure of anxiety and here it ranges from 26.79 to 105.92, thus an effect of k ranging from -3.06 to 1.80 on the outcome really will do barely nothing.
For variability we see that the outcome is much more variable than the predictor x1, this is not very meaningful information since these scores were obtained using different units. What does matter is their correlation of 0.54, which is pretty good. This means that 0.54
It is also good to look for group differences between the outcome and the other predictors. Here we can see that the correlation between anxiety and depression is the same between sex. Additionally we see that there is likely a sex effect on the outcome (anxiety) but not on the predictor (depression)- this is based to how the data were simulated.
Data Description | Covariance | Correlation | Group Differences |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Okay so now that we have descriptives out of the way, we can start to closely inspect our model output. We start by verifying that the model recognized 100 therapist, which it did based on the 'Number of obs:' it reported. We then see the 'Random effects', where we can see the standard deviation for the therapist and for error (both are random effects). We see that the standard deviations are the same to what we programmed- so that's good, and this shows that a lot of the error for anxiety is being attributed to the individual level and not the therapist level. For the fixed effects, it is kinda hard to tell by the y-intercept is 44.6, x1 is 3 and x2 is 10, which is also what we programmed. We see that both predictors are significant, with very small standard errors, which was to be expected considering the sample size.
In the extracted information, we can obtain the intraclass correlation coefficient (icc)- this value tells us how an average of how correlated anxiety is within the therapist cluster- this value is very small indicating that therapist had barely no effect in this regard. We can also extract the fixed effects and random effects using fixef()
and VarCorr()
respectively. We can obtain how much of the outcome is explained by the model, which is an r2()
function. Interestingly, we can extract the k effect for each cluster with the ranef()
function. This correlates with the effects we programmed but there is some deviation.
Lastly, we can produce produced standardized beta coefficients either (recommended but a bit more work) scaling the continuous predictors and outcomes in your data and rerunning the model (use the updated()
function) or use the standardized_parameters
function from the 'parameters' package. However, this function fails to work if there are more than two outcome variables in the model.
Model Summary | Extracted Information |
---|---|
![]() |
![]() |
For this example, we are going to now introduce a new idea that was not present in the previous dataset, and that is that therapist differed from each other due to price. What does this mean? Well therapist actually offered different prices in this clinic to screen patients, so some individuals could pay a lot more for their services or a lot less. Assuming that patients believed a more expensive screening led to better results, patients with more money to spend (better SES) were willing to get screened by a more costly therapist. So now some interesting occurs, and that's that if anxiety is in anyway related to the SES of the patient (which we are going to say it is), that now there is an unexpected confounder embedded within the level-2 cluster that needs to be accounted for. This is because the similarity of SES for patients assessed by the same therapist is making their scores more correlated with each other than others who received a different therapist. Thus, this will basically create a cluster (k) effect that if not accounted for will increase variability for the level-2 group, and we can see this.
# Load required packages
library(tidyverse)
library(psych)
library(lme4)
library(lmerTest) # For p-values
library(performance)
library(ggplot2)
library(sjPlot) # plot_model
library(parameters) # standardize_parameters
# Set seed for reproducibility
set.seed(123)
# Set population parameters
b0 = 45
b1 = 3
b2 = 10
b3 = - 3
tau = 1
sig = 8
# Set sample size
k = 100 # cluster group (therapists)
n = 50 # sample within cluster (clients)
N = k * n # total row length
# Generate IDs
k_id <- rep(1:k, each = n)
N_id <- 1:N
# Fixed effects predictors
x1 <- rnorm(N, mean = 5, sd = 2) # Depression level
x2 <- rbinom(N, 1, 0.5) # Sex
# Random intercept for cluster group
k_effect <- rep(rnorm(k, 0, sd = tau), each = n)
price <- rep(round(runif(k, min = 0, max = 5)), each = n)
# Generate the outcome
y <- b0 + b1 * x1 + b2 * x2 + b3 * price + k_effect + rnorm(N, 0, sd = sig)
# Create a dataframe
dat <- data.frame(
k_id,
N_id,
x1,
x2,
k_effect,
price,
y
)
# Fitting an empty and regular model
mod_empty <- lmer(y ~ 1 + (1 | k_id), data = dat)
conf_mod_mixed <- lmer(y ~ x1 + x2 + (1 | k_id), data = dat)
mod_mixed <- update(conf_mod_mixed, . ~ . + price) # Update model to include the confounder
# Print out model summary
summary(conf_mod_mixed)
summary(mod_mixed)
# Extract parameter information
performance::icc(mod_empty) # ICC
fixef(mod_mixed) # fixed effects
VarCorr(mod_mixed) # random effects
r2(mod_mixed)
as.vector(unlist(ranef(mod_mixed)))[1:20] # Estimated random intercept for each k cluster
# plot the random effects
plot_model(conf_mod_mixed, type = "re") + theme_classic()
plot_model(mod_mixed, type = "re") + theme_classic()
# Run a model with standardized betas
data_std <- dat %>% mutate(across(c(x1, x2, y), scale))
mod_mixed_std <- update(mod_mixed, data = data_std)
round(fixef(mod_mixed_std),2)
standardize_parameters(mod_mixed) # or this function works too sometimes (I don't think it works with multiple outcomes)
Below we have two models produced by the same dataset. These models are essentially the same, the biggest difference is the random effects. We simulated data so that the random effect for the level-2 cluster (the random intercepts) would have a standard deviation of 1. However for the model with the confounder, we see that this value is a lot larger, being closer to 5. This is telling us that there are some deviations in the mean of anxiety across clusters. In other words, if we were to average anxiety of patients within therapist, that these averages would vary quite largely compared to each other. This would mean that therapist in someway is influencing the anxiety of the their patients. This is kinda true but in reality what is going on is that confound of price. Since some therapists only have patients that come from high SES, other therapists only have patients that come from low SES, and SES status has a direct effect on anxiety, then that is what is being captured as unexplained variance in the level-2 cluster. Thus, once that is identified and incorporated into the model as a predictor, which is the case in the second model, we see that the standard deviation of the level-2 cluster reduced back to a value closer to 1.
Model with Confounder | Model with Confounder Addressed |
---|---|
![]() |
![]() |
Using the plot_model()
function on the models through the 'sjPlot' package can help us see just how much of an effect is present in the k-cluster level for the model with that confounder. By addressing this through the fixed effect of price, we see that this influence gets completely explained away. Additionally the icc for the model with the confounder unsurprisingly increased to 0.146, which is a big jump.
Random Effects (Confounder) | Random Effects (Cofounder Addressed) |
---|---|
![]() |
![]() |
Now we have another interesting example. We find out later that screenings for the patients were done by therapists at the same time! Meaning that every patient we have data for was getting screened as a group with other patients. This now creates an interesting dynamic because now the outcome anxiety, could be influenced not only by the level-1 predictor depression, but also by the depression status of the group for each therapist. For example, imagine someone has very little depression, they would be expected to also have very little anxiety, however, imagine they are in a group setting where everyone is also depressed, this could potentially influence this persons mood since it is being dragged down by the group they are within, which can result in lower anxiety. We can test for this by now creating a level-2 predictor from the level-1 predictor- in which this predictor corresponds to the group effect of depression on anxiety. Again, the introduction of this predictor makes sense to do with the context of how data was collected.
# Load required packages
library(tidyverse)
library(psych)
library(lme4)
library(lmerTest) # For p-values
library(performance)
library(ggplot2)
library(sjPlot) # plot_model
library(parameters) # standardize_parameters
library(kableExtra)
# Set seed for reproducibility
set.seed(123)
# Set population parameters
b0 = 45
b1 = 3
b2 = 10
b3 = 12 # level 2- effect of x1_mean (between-cluster)
tau = 1
sig = 8
# Set sample size
k = 100 # cluster group (therapists)
n = 50 # sample within cluster (clients)
N = k * n # total row length
# Generate IDs
k_id <- rep(1:k, each = n)
N_id <- 1:N
# Fixed effects predictors
x1 <- rnorm(N, mean = 5, sd = 2) # Depression level
x2 <- rbinom(N, 1, 0.5) # Sex
# Create a dataframe
dat <- data.frame(
k_id,
N_id,
x1,
x2,
x1_mean = NA,
k_effect = NA,
y = NA
)
# Generate a level-2 predictor from mean of x1
x1_mean_df <- dat %>% group_by(k_id) %>% summarise(x1_mean = mean(x1))
dat$x1_mean <- rep(x1_mean_df$x1_mean, each = n)
# Random intercept for cluster group
dat$k_effect <- rep(rnorm(k, 0, sd = tau), each = n)
# Generate the outcome
dat <- dat %>%
mutate(y = b0 + b1 * x1 + b2 * x2 + b3 * x1_mean + k_effect + rnorm(N, 0, sd = sig))
# Fitting an empty and regular model
mod_empty <- lmer(y ~ 1 + (1 | k_id), data = dat)
miss_mod_mixed <- lmer(y ~ x1 + x2 + (1 | k_id), data = dat)
mod_mixed <- update(miss_mod_mixed, . ~ . + x1_mean)
# Print out model summary
summary(miss_mod_mixed)
summary(mod_mixed)
# View standardized beta coefficients
data_std <- dat %>% mutate(across(c(x1, x2, x1_mean, y), scale))
mod_mixed_std <- update(mod_mixed, data = data_std)
round(fixef(mod_mixed_std),2)
standardize_parameters(mod_mixed)
The descriptives are to showcase the addition of a new predictor into the model, which again was created from a predictor present in the data. This is a level-2 predictor, which differentiates variance by clusters, additionally we can confirm this since its range is very small, which is to be expected from calculating means. It also has some predictive properties since it correlates with the outcome 'y'. The question now is what happens when we fail to include this predictor into the model.
Descriptives of the data | Correlations of Predictors and Outcome |
---|---|
![]() |
![]() |
Just like in the previous example, we see that the fixed effects between the models are essentially identical. What is different is the variance of the group level (k) random effects. We see that in the model that failed to include this level-2 predictor, has a much larger standard deviation at 3.437. The other model that did include the predictor, has its standard deviation back to around 1. Thus this shows how in our data, there was a clear group effect of the predictor that has influence over our outcome
Model Without Mean Predictor | Model With Mean Predictor |
---|---|
![]() |
![]() |
Since we introduced a new predictor, it is not very clear how impactful it is compared to the other predictors in the model. While its unstandardized beta coefficient of 12 is much larger than that of 3 and 10 for x1 and x2, this doesn't necessarily mean that it is stronger- mainly because the spread of this level-2 predictor is much smaller than that of x1. So what we can do is calculate the standardized beta coefficients either by scaling all the continuous variables in the dataset and updating the model with it or using the standardize_parameters()
function.
Thus according to the table below, we see that the x1_mean did have a good impact on the outcome, but it is still weaker compared to x1 and x2 (assuming that this comparison is appropriate- idk).
Standardized Beta Estimates |
---|
![]() |
Now this dataset was pretty tricky to generate but we finally figured it out. Random slopes, this is a concept where we take a continuous predictor or covariate and allow the model to calculate the slope between that predictor/covariate and the outcome for all clusters. It then produces an estimate of its variance/standard deviation, which then informs you if the relationship between the predictor and outcome is affected by cluster. This is good because of that is the case then we can look for any predictors that could be influencing the slope- similar to a moderation analysis for a meta analysis.
Anyway for our purposes, we will be introducing random slopes in the data by multiplying the slope of a simulate predictor with a random effect we labeled tau_slope
, which corresponds to a specific cluster. Additionally, we have also introduced non-randomly varying slopes, meaning that the slopes in our data do changed but due to fixed effects, aka the slope is modified by the levels of a specific factor.
Okay now to contextualize this information. Let's say now that we have data from 1000 patients that went to the clinic because they were having a panic attack right before going. The therapist was told this and most of them provide some form of therapy to help their anxiety, and some do not (because they are inexperienced or could not be bothered). The therapy is targeted at reducing the patient's anxiety, with CBT having the strongest effect, Mindfulness having a decent effect, and no intervention having no effect. After this session, the subjects anxiety and depression scores are collected. Here is the catch, the the intervention reduces the patient's anxiety (temporarily enough to show in the measure) but not their depression, so essentially the relationship between the two has now weakened because the anxious person who is temporarily calmed still has their baseline depressive symptoms, which will produce a weaken slope compared to someone without any treatment who is still anxious and depressed. Anyway we will see all of this infold down below.
To specify a mixed model to make a continuous random, introduce it into the ( | ) and have it on the left side- it's that easy.
# Load required packages
library(tidyverse)
library(psych)
library(lme4)
library(lmerTest) # For p-values
library(performance)
library(ggplot2)
library(sjPlot) # plot_model
library(parameters) # standardize_parameters
library(kableExtra)
# Set seed for reproducibility
set.seed(123)
# Set population parameters
b0 = 45
b1 = 3
b2 = 10
b3_int = 4 # Interaction effect
tau = 2.5
tau_slope = 2
sig = 8
# Set sample size
k = 100 # cluster group (therapists)
n = 50 # sample within cluster (clients)
N = k * n # total row length
# Generate IDs
k_id <- rep(1:k, each = n)
N_id <- 1:N
# Fixed effects predictors
x1_c <- rnorm(N, mean = 0, sd = 2) # Depression level
x2 <- rbinom(N, 1, 0.5) # Sex
# An enhancer for random slopes (intervention variable)
interv <- rep(sample(c(0, 1, 2), k, replace = TRUE, prob = c(1/3, 1/3, 1/3)), each = n)
interv_label <- case_when(interv == 0 ~ "No Intv", interv == 1 ~ "Mindfulness", interv == 2 ~ "CBT")
interv_label <- factor(interv_label)
contrasts(interv_label) <- matrix(c(1,0,0,0,1,0), ncol = 2)
# Random effects for cluster group
k_effect <- rep(rnorm(k, 0, sd = tau), each = n)
k_slope <- rep(rnorm(k, 0, sd = tau_slope), each = n) + b3_int * interv
# Generate the outcome
y <- b0 + (b1 + k_slope) * x1_c + b2 * x2 + k_effect + rnorm(N, 0, sd = sig)
# Create a dataframe
dat <- data.frame(
k_id,
N_id,
x1_c,
x2,
k_effect,
interv,
interv_label,
y
)
# Fitting an empty and regular model
mod_empty <- lmer(y ~ 1 + (1 | k_id), data = dat)
mod_pred <- lmer(y ~ x1_c + x2 + (1 | k_id), data = dat)
mod_pred_ran <- lmer(y ~ x1_c + x2 + (1 + x1_c| k_id), data = dat,
control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)))
mod_pred_ran2 <- update(mod_pred_ran, . ~ . + interv_label)
mod_pred_ran3 <- update(mod_pred_ran2, . ~ . + x1_c:interv_label)
# model comparison
anova(mod_empty, mod_pred, mod_pred_ran, mod_pred_ran2, mod_pred_ran3)
# Extract information
performance::icc(mod_empty)
# Extracting random effects
VarCorr(mod_pred)
VarCorr(mod_pred_ran)
VarCorr(mod_pred_ran3)
# Extracting fixed effects
round(fixef(mod_pred),2)
round(fixef(mod_pred_ran),2)
round(fixef(mod_pred_ran3),2)
# Obtain an R^2 for each test
r2(mod_empty)
r2(mod_pred)
r2(mod_pred_ran)
r2(mod_pred_ran2)
r2(mod_pred_ran3)
# Omnibus test
Anova(mod_pred_ran3, type = "III")
# Standardized betas
standardize_parameters(mod_pred_ran3)
Okay several different models were created and this is what they are doing- they build on each other
- Model 1: Empty model
- Model 2: Includes main two fixed effects
- Model 3: Introduces one of the previous fixed effects ('x1_c') as a random variable
- Model 4: Introduces a third fixed effect (not meaningful)
- Model 5: Let's the slope of 'x1_c' differ on the levels of the third fixed effect (factor)
We can see from model comparison that Model 5 explained the data the best- and it should since we simulated the data to best be explained through this model.
Next we can look at the random effects and fixed effects from the most notable models. Let's first review that we programmed the following: b0 = 45, b1 = 3, b2 = 10, b3_int = 4, tau = 2.5, tau_slope = 2, sig = 8.
For the fixed effects, all three of the models were able to recapture that information without issue- which is a good thing. However, we do see that the model without a random slope has placed the error into the level-1 standard deviation, which is wrong. For the second model, we see that now we have the standard deviation for the slopes, but it is overestimated. It is only in model 3, when the interaction between the levels of the factor and the slope is introduced, that the random effects of the model match that of the simulated data. Bonus, increasing the cluster size from 100 to 10000 will reproduce essentially perfect fixed and random effects, any minor deviations is just due to sampling error.
Model Comparison | Model Random and Fixed Effects |
---|---|
![]() |
![]() |