Linear Mixed Effects Models (ERPs Example) - Private-Projects237/Statistics GitHub Wiki
Overview
To explain some concepts and code in R that are used for linear mixed effects model analysis for behavioral data.
Disclaimer
The following document assumes that the assumptions for running a linear-mixed effects model have been met.
Packages
- The
lme4
package brings forth thelmer()
function. This is the function that runs the linear mixed effects model. - The
lmerTest
package provides p-values to the model created from usinglmer()
- The
car
package allows you to use theAnova()
function, which is a substitute to thesummary()
function that is used on the object that saves the model. - The
emmeans
package allows you to use theemmeans()
function which allows you to do pairwise comparisons.
library(lme4)
library(lmerTest)
library(car)
library(emmeans)
Example data
We will be using the GnG behavioral data for the Pilot Analysis as an example. For this dataset, what matters are four variables.
- ID: The ID of the participant
- Trial.Type: Whether the trial was a Go or a No-Go trial
- Day.Type: Which testing day it occurred on
- Power: The dependent variable that is a grand average of power between 300-600 milliseconds (ms)
The goal of this analysis is to:
- 1 Identify if there is a difference in brain activity between Go and No-Go trials
- 2 Identify if the ERP difference between Go and No-Go trials (if present) is different across the testing days.
To be more specific, we want to know if there is a P300 effect, which I think means that the average electrical activity amplitude between 300-600 ms in No-Go trials is different from the average electrical activity amplitude in the Go trial within that same time frame. Specifically, we need No-Go brain activity to be more positive than Go brain activity. This will establish an effect of the P300.
The other thing we need to consider is if this P300 effect is present across all stress conditions or if it is present in some, or if it is present in all but in some cases the difference (aka the P300 effect) is much larger in some days but not all.
The purpose of multilevel modeling
This is explained in detail in Linear Mixed Effects Models (Behavior Example). To summarize we are using linear mixed effects models/multi-level modeling to control for the repeated measure aspects of this data. Participants who score well in one day condition are likely to score well in the others and vice versa. Having IDs in the model function as the random effect (1|ID)
will control for this.
Data set up
Our data have been correctly set up for analysis. Since we are interested in understanding the P300 effect across testing days, we will need to include trial type (Go, No-Go) and study days (Baseline, Exercise, Sleep Deprivation) into the model, which will result in 6 rows per participant.
Running a linear mixed-effects model
Because our dataset is in the correct version to run the model, we can go ahead and do so with the following code.
We run the model using the lmer()
function and set 'ID' as the variable that has random effects (1|ID)
. The Day.Type * Trial.Type portion is telling the model we are looking for main effects of both 'Day.Type' and 'Trial.Type' and also looking for their interaction. Meaning are any of these three predictors significant at predicting the outcome variable?
# Select variables of interest
Pilot.Sample.GnG.ERPs4.long <- select(Pilot.Sample.GnG.ERPs4.long, ID, Day.Type, Trial.Type, Power)
# Run a linear mixed effects model predicting P3 difference from testing days
P3.differences2.model <- lmer(Power ~ Day.Type * Trial.Type + (1|ID), Pilot.Sample.GnG.ERPs4.long)
Checking the omnibus output
In stats class, we learned that an omnibus test is a test with a lot of power that tells you if any of the predictor's levels lead to outcomes that are significantly different from each other. The test does not tell you where but if that is the case then they are followed up with post-hoc tests. To run the omnibus test for a linear mixed effects model we can use the Anova()
function from the car
package. Notice we are using the argument type == "III"
, this is because we want to obtain significant from type III sum of squares. According to ChatGPT, type III sum of squared when we are more interested in an interaction and type II sum of squares are used when we are interested in main effects.
# Print the summary
Anova(P3.differences2.model, type = "III")
The outcome shows we have:
- No Main effect of 'Day.Type'
- Main effect of 'Trial.Type'
- No interaction effect
Interpretation: These results show that
- 1 average power across day types is not significant- this by itself is not meaningful information;
- 2 average power is significantly different when compared across trial types; and
- 3 average power differences between Go and No-Go trials does not change across day types- using the findings from bullet point 2, either all days have a correct P300 or an inverted P300.
Running follow-up tests (emmeans)
The emmeans()
function from the emmeans
package is computing the 'estimated marginal means (EMM)' for each specified factor. EMM is also known as the least-square means. According to ChatGPT, these calculated EMMs are basically group means (in contrast to the grand mean) that are compared to each other, however, they have been adjusted prior to comparison to isolate the effect of the factor/s of interest. Therefore, if outcome group means created by the levels of a factor are statistically different from each other, then this function will capture it.
From the results of the omnibus test, we know that we can appropriately run the emmeans() function on group means created by 'Trial.Type', respectively, but 'Day.Type' nor the interaction between it and 'Trial.Type'. We use the argument pairwise to make pairwise comparison. This means exactly as it sounds, we take all of the levels from a factor and pair them up (a comparison between one level and another), this is done from all possible pairs. Since we are only running emmeans()
on 'Trial.Type', it would be similar as running a t-test.
When we run this function with an interaction, we are basically comparing group means from all different group combinations that can be created by the two factors (3 x 2 = 6). Each of these subgroups are paired with the others to look for differences from all possible pairs. From these factors and their levels this produces 15 possible pair combinations/comparisons. We can use this function here to better understand exactly what is going on in our ERPs. Essentially we are expecting that pair-wise combinations between Go and No-Go trials will all be significant but nothing more.
# Get further insight
emmeans(P3.differences2.model, pairwise ~ Day.Type)
emmeans(P3.differences2.model, pairwise ~ Trial.Type)
emmeans(P3.differences2.model, pairwise ~ Day.Type * Trial.Type)
First Output: This output shows as we would expect from the omnibus test that there is no difference in power across testing days. Alone this information is not meaningful because it is just averaging the power from Go and No-Go trials together into one value. If this value does or does not change across testing days then we would not know how to interpret the meaning behind this main effect alone.
Second Output: This is the most meaningful output from the rest. By looking at the values from the emmean
column we see that NoGo on average leads to a positive value and Go a negative power value. The middle portion shows these means were calculated after control for the effects from 'Day.Type'. Lastly the p.value
column show that the difference between Go and No-Go power is statistically significant.
Third Output: Initially I would say that the results from this table are meaningless because the interaction effect from the omnibus test was not signifiant, however, I think there is something valuable to gain from these pair-wise comparisons. For starters, notice that we do have a lot of significant comparisons. The catch is that all of these comparisons are between Go and No-Go trials. I think the reason why the omnibus test was not significant is because this Go vs No-Go pattern is consistent across all of the testing days. If for example the P300 effect was not present in one of the days, meaning that the difference between Go and No-Go brain electrical activity was similar, then that would be apparent here as an interaction.
When looking at the emmean
column, we can get an idea of what the difference between the ERPs would be for their respective testing days. Doing the math mentally they seem pretty similar. However to be more confident we could run another linear mixed effects model on the difference between trial type ERPs by day type.