Intro into Generalized Linear Mixed Effects Models GLMM - Private-Projects237/Statistics GitHub Wiki

Overview

This page is going to into the meat and bones about generalized linear mixed effects models GLMM. We will first start by reviewing some information from Chapter 5: Introduction to Generalized Linear Mixed Models but we will not be using their examples because they are not that good. Instead, Faraway, J. J. (2016). Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models will be used for the examples.

Background

GLMMs are an extension of the Linear Mixed Models, which are used to exam data when it is not normally distributed and is clustered! Examples of this include binary and count data.

To 'generalize' the linear model to the generalized linear model (GLM), we are able to by introducing a link function for the mean of the distribution. It looks like this:

$$g(E(Y|X))= \eta = X\beta$$

Where:

$E(Y|X)$: This represents the expected value (mean) of the response variable Y, conditional on the predictor variables X. Essentially, it's the model's prediction for Y, given the values of X. When we say mean, think of the true mean of Y, when X is a specific value!
$g()$: The function g is the link function. It transforms the expected value E(Y|X) so that the relationship with the linear predictors $X\beta$ becomes linear.
$\eta$: This is called the linear predictor, which is the linear combination of the predictors X and their corresponding coefficients $\beta$. It represents the transformed mean of Y after applying the link function. Linear combination is jargon for matrix multiplication, or just math involving matrices.

X Matrix	Beta Matrix	Linear Combination (Multiplication)

To write out the equation for GLMM, all we need to do is introduce the random effect of the intercepts (this is the simplest case- this equation varies on the complexity of the model).

$$g(E(Y|X,\alpha))= \eta + \alpha = X\beta + \alpha$$ $$\alpha \sim N(0, \sigma_{\alpha}^2)$$

Where

$E(Y|X, \alpha)$: is the expected value of Y given the predictors X and an additional term $\alpha$, which represents the random intercepts.
$g()$: is the same as above, the link function.
$\eta + \alpha$: This adds the random components to the linear combination (multiplication) of X and $\eta$. Their sum combines the fixed effects ($X\beta$) and the random effects ($\alpha$) into the linear predictor.

The random effect distribution is:

$N()$: normally distributed
has a mean of 0
has a variance of $\sigma_{alpha}^2$

Lastly, we can estimate the parameters of interest like $\beta$, and $\sigma_{alpha}^2$ using maximum likelihood estimation related methods in the GLMM models. This is complicated!!

Marginalized Log-Likelihood Function

To demonstrate the expression for this, we will start by using logistic regression as an example. Here is the notation for logistic regression:

$$log\frac{p_{ij}}{1-p_{ij}} = X_{ij}\beta + \alpha_{j} => p_{ij} = \frac{e^{X_{ij}\beta + \alpha_{j}}}{1 + e^{X_{ij\beta + \alpha_{j}}}}$$

Where:

$p_{ij}$: the probability that the outcome $Y_{ij}$ = 1 for the i-th individual in the j-th group.
$log\frac{p_{ij}}{1-p_{ij}}$: the log-odds or logit of $p_{ij}$. This transforms the probability $p_{ij}$ (which ranges from 0 to 1) into a scale that ranges from $-\infty$ to $+\infty$
$X_{ij}$: A matrix representing the scores of the predictors
$\beta$: A column vector of regression coefficients corresponding to the predictors in $X_{ij}$.
$\alpha_{j}$: The group-specific random effect of intercept for the j-th group.

Takeaway: The log-odds of $p_{ij}$ is modeled as a linear combination of: Predictors ($X_{ij}$) weighted by coefficients ($\beta$) and a group-specific effect ($\alpha_{j}$) to capture unobserved heterogeneity between groups. The probability $p_{ij}$ is derived by applying the logistic function to this linear combination, which gives you the probability (this is not done automatically in R, but can be done as a follow up using predict() with type = "response". The linear relationship in the output of a model and is described as log-odds, because the math there makes sense!

Now we will write out the likelihood function for a hierarchical logistic regression model (or model with random effects).

$$L(\beta, \sigma_{\alpha}^2|Y,X) = \Pi_{j=1}^{J} \int \Pi_{i=1}^{n_{j}} p_{ij}^{Y_{ij}} (1-p_{ij})^{1-Y_{ij}} f_{\alpha}(u)du$$

Where:

$L(\beta, \sigma_{\alpha}^2|Y,X)$: Is the likelihood function, which represents the likelihood of the model parameters $\beta$ and $\sigma_{\alpha}^2$ given the data Y and X. Essentially, the likelihood function is trying to estimate the fixed effects and random effects from the observed data.
I dont think the rest matters to much to know right- the key takeaway is understanding the likelihood function.

Binary Response

Now we will be switching to the Faraway (2016) textbook. This dataset is from the OzDASL package and it is investigating Balance (1-4) from surface (normal, foam) and vision (eyes open, eyes closed, dome on head). Each participant did all experimental condition combinations twice, thus there are 12 measures per subject. We will transform balance into a dichotomous outcome variable for this demonstration.