Demo 2 - slzhang-fd/mvreprobit GitHub Wiki

Four-process 3-level (3 random effects) probit model, with additional time-varying level 3 random effect

In this simulation, we fit a four-variate, three-level model, which is an extension of the standard 3-level hierarchical model to include a time-varying level 3 random effect, giving three random effects in total. The mvreprobit_3re_Gibbs() function in mvreprobit package is used.

Similar to demo 1, we index the multivariate responses by $r$, time points by $t$, individuals (level 2 units) by $i$, and couples (level 3 units) by $j$. We specify an extended multivariate three random effects probit model as follows

$$ \boldsymbol y_{tij}^* = \mathbf B \boldsymbol x_{tij} + \boldsymbol u_{ij} + \boldsymbol v_j + \boldsymbol w_{tj} + \boldsymbol e_{tij}, $$

where underlying response variables $\boldsymbol y_{tij}^* = (y_{1tij}^* , \ldots , y_{Jtij}^*)^\top $, $\mathbf B = (\boldsymbol \beta_1,\ldots,\boldsymbol \beta_J)^\top$ are coefficient parameters. $\boldsymbol u_{ij}\sim N(0,\Sigma_u)$ is a vector of level 2 (individual) random effects, $\boldsymbol v_j\sim N(0,\Sigma_v)$ is a vector of level 3 (couple) random effects, $\boldsymbol w_{jt}\sim N(0, \Sigma_w)$ is a vector of time-varying couple-level random effects, and $\boldsymbol e_{ti}\sim N(0,\Sigma_e)$ is a vector of time-varying error term with the diagonal elements of $\Sigma_e$ are set to one for identification. All random effect vectors are $J$ dimensional.

The model above is the same as the model (2) in the paper

Steele, F., Zhang, S., Grundy, E., and Burchardt, T. (2022). Longitudinal analysis of exchanges of support between parents and children in the UK.

#####################################################
## 2. Four-variate, 3-level (3 multilevel group
##    random effects) multilevel random intercept model
##
## Simulate data from the model
library(truncnorm)
library(mvtnorm)
library(mvreprobit)

rm(list = ls())
# 4 response variables
K <- 4
# J: Number of level 3 units (couples)
J <- 1000
# N: Number of level 2 units (individuals)
# 2 per level 3 unit (individuals per couple)
N <- 2000
# NN: Number of observations
# 4 time points for each individual
NN <- 8000
# p: number of explanatory variables (including constant)
p <- 5

## set true parameters
set.seed(123)
coeffs <- matrix(rnorm(K*p),p,K)
x_covs <- cbind(rep(1,NN),matrix(rnorm(NN*(p-1)),NN,(p-1)))

# Residual covariance matrix
Sigma_e <- matrix(0.1, K, K)
diag(Sigma_e) <- 1
B_e <- t(chol(Sigma_e))

# Level 2 random effects covariance matrix
Sigma_u <- matrix(0.2, K,K)
diag(Sigma_u) <- runif(K, 1,2)
B_u <- t(chol(Sigma_u))

# Level 3 random effects covariance matrix
Sigma_v <- matrix(0.15, K,K)
diag(Sigma_v) <- runif(K, 1,2)
B_v <- t(chol(Sigma_v))

# Time-varying level 3 random effects covariance matrix
Sigma_w <- matrix(0.15, K, K)
diag(Sigma_w) <- runif(K, 1,2)
B_w <- t(chol(Sigma_w))

# Level 3 identifier (couple)
j_ind <- rep(1:J, each=8)
# Level 2 identifier (individual)
i_ind <- rep(1:N, each=4)
# Level 1 identifier (time)
t_ind <- rep(1:4, N)
# time-varying level 3 identifier (j X t)
jt_ind <- as.numeric(as.factor(paste(j_ind,t_ind)))

## generate random variables
U_all <- rmvnorm(N, sigma = Sigma_u)
V_all <- rmvnorm(J, sigma = Sigma_v)
W_all <- rmvnorm(max(jt_ind), sigma = Sigma_w)
Y_star <- x_covs %*% coeffs +
  U_all[i_ind,] + V_all[j_ind,] + W_all[jt_ind,] +
  rmvnorm(NN, sigma = Sigma_e)
Y <- 1 * (Y_star>0)

## Fit the 4-process 3 random effect probit model
## Run for 10,000 iterations (including burn-in)
##
## Note: for running K-process model (K>1), the MH step
## size cor_step_size needs to be specified.
## Please choose a large step size and adjust the rejection rate
## output between 0.7~0.9 when running the program (output every 100 steps)
res_3re_4process <-
  mvreprobit_3re_Gibbs(Y, x_covs, j_ind=j_ind, i_ind=i_ind, jt_ind = jt_ind,
                       max_steps = 10000, cor_step_size = 0.06)

#####################################################
## Plot Gibbs posterior samples credible interval
## versus simulation true
##
## Note that for this more complex model, the parameter
## estimates are further from the true values than for
## the simple 2-level case.
## The difference is smaller if the sample size is increased.

res_compare_plot(res_3re_4process$coeffs_all, coeffs, burn_in = 1000)

## Obtain estimated posterior means and credible intervals
## (based on sample size of 4,000 as burn-in is 1,000)
## Compare each with true values used in the simulation
## (in coeffs, Sigma_e, Sigma_u, Sigma_v, and Sigma_w)
##
## SDs and 95% credible intervals are also stored in res_summary_3re
res_summary_3re <- res_summary_3re_Gibbs(res_3re_4process, x_covs, burnin = 1000)

round(coeffs, 3)
round(res_summary_3re$coeffs, 3)

round(Sigma_e, 3)
round(res_summary_3re$Sigma_e, 3)

round(Sigma_u, 3)
round(res_summary_3re$Sigma_u, 3)

round(Sigma_v, 3)
round(res_summary_3re$Sigma_v, 3)

round(Sigma_w, 3)
round(res_summary_3re$Sigma_w, 3)

Results after 5,000 iterations of Gibbs sampling (first 1,000 as burnin)

Time spent: 2 min 31 secs

## Posterior means are used as estimated values.

coefficients:
True:
       [,1]   [,2]   [,3]   [,4]
[1,] -0.560  1.715  1.224  1.787
[2,] -0.230  0.461  0.360  0.498
[3,]  1.559 -1.265  0.401 -1.967
[4,]  0.071 -0.687  0.111  0.701
[5,]  0.129 -0.446 -0.556 -0.473
Estimated:
         Y1     Y2     Y3     Y4
[1,] -0.462  1.621  1.185  1.580
[2,] -0.246  0.397  0.294  0.424
[3,]  1.482 -1.122  0.350 -1.704
[4,]  0.039 -0.585  0.113  0.597
[5,]  0.166 -0.391 -0.564 -0.373

Sigma.e (covariance matrix for e_ti, diagonals are set to one for identifiability)
True:
     [,1] [,2] [,3] [,4]
[1,]  1.0  0.1  0.1  0.1
[2,]  0.1  1.0  0.1  0.1
[3,]  0.1  0.1  1.0  0.1
[4,]  0.1  0.1  0.1  1.0
Estimated:
      Y1     Y2    Y3     Y4
Y1 1.000  0.023 0.012  0.078
Y2 0.023  1.000 0.106 -0.148
Y3 0.012  0.106 1.000  0.043
Y4 0.078 -0.148 0.043  1.000

Sigma.u (covariance matrix for u_ij, individual-specific random effect)
True:
      [,1]  [,2]  [,3]  [,4]
[1,] 1.329 0.200 0.200 0.200
[2,] 0.200 1.087 0.200 0.200
[3,] 0.200 0.200 1.857 0.200
[4,] 0.200 0.200 0.200 1.122
Estimated:
      Y1    Y2    Y3    Y4
Y1 1.161 0.211 0.345 0.129
Y2 0.211 0.820 0.187 0.248
Y3 0.345 0.187 1.702 0.268
Y4 0.129 0.248 0.268 0.813

Sigma.v (covariance matrix for v_j, time-invariant couple-level random effect)
True:
      [,1]  [,2]  [,3]  [,4]
[1,] 1.236 0.150 0.150 0.150
[2,] 0.150 1.662 0.150 0.150
[3,] 0.150 0.150 1.781 0.150
[4,] 0.150 0.150 0.150 1.679
Estimated:
      Y1    Y2     Y3     Y4
Y1 0.932 0.057  0.012  0.012
Y2 0.057 1.334  0.057  0.169
Y3 0.012 0.057  1.750 -0.020
Y4 0.012 0.169 -0.020  1.456

Sigma.w (covariance matrix for w_jt, time-varying couple-level random effect)
True:
     [,1]  [,2]  [,3]  [,4]
[1,] 1.80 0.150 0.150 0.150
[2,] 0.15 1.829 0.150 0.150
[3,] 0.15 0.150 1.293 0.150
[4,] 0.15 0.150 0.150 1.138
Estimated:
      Y1    Y2    Y3    Y4
Y1 1.610 0.152 0.312 0.185
Y2 0.152 1.508 0.007 0.331
Y3 0.312 0.007 1.306 0.138
Y4 0.185 0.331 0.138 0.732

Plot Gibbs posterior samples credible intervals against simulation true values

coefficients