1. Four processes one factor probit model - slzhang-fd/mvreprobit Wiki

Demo 1

In this simulation, we fit a four-variate, two levels (1 group random effect factor) probit model to the simulated data using the mvreprobit_1factor_Gibbs() function in mvreprobit package.

For each binary response $y_{rti}$, there exists an underlying continuous variable $y_{rti}^*$ that

$$ y_{rti}=1 \Leftrightarrow y_{rti}^* > 0, $$

where $r=1,\ldots,J,t=1,\ldots,T_i,i=1,\ldots,n$. The underlying variables $\boldsymbol y_{ti}^* = (y_{1ti},\ldots,y_{Jti})^\top$ satisfy

$$ \boldsymbol y_{it}^* =\mathbf B \boldsymbol{x_{ti}} + \boldsymbol u_i + \boldsymbol e_{ti}, $$

where the $\mathbf{B}=(\beta_1,\ldots,\beta_J)$ are coefficient parameters, $\boldsymbol u_i\sim N(0,\Sigma_u)$ is individual random effect, $\boldsymbol e_{ti}\sim N(0,\Sigma_e)$ is time-varying error term. For identifiability, the diagonal elements of $\Sigma_e$ are set to one.

The model above is the same as the model (1) in the paper

Steele, F., Zhang, S., Grundy, E., and Burchardt, T. (2022). Longitudinal analysis of exchanges of support between parents and children in the UK.

#####################################################
## 1. Four-variate, two levels (1 group random 
##    effect factor) probit model
##
## load required packages
library(truncnorm)
library(mvtnorm)
library(mvreprobit)

## function for comparing estimated parameters versus the true ones
compare_func <- function(refit_res, true_val, burn_in) {
  refit_res <- refit_res[(burn_in + 1):nrow(refit_res), ]
  yl <- apply(refit_res, 2, quantile, 0.025)
  yu <- apply(refit_res, 2, quantile, 0.975)
  x <- seq_len(ncol(refit_res))
  plot(x, true_val, ylim = range(c(yl, yu)), pch = 19, cex = .3, xaxt = "n")
  axis(1, at = x)
  arrows(x, yl, x, yu, angle = 90, code = 3, length = .02)
}


## Simulate data
rm(list = ls())
# 4 variates
K <- 4
# N: number of individuals
N <- 1000
# 4 time points for each individual
NN <- 4000
# p: number of coefficients
p <- 5

## set true parameters
# mu <- runif(K, -1, 1)
set.seed(123)
coeffs <- matrix(rnorm(K*p),p,K)
x_covs <- cbind(rep(1,NN),matrix(rnorm(NN*(p-1)),NN,(p-1)))

## Correlation matrix for the error
Sigma_e <- matrix(0.2, K, K)
diag(Sigma_e) <- 1
B_e <- t(chol(Sigma_e))

## Covariance matrix for the random effect u
Sigma_u <- matrix(0.3, K,K)
diag(Sigma_u) <- runif(K, 1,2)
B_u <- t(chol(Sigma_u))

## indices for individuals/groups
i_ind <- rep(1:N, each=4)

## generate random variables
U_all <- rmvnorm(N, sigma = B_u %*% t(B_u))
Y_star <- x_covs %*% coeffs + U_all[i_ind,] + rmvnorm(NN, sigma = Sigma_e)
Y <- 1 * (Y_star>0)

## Fit 4-process 1-factor model
res_1fac_4process <-
  mvreprobit_1factor_Gibbs(Y, x_covs=x_covs, i_ind=i_ind,
                           max_steps = 5000, cor_step_size = 0.06)

#####################################################
## Plot Gibbs posterior samples credible interval
## versus simulation true
compare_func(res_1fac_4process$coeffs_all, coeffs, burn_in = 1000)
compare_func(res_1fac_4process$Sigma_e_all, Sigma_e, burn_in = 1000)
compare_func(res_1fac_4process$Sigma_u_all, Sigma_u, burn_in = 1000)

## Check estimated posterior means and credible intervals
res_summary_1fac_Gibbs(res_1fac_4process, x_covs, burnin = 1000)

Results after 5,000 iterations of Gibbs sampling (first 1,000 as burnin)

Time spent: 1 min 8 sec

## Posterior means are used as estimated values.

coefficients:
True:
       [,1]   [,2]   [,3]   [,4]
[1,] -0.560  1.715  1.224  1.787
[2,] -0.230  0.461  0.360  0.498
[3,]  1.559 -1.265  0.401 -1.967
[4,]  0.071 -0.687  0.111  0.701
[5,]  0.129 -0.446 -0.556 -0.473
Estimated:
       [,1]   [,2]   [,3]   [,4]
[1,] -0.607  1.678  1.119  1.782
[2,] -0.202  0.387  0.333  0.534
[3,]  1.506 -1.316  0.418 -2.099
[4,]  0.015 -0.697  0.141  0.739
[5,]  0.176 -0.470 -0.521 -0.492

Sigma.e (covariance matrix for e_ti, diagnal one for identifiability)
True:
     [,1] [,2] [,3] [,4]
[1,]  1.0  0.2  0.2  0.2
[2,]  0.2  1.0  0.2  0.2
[3,]  0.2  0.2  1.0  0.2
[4,]  0.2  0.2  0.2  1.0
Estimated
     [,1] [,2] [,3] [,4]
[1,] 1.00 0.31 0.23 0.19
[2,] 0.31 1.00 0.24 0.11
[3,] 0.23 0.24 1.00 0.19
[4,] 0.19 0.11 0.19 1.00

Sigma.u (covariance matrix for u_ij, individual level random effect)
True:
     [,1] [,2] [,3] [,4]
[1,] 1.72 0.30 0.30 0.30
[2,] 0.30 1.24 0.30 0.30
[3,] 0.30 0.30 1.17 0.30
[4,] 0.30 0.30 0.30 1.78
Estimated
     [,1] [,2] [,3] [,4]
[1,] 1.63 0.17 0.33 0.34
[2,] 0.17 1.18 0.29 0.34
[3,] 0.33 0.29 1.07 0.47
[4,] 0.34 0.34 0.47 2.26

Plot Gibbs posterior credible intervals against simulation true values

coeffcients