Example 19: Simulation with Varying Sample Size and Percent Missing - simsem/simsem GitHub Wiki

Model Description

This example will show how to make the simulation study such that both sample size and percent completely missing at random are not equal across replications. That is, the sample size is increasing from 50 to 500 by 5 and the percent completely missing at random could be 0, 0.1, 0.2, 0.3, or 0.4. Then, we will find the combination of sample size and percent completely missing at random values that the power of a given parameter is equal to .8 and the fit indices cutoff of the estimated sample size value. The model in this example is the conditional growth curve model. That is, the model is the growth curve model from Example 3. The intercept and the slope factors are predicted by a grouping variable. The grouping variable has two conditions with equal probability (the average and the variance of the binary variable are 0.5 and 0.25). The effects of the grouping variables onto the intercept and slope are 0.5 and 0.1. We will find the power in detecting these two effects.

Example 19 Model

Syntax

The factor loading object can be specified:

loading <- matrix(0, 5, 3)
loading[1,1] <- 1
loading[2:5,2] <- 1
loading[2:5,3] <- 0:3
loading.trivial <- matrix(0, 5, 3)
loading.trivial[3:4, 3] <- "runif(1, -0.1, 0.1)"
LY <- bind(loading, misspec=loading.trivial)

The factor mean object can be specified:

facMean <- rep(NA, 3)
facMeanVal <- c(0.5, 5, 2)
AL <- bind(facMean, facMeanVal)

The factor variance object can be specified:

facVar <- rep(NA, 3)
facVarVal <- c(0.25, 1, 0.25)
VPS <- bind(facVar, facVarVal)

The factor correlation object can be specified:

facCor <- diag(3)
facCor[2,3] <- NA
facCor[3,2] <- NA
RPS <- binds(facCor, 0.5)

The measurement error variance object can be specified:

VTE <- bind(c(0, rep(NA, 4)), 1.2)

The measurement error correlation object can be specified:

RTE <- binds(diag(5))

The measurement intercept object can be specified:

TY <- bind(rep(0, 5))

The regression coefficient matrix object can be specified:

path <- matrix(0, 3, 3)
path[2,1] <- NA
path[3,1] <- NA
pathVal <- matrix(0, 3, 3)
pathVal[2,1] <- 0.5
pathVal[3,1] <- 0.1
BE <- bind(path, pathVal)

The SEM object that represents the conditional growth curve model is specified:

LCA.Model <- model(LY=LY, RPS=RPS, VPS=VPS, AL=AL, VTE=VTE, RTE=RTE, TY=TY, BE=BE, modelType="SEM")

The data distribution object representing the factor distribution can be specified:

group <- list(size=1, prob=0.5)
n01 <- list(mean=0, sd=1)
facDist <- bindDist(c("binom", "norm", "norm"), group, n01, n01, keepScale=c(FALSE, TRUE, TRUE))

Binomial distribution and normal distribution are specified in the data distribution object. In the list of arguments of the binomial distribution, the first argument is the number of trials and the second argument is the proportion of success (or treatment group). If the number of trial is 1 in the binomial distribution, the binomial distribution will be a Bernoulli trial which provides only 0 or 1 (or dummy variable). In this list of normal distribution, mean and standard deviation are specified. In the bindDist function, the name vector of distributions is specified following by the list of arguments of each distribution. The keepScale argument is to use the mean and standard deviation from the model or the distribution. If the keepScale argument is TRUE, the model-implied mean and standard deviation are used. If FALSE, the mean and standard deviation from the distribution are used.

The result object can be specified:

Output <- sim(NULL, n=50:500, LCA.Model, pmMCAR=seq(0, 0.4, 0.1), sequential=TRUE, 
    facDist=facDist)

The facDist argument is specified for the factor distribution. The pmMCAR argument is the values of percent completely missing at random. Notice that both sample size and percent completely missing at random are vectors. The total number of replications is the product of the length of both vectors. That is, the function will run the replications of all factorial combination of sample size and percent missing completely at random. The result object can be summarized:

summary(Output)

The figure below shows the screen provided by the summary function:

Example 19 summary

Example 19 summary

The cutoffs given the value of sample size and percent completely missing at random can be plotted by the plotCutoff function:

plotCutoff(Output, 0.05)

The figure below shows the graph provided by the plotCutoff function:

Example 19 SSD

The cutoff can be plotted in a three-dimensional graph by specifying the useContour argument as FALSE:

The figure below shows the graph provided by the plotCutoff function by specifying useContour as FALSE:

Example 19 SSD2

We can also use the getCutoff functions to find the cutoff given the specific value of sample size and percent missing completely at random.

getCutoff(Output, 0.05, nVal = 200, pmMCARval = 0)
getCutoff(Output, 0.05, nVal = 300, pmMCARval = 0.33)	

The power of each parameter given each combination of sample size and percent missing completely at random can be obtained by the getPower function:

Cpow <- getPower(Output)

The figure below shows the first six rows of the Cpow object:

Example 19 Cpow

Cpow2 <- getPower(Output, nVal = 200, pmMCARval = 0.35)

The figure below shows the Cpow2 object:

Example 19 Cpow2

The nVal and pmMCARval arguments are used to find the power of each parameter on the specific values of sample size and percent missing completely at random specifically.

The power table obtained from the getPower function can be used to find the sample size value that provides the power of 0.80 given each value of percent missing complete at random by the findPower function:

findPower(Cpow, "N", 0.80)

The figure below shows the screen provided by the findPower function for sample size:

Example 19 findpower

The percent missing completely at random value that provides the power 0.80 given each value of sample size can be calculated:

findPower(Cpow, "MCAR", 0.80)

The figure below shows the screen provided by the findPower function for percent missing completely at random:

Example 19 findpower2

The power graphs of the regression coefficients from the grouping variable against the sample size and percent completely missing at random can be built by the plotPower function:

plotPower(Output, powerParam = c("f2~f1", "f3~f1"))

The figure below shows the graph provided by the plotPower function:

Example 19 plotpower

The power can be plotted in a three-dimensional graph by specifying the useContour argument as FALSE:

plotPower(Output, powerParam = c("f2~f1", "f3~f1"), useContour = FALSE)

The figure below shows the graph provided by the plotPower function by specifying useContour as FALSE:

Example 19 plotpower2

Here is the summary of the whole script in this example.