01.Association04.Models with grouped outcomes - sporedata/researchdesigneR GitHub Wiki

1. Use cases: in which situations should I use this method?

Multilevel models are used in situations where the observations are not independent (e.g., one observation per patient), for example:
- the same patient might have multiple observations over time (the group is time),
- that the same patient had more than one observation at any given time (e.g., measuring body weight twice to increase measurement reliability),
- that the intervention might have varied over time (e.g., a trial investigating the role of placebo effect where the blinding changed from involving patient and researchers, to patients only or to an open-label format),
- where the intervention assigned to groups changed (e.g., in a cross-over trial).
Generalized estimating equation (GEE) is one of the types of marginal models. GEE models are often preferred when population estimates are required, especially when ecological measures are involved [1]. When generating the means you can only generate the predicted means for the population (study sample-entire group) and not specific subgroups [15].
Mixed models combine fixed effects and random effects. Predictions can be done for individual models (subgroups) and at the same time average all other parameters and also get the population means [15].
Bayesian multilevel models allow us to establish the association between two variables (a risk factor and an outcome) while accounting for potential confounding, missing data, and groups in the dataset. The interpretation of the results is far more intuitive than the interpretation provided by frequentist analyses since a parameter interval (for example the difference in the quality of life outcomes between two interventions) demonstrates the probability of this value given our prior knowledge.
It can be used in Evaluating the effects of provider practices on the delivery of care and patient outcomes
In a multicenter research trial, patients are frequently treated in different clinics, and experimental tasks with multiple trials produce data that is nested within participants. Data points become dependent on the contexts in which they are collected. Patients treated by the same clinician, for example, should produce data that is more similar than patients treated by different clinicians. Multilevel models can be used to model this dependency. Multilevel models not only liberate us from some of the assumptions of other tests (for example, homogeneity of regression slopes) and eliminate the need for full data sets, but they also allow us to investigate whether effects are context dependent (i.e. do slopes or intercepts vary across contexts) [16].

2. Input: what kind of data does the method require?

A dataset with outcomes, predictors, and confounders.

3. Algorithm: how does the method work?

Describing in words

Some authors have suggested that multilevel models should be the default rather than the exception - see Multilevel Regression as Default

Describing in images

Multilevel models can be used in situations where Simpson's paradox might be introducing bias into the association between outcome and risk factors.

These models allow analysts to consider multiple layers

Describing with code

Alternative algorithms can be used to speed up the computation in the early stages - see Run Stan's variational algorithm for approximate posterior sampling

rstan [10].
brms [11].
rstanarm [12].

Suggested companion methods

Causal models could be used to provide better control of confounding [13].
Machine learning predictive models
Latent Curve Models (LCM) are a technique based on structural equation modeling, where change is modeled as a function of time and is represented through the specification of latent variables, also referred to as growth factors. Even though LCM are an alternative, we usually do not recommend using them due to the following reasons:
- both slope and intercept have to be modeled as latent variables, which makes the interpretation of the model too complex;
- a series of assumptions should be made about the values of latent intercept and slope, which are usually difficult to develop;
- several questions regarding the individuals' trajectories can arise [17].

Learning materials

4. Output: how do I interpret this method's results?

Typical tables and plots and corresponding text description

See Table 3 from [14] for an example.

OR: Compared to [referent], [intervention] presented [or, (95% CI)] times the risk [p value] of [outcome].

Odds Ratio

The odds ratio (OR) helps identify how likely a specific event (outcome) is associated to an exposure (intervention) when compared with the same event occurring in the absence of that exposure. The larger the OR, the higher odds that the event will occur with exposure. When OR < 1, it implies the event has fewer odds of happening with the exposure, whereas OR = 1 implies the exposure does not affect the odds of the event. The OR value can be reported as follows:

Compared to [referent], [intervention] presented [OR, (95% CI)] times the odds [p value] of [outcome].

The 95% confidence interval (CI) gives an expected range for the population odds ratio to fall within. It can be used to estimate the precision of the OR, where a large CI indicates a low level of precision of the OR, whereas a small CI indicates a higher precision of the OR. The CI is also used as a indicator of statistical significance for the OR if it does not overlap the null value (OR = 1). Of importance, negative CI values are just an artifact of the binomial distribution used to calculate them when the lower boundary is close to zero [18]. We can keep them as is or replace them with a zero value.

The p-value is the probability of observing the given effect at least as extreme as the one observed in the sample data, assuming the truth of null hypothesis. A p-value less than 0.05 means that observing such an extreme result under the null hypothesis would be very unlikely (less than 5% of the time), providing statistical significance to reject the null hypothesis (OR = 1).

Metaphors

Subgroups within the data and violate the assumption of independence between individuals.
Bayesian Multiple comparisons can be performed without concerns. It is possible to inspect and test subgroups smaller than other ones.

Reporting guidelines

The analysis should follow a standard sequence, including:
1. Extensive data management and checking of uni and bivariate analyses,
2. A detailed model specification (see references at the end) including handling missing data, the choice of groups, varying coefficients, and prior distributions (preferably conjugate as that will simplify the posterior sampling [2] and [3].
3. Parallelize if needed [4].
4. Checking of the model fit [5].
5. Model comparison to achieve the most parsimonious models
6. Outcome predictions
7. The writing of the Results section, since most researchers are not used to concepts such as credible intervals and Highest Density Intervals (HDI) [6].
Bayesian models are often preferred over their frequentist counterparts since the size of each group does not become an issue. The latter is not true for frequentist models, where group sample sizes lower than 30 tend to be a source of problems [7]. A few other advantages are mentioned [8] and [9].

Mock conclusions or most frequent format for conclusions reached at the end of a typical analysis.

Given the observed data, the effect has a 95% probability of falling within the range of HDI (Highest Density Interval).

5. SporeData-specific

Templates

Data science functions

sdatools::boxPlot
sdatools::scatterPlot
sdatools::barPlot
sdatools::stackedBarPlot
sdatools::piratePlot
sdatools::ExplanatoryAnalysis - add the groups and parameters

References

[1] Hoffmann JA, Farrell CA, Monuteaux MC, Fleegler EW, Lee LK. Association of pediatric suicide with county-level poverty in the United States, 2007-2016. JAMA pediatrics. 2020 Mar 1;174(3):287-94.
[2] Gelman A. Prior choice recommendations. Retrieved July. 2019 Sep;24:2019.
[3] Betancourt M. How the shape of a weakly informative prior affects inferences. March. 2017;17:2017.
[4] How the Shape of a Weakly Informative Prior Affects Inferences.
[5] Robust Statistical Workflow with RStan
[6] Highest Density Interval
[7] Bryan ML, Jenkins SP. Regression analysis of country effects using multilevel data: A cautionary tale.
[8] Nalborczyk L, Batailler C, Lœvenbruck H, Vilain A, Bürkner PC. An introduction to Bayesian multilevel models using brms: A case study of gender effects on vowel variability in standard indonesian. Journal of Speech, Language, and Hearing Research. 2019 May 21;62(5):1225-42.
[9] Bølstad, Jørgen. 2019. The Benefits of Bayesian Hierarchical Modeling: Comparing Partially Pooled and Unpooled Models in R”. Playing with Numbers: Notes on Bayesian Statistics. www.boelstad.net/post/bayesian_hierarchical_modeling/.
[10] Team SD. RStan: the R interface to Stan. R package version. 2016;2(1).
[11] Bürkner PC, Buerkner MP. Package brms.
[12] Goodrich B, Gabry J, Ali I, Brilleman S. rstanarm: Bayesian applied regression modeling via Stan. R package version. 2018;2(4):1758.
[13] McCandless LC, Gustafson P, Austin PC. Bayesian propensity score analysis for observational data. Statistics in medicine. 2009 Jan 15;28(1):94-112.
[14] Laptook AR, Shankaran S, Tyson JE, Munoz B, Bell EF, Goldberg RN, Parikh NA, Ambalavanan N, Pedroza C, Pappas A, Das A. Effect of therapeutic hypothermia initiated after 6 hours of age on death or disability among newborns with hypoxic-ischemic encephalopathy: a randomized clinical trial. Jama. 2017 Oct 24;318(16):1550-60.
[15] Fitzmaurice, Garrett M.; Laird, Nan M.; Ware, James H. Applied longitudinal analysis; Chapter 16. Retrieved 2019-03.
[16] Field & Wright. A Primer on Using Multilevel Models in Clinical and Experimental Psychopathology Research. Journal of Experimental Psychopathology; 2(2), 271–293.
[17] Bollen, Kenneth A., and Patrick J. Curran. Latent Curve Models: A Structural Equation Perspective. Wiley Series in Probability and Statistics. Hoboken, N.J: Wiley-Interscience, 2006.
[18] Brown LD, Cai TT, Dasgupta A. Interval Estimation for a Binomial Proportion. Statistical Science. 1999;16:101-133.