8. Tutorials - GenomicSEM/GenomicSEM GitHub Wiki

Tutorials

Simulating GWAS Summary Statistics Based on the LDSC Model (simLDSC)

Direct generation of summary statistics using the simLDSC function allows us to consider an expansive set of replications and conditions that would be computationally prohibitive to simulate using a framework in which raw phenotype data were first generated for individual genomes and then submitted to GWAS. The user can provide the population genetic covariance matrix (Σ) directly or alternatively provide a fully parameterized model specified in lavaan syntax.

The tutorial can be found [here] (https://rpubs.com/JaFuente/simLDSC).

The simLDSC approach was introduced in the following paper:

de la Fuente, J., Grotzinger, A. D., Marioni, R. E., G., Nivard, M. G., & Tucker-Drob, E. M. (2022). Integrated analysis of direct and proxy genome wide association studies highlights polygenicity of Alzheimer’s disease outside of the APOE region. PLoS Genetics 18, e1010208.

Parallel Analysis (paLDSC) for Determining the Number of Factors to Extract from a Genetic Correlation Matrix

The paLDSC function allows to identify the number of non-spurious dimensions in exploratory genomic factor analysis. Our method adapts a classic method known as Parallel Analysis (Horn, 1965) to the genomic space. paLDSC compares the eigenvalues generated from the eigen decomposition of the LDSC genetic correlation matrix to the eigenvalues of a Monte-Carlo simulated null correlation matrix with random noise drawn from the multivariate LDSC sampling distribution. The suggested number of factors to be extracted corresponds with an eigenvalue exceeding a pre-specified percentile from the corresponding distribution of eigenvalues generated under the null.

The tutorial can be found here.

This method was introduced in the supplement to the following paper:

Fürtjes, A. E., Arathimos, R., Coleman, J. R. I., Cole, J. H., Cox, S. R., Deary, I. J., de la Fuente, J., Madole, J. W., Tucker-Drob, E. M., & Ritchie, S. J. (2023). General dimensions of human brain morphometry inferred from genome-wide association data. Human Brain Mapping.

Genomic Structural Invariance (GSI) and Local Standardized Room Mean Square Difference (LocalSRMD)

Here is a tutorial on calculating LocalSRMD (local Standardized Root Mean-square Difference) in the context of testing Genomic Structural Invariance (GSI). LocalSRMD and GSI are both introduced and detailed in the following preprint:

Schwaba, T., Mallard, T.T., Maihofer, A.X., Rhemtulla, M., Lee, P.H., Smoller, J.W., Davis, L.K., Nivard, M.G., Grotzinger, A.D., & Tucker-Drob, E.M. (2023) Comparison of the Multivariate Genetic Architecture of Eight Major Psychiatric Disorders Across Sex. medRxiv.

Multivariate analysis of direct and proxy GWAS

Here is a tutorial on multivariate analysis of direct and proxy GWAS. More about the method can be found in the following paper:

de la Fuente, J., Grotzinger, A. D., Marioni, R. E., G., Nivard, M. G., & Tucker-Drob, E. M. (2022). Integrated analysis of direct and proxy genome wide association studies highlights polygenicity of Alzheimer’s disease outside of the APOE region. PLoS Genetics, 18, e1010208.

GWAS-by-Subtraction

Here is a tutorial on GWAS-by-Subtraction. More about the method can be found in the following paper:

Demange, P. , Malanchini, M., Biroli P., Cox, S., Grotzinger, A. D., Mallard, T., Tucker-Drob E. M., Abdellaoui A., Arseneault, L., Caspi, A., Corcoran, D., Domingue B., Mitchell C., van Bergen E., Boomsma D. I., Harris K. M. Ip H. F., Moffitt, T. E., Poulton, R., Prinz, J., Karen Sugden, K., Wertz J., Williams, B., de Zeeuw E. L., Belsky D. W., Harden K. P., & Nivard M. G. (2021). Investigating the genetic architecture of non-cognitive skills using GWAS-by-subtraction. Nature Genetics, 53, 35-44.

Using hdl() to estimate genetic (co)variance (and possibly gain some power):

Here you'll find a tutorial on the use of HDL instead of ldsc to estimate genetic covariance in GenomicSEM.

For details on HDL see: Ning, Z., Pawitan, Y., & Shen, X. (2020). High-definition likelihood inference of genetic correlations across human complex traits (pp. 1-6). Nature Genetics.

Mediation

Disentangle whether trait A influences trait C directly, or whether the effect is (partly) mediated by trait B in this tutorial on mediation

Bayesian Network Learning

Infer (plausible not definite!) causal networks from observational data. View the tutorial here.