Example: Height & BMI (basic) - ajaynadig/bhr GitHub Wiki

BHR Example: Burden heritability and genetic correlation for height and BMI

Description

This tutorial contains code for running basic univariate and bivariate analysis for height and BMI, using ultra-rare pLoF variants from Genebass. Before proceeding, clone this repository onto your machine: devtools::install_github("ajaynadig/bhr")

Loading files

In the reference_files directory, locate the following files:

  1. github_ms_variant_ss_400k_pLoF_group1_21001NA_BMI_munged.txt.zip: variant-level summary statistics for BMI
  2. github_ms_variant_ss_400k_pLoF_group1_50NA_Height_munged.txt.zip: variant-level summary statistics for height
  3. ms_baseline_oe5.txt: baseline-BHR file

Load the files into R

height <- read.table("~/github_ms_variant_ss_400k_pLoF_group1_50NA_Height_munged.txt")
bmi <- read.table("~/github_ms_variant_ss_400k_pLoF_group1_21001NA_BMI_munged.txt")
baseline <- read.table("~/ms_baseline_oe5.txt")

The first lines of the height file:

             gene      N      beta         AF variant_variance phenotype_key chromosome gene_position
1 ENSG00000000419 375630  0.014129 5.1037e-06       2.4663e-05          50NA         20      49563248
2 ENSG00000000419 375630  0.445650 1.2695e-06       6.1661e-06          50NA         20      49563248
3 ENSG00000000419 375630 -0.640410 1.2691e-06       6.1663e-06          50NA         20      49563248
4 ENSG00000000419 375630  0.250800 2.5382e-06       1.2332e-05          50NA         20      49563248
5 ENSG00000000419 375630 -0.382830 1.2746e-06       6.1662e-06          50NA         20      49563248
6 ENSG00000000419 375630  0.288030 2.5496e-06       1.2332e-05          50NA         20      49563248

The first lines of the baseline file:

             gene baseline_oe1_total5 baseline_oe2_total5 baseline_oe3_total5 baseline_oe4_total5
1 ENSG00000000419                   0                   0                   1                   0
2 ENSG00000000457                   0                   1                   0                   0
3 ENSG00000000460                   0                   0                   1                   0
4 ENSG00000000938                   0                   1                   0                   0
5 ENSG00000000971                   0                   1                   0                   0
6 ENSG00000001036                   0                   0                   1                   0

Running univariate analysis for height

The code below will estimate the burden heritability for height using these summary statistics. See Running BHR for details about the custom_variant_variances flag

height_univariate <- BHR(mode = "univariate",
              trait1_sumstats = height,
              custom_variant_variances = TRUE,
              annotations = list(baseline))

produces console output:

Burden Heritability Regression
Daniel Weiner, Ajay Nadig, and Luke O'Connor, 2023
Running BHR at 2023-01-01 09:36:59
For trait 1, MAF ranges from 3.1e-06 to 4e-05
...seems reasonable
Lambda GC:  1.414
...seems reasonable
16612 genes in the BHR regression, 47 of which are significant fixed effects
...seems reasonable
BHR finished at 2023-01-01 09:37:10

To extract total heritability, enter height_univariate$mixed_model$heritabilities[,"total"]:

          h2        h2_se 
0.0171046211 0.0008884702 

Running bivariate (genetic correlation) analysis for height and BMI

This code will estimate the burden genetic correlation between height and BMI

height_bmi_bivariate <- BHR(mode = "bivariate",
              trait1_sumstats = height,
              trait2_sumstats = bmi,
              custom_variant_variances = TRUE,
              annotations = list(baseline))

produces console output:

Burden Heritability Regression
Daniel Weiner, Ajay Nadig, and Luke O'Connor, 2023
Running BHR at 2023-01-01 09:43:45
For trait 1, MAF ranges from 3.1e-06 to 4e-05
...seems reasonable
For trait 2, MAF ranges from 1.4e-06 to 1.9e-05
...seems reasonable
Lambda GC:  1.414
...seems reasonable
16607 genes in the BHR regression, 56 of which are significant fixed effects
...seems reasonable
Lambda GC:  1.211
...seems reasonable
16607 genes in the BHR regression, 56 of which are significant fixed effects
...seems reasonable
BHR finished at 2023-01-01 09:44:00

To extract genetic correlation, run height_bmi_bivariate$rg$rg_mixed and height_bmi_bivariate$rg$rg_mixed_se for output:

height_bmi_bivariate$rg$rg_mixed
-0.2004711

height_bmi_bivariate$rg$rg_mixed_se
0.07669187