Complex traits, GWAS, Human genetics and the future - AndersenLab/Genetic-Analysis GitHub Wiki

Lecture 15

Association mapping: Correlating genotype with phenotype Walk through of slides 30-37 from lecture 14:

  1. Find a disease/trait of interest that has variation in the population and phenotype individuals.
  2. Genotype individuals with the trait and individuals without the trait (here we are looking at human height so we want to genotype short, medium, and tall individuals).
  3. At each marker across the genome (accounting for LD blocks etc.), split the population based on their genotype. If there are only two alleles at this marker (i.e. A or G) there will be two groups.
  4. Now, we want to perform a statistical test to see if we see a correlation between genotype and phenotype
    • Expectation: individuals with the A allele will be a mixture of short/tall and individuals with the G allele will also be a mixture of short/tall.
    • Significant correlation: Most of the individuals with allele 1 are tall and most of the individuals with allele 2 are short. We call this a quantitative trait locus which is fancy for "region of the genome that correlates with our phenotype. We can now say this marker (or another marker close by) is linked to the disease-causing allele
  5. We do this for all the markers across the genome and can generate a manhattan plot of our genome-wide analysis. We plot the genomic position on the x-axis and the significance (more positive = more significant) on the y-axis. Markers above a set threshold are significant.

GWAS Calculation

  • If we know the number of cases (people with a disease/trait) and controls (people without the disease/trait) that have each SNV, we can use a chi-square test to calculate the p-value
    • What this means, is do we see more (or less) of one allele in the cases (observed) than we expect (what we see in the controls)
    • If there is no correlation between genotype at this marker and phenotype, we should see the same ratio of SNV1 in the cases as the controls (i.e. 40% G, 60% A).
    • If there IS a correlation between genotype at this marker and phenotype, we should see an altered ratio of SNV1 in the cases (70% G, 30% A) and controls (40% G, 60% A).
  • Remember, we need to correct for multiple testing with Bonferroni correction which is accomplished by dividing your significance level (i.e. 0.05) by the total number of tests you performed (number of markers tested)
  • Note: GWA mapping works best within a related population. Why??

Lessons from the GWAS era

  • Many traits are polygenic
  • Effect sizes of common variants are very small
  • Many associated SNVs are near genes, but not always in genes
  • Most functional variants might affect gene expression as opposed to protein function

Genotype Relative Risk Now we know that SNV1 is associated with our trait of interest. How do we calculate the probability of expressing that trait given our genotype at that SNV?

For our normal diallelic loci (A or a), we will have three possible genotypes: AA, Aa, or aa.

Lets say the majority of the population has the aa genotype, but you have the AA genotype. What is your risk??

GRRAA = (Risk for AA genotype) / (risk for aa genotype)

  • Relative risks are estimated as ratios of case:control ratios GRRAA = (cases AA / controls AA) / (cases aa / controls aa)
  • A GRRAA of 1.2x is more likely than the aa genotype to have the disease
⚠️ **GitHub.com Fallback** ⚠️