Frequently Used Variant Pathogenicity and Constraint Scores - core-unit-bioinformatics/knowledge-base GitHub Wiki

This is an explanation of some pathogenicity/constraint scores frequently used to identify interesting variants


Scores available as VEP plugins (also within the Variant Interpretation Pipeline):

REVEL - (A variant-level missense pathogenicity score. Higher values indicate higher likelihood of pathogenicity).

CADD - A variant-level deleteriousness score (PHRED-like scaled). Higher values indicate greater predicted deleteriousness.

SpliceAI - A variant-level splice disruption probability score, representing the strongest predicted splice effect.

LOEUF - A gene-level LoF intolerance score. Lower values indicate stronger intolerance to loss-of-function variants.


Scores without VEP plugin:

MIS_Z - A gene-level constraint score. Higher values indicate stronger depletion of missense variants.

  • Data download: https://www.nature.com/articles/s41586-020-2308-7, supplementary data 11
  • Comment: The mentioned supp. data file contains MIS_Z scores as well as LOEUF scores, so these can be processed together instead of using the VEP plugin for LOEUF.

VIP integration

  • These plugins can be added to the Variant Interpretation Pipeline by putting them in the "run.config" file, e.g. like this (see the line starting with '--format vcf:
process {
    errorStrategy = { task.exitStatus in (1..200) ? 'retry' : 'finish' }
    maxRetries    = 3

    withName: "ENSEMBLVEP_VEP" {
        cpus   = 32
        memory = 125.GB
        time   = 72.h
        ext.args = {
            '--format vcf --offline --refseq --check_existing --everything --no_escape --flag_pick_allele_gene --terms SO --clin_sig_allele 1 --var_synonyms --vcf --assembly GRCh38 --plugin REVEL,file=/path/to/new_tabbed_revel_grch38.tsv.gz --plugin SpliceAI,snv=/path/to/spliceai_scores.raw.snv.hg38.vcf.gz,indel=/path/to/spliceai_scores.raw.indel.hg38.vcf.gz --plugin CADD,snv=/path/to/cadd_whole_genome_SNVs.tsv.gz,indels=/path/to/cadd_gnomad.genomes.r4.0.indel.tsv.gz'
        }
    }
}