VC Overview - iffatAGheyas/bioinformatics-tutorial-wiki GitHub Wiki

  • Overview
    Variant calling is the computational process of detecting genomic differences—single nucleotide variants (SNVs), insertions/deletions (indels), and larger structural variants (SVs)—by comparing aligned sequencing reads (BAM) to a reference genome. Variant callers inspect evidence at each position (base mismatches, read depth, alignment context) to decide whether a variant is present.

    Types of variants

    • SNVs (Single Nucleotide Variants): substitution of one base for another.
    • Indels (Insertions/Deletions): gains or losses of a small number (typically ≤50) of bases.
    • SVs (Structural Variants): larger events such as large insertions, deletions, inversions, duplications, or translocations.

    Why VCF?
    The Variant Call Format (VCF) is the community standard for recording variant calls. For each variant it captures:

    • Location: CHROM, POS
    • Alleles: REF, ALT
    • Quality & Filters: QUAL, FILTER
    • Metadata: INFO fields for annotations (e.g. allele frequency, depth)
    • Genotypes: FORMAT fields and per-sample columns (e.g. GT, DP, GQ)

    VCF’s structured header plus tabular layout make it easy to exchange, filter, annotate, merge, and visualize variants across different tools and workflows.