Welcome to the SNPs_pipeline wiki!

Phylogeny Construction with SNPs


We received a vcf file where SNPsaurus converted genomic DNA into nextRAD genotyping-by-sequencing libraries (SNPsaurus, LLC). The nextRAD libraries were sequenced on a HiSeq 4000 on two lanes of 150bp reads.

With this pipeline you would be able to:

Modified from http://grunwaldlab.github.io/Population_Genetics_in_R/qc.html

  • Using RStudio 3.4.4 and packages:

genepop, parallel, poppr, dartR, devtools, phytools, seqinr, phylotools, adegenet, pegas, hierfstat

  1. Remove indels (insertions-deletions)

  2. SNPs upper and lower 20% of depth distribution

  3. Delete:

    3.1 Samples (missingness >70%)

    3.2 SNPs (>90%) with a high degree of missingness information

  4. Rewrite vcf file

  • Using RStudio or VCFTOOLS 0.1.17

    1. Run Minor Allele Frequency (MAF)
  • Convert vcf file to PHYLIP format

Suitable for RAxML

Or you can use another tool from CIPRESS Phylogenetic Collection (BEAST2, MRBAYES, RAXML) CyVerse


