SNP_Phylo - BGIGPD/BestPractices4Pathogenomics GitHub Wiki
Workshop: Using Malaria VCF Files to Construct an SNP Phylogenetic Tree
Objectives
This workshop aims to guide participants through the process of constructing a single nucleotide polymorphism (SNP) phylogenetic tree using VCF (Variant Call Format) files derived from malaria pathogen data. By the end of this session, participants will:
- Generate an SNP phylogenetic tree to study genetic relationships among different malaria samples.
Steps
1. Prepare Your Environment
Install the softwares required using conda
conda install -c conda-forge -c bioconda -n WGS_analysis iqtree
conda activate WGS_analysis
2. Create your work directory and link the result of vcf filtering to here
Create a new directory for the SNP phylogenetic construction
mkdir workshop_SNP_PhyloConstruction
cd workshop_SNP_PhyloConstruction
Make a symlink from the vcf you filtered to here
ln -s ~/workshop_WGS_upstream/VCF_filtering/Pf7_practice_chr1.filtered.snps.vcf ./
3. Convert VCF to TAB Format
conda activate /home/renzirui/micromamba/envs/WGS_analysis
cat Pf7_practice_chr1.filtered.snps.vcf | vcf-to-tab > Pf7_practice_chr1.filtered.snps.vcf.tab
4. Using custom perl script to convert the tab format into SNP alignment FASTA
perl /home/renzirui/workshop_SNP_PhyloConstruction/vcf_tab_to_fasta_alignment.pl -i Pf7_practice_chr1.filtered.snps.vcf.tab > Pf7_practice_chr1.filtered.snps.afa
5. Construct the Phylogenetic Tree
We will use IQ-TREE to build the SNP phylogenetic tree.
iqtree -s Pf7_practice_chr1.filtered.snps.afa -m MFP -bb 1000 -nt 4 -pre Pf7_SNP_Phylo/tree