4. Generation of ST117 snippy phylogenomic Tree - maxlcummins/APEC-MGEN-2018 GitHub Wiki
Clonal complex 117 SNP trees were generated using snippy, gubbins and FastTree2. The reference used was an ST117 strain named 2009C-3133 (Accession Number: CP013025.1).
Note: Thanks to tseeman for providing the basis of this workflow in his snippy README.
The workflow consisted of the following steps:
1. Snippy
Snippy was run on default settings with 2009C-3133 used as a reference. A qsub script is available here.
qsub -v READ_PATH=BigChook_Reads/reads/BigChook_CC117,PREFIX=2009C_snippy,REF=2009C-3133_ST117_Ref.fasta ~/qsub/snippy.qsub
2. Snippy
Snippy-core, snippy-clean_full_aln were used to create a core SNP site alignment, again with 2009C-3133 used as a reference.
cd BigChook_Reads/reads/BigChook_CC117
snippy-core --ref ~/2009C-3133_ST117_Ref.fasta BigChook_Reads/reads/BigChook_CC117/*.out
snippy-clean_full_aln core.full.aln > clean.full.aln
3. Recombination filtering
Gubbins was used for recombination filtering. Apart from including verbosity (--verbose) and no clean-up (--no_cleanup) prefix (-p), gubbins was run on default settings. Note that gubbins is memory intensive and may require a job submission for you to get it to complete, depending on your computing power.
python ~/miniconda3/envs/gubbins/bin/run_gubbins.py -p gubbins --verbose --no_cleanup ~/BigChook_Reads/reads/BigChook_CC117/clean.full.aln
4. Tree building
The SNP sites were identified from the output of gubbins, as below, before using FastTree to generate the tree:
snp-sites -c gubbins.filtered_polymorphic_sites.fasta > clean.core.aln
FastTree -nt -gtr <clean.core.aln> CC117_clean_noout.tree
5. Tree visualisation
This SNP tree was visualised and annotated using ggtree as was described in Part 3 of this wiki, however a different script available here was utilised. All supporting data is avaiable in the data folder of this github repository.