Genome Assesment - Lavadav/EPP531_AGA GitHub Wiki

6. BUSCO Analysis

BUSCO Slides

mkdir Busco
cd Busco

Symbolically link the fasta file to the current directory

mv <path to file.fasta> .

Check if BUSCO is loaded by singualirity

singularity exec -B $PWD /sphinx_local/images/ezlabgva-busco-v5.6.1_cv1.img busco --help

Run BUSCO

singularity exec -B $PWD /sphinx_local/images/ezlabgva-busco-v5.6.1_cv1.img busco -i Genome.fasta -m genome -l embryophyta -c 5 -o busco_results

7. Remove Mitochondria and Chloroplast genome from the assembly

mkdir MT_CP_removal
cd MT_CP_removal

spack load minimap2

Map your assembly to Chloroplast Genome

minimap2 -t 5 -x asm5 Chloroplast_genome.fasta assembly.fasta > Alignment.paf

Copy the following python scripts to the current directory

cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Busco/find_scaffolds_by_paf_coverage.py .
cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Busco/remove_contigs_by_name.py .

Find the list of mapping scaffolds

python3 find_scaffolds_by_paf_coverage.py Alignment.paf > Alignment_list.txt

Remove the mapping contigs

python remove_contigs_by_name.py Alignment_list.txt assembly.fasta
mv assembly.fasta assembly-CP_filtered.fasta

8. Now use the filtered fasta file to remove mitochondria sequence and rerun the BUSCO.