M3 Lab1 - areed37/Andrew_Reed_EPP_531 GitHub Wiki

Genome Annotation

needs dependencies took Lav about 2 days to install everything (so be prepared) Repbase is provided by university

database uses NCBI and repbase and your file repeat masker combines databases into one

#linking input data
ln -s /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/Redbud_Genome_Hap2.fasta .

# Load the right Perl
spack load /ajwoixl
#building the database
/pickett_shared/software/RepeatModeler-2.0.3/BuildDatabase -name Redbud -engine ncbi Redbud_Genome_Hap2.fasta

#running repeat moddeler
/pickett_shared/software/RepeatModeler-2.0.3/RepeatModeler \
-pa 3 \ #how many you want to run in parallel (this number is multiplied by 4)
-engine ncbi \
-database Redbud 2>&1 | tee 00_Redbud_repeatmodeler.log

#merge all of our repeat libraries into one
cat /pickett_shared/software/RepeatMasker/Libraries/eudicotyledons-rm.fa /pickett_shared/software/RepeatMasker/Libraries/RMRB.fasta RedBud-families.fa > Redbud_totalRepeatLib.fa
#check output
cat Redbud_totalRepeatLib.fa | grep ">" | wc -l
#50962

#Mask our genome
/pickett_shared/software/RepeatMasker/RepeatMasker \
-lib Redbud_totalRepeatLib.fa \
-e rmblast \
-pa 3 \
-nolow \
-xsmall \
-gff \
Redbud_Genome_Hap2.fasta \
>& Redbud_1.0.0_RMasker.out