GATK - erinvanberkel/EPP622-Test-2 GitHub Wiki
Making a new directory for GATK
mkdir 4_gatk
Linking the RG BAM and the bam index from the last page
ln -s $(readlink -e ../3_bwa/*_sorted.RG.*) ./
Haplotype Caller
/pickett_shared/software/gatk-4.2.6.1/gatk \
--java-options "-Xmx4G" \
HaplotypeCaller \
-R solenopsis_invicta_genome.fa.gz \
-I SRR6922236_1_sorted.RG.bam \
-O SRR6922236_1_NC_052664.1.vcf \
-bamout SRR6922236_1_sorted_NC_052664.1.RG.realigned.bam \
-L NC_052664.1
Calling variants on one sample at a time at chromosome NC_052664.1.
If you want to visualize a few of the variants after the header information to check.
grep -v '^##' SRR6922141_1_NC_052664.1.vcf | head -n 5
Count the number of snps and indels. Grabbing the SNPs or indels and counting the number of lines.
spack load bcftools
bcftools view -v snps SRR6922141_1_NC_052664.1.vcf | grep -v "^#" | wc -l
bcftools view -v indels SRR6922141_1_NC_052664.1.vcf | grep -v "^#" | wc -l
Download files to your computer for IGV visualization
scp [email protected]:/pickett_sphinx/teaching/EPP622_2024/test2/analysis/evanberk/4_gatk/\solenopsis_invicta_genome.fa .
scp [email protected]:/pickett_sphinx/teaching/EPP622_2024/test2/analysis/evanberk/4_gatk/\solenopsis_invicta_genome.fa.fai .
scp [email protected]:/pickett_sphinx/teaching/EPP622_2024/test2/analysis/evanberk/4_gatk/\SRR6922141_1_sorted.RG.bam .
scp [email protected]:/pickett_sphinx/teaching/EPP622_2024/test2/analysis/evanberk/4_gatk/\SRR6922141_1_sorted.RG.bam.bai .
scp [email protected]:/pickett_sphinx/teaching/EPP622_2024/test2/analysis/evanberk/4_gatk/\SRR6922141_1_sorted_NC_052664.1.RG.realigned.bam .
scp [email protected]:/pickett_sphinx/teaching/EPP622_2024/test2/analysis/evanberk/4_gatk/\SRR6922141_1_sorted_NC_052664.1.RG.realigned.bai .
scp [email protected]:/pickett_sphinx/teaching/EPP622_2024/test2/analysis/evanberk/4_gatk/\SRR6922141_1_NC_052664.1.vcf .