Assembly Validation - fbemm/onefc-oneasm GitHub Wiki

Software Package

bwa - short read mapper

samtools - short read mapping manipulation

freebayes - variant caller

vcflib - parsing and manipulating VCF files

Variant calling (bwa, samtools, freebayes)

Genome Indexing (bwa)

bwa index ONTmin_IT3.fasta

Read Mapping (bwa, samtools)

bwa mem -t 4 ONTmin_IT3.fasta il_trimmed_1.fastq il_trimmed_2.fastq | samtools view -bS | samtools sort -T tmp_ONTmin_IT3.fasta -@ 4 > ONTmin_IT3.fasta.bam

Index Mapping (samtools)

samtools index ONTmin_IT3.fasta.bam

Variant Calling (freebayes)

freebayes -C 1 -0 -O -q 20 -z 0.10 -E 0 -X -u -p 2 -b ONTmin_IT3.fasta.bam -v ONTmin_IT3.fasta.vcf -f ONTmin_IT3.fasta

Quality Calculation

SNP Extraction & Counting

vcfstats ONTmin_IT3.fasta.vcf > ONTmin_IT3.fasta.vcf.snps

CNT=$(grep -v "^#" ONTmin_IT3.fasta.vcf.snps | wc -l)

Effective Genome Size Estimation

samtools stats ONTmin_IT3.fasta.bam > ONTmin_IT3.fasta.bam.stats

BG=$(grep "^COV" ONTmin_IT4.fasta.bam.stats | cut -f4 | paste -sd + | bc)

Calculate SNP-based Assembly Quality

echo -10*(l($CNT/$BG)/l(10)) | bc -l