Assembly Generation - fbemm/onefc-oneasm GitHub Wiki

Software Packages

miniasm - long read assembler

minimap - long read mapper

racon - long read consensus caller

pilon - short read consensus caller

Genome Assembly

Raw read overlap detection (minimap)

minimap -Sw5 -L100 -m0 -t8 ont.fastq ont.fastq | gzip -1 > ont.paf.gz

Real time: 21790.387 sec; CPU: 59044.524 sec

OLC-based de novo assembly (miniasm)

miniasm -f ont.fastq ont.paf.gz > ONTmin.gfa

Real time: 63.362 sec; CPU: 62.930 sec

GFA-to-Fasta conversion

awk '/^S/{print ">""\n"}' ONTmin.gfa | fold > ONTmin_IT0.fasta

Long Read Polishing

Long read remapping - Iteration 1 (minimap)

minimap ONTmin_IT0.fasta ont.fastq > ONTmin_IT0.paf

Real time: 68.842 sec; CPU: 208.186 sec

Long read consensus call - Iteration 1 (racon)

racon -t 4 ont.fastq ONTmin_IT0.paf ONTmin_IT0.fasta ONTmin_IT1.fasta

Real time: 16769 sec;

Long read remapping - Iteration 2 (minimap)

minimap ONTmin_IT1.fasta ont.fastq > ONTmin_IT1.paf

Real time: 68.370 sec; CPU: 206.952 sec

Long read consensus call - Iteration 2 (racon)

racon -t 4 ont.fastq ONTmin_IT1.paf ONTmin_IT1.fasta ONTmin_IT2.fasta

Real time: 16279 sec;

Long read remapping - Iteration 3 (minimap)

minimap ONTmin_IT2.fasta ont.fastq > ONTmin_IT2.paf

Real time: 65.045 sec; CPU: 198.059 sec

Long read consensus call - Iteration 2 (racon)

racon -t 4 ont.fastq ONTmin_IT2.paf ONTmin_IT2.fasta ONTmin_IT3.fasta

Real time: 15444 sec;

Short Read Polishing

Read trimming

perlprinseq-lite.pl -fastq il_P1.fq -fastq2 il_P2.fq -min_len 100 -trim_qual_right 38 -min_qual_mean 38 -out_good il_trimmed

BWA genome indexing & short read remapping

bwa index ONTmin_IT3.fasta

bwa mem -t 8 ONTmin_IT3.fasta il_trimmed_1.fastq il_trimmed_2.fastq | samtools view -@ 8 -bhS | samtools sort -@ 8 > ONTmin_IT3.bam

Short read consensus call - Iteration 1 (pilon)

java -Xmx16G -jar pilon-1.22.jar --genome ONTmin_IT3.fasta --frags ONTmin_IT3.bam --fix snps --output ONTmin_IT4