Assembly of Canu binned HiFi reads with hifiasm - USDA-ARS-GBRU/Pepper_TrioBinning GitHub Wiki
HDA149 TrioHifi code
hifiasm -o hifiasm_assembly/TrioHifi_HDA149 -t 32 /bins/haplotype/haplotype-HDA149.fasta.gz haplotype-unknown.fasta.gz
- We set the number of threads to 32 with
-t 32
- We wrote the output to the 'hifiasm_assembly/' directory and gave the files the 'TrioHifi_HDA149' prefix with
-o hifiasm_assembly/TrioHifi_HDA149
- 2 input files (that we generated through TrioCanu:
/bins/haplotype/haplotype-HDA149.fasta.gz
/bins/haplotype/haplotype-unknown.fasta.gz
Results
5 assemblies are generated by default
- TrioHifi_HDA149.bp.hap1.p_ctg.gfa
- TrioHifi_HDA149.bp.hap2.p_ctg.gfa
- TrioHifi_HDA149.bp.p_ctg.gfa <- this is the one that advances
- TrioHifi_HDA149.bp.p_utg.gfa
- TrioHifi_HDA149.bp.r_utg.gfa
HDA330 TrioHifi code
hifiasm -o hifiasm_assembly/TrioHifi_HDA149.bp.p_ctg.gfa -t 32 /bins/haplotype/haplotype-HDA330.fasta.gz haplotype-unknown.fasta.gz
It's the same code as for HDA149, but with the HDA330 binned reads and generating an assemblies named with the 'TrioHifi_HDA330' prefix
Convert the assembly graphs (.gfa) to fasta
awk '/^S/{print ">"$2;print $3}' TrioHifi_HDA149.bp.p_ctg.gfa > TrioHifi_HDA149.bp.p_ctg.fa