3. Assembly with hifiasm - USDA-ARS-GBRU/Pepper_TrioBinning GitHub Wiki

Hifiasm in trio-binning mode was used to generate two haploid assemblies corresponding to HDA149 and HDA330.

Step 1: Use yak (yet another k-mer counter) to get parental k-mers
Step 2: Use hifiasm in trio-binning mode to generate both haploid assemblies

Step 1: yak for parental k-mers

The script for HDA149 is shown for brevity.
Input
fastp trimmed parental short reads.
Output
HDA149_all.yak

#!/bin/bash
#SBATCH --job-name=yak
#SBATCH --ntasks-per-node=32
#SBATCH --output="%x_%j.o" # job standard output file (%j replaced by job id)
#SBATCH --error="%x_%j.e" # job standard error file (%j replaced by job id)

yak='/software/yak/yak'

# trimmed PE reads must be concatenated into one file for yak
cat HDA149_BDPL200001952-1A_HJWJNDSXY_L3_1.fp.fq.gz HDA149_BDPL200001952-1A_HJWJNDSXY_L3_2.fp.fq.gz HDA149_BDPL200001952-1A_HJWKKDSXY_L2_1.fp.fq.gz HDA149_BDPL200001952-1A_HJWKKDSXY_L2_2.fp.fq.gz HDA149_BDPL200001952-1A_HJWKKDSXY_L3_1.fp.fq.gz HDA149_BDPL200001952-1A_HJWKKDSXY_L3_2.fp.fq.gz HDA149_BDPL200001952-1A_HJWKKDSXY_L4_1.fp.fq.gz HDA149_BDPL200001952-1A_HJWKKDSXY_L4_2.fp.fq.gz > HDA149_all.fp.fq.gz

# Generate the yak file
${yak} count -k31 -b37 -t32 -o HDA149_all.yak HDA149_all.fp.fq.gz

Step 2: hifiasm for assemblies

Inputs
-1 maternal yak file
-2 paternal yak file
Filtered HiFi reads of the F1 progeny. Can use * (wildcard) to call all the HiFi fastq files. Can be gzipped.

#!/bin/bash
#SBATCH --job-name=hifiasm
#SBATCH --nodes=1   # number of nodes
#SBATCH --ntasks-per-node=60
#SBATCH --mem=1400G   # Memory per core, use --mem= for memory per node
#SBATCH -t 6-10:00:00
#SBATCH --output="%x_%j.o" # job standard output file (%j replaced by job id)
#SBATCH --error="%x_%j.e" # job standard error file (%j replaced by job id)

hifiasm='/software/hifiasm/hifiasm'

mkdir hifiasm_yak_assembly
# Usage: hifiasm -o assembly_output_name -t 32 -1 pat.yak -2 mat.yak Filtered_HiFi_reads.fq.gz

${hifiasm} -o hifiasm_yak_assembly/hifiasm_yak.asm -t 32 -1 HDA149_all.yak -2 HDA330_all.yak m54334U*.filt.fastq.gz

# Convert .gfa to .fa
awk '/^S/{print ">"$2;print $3}' hifiasm_yak.asm.gfa > hifiasm_yak.asm.fa

# Check statistics with bbtools
module load bbtools
stats.sh -Xmx5g t=4 in=hifiasm_yak.asm.fa