To output unaligned PacBio reads in FASTA and FASTQ - pb-cdunn/blasr GitHub Wiki
In order to output unaligned PacBio reads in FASTA format, use blasr -unaligned option . Example:
$ blasr input.fofn ref.fasta -unaligned unaligned.fasta
blasr only output unaligned PacBio reads in FASTA format, in order to get unaligned PacBio reads in FASTQ format, please try the following python script. You will need to install pbcore and have SMRTCells (input.fofn) available.
#!/usr/bin/env python
from pbcore.io import *
fofn = "input.fofn" # input fofn of SMRTCells
unaligned_fa = "unaligned.fasta" # input unaligned PacBio reads in fasta file
unaligned_fq = "unaligned.fastq" # output unaligned PacBio Reads in fastq file
# Scan over bas.h5 file in fofn
h5 = BasH5Collection(fofn)
# output fastq writer
fqwriter = FastqWriter(unaligned_fq)
for read in FastaReader(unaligned_fa):
# iterate over unaligned reads
subread = h5[read.header]
# write to output fastq
fqwriter.writeRecord(subread.readName, subread.basecalls(), subread.qv("QualityValue"))