Quick start - mjsull/HapFlow GitHub Wiki

This section describes the workflow for using Hapflow with paired-end Illumina reads by creating a sorted BAM file and VCF file. If you already have an index BAM file and VCF file start at step 6. If you'd like to play around with a premade flow file one is available in the Example 1 zip file. For a more in depth guide of HapFlows features please use the tutorials.

Workflow

1) The reads first need to be aligned against a reference genome to create a sorted BAM file. This can done using the read mapping BWA to create a SAM file:

% bwa mem reference.fasta reads.fq > aln.sam

2) Samtools can be used convert the SAM file to a sorted and indexed BAM file:

% samtools faidx reference.fasta

% samtools import reference.fasta.fai aln.sam aln.bam

% samtools sort aln.bam aln.sorted

% samtools index aln.sorted.bam

3) Freebayes is the recommended software for creating a VCF file to be used with HapFlow

% freebayes –F 0.05 –p 10 –f reference.fasta aln.sorted.bam > aln.vcf

4) Launch Hapflow from the UNIX command line or using an executable.

% python Hapflow.py

5) Create the flow file using the sorted BAM and the filtered VCF file.

File -> Create flow file

6) Load the flow file.

File -> Load flow file