Generating a consensus - rrwick/Trycycler GitHub Wiki
Requirements
Before this step, you'll need to have run the previous ones. If your cluster directories contain 2_all_seqs.fasta
, 3_msa.fasta
and 4_reads.fastq
files, you should be ready!
Concept
The final step of Trycycler is to generate a consensus contig sequence for each cluster. It does this by converting the MSA into a graph form, containing "same" chunks (where all the input sequences agree) and "different" chunks (where there are two or more options). It then chooses the most popular option for each different chunk (see How variants are chosen for the consensus sequence for more details). When there is a tie between options, Trycycler aligns the reads to the alternative sequences and chooses the option with the best read alignment scores.
Running Trycycler consensus
The Trycycler consensus command must be run separately for each of your good clusters.
Assuming your trycycler output directory is trycycler
and your good clusters are numbers 1, 2 and 3, these are the commands you would run:
trycycler consensus --cluster_dir trycycler/cluster_001
trycycler consensus --cluster_dir trycycler/cluster_002
trycycler consensus --cluster_dir trycycler/cluster_003
Settings
--linear
: use this option if your input contigs are not circular. It will disable the circularisation steps when aligning reads and choosing variants.--min_aligned_len
: reads with less than this many bases aligned (default = 1000) will be ignored.--min_read_cov
: reads with less than this percentage of their length aligned (default = 90.0) will be ignored.--threads
: this is how many threads Trycycler will use for read alignment. It will only affect the speed performance, so you'll probably want to use as many threads as you have available.--verbose
: use this flag to display extra output. For every read-assessed variant, this will show the alternative sequences and their read alignment scores.
Output
When finished, you should have a 7_final_consensus.fasta
file in each of your cluster directories. If you have multiple clusters, you can combine their consensus sequences into a single FASTA file like this:
cat trycycler/cluster_*/7_final_consensus.fasta > trycycler/consensus.fasta
This is the end of Trycycler's pipeline! However, you might want to polish your consensus sequences to further improve their accuracy.