Fully automated assembly - rrwick/Autocycler GitHub Wiki

The following commands can be run without any human intervention. For more details on each step in the process, see the corresponding wiki pages.

reads=ont.fastq.gz  # your read set goes here
threads=16  # set as appropriate for your system (no more than 128)
genome_size=$(autocycler helper genome_size --reads "$reads" --threads "$threads")  # can set this manually if you know the value

# Step 1: subsample the long-read set into multiple files
autocycler subsample --reads "$reads" --out_dir subsampled_reads --genome_size "$genome_size"

# Step 2: assemble each subsampled file
mkdir assemblies
for assembler in canu flye miniasm necat nextdenovo raven; do
    for i in 01 02 03 04; do
        autocycler helper "$assembler" --reads subsampled_reads/sample_"$i".fastq --out_prefix assemblies/"$assembler"_"$i" --threads "$threads" --genome_size "$genome_size"
    done
done

# Optional step: remove the subsampled reads to save space
rm subsampled_reads/*.fastq

# Step 3: compress the input assemblies into a unitig graph
autocycler compress -i assemblies -a autocycler_out

# Step 4: cluster the input contigs into putative genomic sequences
autocycler cluster -a autocycler_out

# Steps 5 and 6: trim and resolve each QC-pass cluster
for c in autocycler_out/clustering/qc_pass/cluster_*; do
    autocycler trim -c "$c"
    autocycler resolve -c "$c"
done

# Step 7: combine resolved clusters into a final assembly
autocycler combine -a autocycler_out -i autocycler_out/clustering/qc_pass/cluster_*/5_final.gfa

The final consensus assembly will be named autocycler_out/consensus_assembly.fasta. Autocycler does not reorient circular sequences, so you may want to use Dnaapler for that.

If you perform many automated assemblies with Autocycler, I recommend using Autocycler table to produce a TSV after they finish to check for problematic genomes.

And if you want to automate the entire Autocycler assembly process, take a look at the pipelines directory in the Autocycler repo. It contains user-contributed pipelines designed to simplify and streamline running Autocycler. Feel free to use, modify or contribute your own!