Fully Automated Assembly - rrwick/Autocycler GitHub Wiki

The following commands can be run without any human intervention.

In addition to Autocycler, these commands use some of the helper scripts. See Generating input assemblies and Genome size estimation for more details.

For more details on each step in the process, see the corresponding wiki pages.

reads=ont.fastq.gz  # your read set goes here
threads=16  # set as appropriate for your system (no more than 128)
genome_size=$(genome_size_raven.sh "$reads" "$threads")  # can set this manually if you know the value

# Step 1: subsample the long-read set into multiple files
autocycler subsample --reads "$reads" --out_dir subsampled_reads --genome_size "$genome_size"

# Step 2: assemble each subsampled file
mkdir assemblies
for assembler in canu flye miniasm necat nextdenovo raven; do
    for i in 01 02 03 04; do
        "$assembler".sh subsampled_reads/sample_"$i".fastq assemblies/"$assembler"_"$i" "$threads" "$genome_size"
    done
done

# Optional step: remove the subsampled reads to save space
rm subsampled_reads/*.fastq

# Step 3: compress the input assemblies into a unitig graph
autocycler compress -i assemblies -a autocycler_out

# Step 4: cluster the input contigs into putative genomic sequences
autocycler cluster -a autocycler_out

# Steps 5 and 6: trim and resolve each QC-pass cluster
for c in autocycler_out/clustering/qc_pass/cluster_*; do
    autocycler trim -c "$c"
    autocycler resolve -c "$c"
done

# Step 7: combine resolved clusters into a final assembly
autocycler combine -a autocycler_out -i autocycler_out/clustering/qc_pass/cluster_*/5_final.gfa

The final consensus assembly will be named autocycler_out/consensus_assembly.fasta. Autocycler does not reorient circular sequences, so you may want to use Dnaapler for that.

If you perform many automated assemblies with Autocycler, I recommend using Autocycler table to produce a TSV after they finish to check for problematic genomes.

And if you want to automate the entire Autocycler assembly process, take a look at the pipelines directory in the Autocycler repo. It contains user-contributed pipelines designed to simplify and streamline running Autocycler. Feel free to use, modify or contribute your own!

⚠️ **GitHub.com Fallback** ⚠️