Demo dataset - rrwick/Autocycler GitHub Wiki
This demo dataset is a small 'genome' consisting of some E. coli plasmids. By excluding the chromosome, the file sizes are kept smaller, making this demo faster to download and assemble. This dataset provides a practical way to test Autocycler's workflow and become familiar with its commands.
Download the Demo Dataset
You can download the demo dataset from here: autocycler-demo-dataset.tar
The autocycler_demo_dataset.tar
file contains the following:
reads.fastq.gz
: 75 Mbp of ONT readstruth.fasta
: an error-free reference
Running Autocycler on the Demo Dataset
The following commands will guide you through running a fully automated assembly on the demo dataset. These commands use only three different assemblers to minimise processing time.
threads="16"
genome_size="242000"
autocycler subsample --reads reads.fastq.gz --out_dir subsampled_reads --genome_size "$genome_size"
mkdir assemblies
for assembler in flye miniasm raven; do
for i in 01 02 03 04; do
"$assembler".sh subsampled_reads/sample_"$i".fastq assemblies/"$assembler"_"$i" "$threads" "$genome_size"
done
done
rm subsampled_reads/*.fastq
autocycler compress -i assemblies -a autocycler_out
autocycler cluster -a autocycler_out
for c in autocycler_out/clustering/qc_pass/cluster_*; do
autocycler trim -c "$c"
autocycler resolve -c "$c"
done
autocycler combine -a autocycler_out -i autocycler_out/clustering/qc_pass/cluster_*/5_final.gfa
The final consensus assembly will be saved as autocycler/consensus_assembly.fasta
. This assembly should closely (ideally exactly) match truth.fasta
, but since the plasmids are circular, the sequences will probably differ in strand and starting position.
Other demo datasets
You can also try running Autocycler on the Trycycler demo datasets which contain pre-made assemblies. These are a little bit dated (the assemblies have a higher error rate with lots of homopolymer-length errors) but will still work with Autocycler. The 'great', 'good' and 'mediocre' datasets should yield a structurally correct assembly.