Generate MAGs from Reads - bokulich-lab/q2-annotate GitHub Wiki
In this tutorial we will be transforming raw sequencing reads into Metagenome-Assembled Genomes (MAGs) using Qiime2. Before you start make sure you have a working virtual environment (instructions available here).
Approximate runtime: 5 hours
We will not be using any real data in this tutorial but instead will simulate a small dataset based on a set of genomes of known origin. However, feel free to skip this step if you have some read data you would like to apply this workflow to. Check out Qiime2 documentation to find out how to import your reads into Qiime.
To simulate reads, we can use the generate-reads command from the q2-assembly plugin.
We can specify how many samples should be generated with home many reads, and which abundance distributions.
For now, let's simulate 3 samples with 20000 reads (uniform abundance distribution for 3 random genomes):
Estimated runtime: 45 minutes
# Download genomes
cd <download_here>
curl -L -o genomes.qza https://github.com/bokulich-lab/q2-moshpit/wiki/genomes.qza
# Simulate reads
qiime assembly generate-reads \
--i-genomes genomes.qza \
--p-sample-names sample{1,2} \
--p-n-reads 500000 \
--p-abundance uniform \
--p-n-genomes 5 \
--p-cpus 7 \
--output-dir reads \
--verboseWe can use the simulated reads to perform metagenome assembly. There are two assemblers available in the q2-assembly
plugin: we will use the MEGAHIT assembler in this tutorial - feel free to try out the MetaSPAdes assembler though.
Estimated runtime: 60 minutes
qiime assembly assemble-megahit \
--i-seqs reads/reads.qza \
--p-presets meta-sensitive \
--o-contigs contigs.qza \
--verbose \
--p-num-partitions 3 \
--parallelEstimated runtime: 6 minutes
qiime assembly evaluate-contigs \
--i-contigs contigs.qza \
--p-min-contig 100 \
--o-visualization contigs.qzv \
--verboseBefore we perform the actual binning (MAG generation), we will need to map the reads to the assembled contigs. The resulting alignment map can then be used directly in the binning action.
We begin by generating a Bowtie2 index of the assembled contigs. This can be
achieved by using the index-contigs action from the q2-assembly
plugin:
Estimated runtime: 60 seconds
qiime assembly index-contigs \
--i-contigs contigs.qza \
--p-threads 7 \
--p-seed 100 \
--o-index contigs-index.qza \
--verboseNext, we will generate a reads-to-contigs alignment map using the map-reads-to-contigs action from
q2-assembly:
Estimated runtime: 60 seconds
qiime assembly map-reads-to-contigs \
--i-indexed-contigs contigs-index.qza \
--i-reads reads/reads.qza \
--p-threads 7 \
--p-seed 100 \
--o-alignment-map reads-to-contigs-aln.qza \
--verboseFinally, we are ready to perform contig binning using MetaBAT2 through the bin-contigs-metabat action from
q2-moshpit:
Estimated runtime: 60 seconds
qiime moshpit bin-contigs-metabat \
--i-contigs contigs.qza \
--i-alignment-maps reads-to-contigs-aln.qza \
--p-num-threads 7 \
--p-seed 100 \
--o-mags mags.qza \
--o-contig-map map.qza \
--o-unbinned-contigs unbinned.qza \
--verboseOnce you have obtained your MAGs you can use q2-moshpit to do quality control, gene prediction, or functional annotation.