Microbiome Helper 2 Tutorial data - LangilleLab/microbiome_helper GitHub Wiki

Authors: Robyn Wright Modifications by: NA

Please note: We think that everything here should work, but we are still testing/developing this so use with caution :)

This page contains links to the studies that we have taken the tutorial data from, links to download the data, and details on how we have made these files.

Marker gene

We are using samples from this study:

Pascoal, F., Duarte, P., Assmy, P. & Costa, R. (2024) Full-length 16S rRNA gene sequencing combined with adequate database selection improves the description of Arctic marine prokaryotic communities. Annals of Microbiology.

This study contains both full-length (PacBio) and V4-V5 region (Illumina) 16S rRNA gene sequencing data for samples collected from the Arctic Ocean.

If you just want to run through the tutorials, then you can download the data to your laptop/server like so.

Illumina:

mkdir arctic_ocean_illumina/
cd arctic_ocean_illumina/

wget http://kronos.pharmacology.dal.ca/public_files/MH2/marker_gene/illumina/raw_data.tar.gz
wget http://kronos.pharmacology.dal.ca/public_files/MH2/marker_gene/illumina/arctic_study_metadata_illumina.txt

tar -xvf raw_data.tar.gz
rm raw_data.tar.gz

PacBio:

mkdir arctic_ocean_pacbio/
cd arctic_ocean_pacbio/

wget http://kronos.pharmacology.dal.ca/public_files/MH2/marker_gene/pacbio/raw_data.tar.gz
wget http://kronos.pharmacology.dal.ca/public_files/MH2/marker_gene/pacbio/arctic_study_metadata_pacbio.txt

tar -xvf raw_data.tar.gz
rm raw_data.tar.gz

Downloading the samples

We will provide details for downloading the samples for yourself here.

Metagenome

We are using samples from this study:

Meyer, F., Fritz, A., Deng, Z-L., et al.. (2022) Critical Assessment of Metagenome Interpretation: the second round of challenges. Nature Methods.

This study contains samples that were simulated with long and short reads to represent different environments, and these can all be downloaded from this page.

You can choose to use the full samples (and download them as we have done below), or we have sub-sampled these so that they can be run in a shorter timeframe (for workshops, or for testing that methods work). The sub-sampled versions contain both a sub-sample of all reads for read-based taxonomic profiling, or a sub-sample with a few selected taxa for assembly-based analyses.

If you just want to run through the tutorials (with the sub-sampled reads), then you can download the data to your server like so.

Short reads for read-based analyses:

mkdir short_read
cd short_read
wget http://kronos.pharmacology.dal.ca/public_files/MH2/metagenome/marine/short_read/subsampled_reads/reads.tar.gz
tar -xvf reads.tar.gz

Short reads for assembly-based analyses:

mkdir short_read
cd short_read
wget http://kronos.pharmacology.dal.ca/public_files/MH2/metagenome/marine/short_read/subsampled_reads/for_assembly.tar.gz
tar -xvf for_assembly.tar.gz

Long reads for read-based analyses:

mkdir long_read
cd long_read
wget http://kronos.pharmacology.dal.ca/public_files/MH2/metagenome/marine/long_read/subsampled_reads/reads.tar.gz
tar -xvf reads.tar.gz

Long reads for assembly-based analyses:

mkdir long_read
cd long_read
wget http://kronos.pharmacology.dal.ca/public_files/MH2/metagenome/marine/long_read/subsampled_reads/for_assembly.tar.gz
tar -xvf for_assembly.tar.gz

Downloading and prepping the samples

We will provide details for downloading the samples for yourself here, as well as how we sub-sampled the reads within the samples.