Microbiome Helper 2 Setting up environments for analysis - LangilleLab/microbiome_helper GitHub Wiki
Authors: Robyn Wright Modifications by: NA
Note that this is still a work in progress! We don't guarantee that everything will work :)
Setup of AWS MH2-2025 image
Launched instance from previous CBW-ICG-2024 image.
Update conda:
conda update conda
Install latest QIIME2:
conda env create \
--name qiime2-amplicon-2025.4 \
--file https://raw.githubusercontent.com/qiime2/distributions/refs/heads/dev/2025.4/amplicon/released/qiime2-amplicon-ubuntu-latest-conda.yml
Got this warning:
For Linux 64, Open MPI is built with CUDA awareness but this support is disabled by default.
To enable it, please set the environment variable OMPI_MCA_opal_cuda_support=true before
launching your MPI processes. Equivalently, you can set the MCA parameter in the command line:
mpiexec --mca opal_cuda_support 1 ...
In addition, the UCX support is also built but disabled by default.
To enable it, first install UCX (conda install -c conda-forge ucx). Then, set the environment
variables OMPI_MCA_pml="ucx" OMPI_MCA_osc="ucx" before launching your MPI processes.
Equivalently, you can set the MCA parameters in the command line:
mpiexec --mca pml ucx --mca osc ucx ...
Note that you might also need to set UCX_MEMTYPE_CACHE=n for CUDA awareness via UCX.
Please consult UCX's documentation for detail.
Install fastqc and multiqc in QIIME2 environment:
conda activate qiime2-amplicon-2025.4
mamba install bioconda::fastqc
mamba install bioconda::multiqc
Update R:
sudo apt update
sudo apt install r-base
Remove previous environments:
conda env remove -n qiime2-amplicon-2024.2-backup
conda env remove -n picrust2
conda env remove -n biobakery3
conda env remove -n anvio-7
conda env remove -n rgi
conda env remove -n checkm
conda env remove -n functional
conda env remove -n taxonomic
Install PICRUSt2 from conda:
conda create -n picrust2-v2.6.2
mamba activate picrust2-v2.6.2
mamba install bioconda::picrust2
Install kneaddata:
mamba create -n kneaddata-v0.12.2
mamba activate kneaddata-v0.12.2
mamba install bioconda::kneaddata
mamba install bowtie2 #unnecessary as already installed
mamba install bioconda::trimmomatic
mamba install bioconda::trf
mamba install bioconda::fastqc
mamba install bioconda::multiqc
mamba install conda-forge::parallel
Install Kraken2:
mamba create -n kraken2-v2.14
mamba activate kraken2-v2.14
mamba install bioconda::bracken
mamba install bioconda::kraken2
mamba install conda-forge::parallel
It always says Kraken version 2.1.3, but it says this when installing:
Package Version Build Channel Size
─────────────────────────────────────────────────────────────
Reinstall:
─────────────────────────────────────────────────────────────
o kraken2 2.14 pl5321h077b44d_0 bioconda Cached
Install Anvi'o
conda create -y --name anvio-8 python=3.10
conda activate anvio-8
mamba install -y -c conda-forge -c bioconda python=3.10 \
sqlite=3.46 prodigal idba mcl muscle=3.8.1551 famsa hmmer diamond \
blast megahit spades bowtie2 bwa graphviz "samtools>=1.9" \
trimal iqtree trnascan-se fasttree vmatch r-base r-tidyverse \
r-optparse r-stringi r-magrittr bioconductor-qvalue meme ghostscript \
nodejs=20.12.2
mamba install -y -c bioconda fastani
curl -L https://github.com/merenlab/anvio/releases/download/v8/anvio-8.tar.gz \
--output anvio-8.tar.gz
pip install anvio-8.tar.gz
mamba install bioconda::concoct
mamba install bioconda::metabat2
mamba install bioconda::maxbin2
mamba install bioconda::das_tool
mamba install bioconda::binsanity
mamba install bioconda::gtdbtk
mamba install usearch
#note that at some point I needed to downgrade scikit-learn to v 1.1.0 when I got pickle errors
#pip install scikit-learn==1.1.0
Install CheckM2:
mamba create -n checkm2 -c bioconda -c conda-forge checkm2
Install RGI (for CARD):
mamba create -n rgi-v6.0.4
mamba activate rgi-v6.0.4
mamba install bioconda::rgi
mamba install conda-forge::parallel
RStudio server:
sudo apt update
sudo apt upgrade
sudo apt-get install r-base
sudo apt-get install gdebi-core
wget https://download2.rstudio.org/server/jammy/amd64/rstudio-server-2025.05.1-513-amd64.deb
sudo gdebi rstudio-server-2025.05.1-513-amd64.deb
Programs for getting a tree of metagenomic reads:
mamba create -n get_tree python=3.9
mamba activate get_tree
mamba install conda-forge::ete3
mamba install conda-forge::pandas
Programs for functional annotation of metagenomic reads with HMMs:
mamba create -n annotate_hmm
mamba activate annotate_hmm
mamba install anaconda::biopython
mamba install bioconda::clustalo
mamba install bioconda::hmmer
mamba install bioconda::raxml
mamba install bioconda::epa-ng
mamba install bioconda::gappa
mamba install conda-forge::r-castor
mamba install conda-forge::ete3
mamba install conda-forge::pandas